Andrea Gioia In recent years, data products have emerged as a solution to the enterprise problem of siloed data and knowledge. Andrea Gioia helps his clients build composable, reusable data products so they can capitalize on the value in their data assets. Built around collaboratively developed ontologies, these data products evolve into something that might also be called a knowledge product. We talked about: his work as CTO at Quantyca, a data and metadata management consultancy his description of data products and their lifecycle how the lack of reusability in most data products inspired his current approach to modular, composable data products - and brought him into the world of ontology how focusing on specific data assets facilitates the creation of reusable data products his take on the role of data as a valuable enterprise asset how he accounts for technical metadata and conceptual metadata in his modeling work his preference for a federated model in the development of enterprise ontologies the evolution of his data architecture thinking from a central-governance model to a federated model the importance of including the right variety business stakeholders in the design of the ontology for a knowledge product his observation that semantic model is mostly about people, and working with them to come to agreements about how they each see their domain Andrea's bio Andrea Gioia is a Partner and CTO at Quantyca, a consulting company specializing in data management. He is also a co-founder of blindata.io, a SaaS platform focused on data governance and compliance. With over two decades of experience in the field, Andrea has led cross-functional teams in the successful execution of complex data projects across diverse market sectors, ranging from banking and utilities to retail and industry. In his current role as CTO at Quantyca, Andrea primarily focuses on advisory, helping clients define and execute their data strategy with a strong emphasis on organizational and change management issues. Actively involved in the data community, Andrea is a regular speaker, writer, and author of 'Managing Data as a Product'. Currently, he is the main organizer of the Data Engineering Italian Meetup and leads the Open Data Mesh Initiative. Within this initiative, Andrea has published the data product descriptor open specification and is guiding the development of the open-source ODM Platform to support the automation of the data product lifecycle. Andrea is an active member of DAMA and, since 2023, has been part of the scientific committee of the DAMA Italian Chapter. Connect with Andrea online LinkedIn (#TheDataJoy) Github Video Here’s the video version of our conversation: https://www.youtube.com/watch?v=g34K_kJGZMc Podcast intro transcript This is the Knowledge Graph Insights podcast, episode number 30. In the world of enterprise architectures, data products are emerging as a solution to the problem of siloed data and knowledge. As a data and metadata management consultant, Andrea Gioia helps his clients realize the value in their data assets by assembling them into composable, reusable data products. Built around collaboratively developed ontologies, these data products evolve into something that might also be called a knowledge product. Interview transcript Larry: Hi, everyone. Welcome to episode number 30 of the Knowledge Graph Insights podcast. I'm really happy today to welcome to the show Andrea Gioia. Andrea's, he does a lot of stuff. He's a busy guy. He's a partner and the chief technical officer at Quantyca, a consulting firm that works on data and metadata management. He's the founder of Blindata, a SaaS product that goes with his consultancy. I let him talk a little bit more about that. He's the author of the book Managing Data as a Product, and he's also, he comes out of the data heritage but he's now one of these knowledge people like us. So welcome, Andrea. Tell the folks a little bit more about what you're up to these days. Andrea: Thank you. Thank you very much, Larry, for having me. It's a pleasure. Yes, as a CTO in Quantyca, I'm in charge of all our advisory services. So I'm helping a customer in figure out how to manage their data properly, especially to leverage the potential of artificial intelligence. So basically I see all sort of problem in data management. Each client, it's different, but each client have a lot of problem of data that is very fragmented or too complex to manage. And so it's a very complex problem to feed this data to the AI model and extract the potential that the modern AI and all the breakthroughs that we are seeing in this day made available. So I'm really focused at this moment to help customers, especially in find a way to manage their knowledge, the knowledge that is characteristic of the company, that is a differentiator of the companies, the knowledge that is not known at the large language mode, what make the company different and can be leveraged to implement domain-specific, company-specific use case based on AI and leveraging the data collected. Larry: Yeah. As you mentioned that, we were just chatting a bit before we went on about the scope of the conversation. And I totally forgot to mention AI, which is of course is like the main driver for half of this stuff we're doing nowadays. But a couple of things you mentioned there. I want to go back to, one, you mentioned what a complex problem space this is and the challenges of data management and every organization has its own issues there. One of the ways that folks like you have helped people cope with this is the notion of a data product. And I know that's a newish concept and maybe new to some of the listeners to this. Can you talk a little bit about your conception of what a data product is and how you put one together? Andrea: Yeah, absolutely. The concept is new but the rationale behind it, it's not new. Humans, when a problem is too much complex, the only way that humans have found to solve a very complex problem is to split in a different part, in smaller part, and try to take all the complexity within each single part. And the idea of that a product come from this strategy, done the lead, the team part. So the idea is not managing the data in a unique central platform in which all the data of the company is collected but split in a modular architecture. So the platform is still there. You have the data layer, the data warehouse, whatever. It's the architecture that you prefer, but it's not anymore a monolithic solution in which you store all the data that you have in your company, but it's built as a composition of independent modules. Each module focuses on one or more, but usually one specific data asset, and there is a team that is in charge of manage the life cycle of that data product that manages specific data asset. Andrea: Of course the composition of all the data asset create the platform and the platform can be used to support the different use cases, but basically you can work on each single module without caring too much about the other module because each module is isolated with a specific interface. So if you do not modify the interface, you can modify the technology and implementation inside. And if you want to understand how the different modules connect, you can ignore the implementation and just concentrate on the relation between the different interfaces. Andrea: So to make it very, very simple, we can think at the data product like a sort of microservice that is a software application, is actually a software application, that does not expose functionality, transactional functionality, to acquire data and drive the transaction but expose the data. It's a software application that expose the data in order to make the data it manages as much usable as possible for its customer base, for its users. So this is a data product. And of course because it is a product, it is managed with a product mindset. So it's not a project. It's not something that the team develop and then forget about it. But there is a dedicated team that implement the first version and then evolve the software application that support that specific data asset through all its lifecycle till the retirement when the data asset is not anymore relevant for the company. That's pretty much what is a data product for me. Andrea: So basically I call this kind of data product a pure data product to even more underline the fact that it's a software application that expose data because I also have a lot of time the question, a report, a dashboard is a data product and they say, yes, it's a data product if it is managed as a product with a product mindset. But my book, my research is more focused on the pure data product, so that specific kind of data products that do not expose visualization or insight or action but expose just pure data to make it reusable and composable over time to support multiple use cases. Larry: That's right. We didn't talk about this before we went on the air, but the episode right before this one is with Dave McComb, and I know I've heard you talk before about you appreciate his approach and his data-centricity. And everything you just said, I'm like, "Oh yeah, he's read Dave's books." Was that the major influence, or what are the influences? Andrea: Absolutely. It was for me an epiphany because at that time when I read McComb's books, I was looking for... I had a problem because we had started since couple of years to help our customer and created this kind of modular architecture. So that architecture that is built as a composition of different data product, even managed with a distributed operating model. So all the data product are managed by different business domain in an autonomous way.
Fler avsnitt av Knowledge Graph Insights
Visa alla avsnitt av Knowledge Graph InsightsKnowledge Graph Insights med Larry Swanson finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.
