Sveriges mest populära poddar
Knowledge Graph Insights

Mike Pool: Is it time for a moratorium on the word “semantics”? – Episode 22

31 min11 februari 2025
Mike Pool Mike Pool sees irony in the fact that semantic-technology practitioners struggle to use the word "semantics" in ways that meaningfully advance conversations about their knowlege-representation work. In a recent LinkedIn post, Mike even proposed a moratorium on the use of the word. We talked about: his multi-decade career in knowledge representation and ontology practice his opinion that we might benefit from a moratorium on the term "semantics" the challenges in pinning down the exact scope of semantic technology how semantic tech permits reusability and enables scalability the balance in semantic practice between 1) ascribing meaning in tech architectures independent of its use in applications and 2) considering end-use cases the importance of staying domain-focused as you do semantic work how to stay pragmatic in your choice of semantic methods how reification of objects is not inherently semantic but does create a framework for discovering meaning how to understand and capture subtle differences in meaning of seemingly clear terms like "merger" or "customer" how LLMs can facilitate capturing meaning Mike's bio Michael Pool works in the Office of the CTO at Bloomberg, where he is working on a tool to create and deploy ontologies across the firm. Previously, he was a principal ontologist on the Amazon Product Knowledge team, and has also worked to deploy semantic technologies/approaches and enterprise knowledge graphs at a number of big banks in New York City. Michael also spent a couple of years on the famous Cyc project and has evaluated knowledge representation technologies for DARPA. He has also worked on tooling to integrate probabilistic and semantic models and oversaw development of an ontology to support a consumer-facing semantic search engine. He lives in New York City and loves to run around in circles in Central Park. Connect with Mike online LinkedIn Video Here’s the video version of our conversation: https://youtu.be/JlJjBWGwSDg Podcast intro transcript This is the Knowledge Graph Insights podcast, episode number 22. The word "semantics" is often used imprecisely by semantic-technology practitioners. It can describe a wide array of knowledge-representation practices, from simple glossaries and taxonomies to full-blown enterprise ontologies, any of which may be summarized in a conversation as "semantics." Mike Pool thinks that this dynamic - using a word that lacks precise meaning while assuming that it communicates a lot - may justify a moratorium on the use of the term. Interview transcript Larry: Hi everyone, welcome to episode number 22 of the Knowledge Graph Insights podcast. I'm really happy today to welcome to the show Mike Pool. Mike is a longtime ontologist, a couple of decades plus. He recently took a position at Bloomberg. But he made this really provocative post on LinkedIn lately that I want to flesh out today, and we'll talk more about that throughout the rest of the show. Welcome, Mike, tell the folks a little bit more about what you're up to these days. Mike: Hey, thank you, Larry. Yeah. As you noted, I've just taken a position with Bloomberg and for these many years that you alluded to, I've been very heavily focused on building, doing knowledge representation in general. In the last let's say decade or so I've been particularly focused on using ontologies and knowledge graphs in large banks, or large organizations at least, to help organize disparate data, to make it more accessible, breakdown data silos, et cetera. It's particularly relevant in the finance industry where things can be sliced and diced in so many different ways. I find there's a really important use case in the financial space but in large organizations in general, in my opinion, for using ontology. So that's a lot of what I've been thinking about, to make that more accessible to the organization and to help them build these ontologies and utilize them effectively. Larry: Nice. One of the intellectual I guess foundations of that kind of practice is what we call semantics. Anyhow, I want to read part of that post you made on LinkedIn, which started a great conversation. One of the things you suggested, "I think we need to impose a moratorium on the use of the word semantics. The reason is simple, it's ironically a term lacking any precise meaning while we assume it's communicating a lot." That's brilliant, can you elaborate on that a little bit? Was there a particular, did something inspire that or has it just been on your mind? Mike: Yeah. I mean, it's mostly, one term that often triggers it for me is this term I see within, let's call it this community of practice ... I see used very, very frequently, people will say, "Let's look at the semantic meaning." So this redundancy in terms, that we said, "Well, what in the world?" But we use it for all kinds of things. We say we need a semantic solution, we need the semantic meaning. And very often what that ends up being when we drill into that, it's just not always clear. The term I think in some sense has become either too vague ... it's unclear of what precisely it means. Or it's a shorthand for something else, that we're not actually saying we're going to capture meaning. We're saying, we're going to use this particular set of tools or something like that. So my concern is that it's sort of lost. We know when we say it that we mean we're going to use these particular set of tools, these particular set of languages, but to the people with whom we're communicating that might remain completely unclear. So yeah, that's my concern about the way we're using the concept. Larry: Yeah. That's really interesting that ... it's not laziness, it's like heuristics or something like that, that people use all the time to just try to advance whatever conversation they're in or project or whatever. It's just like, "Oh yeah, we need a semantic thing there," or something like that, it sounds like. Or they're thinking of possibly 20 different things or they just say, "Oh, semantic is the closest word I know to that idea," that we need to advance this. Mike: Yeah. I mean, an example is ... Because I think, as I noted somewhat ironically, I herald myself as a semantic technology practitioner or something like that. After you said to me, "Well, what in the world is semantic technology?" It's a good question. If I create a property graph, there's part of me that says, well, that's not really semantic but a triple store is. It's like, well, what's the dividing line? What precisely makes it count as semantic or not? It's a little bit hard to pin that down. Larry: The way you just said it, it's almost like there's an on/off switch someplace. But I've seen, there's a lot of representations of what various people have called something like the semantic spectrum, from just term lists to glossaries to thesauri, to the ontologies, that kind of thing. It's easy enough to disambiguate between each of those things I just mentioned, but is there something like a spectrum in there? Is that why people are grasping for words, do you think, to describe exactly what they're talking about in the moment? Mike: Yeah. As I said, I think that's part of the problem. As I said, the people with whom I often communicate, I think we more or less mean it as a shorthand for using RDF OWL. And that might be as simple as using SKOS and creating a simple taxonomy with that, or creating a very elaborate OWL ontology. But it's interesting because, let's say, we create a taxonomy in SKOS. Well, is there any reason that if you just had that taxonomy and you didn't bother to put it in SKOS, you just put it in an Excel spreadsheet with appropriate indentations? We'd say, well, how does the SKOS, or how does the RDF magically capture the meaning where the spreadsheet didn't, right? It's a little bit unclear. But I think other people use it differently. I think there's lots of people who would say using a property graph is a semantic solution. We're capturing knowledge, et cetera, in it. So I think it varies a little bit, that's again, part of the point. But I do believe people with whom I communicate, that's the shorthand we're using. It's like, this is either technology that we use to extend RDF OWL, or it's a knowledge graph that encodes that knowledge. But that's often what it means in my space I think. Larry: Also, you mentioned in that post too, I think, when you're talking about, if your intent is to capture meaning, that there are other ways to do that technologically. You talked about just an old-fashioned ERD or a graph schema or a Python script that captures something in some project you're working on. And even I think you also alluded to, or maybe it came up in the discussion, it could be a natural language thing that you say to an LLM that it could discern. But I guess that gets at, what is the amount of meaning you need to capture? Does that make sense? What's the intent behind your attempt to do something semantic in a technical project? Mike: It sounds like a straightforward question, but it's actually a very good one. Because as I said, that if we say, well, we're trying to capture meaning, you could write a Python script that does it, or there's a lot of different ways to do that. I think this whole, at least the background that I have in this, when we're talking about capturing semantics, what we are really concerned with is really trying to say, can we get a computer to reason in the same way that we do? Can we get the computer to respond to natural language prompts in a similar way that a human can? That's kind of what we meant by semantics. But then if we try to talk about that in the technology space, what exactly does that mean? Let's take the Python script example. If I said, oh, I'm trying to solve this problem, I have this search engine and every time people,

Fler avsnitt av Knowledge Graph Insights

Visa alla avsnitt av Knowledge Graph Insights

Knowledge Graph Insights med Larry Swanson finns tillgänglig på flera plattformar. Informationen på denna sida kommer från offentliga podd-flöden.