Back to blog
Industry Insights

Ontology Had Its Moment. Then Another One. We Were There for Both.

By Robert GoodyearDecember 9, 20245 min read
Ontology Had Its Moment. Then Another One. We Were There for Both.

2024: Conferences Again

SEMANTiCS 2024 in Amsterdam, ISWC 2024 with its "Large Language Models for Ontology Learning Challenge," the Semantic Web Journal's special issue on "Large Language Models, Generative AI and Knowledge Graphs." Twenty years after OWL became a W3C standard, ontology is having a moment again.

The Word

Ontos (being) plus logos (study). Aristotle's Categories around 350 BCE offered ten primitives for describing what exists: substance, quantity, quality, relation, place, date, posture, state, action, passion. In the 1970s, AI researchers borrowed the term because it fit, and Tom Gruber codified the computer science definition in 1993: "an explicit specification of a conceptualization," meaning a formal way to describe what things are and how they relate.

First Wave: 1994-2006

Tim Berners-Lee outlined the Semantic Web at the first WWW Conference in 1994, describing a web readable by machines. The 2001 Scientific American article with Hendler and Lassila crystallized the vision: everyone marks up their pages with semantic metadata, intelligent agents traverse the structured web.

The problem was adoption. Healthcare had HIPAA requirements, so medical ontologies like SNOMED CT got built. Search engines had business models, so Google built knowledge graphs. The open web had neither incentive nor mandate, which meant JSON-LD survived for SEO purposes and knowledge panels survived for search, but the universal semantic web failed to materialize.

Second Wave: 2023-Present

GPT-4 is trained on essentially the entire internet, fluent and confident, including when wrong. The initial fix was Retrieval-Augmented Generation: rather than relying on what the model memorized, retrieve relevant documents at inference time and include them in the prompt. Text chunks work, but relationships are implicit and multi-hop reasoning is fragile.

Microsoft's GraphRAG in 2024 integrated knowledge graphs into the retrieval pipeline, where the graph structure explicitly captures entity relationships. The model traverses connections instead of hoping they're implicit in text. Gartner now predicts graph technologies will appear in 80% of data and analytics innovations by end of 2025, up from 10% in 2021. Structured knowledge matters because LLM reliability demands it.


Where We Came In

In 2014, we were shifting away from a services business (tech product marketing/CRM/PLM) to making stronger bets on earlier products, participating on success, putting our future at risk. Chop that up into smaller slices across more partners and it becomes a study in reconciling portfolios. We won't dwell on what went sideways when we tried to de-risk our project concentration with some global travel titans on the eve of the, erm, GLOBAL pandemic, but that's a story for another time.

Back to the origin (war) story of why we're here:

2014-ish

Same municipal bond, three different prices across systems at one institution. These were material discrepancies affecting portfolio decisions, not rounding, a problem that haunted me for years as we watched every platform speak its own language.

Simon Property Group: "Real Estate" in one system, "Financials" in another (old GICS), "Alternatives" in a third (endowment model), "Discretionary" in a fourth (consumer exposure lens). Same REIT, four classifications, four risk models, four different answers to simple questions. DTCC and Oliver Wyman estimated $20-40 billion annually in reconciliation, failed trades, and manual mapping. You cannot surface cross-asset patterns when systems disagree on what assets are.

2016-2020… and Beyond

Early prototypes classified bonds by cash flows rather than legal names. The key was a directed acyclic graph that lets instruments inherit multiple properties: a convertible bond is both debt and equity-linked without the system choking, with no circular dependencies. By 2020, machine learning classified new securities with 99.7% accuracy across 10+ million instruments, and the models found patterns humans missed, including munis behaving like corporate debt and REITs trading like fixed income.

A funny thing happened along the way. In the pandemic haze of 2020, we poked around at some map/reduce and ML experiments for our friend Sean Hsieh and his team at Concreit, looking for the meaning of meaning in the rising trend of D2C fractional REIT investors. We instrumented some GPT-2 / GPT-3 pipelining and ran some regressions against our other friends' massive datasets at ATTOM Data Solutions.

It was an early experiment in cracking the "Big data? Big deal… so what are you gonna do about it?" deadlock. Everybody had dashboards by the late '90s. Everybody could spell CRM in the '00s. Web 2.0 made it all pretty in the '10s. Our founder-led and sometimes rage-baity quest for the actionable insight versus consultant-billed-thud-factor reports was coming to a head.

This "OK, but… why?" thorn in the side sat idle through a short distraction in the digital assets boom, but by late 2023 when that self-corrected by force, the rear-view mirror showed these new asset types to be yet another justification for sciencing the sh*t out of the supposedly-transparent yet parochial, or perhaps insular, world of finance.

This became ReferenceModel, the classification backbone for Aaim's valuation infrastructure and the asset taxonomy for all of our interrelated but distributed systems.

Don't Call it a Comeback, it's a Convergence

Please forgive the negation-pivot headline that sounds like genAI slop. But it had to be said because LL Cool J was our founder's grad school soundtrack. The momentum of GPT, or probably the consumerization moment of ChatGPT, brought the world an accelerator that cuts both ways. Remember what we said about "...including when wrong." So the LLM researchers needed structured knowledge to ground model outputs. We needed it to make incompatible financial nomenclature systems interoperable. Same destination, different routes.

Unstructured information has limits, and at some point you need explicit relationships and formal categories. Aristotle used substance and quality, Gruber used conceptualizations, the LLM researchers use knowledge graphs, and we use directed acyclic graphs of financial instruments. Not to conflate ourselves with other big brains in this space, but we are doing our best to stand on the shoulders of giants.

Sources

Robert Goodyear
Robert Goodyear
Founder/CEO

Robert Goodyear is the founder of Aaim, a financial technology company providing alternative asset infrastructure to financial institutions.

Let's talk about alternative asset lending

Schedule a consultation to discuss pledged-asset lending for your institution.