After playing around with some chat-to-SQL tools in the past month, I’m surprised to report that this problem seems solved. A motivated analyst can feasibly create a SQL chatbot in a few minutes and have answers delivered to users in their preferred medium.
It’s early days, yes, but the shape of the thing aligns with practice: a user asks a poorly formed question; an analyst interprets and does its best with available data objects and business context; and the user gets queries, charts, and datasets with answers. The relevant variables are users, interpreters, data objects, business context, and the APIs for producing artifacts.
What’s changed is that the newer LLM tools leave only the “business context” and “data objects” as unfixed variables. (i.e., Users can’t be faulted, system APIs are reliable, and LLMs are good enough, given context.)
But getting to good is tractable now. Users like to point and grunt—“revneue yestrday” is interpreted as “Please share the total revenue from orders in the US on 2024-04-07”. If the agent gets confused, developers can update the YAML and redeploy. The agent improves. Repeat.
Finally, I think there’s a reasonable test for business intelligence layer quality. Let’s call it the mediocre model test:
A data interface is poorly modeled if an LLM system cannot reliably answer 80% of user questions in a single pass at the question.
Framed positively: well-modeled data is intuitive, performant, and accessible to mediocre data users. An LLM analyst is nothing less than an infinite well of mediocre data users.
I’m not saying it’s easy—but the challenges are precisely of the sort that lead to an interesting design space. Engineers need to iteratively expose the agent to more information in the world in a way that nudges its response towards the “right” answer, much the same way an application frontend nudges users into desirable flows.
The future of data engineering—or at least, one piece of the future— is unquestionably building data frontends for agents.
The “semantic layer” is remarkable primarily because it solved the problem of context sharing before ChatGPT existed. But it’s beginning to look like just another implementation of a model context protocol.
The problem is to layer in context, which all interoperable agents will need, whether creating a revenue dashboard or retrieving a user’s orders. Unlike business intelligence metrics layers, Anthropic’s model context protocol will be adopted in a big way, for big projects. It’s an order of magnitude more impactful than automating away BI.
That doesn’t mean it won’t have the same problems as the semantic layer. For example, the reference MCP server for PostgreSQL gives agents access to table schema information, such as:
JSON schema information for each table
Column names and data types
Automatically discovered from database metadata
The same applies to Slack, ElasticSearch, SQLite, and other platforms. Data professionals know what problems will arise next: semantics, quality, intention, ownership—in short, nuance.
What’s changed is that the gap from “database metadata” to “expert nuance” is gone. It’s all just tokens injected at runtime. The teams willing to invest in designing that interface will deliver the best product.
It’s still early days, though. We’re pre-CSS. We’ve got “<header>” and “<body>” tags, and that’s about it.
The context frontend is not pretty. Just look at Snowflake’s Cortex Analyst YAML spec: verified queries, synonyms, comments, example values. Look how pedantic you have to be.
Building a pretty web UI isn’t rocket science, either, though. Just look at Tailwind CSS’s index of utility classes. What’s more pedantic than standardized names for button classes and accent colors?
No—the design paradigms are still emerging, but the design space seems relatively determined. Today’s context protocols are like sitemaps from government research departments.
Traditional front-end development emphasizes visual cues—buttons, colors, and gestures—to nudge users to the right flows. In contrast, data front-ends depend on namespaces, descriptions, synonyms, and example values to render data accessible to AI.
If there are four different timestamp columns and one of them isn’t tagged as primary, the agent is unlikely to get the “right” one—in fact, you’ve probably leaked that there is no systemically understood “right” one.
The art will be to maximize what is made available to the agent’s context window, much as a mobile app must optimize what’s presented to the retinal field.
The data interface for them is nothing more than creating a linguistic reality that must be as complete as possible, in as narrow a manner as possible, while maintaining as wide a potential search space. Having a hundred tables that all do the same thing is not just clutter; it will have a measurable impact on performance and accuracy.
We may find that this drains all creativity from data modeling. We may face a semantic apocalypse where every business generates a cheap Kimballification of the spreadsheets they care most about.
But how much data complexity is invented, or due to a lack of clear interfaces? Might we find freedom in constraint—in minimum viable APIs?
Perhaps any business needs only a bit of nuance—just enough to fit in a context window. That’s how mobile apps work, restricting users to a few primate gestures: taps, swipes, clicks, presses, and hour-long stares.
It’s those stares that matter, and that’s precisely what the data frontends facilitate. How do you guide the AI to the “right” data when it has only the most meager instructions? Does it matter how deep the DAG is underneath if it can guide the right business leader’s attention to the anointed table? If it can answer questions in the heat of the boardroom? If it can respond to the barest inputs?
ChatGPT, what were our revenue drivers last month?
Which ones should I focus on?
Other insights?
Trends?
Make slides
do pie