In life and in work, asking great questions can reshape your thinking and open doors to important answers. LLMs have made it amazingly easy to ask questions about most things, but asking great data-backed questions about our businesses is still difficult - it requires writing and running SQL, fiddling with complex query builders, or asking an engineer or data analyst to do it.
At the last Maven team hackathon1 I built a data exploration tool that makes it easy to ask natural language questions about our business and get exact data as answers. It operates over almost all the data our company collects, and itās being used by over half our team every week across all functions, from the support team to the CEO.
They seem to love it.

Here's how and why I built it.
Maven collects data from many sources: user actions in Amplitude, email activity in Customer.io, and application state in our Postgres database. This data is combined via Segment and extended through dbt transformations in Snowflake. On top of this, we have Metabase for querying, charting, and sharing cuts with the team.
This is a reasonably thoughtful data pipeline for a 20 person company, and Metabaseās query builder is easy enough to use for simple queries, but there are some challenges:
To query data you need to know that it exists, and what itās called. We add new tables all the time and the names might not match the featureās external name. Some of the data may live as columns, other data might live in JSON fields and the contents may be opaque to database tools. Itās hard for a casual analytics user to keep up and to remember enough details to ask all their questions.
Many interesting questions require complex queries which can get gnarly even with visual query builders. For example, I wanted to know which courses engineers had taken in the last 3 months. This required a multi-part query: first pattern matching over user job titles to find engineer users, separately joining courses with tags and enrollments to find engineering course users, and two kinds of joins to bring the pieces together. Simple question, complex query to generate.
Because of these challenges, people start their analysis by either taking an existing query and modifying it, or by posting questions in the #analytics channel in Slack and waiting for a PM or an engineer to help them answer it, which is a drag for everyone involved.
LLMs can help here because theyāre great at two important things:
First, LLMs are excellent at writing well-structured SQL. SQL is inherently structured with a small, well-defined grammar. Natural language queries are declarative like SQL. And there is a ton of training data for SQL and responses are easy to verify. Even the engineers on the team that are very comfortable with SQL use ChatGPT to help them write the more complex queries.
Second, LLMs can accurately handle very large context windows. Hallucinations still happen but there is active effort and good progress in increasing factuality in LLM responses. And as the context windows of LLMs have grown from 8k to millions of tokens, work has been done to measure and improve on the āneedle-in-a-haystackā test. Gemini 2.5 Pro claims 100% recall from a context of 530k tokens, and 99.7% recall for a million tokens!
So if we can give an LLM a complete description of our data model - the tables, the rows, the names of fields in JSON columns - it wouldnāt be able to answer our question directly, but it might figure out how to write accurate queries to answer these questions. This turns out to work quite well.
What I built was a āchat with an LLMā page on our admin site backed by an API endpoint that sends the userās query and a description of our data model to Gemini 2.5, gets back SQL with an explanation, and runs that SQL in our Snowflake warehouse through Metabase. There are no microservices here - both the client and server sides were built into our main apps.
In more detail:
We construct our context to contain a description of the data model. This context is a giant text blob containing four components:
models.py file where we define all Postgres tablesschemas.py file where we define Pydantic schemas for JSONB columns (and much more). If we were short on context window space, weād cut out the irrelevant schemas..sql files which define all our dbt modelsEach of these components is a different format, but the LLM understands each one with no additional explanation. And since this feature is built into our primary application, it is always up to date with our data model.
We prompt Gemini with this context, task instructions, and the userās query, which comes via a chat interface largely built by Cursor. We prompt the LLM via litellmās chat completions API with a system message containing the data model context and a user message with the userās query.
This is the exact system prompt we use (notice the spots where the schemas above get inserted). We add specific instructions to handle edge cases, formatting specifications, and define application specific terms. We use Gemini because it has the largest context window but most of the latest LLMs should work.
When an answer comes back, we extract the SQL and turn it into a Metabase link using Metabaseās undocumented question URL API. We return this Metabase link and the LLMās explanation of the query. The user can then go to that Metabase link to run the query and see the results, or can keep chatting with the tool to modify the query.
We could directly execute the SQL we get back from the LLM, but we donāt! Even with the safety of a read-only replica, weād have to consider pagination, the UI for displaying data, and a number of other concerns better left to Metabase or a proper querying tool.
The work wasnāt done until people were using the tool, so the final step was to market it internally. I started pasting the exact questions posted in our Slack #analytics channel into the data explorer and replying with a link to the chat. When Iād ask an interesting question or get a surprising query, Iād share the result with the team and credit the Data Explorer. With regular reminders, usage crept up.
A couple of months in, most of the team has tried the tool and around a third of the team uses it every week. The heaviest user is one of our PMs, who says it has replaced a lot of his direct use of Metabase. Iām the second heaviest user, and it has greatly increased the number of data questions I ask every week. After us, the support and instructor success teams use it most often, usage which would have certainly required assistance from the engineering team before this tool.
In this time, Iāve noticed a few failure modes. First, people expect more from the tool than it can do, and the edges are non-obvious. They might expect it to read a URL (a webpage, a previous query) and answer questions based on it. Or they might expect us to have data that we donāt collect or isnāt in Snowflake yet. Some failures point to future improvements. Second, the SQL can certainly be subtly wrong, primarily via hallucinated field names, and itās important to read the query to verify it. Finally, infrequent users forget to use the tool and go back to asking questions in #analytics that the tool can answer. Adoption in a constant effort - a truth that applies to many product features.
There are unexpected positives too: because the tool sees the full data model, it may find paths to answering a question that I may not have taken. For example, I asked which of our instructors were in Europe and it used an instructor_timezone field that I had forgotten we collected. And assuming one reads the SQL that is generated (I do, but donāt expect most would), itās also possible to learn SQL techniques like CTEs and rank windows from its output.
The most important positive of course is that because asking questions is now cheap, I ask way more questions.
All this was possible because of two strong foundations: Weāve insisted on type checking across our Python code base, which means that our āschemalessā JSON is also typed and well defined in application code. And our prior data work to connect different sources using Segment and dbt into Snowflake allows us to write and run queries across our data in Metabase. That said, the core of the tool would be useful with just a Postgres db and psql in read-only mode to run the generated queries.
AI gives us the power to cross role boundaries. PMs can build prototypes, marketers can generate microsites, and everyone should be able to do the data analysis they need to do their jobs better. I imagine every company will have some version of this tool in the coming years, and if you have a few days to spare, you can give your team this as well. If youāve got questions about it along the way, message me at [email protected].
Iām Shreyans, co-founder and CTO of Maven, the missing grad school for tech, and home to some incredible engineering/AI courses. I love building useful products with small, excellent teams, and I think weāve got a pretty great one here at Maven. Weāre hiring senior/staff engineers to work on our tight-knit, remote team.
Before this, I co-founded Socratic (one of the first AI powered tutors) and sold it to Google. Before that I was the first employee at Venmo.
1Weāre a remote team but stay tight knit by meeting once a quarter. We do full team offsites in Feb and Sept (we were in Porto last week), and two day hackathon in New York in May and Dec.