Verified Queries
Curated SQL patterns, few-shot injection, and the promotion pipeline
Verified Queries (VQs) are curated, production-validated SQL patterns that serve as few-shot examples for the NLQ engine. They are the highest-trust tier of SQL generation — when a question matches a verified query, the LLM receives the pattern as context and generates highly accurate SQL.
How It Works
The verified query system operates in three phases:
- Matching — When a question arrives, the system scores existing verified queries for relevance using keyword overlap, table coverage, and semantic similarity
- Injection — Matched VQs are injected into the LLM prompt as few-shot examples
- Promotion — Queries that receive consistent positive feedback can be promoted to verified status
Finding Relevant VQs (verified_query_loader.py)
The find_relevant() function scores VQs against the incoming question:
- Keyword scoring — Matches question keywords against VQ descriptions and SQL content
- Table overlap — Prioritizes VQs that use the same tables as the schema context
- Semantic fallback — When keyword scoring is insufficient, uses embedding similarity (if available)
- Exact match — Direct string matching for known question patterns
The format_for_prompt() function converts matched VQs into the prompt format expected by the LLM.
VQ Pipeline (vq_pipeline/)
The pipeline manages the lifecycle of verified queries:
- QueryMiner (
query_miner.py) — Mines the query log for frequently asked questions and high-confidence patterns that are candidates for verification - FeedbackPromoter (
feedback_promoter.py) — Automatically promotes queries with consistent positive feedback ratings above a threshold - GenAIGenerator (
genai_generator.py) — Uses the LLM to generate candidate VQs from concept definitions, which human stewards then review - SQLImporter (
sql_importer.py) — Imports SQL patterns from external sources (dbt models, Looker views, etc.)
Architecture
Verified queries are stored in PostgreSQL with the following key fields:
question_pattern— The natural language pattern this VQ answerssql_template— The validated SQLtables— Tables referenced (for scoring)status— Active, pending review, or retiredexecution_count— How many times this VQ has been matched and usedworkspace_id— Workspace isolation for multi-tenant deployments
Configuration
Verified queries can be loaded from:
- Database — PostgreSQL
verified_queriestable - JSON files —
demo/verified_queries.jsonfor demo data - API — REST endpoints for CRUD operations
The invalidate_cache() function clears the in-memory VQ cache, forcing a reload from the database on the next request.
Technical Details
- VQ matching uses a scoring algorithm that weights keyword overlap, table coverage, and freshness
- Active VQ IDs are cached with a TTL for performance (
get_active_vq_ids()) - The promotion threshold is configurable — queries need N positive feedbacks with no negative feedback in the last M days
- VQs support dialect-specific SQL (PostgreSQL, Snowflake, BigQuery) via the dialect compiler