Knowledge Model
OWL ontology, SPARQL context, schema loading, and semantic model
The knowledge model is the semantic foundation of Epistom. It stores verified business definitions in an OWL 2 QL ontology, maps them to database schemas, and provides the context that makes LLM-generated SQL accurate.
How It Works
The knowledge model is stored in Oxigraph, a fast Rust-based SPARQL 1.1 triplestore. User-facing concepts map to internal OWL/SHACL primitives:
| User-Facing | Internal | Example |
|---|---|---|
| Object Type | OWL Class | Customer, Invoice, Subscription |
| Link Type | OWL Property | customerHasSubscription, invoiceBelongsToCustomer |
| Business Rule | SHACL Shape | "Revenue = net_amount WHERE status = 'paid'" |
Users never interact with OWL syntax or SPARQL directly. They work through a visual editor (card-based UI) or the auto-discovery pipeline.
Architecture
OntologyBridge (ontology_bridge.py)
The OntologyBridge class handles import/export of the knowledge model in multiple formats:
- JSON-LD — Standard linked data format
- dbt YAML — Integration with dbt semantic layer
- OpenMetadata — Integration with OpenMetadata catalogs
OxigraphSPARQL (oxigraph_sparql.py)
Low-level SPARQL client for Oxigraph. Handles query execution, graph management, and triple operations against the triplestore endpoint.
SchemaLoader (schema_loader.py)
Loads database schema metadata (tables, columns, types, constraints) from connected data sources. This schema is used by the SQL validator to check generated SQL against reality.
SemanticModelLoader (semantic_model_loader.py)
Loads and validates the complete semantic model — the combination of ontology classes, source mappings, verified queries, and business rules. The load_semantic_model() function assembles all components into a unified model, and validate_model() checks for consistency (orphaned classes, missing source mappings, etc.).
Configuration
The knowledge model is loaded from either:
- Auto-discovery (database schema scan + AI-proposed mappings)
- Starter kits (pre-built industry ontologies: SaaS 150 classes, Retail 150, Healthcare 150)
- Manual import (JSON-LD, dbt YAML, OpenMetadata)
Technical Details
- The ontology uses OWL 2 QL profile for efficient query answering
- SPARQL queries are parameterized via templates in
sparql_templates.py - Schema snapshots are versioned — the
ontology_snapshot_service.pymanages point-in-time snapshots for drift comparison - Inference rules in
ontology_inference.pyderive implicit relationships (e.g., transitivity)