Project: datalens
81 entity types
Matrix/Data Model/discovery.py service
DataEntityData Model

discovery.py service

The Data Discovery feature contains the discovery.py service. The discovery.py service uses the TableIndex for semantic table matching. The Discovery Service processes Danish questions for entity extraction, table ranking, and join detection. The Discovery Service uses a 4-factor relevance score to rank tables by matching criteria. The Discovery Service applies the Known keys join strategy to discover joins with 95% confidence. The Discovery Service applies the ID matching join strategy for join discovery with 85% confidence. The Discovery Service applies the Value overlap join strategy to identify joins by data overlap with 75% confidence. The Discovery Service implements the Schema consolidation mechanism to improve query success rate. DiscoveryService uses Table to represent database tables with metadata in consolidation recommendations. DiscoveryService uses JoinPath to represent joins between database tables for consolidation. DiscoveryService produces ConsolidationRecommendation to suggest table consolidations for questions. FilePrioritizer uses DiscoveryService outputs to prioritize project files relevant to analytical questions. The discovery.py service implements semantic table matching through TableIndex. The Data Discovery Feature includes the Backend discovery service which performs entity extraction, table ranking, and join discovery. The Discovery Service uses Qwen3 LLM for table selection in the intelligent table discovery process. The Discovery Service passes unified schemas to Arctic LLM for SQL generation after table consolidation. The Data Discovery Feature includes the Backend discovery service which performs entity extraction, table ranking, and join discovery. The Discovery Service uses Qwen3 LLM for table selection in the intelligent table discovery process. The Discovery Service passes unified schemas to Arctic LLM for SQL generation after table consolidation.