MKB Explorer

Matrix/All Domains

All Domains

1587 entities found

schema detection

Schema detection (AI via Ollama)

Implementing AI-based schema detection is a 'must-have' feature. Current development is ongoing, with integration of Ollama for detecting schema types and suggesting column mappings, aiming for full functionality.

RequirementIntent

schema detection via Ollama

DataEntityData Model

Schema Graph

Phase A Schema Graph Construction implements the creation of the Schema Graph representing join relationships and clustering of tables. The logical Schema Graph of tables maps to the physical Consolidated Unified Views created as transient database views.

TechConstraintArchitecture

Schema Limiting

Schema Limiting constrains Arctic-Text2SQL-R1-7B to reduce the number of database tables provided for each query to avoid context window overflow. Phase 1 Implementation includes reducing the maximum number of tables passed to SQL generator via Schema Limiting to avoid Arctic context overflow.

RequirementIntent

schema mapping

RequirementIntent

schema mapping suggestions

PageUser Interface

Schema Relationship Explorer

Schema Relationship Explorer is an instance or feature of the User Schema Relationship Mapper used for data visualization. The Schema consolidation mechanism is planned to be presented via the Schema relationship explorer UI for better transparency and customization.

UseCaseIntent

Schema Selection Stage

Developed to improve table relevancy and joinability detection, enabling more reliable user queries through schema relationship discovery.

CapabilityIntent

schema system

PhysicalTableData Model

schema.sql

The DataLens platform backend uses the schema.sql database schema file. The DataLens platform backend includes the schema.sql database schema.

RequirementIntent

SchemaMapper

A SchemaMapper service is under development to automate column renaming and schema application, with current iteration addressing persistent mapping storage and automated application issues. SchemaMapper uses StorageService to manage file storage when mapping uploaded file columns to standard schemas.

RequirementIntent

SchemaMapper Service

The Standard Schemas capability requires the SchemaMapper Service for AI-powered column mapping.

DataEntityData Model

SchemaProfile

Pydantic model in agent_models.py. Fields include domain_area, classification, sensitivity_flag, and persistence_type, representing schema profile data. Classification: value_object; domain_area and other attributes are optional. Schema profiles relate to projects using the project_id field in schema_profiles. ProjectGdprFlag data entity uses SchemaProfile data entity to define schema-related GDPR flags. Projects include schema_profiles tables which hold profile data of database schemas associated with the project.

EntityAttributeData Model

scope field

The project goal concept is represented by the scope field in the data model although renamed in UI. The scope field is required in the ProjectCreate pydantic model as per design. The scope field is included in the ProjectResponse to ensure it is always present. The scope field replaces hardcoded text in backend/app/workers/catalog.py to generate more accurate file summaries. Word count validation constrains the scope field to have a hard minimum of 20 words. Project Creation Form uses the scope field renamed as "Project Goal" in the UI for project creation. ProjectCreate validation for scope validates that the scope field meets minimum word count requirements.

IntegrationIntegrations

SCP

Data files are transferred to elin GPU server using SCP before Docling extraction runs remotely.

Entity

scripts/migrate_duckdb_to_pg.py

Entity

scripts/migrate_duckdb_to_postgresql.py

Migration script used for transferring existing data from DuckDB to PostgreSQL, run with specific project ID and path, completed during the migration process. Migration script reads existing Project 14 data from DuckDB in read-only mode to avoid conflicts during migration. Migration script writes migrated data into PostgreSQL via PgDataService. Migration script reads data from Project 14 DuckDB file with read-only access during migration.

Entity

scripts/monitor-extraction.sh

The monitor-extraction.sh script monitors the extraction progress of the 132 files including queue size and extracted/pending counts.

BusinessProcessIntent

scripts/reset-and-reextract.py

The reset-and-reextract.py script processes the 132 files for full reset and re-extraction of the SVGV dataset.

IntegrationEndpointIntegrations

search

Performs vector similarity search in Qdrant collection for project. The /api/v1/discovery endpoints include the search endpoint.

RequirementIntent

SECRET_KEY environment variable

DataLens Platform requires setting a unique SECRET_KEY environment variable for session signing. Backend API requires session signing key configured as SECRET_KEY environment variable. DataLens Platform requires setting a unique SECRET_KEY environment variable for session signing. Backend API requires session signing key configured as SECRET_KEY environment variable.

RequirementIntent

section-based chunking

Semantic chunking based on document sections or slides is prioritized for better document structure preservation. It involves detecting headings and dividing content accordingly, with fallback to fixed size chunks for efficiency.

Entity

SecurePass123!

BusinessRuleIntent

Security Checklist

UserStoryIntent

semantic chunking

Technique for dividing documents into meaningful segments for embedding and search, used in RAG system. The DOCX extractor will be refactored to implement semantic chunking using section-based chunk boundaries and heading hierarchy tracking. The PPTX extractor uses semantic chunking based on slide units with enhancements for sub-slide chunk splitting in case of dense content. The chunking strategy for DOCX and PPTX files requires semantic chunking based on section or slide boundaries, improving retrieval quality for RAG. Semantic chunking is enforced as a business rule in the Docling extraction system for section/slide-based document processing. The DOCX extractor implements semantic chunking with section-based chunk boundaries to preserve document structure. Document RAG implements semantic chunking of documents to improve retrieval accuracy.

CapabilityIntent

Semantic Layer

WrenAI employs a semantic layer with YAML definitions encoding schema, metrics, joins, and governance rules.

BusinessRuleIntent

Semantic Layer (MDL Models)

WrenAI implements a Semantic Layer using MDL Models to define table schemas, metrics, joins, and governance rules. WrenAI uses a semantic layer with YAML model definitions for schema, metrics, and governance

DesignDecisionArchitecture

Semantic Layer Pattern

Entity

All Domains

schema detection

Schema detection (AI via Ollama)

schema detection via Ollama

Schema Graph

Schema Limiting

schema mapping

schema mapping suggestions

Schema Relationship Explorer

Schema Selection Stage

schema system

schema.sql

SchemaMapper

SchemaMapper Service

SchemaProfile

scope field

SCP

scripts/migrate_duckdb_to_pg.py

scripts/migrate_duckdb_to_postgresql.py

scripts/monitor-extraction.sh

scripts/reset-and-reextract.py

search

SECRET_KEY environment variable

section-based chunking

SecurePass123!

Security Checklist

semantic chunking

Semantic Layer

Semantic Layer (MDL Models)

Semantic Layer Pattern

Semantic matches sidebar