Intent
495 entities found
Phase 4 Switch Query Pipeline
Phase 5 Test & Validate
Phase 6 Deploy
Phase 7 Cleanup
Phase A Schema Graph Construction
Phase A Schema Graph Construction implements the creation of the Schema Graph representing join relationships and clustering of tables.
Phase A: Batch upload and enhanced AI cataloging
Phase A includes the batch upload and enhanced AI cataloging as part of the batch upload pipeline implementation. The batch upload pipeline includes Phase A which is batch upload plus enhanced AI cataloging.
Phase B Intelligent Retrieval
Phase B Intelligent Retrieval implements the Query Enhancer for entity extraction and relevant table identification for queries. Phase B involving the smart auto-processing pipeline with Qdrant is part of the smart processing UX model.
Phase C Integration
Phase C Integration modifies Backend Application Services to use consolidated views for query analysis within the DataLens System. Phase C, the unified question interface, is part of the smart processing UX model for DataLens.
Pipeline Architecture
The plan calls for testing the full pipeline from upload to query and insight generation.
Port Mapping
POSTGRES_DB environment variable
POSTGRES_PASSWORD environment variable
POSTGRES_USER environment variable
PostgreSQL Health Check
PostgreSQL metadata storage
Uses PostgreSQL for storing system metadata, with ready deployment in production.
postgreSQL MVCC
prepare-data
DataLens Agent Mode implements the prepare-data skill for cleaning and transforming datasets via SQL or Python operations.
PrepareDataSkill
PrepareDataSkill produces SkillResult during data cleaning and transformation executions. ExtractionCoordinator uses PrepareDataSkill to process data after extraction across CPU and GPU services.
prioritize worker
Background workers include the prioritize worker as a component. The batch processor orchestrator uses the prioritize worker to assign processing tiers.
Private Model Backend
DataLens Agent Mode supports a Private Model Backend using Ollama self-hosted LLMs for GDPR-compliant inference.
process_message
The process_message function is expected to call the _run_query function to execute queries, but current data flow problem stops execution before _run_query is reached. In agent_skills.py, the process_message() function calls _run_query asynchronously to generate query results.
Production-Ready Infrastructure
Production-Ready Infrastructure relies on DuckDB for analytics data storage with read-only connections and timeouts. Production-Ready Infrastructure integrates with Qdrant for vector storage. Production-Ready Infrastructure uses Ollama for GPU-accelerated LLM inference and embeddings.
Progress Indicator Fix
Corrected progress tracking by removing faulty ORM import, enabling accurate real-time display of catalog, extraction, and vectorization statuses.
Project
Represents a data analysis project involving internal stakeholders with high influence. Data models such as StandardSalaryRecord, StandardHealthRecord, StandardFinancialTransaction, StandardGeographicData, and StandardBudgetRecord are used within projects to structure relevant data. The project entity is linked to physical tables like FileUpload, Query, Insight, and ProcessingJob, which manage project files, executed queries, insights, and background tasks respectively. Each Project's data is stored in a dedicated DuckDB file (e.g., project_4.duckdb) managed by DuckDBService. Each Project's semantic data is stored in a dedicated Qdrant collection used by QdrantService. User interacts with Project data via the API, querying and managing project-specific information. Project physical table contains multiple FileUpload physical tables representing uploaded files associated with the project. PostgreSQL database stores project metadata like org_id and created_by user. Query data entity references the Project entity by project_id.
Project 4
SVGV Budget Analysis is the Project 4 deployed on the platform for batch extraction and analysis.
Project 9 with SVGV scope
Comprehensive plan to build a multi-tenant data platform for municipal finance data, leveraging AI extraction and structured schemas.
Project context in summary generation
Project context in summary generation is required by the DSStarService file cataloging workflow to produce contextually relevant AI summaries.
Project Goal Feature
A key epic for setting project scope, priorities, and goals, under development and testing.
Project goal field
Project goal generation
Phase 2 requirement: automated creation of project-specific goals, validated and integrated into the platform, enabling targeted data analysis.