Project: datalens
81 entity types
Matrix/Intent

Intent

495 entities found

BusinessRuleIntent

Docling MANDATORY

Enforced GPU-first extraction system requiring Docling for docx and pptx files, with no fallback to Python tools. Fail hard on errors, ensuring consistent high-quality processing as per recent implementation.

BusinessProcessIntent

docs/DISCOVERY_WORKFLOW.md file

Workflow documentation describing the data discovery, consolidation, and analysis process.

EpicIntent

Document RAG

Integrates LlamaIndex and Qdrant, using bge-large-en-v1.5 embeddings, for semantic document chunking, embedding, and retrieval in the DataLens platform, leveraging RAGAgent for unstructured text search and answer synthesis.

BusinessProcessIntent

Documentation

Complete documentation exists with guides detailing the system architecture and workflows.

EpicIntent

DS-STAR

DataLens uses DS-STAR as part of its integrated architecture for AI cataloging and extraction. DS-STAR reasoning uses rich metadata such as hierarchy and provenance produced by the Docling extraction system for advanced AI cataloging and analysis. DS-STAR queries use DS-STAR reasoning capabilities that leverage the rich metadata and embeddings from the Docling extraction system. The DataLens platform backend includes the DS-STAR pipeline. The DS-STAR pipeline includes the FileAnalyzer component. The DS-STAR pipeline includes the PlannerAgent component. The DS-STAR pipeline includes the VerifierAgent component. The DS-STAR pipeline includes the RouterAgent component. The Platform Backend uses DS-STAR subprocess calls to implement cataloging, extraction, and SQL generation features. The Platform Backend uses DS-STAR subprocess calls to implement cataloging, extraction, and SQL generation features. The Platform Backend uses DS-STAR subprocess calls to implement cataloging, extraction, and SQL generation features. The Platform Backend uses DS-STAR subprocess calls to implement cataloging, extraction, and SQL generation features. The Platform Backend uses DS-STAR subprocess calls to implement cataloging, extraction, and SQL generation features. The Platform Backend uses DS-STAR subprocess calls to implement cataloging, extraction, and SQL generation features. The Platform Backend uses DS-STAR subprocess calls to implement cataloging, extraction, and SQL generation features. DS-STAR Intelligence is encompassed within the broader DS-STAR System epic. DataLens incorporates the DS-STAR autonomous extraction pattern as a built-in integration for the SVGV project data processing. The DataLens platform backend uses the DS-STAR pipeline for various processing tasks.

RequirementIntent

DS-STAR AI cataloging

DS-STAR AI cataloging is integrated via the DS-STAR Agent API running on elin and proxied on theo. DS-STAR AI cataloging system uses Text-to-SQL with Ollama for natural language query translation. The DS-STAR AI cataloging functionality uses 12 agents running on elin as part of the DS-STAR agents.

CapabilityIntent

DS-STAR autonomous extraction

AI Core capability requires the DS-STAR autonomous extraction capability. The AI Core includes DS-STAR autonomous extraction components such as PlannerAgent, VerifierAgent, RouterAgent, and Orchestrator. DS-STAR Intelligence Layer realizes the Autonomous Data Extraction capability with local LLMs and iterative quality refinement.

CapabilityIntent

DS-STAR Autonomous Extraction Pattern

BusinessRuleIntent

DS-STAR cataloging (12 agents)

EpicIntent

DS-STAR Intelligence

Includes components like PlannerAgent, VerifierAgent, and RouterAgent within the DS-STAR system, implementing autonomous file analysis, data validation, and query generation on GPU infrastructure and DuckDB, involving iterative extraction and AI-driven quality assessment.

BusinessProcessIntent

DS-STAR loop orchestrator

DS-STAR Orchestrator comprises PlannerAgent, VerifierAgent, and RouterAgent for autonomous data extraction, processing CSV, Excel, and PDF files, and implementing an iterative refinement loop during extraction. The DS-STAR Intelligence capability includes a DS-STAR loop orchestrator to orchestrate the agent workflow loop for iterative extraction improvement. DS-STAR Intelligence contains the DS-STAR loop orchestrator responsible for coordinating the iterative process.

UseCaseIntent

DS-STAR planning loop

Objective: Implement an autonomous cycle that fixes extraction errors via multi-step reasoning, quality assessment, and iterative refinement to improve data extraction quality.

RequirementIntent

DSSTAR_AGENTS_DIR environment variable

RequirementIntent

DSSTAR_VENV_PYTHON environment variable

BusinessProcessIntent

DSStarService

DSStarService depends on BatchProcessor to orchestrate DS-STAR Agent API workflows. DataLens Development uses DSStarService as an HTTP client to integrate with DS-STAR Agent API for AI cataloging and extraction. Project context in summary generation is required by the DSStarService file cataloging workflow to produce contextually relevant AI summaries. DSStarService integrates with DiscoveryService as client for DS-STAR Agent API on elin.

RequirementIntent

Dual-write then cutover

RequirementIntent

DuckDB extraction

Employs DuckDB for data extraction and schema auto-discovery, though initially marked as a technical requirement not yet fully met.

CapabilityIntent

DuckDB schema auto-discovery

The Text-to-SQL capability in the DataLens Project requires DuckDB schema auto-discovery.

RequirementIntent

DuckDB Text-to-SQL strategies

UserStoryIntent

DuckDB-NSQL-7B

The DuckDB-NSQL-7B model planned as a future upgrade for faster and more efficient SQL queries; currently, SQLCoder-7B handles SQL generation, with plans to enhance speed and capabilities.

CapabilityIntent

DuckDBService class

The DuckDBService class is defined within backend/app/services/duckdb_service.py. DuckDBService class uses the project_4.duckdb file to store and query project-specific data. PgDataService replaces DuckDBService for data storage and querying of extracted data. The new pg_data_service.py file provides PostgreSQL data management functionality replacing DuckDBService. The DATA_BACKEND feature flag can toggle between using PostgreSQL (PgDataService) and DuckDBService as the data backend. PgDataService replaced DuckDBService in extract.py and question_router.py to switch backend from DuckDB to PostgreSQL. The execute_query method is part of the DuckDB Service.

RequirementIntent

E2E workflow test

DataLens Development realizes a complete end-to-end flow that covers file upload, data extraction, SQL query execution, and result export.

CapabilityIntent

Efficiency Analyzer

Efficiency Analyzer is a DS-STAR agent focused on operational and efficiency metrics analysis. Agent Selector directs queries to the Efficiency Analyzer agent

RequirementIntent

Elin environment variables

The Anthropic API key must be set in the elin environment for the OpenClaw Gateway to authenticate API calls to Anthropic. The Anthropic API key was added to the elin environment and resulted in successful OpenClaw Gateway authentication and Claude response. The Anthropic API key must be set in the elin environment for the OpenClaw Gateway to authenticate API calls to Anthropic. The Anthropic API key was added to the elin environment and resulted in successful OpenClaw Gateway authentication and Claude response.

BusinessProcessIntent

EmbeddingService

EmbeddingService produces embeddings used by QdrantService for semantic search and vector collections. Document RAG uses embedding models for semantic vector generation, optionally on CPU or Ollama GPU.

RequirementIntent

Environment Variables

The ANTHROPIC_API_KEY is required to be set in the Coolify environment to allow Claude to respond and prevent analysis request timeouts The User is expected to provide the Anthropic API key The Anthropic API key is required by the OpenClaw Gateway to authenticate calls to Claude for query processing The Anthropic API key must be set in the Elin environment variables for the OpenClaw Gateway to use it The ANTHROPIC_API_KEY is required to be set in the Coolify environment to allow Claude to respond and prevent analysis request timeouts The User is expected to provide the Anthropic API key The Anthropic API key is required by the OpenClaw Gateway to authenticate calls to Claude for query processing The Anthropic API key must be set in the Elin environment variables for the OpenClaw Gateway to use it

RequirementIntent

Excel Extraction

Excel extractor component implements the Excel Extraction requirement.

BusinessProcessIntent

Excel files

Phase 2 Strategy depends on the prior processing status and results of Excel files from Phase 1 Budget questions are partially answerable from Excel files available after Phase 1 processing Phase 2 Strategy Research & Decision Point uses data from Excel files processed in Phase 1 as context for decision making. Extractor Agents handle Excel files for data normalization and extraction.

CapabilityIntent

Excel Worker AI Agents

DataLens implements production-ready autonomous agents approximating the functionality of Excel Worker AI Agents like LangGraph and LangChain. Excel Worker AI Agents are prototype implementations following the ReAct pattern, while DataLens offers a production-ready equivalent solution.

StakeholderIntent

Exerun

The Admin User belongs to the organization Exerun.