Project: datalens
81 entity types
Matrix/All Domains

All Domains

1587 entities found

PageUser Interface

RESULTS_VISUALIZATION_TEST_REPORT.md

Test report for visualization components, verifying functionality and performance.

BusinessRuleIntent

Rich metadata

Rich metadata including hierarchy, confidence, and provenance is enforced by the Docling extraction system as a business rule for DS-STAR reasoning. DS-STAR reasoning uses rich metadata such as hierarchy and provenance produced by the Docling extraction system for advanced AI cataloging and analysis.

BusinessProcessIntent

RingfencedSkills

RingfencedSkills uses ElinSkillClient to execute constrained skill operations for DataLens agent. RingfencedSkills replace raw SQL skills with constrained operations used by SkillExecutor when executing agent skills. ElinSkillClient integrates with RingfencedSkills to perform ringfenced skill executions on elin.

ContingencyPlanOperations

Rollback Plan

DesignDecisionArchitecture

Router

Router agent manages fixes and extensions to the extraction plan.

DesignDecisionArchitecture

Router Architecture

DataLens Development applies the Router Architecture for modular endpoint routing and domain separation.

ThirdPartyComponentArchitecture

RouterAgent

DS-STAR Intelligence includes the RouterAgent component. Router Agent decides between fixing and extending steps, managing the iteration loop in the plan. RouterAgent is part of the DS-STAR pipeline. DS-STAR Orchestrator uses RouterAgent for decision logic in its process DSStarOrchestrator uses router agent for decision making in the extraction process DS-STAR Orchestrator uses RouterAgent for decision logic in its process DSStarOrchestrator uses router agent for decision making in the extraction process The DS-STAR pipeline includes the RouterAgent component. DS-STAR Intelligence contains the RouterAgent component. The Router Agent manages iteration loops in the DS-STAR planning process as per the plan. The DS-STAR Intelligence Layer includes the RouterAgent component. DS-STAR Intelligence includes the RouterAgent that makes decision logic for extraction steps. DS-STAR Intelligence includes the RouterAgent component which manages decision logic such as FIX, ADD, or PROCEED. Router Agent uses outputs from Verifier Agent to decide extraction plan adjustments.

UIComponentUser Interface

Row-Level Security

Vanna 2.0 enforces row-level security by filtering queries based on user permissions. Vanna 2.0 enforces row-level security by filtering queries per user permissions. Vanna 2.0 enforces row-level security filtering queries per user permissions

ThirdPartyComponentArchitecture

RQ

RQ is a background job management tool used for tasks like schema profiling and session warming, though details are limited in the messages. The Batch Upload process uses the RQ job queue for reliable background job execution. The RQ job queue manages the execution of the extract_file_job() function with a timeout for large files. The Batch Upload process uses the RQ job queue for reliable background job execution. The RQ job queue manages the execution of the extract_file_job() function with a timeout for large files. AI Summary Generation is implemented asynchronously using the RQ job queue to avoid blocking HTTP responses during file list retrieval. Vectorize Progress Tracking uses the RQ job queue to track status and progress of asynchronous embedding jobs. Vectorize Progress Tracking queries the RQ job queue and chunk counts to provide accurate vectorization progress percentages. The RQ worker for async job processing consumes jobs from the RQ job queue to generate AI summaries and process embeddings asynchronously. The RQ worker depends on the RQ job queue to receive tasks for async AI summary generation and embedding processing. Background workers use RQ job chaining for job chaining and orchestration. Background workers utilize RQ job chaining to coordinate sequential processing tasks.

ThirdPartyComponentArchitecture

rq

Rq depends on redis as the message broker for job queueing.

BatchJobIntegrations

RQ job queue

The AI Summary Generation feature uses the RQ job queue for asynchronous summary generation after extraction completes. The RQ job queue is utilized and managed by backend app api files.py for async AI summary generation processing. The RQ job queue depends on the Redis service for managing asynchronous job scheduling and processing.

BatchJobIntegrations

RQ job queue for async summary generation

BatchJobIntegrations

RQ Queue

Redis provides the backend queue for the RQ worker in the extraction pipeline. The RQ worker listens to the RQ queue to process extraction jobs.

IntegrationIntegrations

RQ Queue with Redis backend

Uses Redis for managing background jobs like file extraction, summaries, and vectorization in DataLens, with bidirectional protocol and pattern. The extraction pipeline depends on the RQ queue for batch extraction job management. RQ Worker depends on RQ queue to consume extraction jobs and process them. The RQ extraction queue on redis triggers the extract_file_job(file_id) function to process file extraction jobs for summaries. The Docker-compose configuration depends on the RQ extraction queue on redis for job processing. The Docker-compose configuration was constrained by a misconfiguration of the RQ extraction queue on redis causing job processing issues. The Backend container uses the RQ extraction queue on redis for managing extraction jobs. RQ Worker extraction processing depends on Redis RQ job queuing for managing extraction jobs. Batch Processing Strategy uses RQ job queue for job management and reliability. Batch extraction is managed by the Backend using RQ job queue for job orchestration.

SecurityConstraintSecurity

RQ serialization

Serialization challenge addressed by replacing RQ jobs with subprocess calls for GPU extraction.

ServerOperations

RQ worker

DataLens depends on the RQ worker to process queued extraction jobs for the SVGV dataset files. The RQ worker processes extraction jobs for the 132 SVGV dataset files. Redis functions as the queue backend supporting the RQ worker processing extraction jobs. The Backend API interacts with the RQ worker to manage extraction job queues and status for SVGV files. The extract_file_job function is executed by the RQ worker to process extraction of files. The worker container runs the RQ worker instance responsible for processing extraction jobs. The worker container is idle, waiting for extraction jobs in the RQ worker queue after the reset. The RQ worker is used by the FastAPI backend to handle asynchronous jobs such as embeddings and summary generation.

ServerOperations

RQ Worker

Worker process correctly configured and processing extraction jobs; initial queueing issues fixed with proper function name, now actively handling 132 files for re-extraction. RQ Worker extraction processing depends on Redis RQ job queuing for managing extraction jobs. RQ Worker extraction processing uses Backend API endpoints for extraction tasks coordination. The SVGV Full Reset process depends on RQ Worker extraction processing to handle extraction jobs after resetting files and schema. RQ Worker processed the SVGV extraction jobs and is currently idle after completion. The Data Discovery system utilizes RQ Worker to process background extraction and consolidation jobs. The RQ Worker uses Docling as the exclusive extraction method for DOCX and PPTX files and enforces failure if any Docling extraction errors occur, prohibiting fallback extraction methods. The Extraction Pipeline depends on the RQ Worker to process files asynchronously in the extraction queue. The Backend depends on the RQ Worker to process asynchronous tasks such as extraction and AI summary generation. The 132 extraction jobs are processed by the RQ worker. The RQ worker listens to the RQ queue to process extraction jobs. The Worker container hosts the RQ worker process for asynchronous job processing. The extraction queue fix requires the RQ Worker to be running and active to process extraction jobs. The extraction queue fix enables the RQ Worker to process all 132 extraction jobs successfully. The Data Discovery system depends on the RQ Worker to process extraction jobs asynchronously. RQ worker processes execute extraction jobs by calling the Extraction API endpoints for each SVGV file.

ServerOperations

RQ worker for async job processing

The RQ worker for async job processing consumes jobs from the RQ job queue to generate AI summaries and process embeddings asynchronously. The RQ worker depends on the RQ job queue to receive tasks for async AI summary generation and embedding processing. The DataLens platform uses the RQ worker for async job processing to handle background tasks for summaries and embeddings. The RQ worker for async job processing is part of the backend infrastructure of the DataLens platform executing asynchronous tasks. The deployment and availability of the RQ worker for async job processing depends on the Coolify deployment platform configuration and deployment status. RQ worker runs on theo server to process async jobs like embeddings and summaries in the DataLens platform. The Extraction Pipeline depends on the RQ Worker to process the extraction queue for files asynchronously. Backend uses the RQ Worker configured to listen on the 'extraction' queue for asynchronous extraction jobs.

BatchJobIntegrations

RQ Worker service

The RQ Worker service depends on the service definition in docker-compose.coolify.yml for backend container deployment and job processing.

ServerOperations

RQ worker service on backend server theo

A background worker in progress to handle file extraction and processing tasks.

PasswordPolicySecurity

RQ workers

DataLens Platform plans to use RQ background jobs for asynchronous cataloging and extraction in future iterations. RQ workers depend on Redis for job queue management. RQ worker depends on the Docling extraction system for processing DOCX/PPTX extraction jobs without fallback failure tolerance. RQ Worker depends on RQ queue to consume extraction jobs and process them. Backend processing depends on RQ Worker to execute background extraction and embedding jobs. The extract.py worker runs as part of the RQ workers to process extraction jobs asynchronously. Future work on the DataLens Platform includes integration of RQ workers for background job processing.

ServerOperations

RTX 4000 SFF Ada

DataLens performs GPU-accelerated workloads on elin, specifically utilizing the RTX 4000 SFF Ada GPU. GPU-first document extraction uses the RTX 4000 SFF Ada 20GB GPU on elin for document extraction and embeddings generation. Ollama runs on the elin RTX 4000 SFF Ada 20GB GPU to provide embedding services for document chunks. The GPU-first document extraction system uses the RTX 4000 GPU on the elin server for fast document extraction and vectorization. DataLens agent requires GPU usage on RTX 4000 SFF Ada with max memory utilization 0.5 due to shared hardware usage constraint. GPU resource management policies require monitoring and usage of the shared RTX 4000 SFF Ada 20GB GPU for extraction and embedding tasks. Docling-based extraction leverages the RTX 4000 SFF Ada 20GB GPU available on elin for DOCX and PPTX file processing. Ollama embeddings run on the RTX 4000 SFF Ada 20GB GPU for batch processing of text chunks.

AgentIdentityAgentic Discipline

RunSqlTool

A tool used within the agent to execute SQL queries on DuckDB for structured data retrieval.

RequirementIntent

Safety net cleanup

Safety net cleanup enhances SQL extraction regex fix by stripping explanation text markers after extraction to ensure pure SQL before execution. Safety net cleanup was deployed on theo.

PhysicalTableData Model

sales table

The sales table is stored in DuckDB (analytics.db)

PhysicalTableData Model

sample_sales table

sample_sales table exists within DuckDB (analytics.db) The sample_sales table is stored in DuckDB as a physical table.

Entity

sample_sales.csv

Entity

Scaling Considerations

DataEntityData Model

Scandinavian budget data schema

IntegrationIntegrations

Schema API endpoints

API endpoints for schema detection and mapping are planned to enable optional, AI-assisted schema assignment, existing as part of development improvements.