Project: datalens
81 entity types
Matrix/Intent

Intent

495 entities found

CapabilityIntent

Full Findings Visualization Layer

findings_generator.py realizes the Full Findings Visualization Layer capability by producing structured Finding objects including metrics, trends, and outliers. The Full Findings Visualization Layer capability requires the support of 7 filter categories for filtering findings in the UI. The Full Findings Visualization Layer depends on integration with Agent.py to return findings on each query. The findings_generator.py service implements the Full Findings Visualization Layer by analyzing query results and generating structured findings. The MetricCard.svelte UI component is part of the Full Findings Visualization Layer providing KPI display with statistics. The TrendChart.svelte UI component is part of the Full Findings Visualization Layer and provides SVG line charts with grid. The OutlierHighlight.svelte UI component is part of the Full Findings Visualization Layer showing anomaly detection displays. The FindingsPanelNew.svelte UI component is part of the Full Findings Visualization Layer serving as a master panel with filtering capabilities. The Full Findings Visualization Layer is planned to be integrated with agent.py to return findings on each query.

BusinessRuleIntent

GDPR PII detection

GDPR PII detection results are recorded in the project_gdpr_flags table to track personal data sensitivity.

BusinessProcessIntent

GdprDetector

GdprDetector uses StorageService to scan project data for GDPR-relevant personal data indicators.

CapabilityIntent

GenBI Insights

WrenAI generates AI summaries and charts as part of its generated insights.

CapabilityIntent

Generate Insights

Process for automatically creating summaries, charts, or insights from data, part of the system's output capabilities.

RequirementIntent

generate-report

DataLens Agent Mode includes the generate-report skill to compile findings into structured markdown reports.

BusinessProcessIntent

GenerateReportSkill

GenerateReportSkill uses SkillResult to compile findings into structured reports.

EpicIntent

Google's DS-STAR Paper

BusinessRuleIntent

Governance rules

CapabilityIntent

GPU embedding

GPU embedding has been planned but is not yet started, involving GPU acceleration for embedding large datasets or document chunks within DataLens. Ollama on elin provides GPU-accelerated embeddings with nomic-embed-text model for batch vectorization. GPU embeddings are stored in Qdrant vector database for semantic search and retrieval purposes.

CapabilityIntent

GPU inference

GPU inference capability is implemented for real-time AI model inference tasks within DataLens, supporting document processing and embeddings with GPU acceleration.

CapabilityIntent

GPU leverage

CapabilityIntent

GPU Resource Management

GPU resource management policies require monitoring and usage of the shared RTX 4000 SFF Ada 20GB GPU for extraction and embedding tasks.

CapabilityIntent

GPU-accelerated workloads

GPU-accelerated workloads are in scope for development, currently not started.

CapabilityIntent

GPU-first document extraction system

The GPU-first document extraction system includes the Docling extraction system as the mandatory method for DOCX and PPTX extraction. The GPU-first document extraction system uses the RTX 4000 GPU on the elin server for fast document extraction and vectorization. The theo backend server orchestrates the GPU-first document extraction system by triggering extraction and processing over SSH to elin GPU. Phase 2 GPU-First Document Extraction involves GPU-first document extraction as its core capability. GPU-first document extraction relies exclusively on Docling for DOCX and PPTX file extraction with no fallback options. GPU-first document extraction is performed using Docling on the elin GPU server. GPU-first document extraction uses the embedding service in backend/app/services/embedding_service.py which communicates with Ollama on the GPU for embeddings. GPU-first document extraction includes extracting DOCX files using backend/app/extractors/docx_extractor.py that calls Docling on elin GPU. GPU-first document extraction includes extracting PPTX files using backend/app/extractors/pptx_extractor.py that calls Docling on elin GPU. The GPU-first extraction system requires Ollama for generating embeddings on GPU using the nomic-embed-text embedding model to vectorize semantic chunks.

BusinessProcessIntent

GPU-first extraction pipeline

Operational process involving GPU-based Docling extraction and semantic chunking, with rich metadata, ensuring high-quality, scalable document processing. The Extraction Pipeline (GPU-First) includes the theo orchestration server that manages FastAPI backend, RQ workers, PostgreSQL metadata, DuckDB data storage, and Redis job queue. The Extraction Pipeline (GPU-First) utilizes the elin GPU processing server which hosts the RTX 4000 GPU, runs Docling for extraction, Ollama for embeddings, and CUDA 12.8.

CapabilityIntent

Grant Administration Cluster

Grant Administration Cluster relies on the Consolidation Mechanism for consolidating grant-related tables for analysis.

BusinessProcessIntent

Grant Administration Use Case

Use case modeling grant data analysis, potentially involving multi-table joins.

RequirementIntent

grants questions

Implemented schema detection, schema mapping, and cross-file join capabilities; tailored for grant data analysis.

BusinessProcessIntent

GRPO

GRPO is referenced as part of the document processing infrastructure for budget projects, supporting namespace organization and data segmentation among various project components.

BusinessProcessIntent

guided analysis

Guided analysis involves complex data exploration, combining manual and automated steps, with actors including backend systems, extraction services, AI models, and user interfaces, focused on data analysis and AI-driven insights.

RequirementIntent

Happy path and error scenarios

StakeholderIntent

HBS Economics

HBS Economics worked as a consultant on the SVGV Budget Analysis Project. HBS Economics is one of the consulting firms involved in the SVGV Budget Analysis Project.

RequirementIntent

heading hierarchy tracking

The DOCX extractor will track heading hierarchy levels to enable semantic section boundary detection during chunking.

AcceptanceCriteriaIntent

Health Checks

The Docker deployment includes Health checks for components. The Docker deployment uses health checks for monitoring service health Backend API implements health checks exposing HTTP GET / to validate readiness. Frontend implements health checks exposing HTTP GET / for availability verification.

UserStoryIntent

Hvad er den samlede budgetnedsættelse fra 2025 til 2028?

User story with goal to accurately calculate and display total budget reduction between 2025 and 2028, a key analytical question.

CapabilityIntent

Hybrid (manual + AI-Assist) Approach

The Hybrid (manual + AI-Assist) Approach uses Claude to generate a detailed project goal from a brief description. AI Generated Goals are derived using the Hybrid (manual + AI-Assist) Approach combining user brief input with AI expansion via Claude.

CapabilityIntent

Hybrid Search

DataLens implements Hybrid Search combining semantic search with Qdrant vectors and structured Text-to-SQL on DuckDB. Document RAG incorporates a hybrid SQL+RAG orchestrator to combine structured and unstructured queries.

RequirementIntent

ID matching join strategy

The Discovery Service applies the ID matching join strategy for join discovery with 85% confidence.

RequirementIntent

image metadata extraction

The PPTX extractor will extract image metadata such as image counts and types for added document context without OCR in Phase 2.