Intent
495 entities found
Full Findings Visualization Layer
findings_generator.py realizes the Full Findings Visualization Layer capability by producing structured Finding objects including metrics, trends, and outliers. The Full Findings Visualization Layer capability requires the support of 7 filter categories for filtering findings in the UI. The Full Findings Visualization Layer depends on integration with Agent.py to return findings on each query. The findings_generator.py service implements the Full Findings Visualization Layer by analyzing query results and generating structured findings. The MetricCard.svelte UI component is part of the Full Findings Visualization Layer providing KPI display with statistics. The TrendChart.svelte UI component is part of the Full Findings Visualization Layer and provides SVG line charts with grid. The OutlierHighlight.svelte UI component is part of the Full Findings Visualization Layer showing anomaly detection displays. The FindingsPanelNew.svelte UI component is part of the Full Findings Visualization Layer serving as a master panel with filtering capabilities. The Full Findings Visualization Layer is planned to be integrated with agent.py to return findings on each query.
GDPR PII detection
GDPR PII detection results are recorded in the project_gdpr_flags table to track personal data sensitivity.
GdprDetector
GdprDetector uses StorageService to scan project data for GDPR-relevant personal data indicators.
GenBI Insights
WrenAI generates AI summaries and charts as part of its generated insights.
Generate Insights
Process for automatically creating summaries, charts, or insights from data, part of the system's output capabilities.
generate-report
DataLens Agent Mode includes the generate-report skill to compile findings into structured markdown reports.
GenerateReportSkill
GenerateReportSkill uses SkillResult to compile findings into structured reports.
Google's DS-STAR Paper
Governance rules
GPU embedding
GPU embedding has been planned but is not yet started, involving GPU acceleration for embedding large datasets or document chunks within DataLens. Ollama on elin provides GPU-accelerated embeddings with nomic-embed-text model for batch vectorization. GPU embeddings are stored in Qdrant vector database for semantic search and retrieval purposes.
GPU inference
GPU inference capability is implemented for real-time AI model inference tasks within DataLens, supporting document processing and embeddings with GPU acceleration.
GPU leverage
GPU Resource Management
GPU resource management policies require monitoring and usage of the shared RTX 4000 SFF Ada 20GB GPU for extraction and embedding tasks.
GPU-accelerated workloads
GPU-accelerated workloads are in scope for development, currently not started.
GPU-first document extraction system
The GPU-first document extraction system includes the Docling extraction system as the mandatory method for DOCX and PPTX extraction. The GPU-first document extraction system uses the RTX 4000 GPU on the elin server for fast document extraction and vectorization. The theo backend server orchestrates the GPU-first document extraction system by triggering extraction and processing over SSH to elin GPU. Phase 2 GPU-First Document Extraction involves GPU-first document extraction as its core capability. GPU-first document extraction relies exclusively on Docling for DOCX and PPTX file extraction with no fallback options. GPU-first document extraction is performed using Docling on the elin GPU server. GPU-first document extraction uses the embedding service in backend/app/services/embedding_service.py which communicates with Ollama on the GPU for embeddings. GPU-first document extraction includes extracting DOCX files using backend/app/extractors/docx_extractor.py that calls Docling on elin GPU. GPU-first document extraction includes extracting PPTX files using backend/app/extractors/pptx_extractor.py that calls Docling on elin GPU. The GPU-first extraction system requires Ollama for generating embeddings on GPU using the nomic-embed-text embedding model to vectorize semantic chunks.
GPU-first extraction pipeline
Operational process involving GPU-based Docling extraction and semantic chunking, with rich metadata, ensuring high-quality, scalable document processing. The Extraction Pipeline (GPU-First) includes the theo orchestration server that manages FastAPI backend, RQ workers, PostgreSQL metadata, DuckDB data storage, and Redis job queue. The Extraction Pipeline (GPU-First) utilizes the elin GPU processing server which hosts the RTX 4000 GPU, runs Docling for extraction, Ollama for embeddings, and CUDA 12.8.
Grant Administration Cluster
Grant Administration Cluster relies on the Consolidation Mechanism for consolidating grant-related tables for analysis.
Grant Administration Use Case
Use case modeling grant data analysis, potentially involving multi-table joins.
grants questions
Implemented schema detection, schema mapping, and cross-file join capabilities; tailored for grant data analysis.
GRPO
GRPO is referenced as part of the document processing infrastructure for budget projects, supporting namespace organization and data segmentation among various project components.
guided analysis
Guided analysis involves complex data exploration, combining manual and automated steps, with actors including backend systems, extraction services, AI models, and user interfaces, focused on data analysis and AI-driven insights.
Happy path and error scenarios
HBS Economics
HBS Economics worked as a consultant on the SVGV Budget Analysis Project. HBS Economics is one of the consulting firms involved in the SVGV Budget Analysis Project.
heading hierarchy tracking
The DOCX extractor will track heading hierarchy levels to enable semantic section boundary detection during chunking.
Health Checks
The Docker deployment includes Health checks for components. The Docker deployment uses health checks for monitoring service health Backend API implements health checks exposing HTTP GET / to validate readiness. Frontend implements health checks exposing HTTP GET / for availability verification.
Hvad er den samlede budgetnedsættelse fra 2025 til 2028?
User story with goal to accurately calculate and display total budget reduction between 2025 and 2028, a key analytical question.
Hybrid (manual + AI-Assist) Approach
The Hybrid (manual + AI-Assist) Approach uses Claude to generate a detailed project goal from a brief description. AI Generated Goals are derived using the Hybrid (manual + AI-Assist) Approach combining user brief input with AI expansion via Claude.
Hybrid Search
DataLens implements Hybrid Search combining semantic search with Qdrant vectors and structured Text-to-SQL on DuckDB. Document RAG incorporates a hybrid SQL+RAG orchestrator to combine structured and unstructured queries.
ID matching join strategy
The Discovery Service applies the ID matching join strategy for join discovery with 85% confidence.
image metadata extraction
The PPTX extractor will extract image metadata such as image counts and types for added document context without OCR in Phase 2.