All Domains
1587 entities found
get_all_schemas
Gets complete schema info for all tables in a project, used for prompt building.
get_schema
Retrieves schema details for specific table in a project.
get_tables
Lists all tables within a project's DuckDB database.
GIN index
Git repository
Coolify server deployment method references Git repository for code or uploads files directly.
GitHub
The Backend integrates with GitHub as the source repository system.
GitHub API
The Coolify daemon integrates with the GitHub API to fetch repository source code during deployment builds.
GitHub repo
DataLens integrates with a GitHub repository for managing code and deployments.
Google services are listed as third-party AI providers available for integration to support AI inference and model hosting within DataLens.
Google's DS-STAR Paper
Governance rules
GPT-5.2
GPU
GPU server NVIDIA RTX 4000 SFF Ada 20GB deployed for Docling extraction and embedding tasks, supporting GPU-first data processing.
GPU acceleration
The deployment on theo lacks local GPU features, which constrains the availability of AI features like DS-STAR and Ollama inference.
GPU Box
The GPU Box refers to the GPU hardware infrastructure on elin, used for high-performance inference tasks including Ollama models and vector embeddings. It supports GPU-accelerated extraction, embedding, and document search, enabling efficient AI computations for the DataLens platform.
GPU embedding
GPU embedding has been planned but is not yet started, involving GPU acceleration for embedding large datasets or document chunks within DataLens. Ollama on elin provides GPU-accelerated embeddings with nomic-embed-text model for batch vectorization. GPU embeddings are stored in Qdrant vector database for semantic search and retrieval purposes.
GPU inference
GPU inference capability is implemented for real-time AI model inference tasks within DataLens, supporting document processing and embeddings with GPU acceleration.
GPU leverage
GPU Resource Management
GPU resource management policies require monitoring and usage of the shared RTX 4000 SFF Ada 20GB GPU for extraction and embedding tasks.
GPU usage
Infrastructure includes GPU usage as a monitored specification. GPU usage monitoring depends on the elin server. The DataLens DS-STAR Implementation Plan includes the GPU Infrastructure as a requirement. GPU Infrastructure requires deployment of vLLM with Qwen2.5-Coder-14B-AWQ model. GPU Infrastructure requires deployment of Qdrant vector database. GPU Infrastructure requires installation of DuckDB database system. GPU Infrastructure requires Python environment setup with all dependencies. GPU Infrastructure uses vLLM for large language model execution on elin. GPU Infrastructure includes the use of Qdrant vector database for semantic search capabilities. GPU Infrastructure uses vLLM for large language model execution on elin. GPU Infrastructure includes the use of Qdrant vector database for semantic search capabilities. GPU Infrastructure uses vLLM for large language model execution on elin. GPU Infrastructure includes the use of Qdrant vector database for semantic search capabilities. The plan considers GPU usage on elin especially for Ollama calls and embedding models.
GPU-accelerated workloads
GPU-accelerated workloads are in scope for development, currently not started.
GPU-first design
DataLens implements a GPU-first design by leveraging Ollama on elin for embeddings and inference. GPU-first document extraction uses Docling for DOCX and PPTX extraction as a mandatory component without fallback. The GPU-first document extraction uses the theo server for orchestration including FastAPI backend, RQ workers, and job queuing. GPU-first document extraction uses the RTX 4000 SFF Ada 20GB GPU on elin for document extraction and embeddings generation. The GPU-first document extraction implementation is validated by the test suite 'test_docling_extractors.py'. DataLens uses a GPU-first architecture leveraging Ollama on Elin GPU for embeddings and inference.
GPU-first document extraction system
The GPU-first document extraction system includes the Docling extraction system as the mandatory method for DOCX and PPTX extraction. The GPU-first document extraction system uses the RTX 4000 GPU on the elin server for fast document extraction and vectorization. The theo backend server orchestrates the GPU-first document extraction system by triggering extraction and processing over SSH to elin GPU. Phase 2 GPU-First Document Extraction involves GPU-first document extraction as its core capability. GPU-first document extraction relies exclusively on Docling for DOCX and PPTX file extraction with no fallback options. GPU-first document extraction is performed using Docling on the elin GPU server. GPU-first document extraction uses the embedding service in backend/app/services/embedding_service.py which communicates with Ollama on the GPU for embeddings. GPU-first document extraction includes extracting DOCX files using backend/app/extractors/docx_extractor.py that calls Docling on elin GPU. GPU-first document extraction includes extracting PPTX files using backend/app/extractors/pptx_extractor.py that calls Docling on elin GPU. The GPU-first extraction system requires Ollama for generating embeddings on GPU using the nomic-embed-text embedding model to vectorize semantic chunks.
GPU-first extraction pipeline
Operational process involving GPU-based Docling extraction and semantic chunking, with rich metadata, ensuring high-quality, scalable document processing. The Extraction Pipeline (GPU-First) includes the theo orchestration server that manages FastAPI backend, RQ workers, PostgreSQL metadata, DuckDB data storage, and Redis job queue. The Extraction Pipeline (GPU-First) utilizes the elin GPU processing server which hosts the RTX 4000 GPU, runs Docling for extraction, Ollama for embeddings, and CUDA 12.8.
GPU/DS-STAR access
Access to GPU and DS-STAR components is configured on elin for GPU-intensive extraction and AI tasks, enabling full pipeline operation without fallback. The backend has access to GPU and DS-STAR resources on elin. The backend runs as a systemd service on elin with GPU and DS-STAR access.
Grant Administration Cluster
Grant Administration Cluster relies on the Consolidation Mechanism for consolidating grant-related tables for analysis.
Grant Administration Use Case
Use case modeling grant data analysis, potentially involving multi-table joins.
grants questions
Implemented schema detection, schema mapping, and cross-file join capabilities; tailored for grant data analysis.
groq
Pydantic-ai-slim integrates with groq for AI model support.
GRPO
GRPO is referenced as part of the document processing infrastructure for budget projects, supporting namespace organization and data segmentation among various project components.