ServerOperations
elin (GPU processing)
The Extraction Pipeline (GPU-First) utilizes the elin GPU processing server which hosts the RTX 4000 GPU, runs Docling for extraction, Ollama for embeddings, and CUDA 12.8. The DataLens Platform uses the GPU box (elin) which hosts Ollama, Qdrant, and DS-STAR agents for AI capabilities. The theo backend integrates with the elin GPU server via SSH to orchestrate Docling extraction and embedding jobs on the GPU hardware running on elin. Docling extraction for DOCX and PPTX files requires the GPU hardware on the elin server to accelerate the extraction and embedding processes. The Backend API integrates with the Ollama GPU service to run LLM models qwen3-coder-next and nomic-embed-text for query classification and embeddings generation.