ThirdPartyComponentArchitecture
Ollama (GPU inference on elin)
Runs local LLM models like qwen3-coder-next on elin's GPU for inference, integral for autonomous cataloging and text-to-SQL. Ollama and Arctic LLM inference engines are used by the Data Discovery system for intelligent table discovery and query processing. Qdrant Service uses Ollama Embedding Service to create vector embeddings.