Project: datalens
81 entity types
Matrix/Architecture/GPU-first design
DesignDecisionArchitecture

GPU-first design

DataLens implements a GPU-first design by leveraging Ollama on elin for embeddings and inference. GPU-first document extraction uses Docling for DOCX and PPTX extraction as a mandatory component without fallback. The GPU-first document extraction uses the theo server for orchestration including FastAPI backend, RQ workers, and job queuing. GPU-first document extraction uses the RTX 4000 SFF Ada 20GB GPU on elin for document extraction and embeddings generation. The GPU-first document extraction implementation is validated by the test suite 'test_docling_extractors.py'. DataLens uses a GPU-first architecture leveraging Ollama on Elin GPU for embeddings and inference.

Attributes
labelsDesignDecision,Entity
rationaleThe decision to implement a GPU-first design was made to leverage GPU acceleration for data analysis tasks, enabling faster processing of large datasets and complex models, particularly with tools like Ollama for embeddings and inference, and to optimize performance for hybrid search functionalities.
alternatives consideredUsing solely CPU-based architectures or cloud-hosted GPU services without a dedicated GPU-first approach.
decided byJesper
decision date2023-10-01
tier2
Relationships6 connections
Loading graph...