Project: datalens
81 entity types
Matrix/Integrations/extract_file_job
BatchJobIntegrations

extract_file_job

A batch job responsible for extracting data from files like DOCX, PPTX, PDF, and Excel, triggering GPU extraction workflows, with no specific recurrence or failure consequences detailed. The RQ job queue manages the execution of the extract_file_job() function with a timeout for large files. The RQ extraction queue on redis triggers the extract_file_job(file_id) function to process file extraction jobs for summaries. The extract_file_job function is executed by the RQ worker to process extraction of files. The batch job extract_file_job performs DOCX and PPTX extraction via SSH to elin using Docling, producing JSON results.