Remote data engineer - 78243
Turing • Noida, Uttar Pradesh • Posted May 23, 2026
Position Overview
70% precision
- Has implemented a golden dataset or ML evaluation dataset management system with versioning and lineage
- Has built multi-format document ingestion pipelines processing 10 K+ documents reliably
- Has implemented physical data separation for a compliance-sensitive system
Project-Specific Skills and Domain Knowledge
Must-Have:
- Experience implementing semantic chunking strategies for different document types (transcripts, reports, code) with measurable retrieval quality impact
- Experience building data pipelines that integrate with LLM APIs for extraction and augmentation tasks
- Experience implementing physical data separation (separate storage, separate schemas) for compliance-sensitive ML datasets
- Experience with Timescale DB or equivalent time-series databases for metrics and cost tracking
PREFERRED QUALIFICATIONS
- Experience with knowledge graph data models and entity resolution pipelines
- Experience operating data infrastruc...
- Has implemented a golden dataset or ML evaluation dataset management system with versioning and lineage
- Has built multi-format document ingestion pipelines processing 10 K+ documents reliably
- Has implemented physical data separation for a compliance-sensitive system
Project-Specific Skills and Domain Knowledge
Must-Have:
- Experience implementing semantic chunking strategies for different document types (transcripts, reports, code) with measurable retrieval quality impact
- Experience building data pipelines that integrate with LLM APIs for extraction and augmentation tasks
- Experience implementing physical data separation (separate storage, separate schemas) for compliance-sensitive ML datasets
- Experience with Timescale DB or equivalent time-series databases for metrics and cost tracking
PREFERRED QUALIFICATIONS
- Experience with knowledge graph data models and entity resolution pipelines
- Experience operating data infrastruc...