Remote data engineer - 78243

Turing • Noida, Uttar Pradesh • Posted May 23, 2026

Position Overview

70% precision
- Has implemented a golden dataset or ML evaluation dataset management system with versioning and lineage
- Has built multi-format document ingestion pipelines processing 10 K+ documents reliably
- Has implemented physical data separation for a compliance-sensitive system
Project-Specific Skills and Domain Knowledge
Must-Have:
- Experience implementing semantic chunking strategies for different document types (transcripts, reports, code) with measurable retrieval quality impact
- Experience building data pipelines that integrate with LLM APIs for extraction and augmentation tasks
- Experience implementing physical data separation (separate storage, separate schemas) for compliance-sensitive ML datasets
- Experience with Timescale DB or equivalent time-series databases for metrics and cost tracking
PREFERRED QUALIFICATIONS
- Experience with knowledge graph data models and entity resolution pipelines
- Experience operating data infrastruc...