Lead Inference Platform Support Engineer - AI I

Refinitiv • toronto, on • Posted May 22, 2026
Position Overview

            # **Our Privacy Statement & Cookie Policy*** Optimize LLMs and ML models for high-performance inference using techniques such as quantization, pruning, distillation, and hardware specific tuning* Deploy and scale inference workloads on GPUs across AWS, Azure, GCP and internal Kubernetes clusters, ensuring predictable performance during peak traffic hours, especially during business hours* Implement routing and failover strategies for OpenAI/Anthropic/Vertex AI traffic* Integrate models into production grade APIs supporting TR products and enterprise workflows.* Develop highly optimized environment and eliminate performance bottlenecks to reduce latency* Collaborate with Platform Engineering teams (Landing Zones, Network, Storage, Compute, AI) to ensure inference workloads align with TR’s cloud native patterns (AWS, Azure, GCP, OCI)* Build and optimize containerized inference pipelines using Kubernetes for large‐scale distributed workloads* Ensure compliance with TR’s AI standards for d...