AdTechTalent
Engineering12 days agoHybrid

Criteo

Senior Site Reliability Engineer (GPU & ML Infrastructure)

site reliability engineeringSREGPUmachine learningML infrastructureRayKubernetesNVIDIA Tritondistributed systemsPythonGoC#inference servingcloud-nativeGKEEKSplatform engineering

Key details

Salary

Not specified

Employment type

Permanent Full Time

Seniority

Senior

Years experience

5-10

Location

Grenoble, France; Paris, France

Full job description

Senior Site Reliability Engineer role focused on GPU and ML infrastructure. Responsibilities include building and operating scalable Ray clusters on Kubernetes, developing self-service distributed computing platforms for ML workloads, and optimizing NVIDIA Triton inference platforms. Requires 5+ years experience in backend, SRE, or platform engineering with distributed systems, strong Kubernetes skills, hands-on GPU workload experience, and software engineering skills in C#, Python, or Go. Bonus for experience with distributed ML frameworks, inference serving stacks, GPU scheduling, and cloud-native GPU orchestration. Hybrid work model based in Paris and Grenoble, France. Benefits include hybrid work, career development, health and wellness support, inclusive team environment, competitive salary, and potential equity.

What you'll do

  • Build and operate scalable Ray clusters running on Kubernetes
  • Develop reliable self-service distributed computing platforms for ML workloads
  • Improve provisioning, observability, reliability, and operational efficiency of ray-as-a-service environments
  • Operate and optimize large-scale inference platforms using NVIDIA Triton Inference Server
  • Improve latency, throughput, scalability, and GPU utilization for deep learning inference workloads
  • Collaborate closely with ML engineers, data scientists, and infrastructure teams to deliver reliable, production-grade ML platforms

Requirements

  • 5+ years of experience in backend engineering, Site Reliability Engineering, or platform engineering roles focused on distributed systems
  • Strong experience with Kubernetes, including workload scheduling, dynamic provisioning, and custom controllers/operators
  • Hands-on experience running or optimizing GPU-based workloads in production, ideally for ML training or inference systems
  • Strong software engineering skills in C#, Python, Go, or similar languages, with a focus on building reliable distributed systems
  • Experience building or operating production-grade infrastructure with strong requirements around performance, scalability, and reliability
  • Strong interest in automation, observability, and designing systems that scale efficiently under high load
  • Bonus: Experience with distributed ML frameworks such as Ray or similar systems
  • Bonus: Familiarity with inference serving stacks such as NVIDIA Triton or TensorRT
  • Bonus: Experience with GPU scheduling, resource management, or multi-tenant GPU platforms
  • Bonus: Exposure to cloud-native GPU orchestration (GKE, EKS, or on-prem Kubernetes GPU clusters)

Tech stack

RayKubernetesNVIDIA Triton Inference ServerC#PythonGoTensorRTGKEEKS

Benefits

Hybrid work model blending home and in-office experiencesLearning, mentorship & career development programsHealth benefits, wellness perks & mental health supportDiverse, inclusive, and globally connected teamAttractive salary with performance-based rewards and family-friendly policiesPotential for equity depending on role and level

Apply now

This MVP uses a placeholder application flow. In production, this section can connect to an external apply URL or a native application form.

Similar jobs

More roles worth a look

Related opportunities based on specialty and working model so candidates can keep momentum.