ML Infrastructure Engineer - Quantum AI
kadenceFull Description
ML Infrastructure Engineer (Junior) Kadence Talent is partnered with a quantum hardware company building at the frontier of quantum-accelerated AI to find their next ML Infrastructure Engineer.
The Role
This team is doing genuinely novel work, quantum circuit simulation, large-scale numerical optimization, tensor network contractions, and AI model training and they need someone to make the compute infrastructure that powers it all just work. That means reliable GPU access, reproducible experiments, and workloads that scale without researchers having to become cloud experts.
You'll own the full stack from cloud provider configuration to the Python tooling researchers use to launch jobs. It's a high-ownership role at a small, fast-moving team where your work will be immediately visible and impactful.
What You'll Do
* Build job submission tooling and compute abstractions that handle diverse workloads, GPU simulation, distributed training, high-throughput CPU jobs, across PyTorch, JAX, and scientific computing frameworks
* Set up experiment tracking and reproducibility infrastructure so research is auditable and repeatable
* Manage and optimize cloud spend across multiple providers, track credits, burn rates, and flag problems before they surface
* Build CI/CD pipelines for research workloads: automated testing, benchmarks, and artifact management
* Support cross-functional teams beyond the core research group, including finance ops and hardware
What We're Looking For
* A solid foundation in cloud engineering : AWS or GCP, compute, storage, IAM basics
* Some exposure to ML or scientific computing environments (doesn't have to be deep)
* One area you've gone deeper: containerization, GPU instances, job schedulers, MLops tooling
* Curious, responsive, and comfortable asking questions when you're in new territory
* Fresh graduates and early-career engineers welcome
Nice to Have
* Hands-on experience with PyTorch or JAX
* Familiarity with tools like MLflow, W&B, or similar experiment tracking platforms
* GitHub Actions or container-based CI/CD
* Any exposure to HPC job schedulers or hybrid cloud/on-prem setups
Comp range in the target of $180k-$240k