Type: Long-term contract

Location: Remote (overlap with PST)

At Sphere, we partner with global logistics company leveraging AI, Machine Learning, and Data Engineering to optimize warehouse operations, predictive maintenance, and route planning.

Role: Build and maintain scalable AI infrastructure, enabling teams to run ML experiments, deploy machine learning models, and implement MLOps pipelines for production-grade AI.

Responsibilities

* Design distributed training pipelines for large-scale machine learning and deep learning models.

* Optimize compute and storage resources for cloud-based AI/ML workloads on AWS, GCP, or Azure.

* Collaborate with data scientists and ML engineers to deploy models in production efficiently.

* Implement monitoring, logging, and alerting for model performance and AI workflows.

* Ensure scalable, maintainable, and reliable AI infrastructure to support real-time and batch ML applications.

Requirements

* 5+ years in Python and ML infrastructure.

* Experience in cloud AI platforms (AWS Sagemaker, GCP AI Platform, Azure ML).

* Experience with containerization (Docker), orchestration (Kubernetes), and CI/CD for ML.

* Experience with distributed systems, data pipelines, and high-performance computing for AI.

* Hands-on with deep learning frameworks like TensorFlow or PyTorch.

Lead AI Infrastructure Engineer (Python/ML )