MLOps Platform Engineer
Cliff Services IncReston VA Onsite
W2
Description
The Data Modeling Analytics & AI Engineering team is seeking an experienced MLOps Platform Engineer to design, build, and support enterprise-grade machine learning operations capabilities. This role will play a key part in enabling scalable, reliable, and secure ML model development and deployment across our cloud and container platforms.
This is a hands-on engineering role requiring strong expertise in AWS, Kubernetes (EKS), CI/CD automation, containerization, and ML platform operations. The ideal candidate will have solid engineering fundamentals combined with practical knowledge of ML workflows, deployment patterns, and platform reliability.
Key Responsibilities
Platform Engineering & Operations
* Engineer, manage, and support MLOps platform components across AWS and EKS-based environments.
* Oversee deployment, configuration, and operation of infrastructure used for ML training, batch inference, and real-time model serving.
* Ensure platform availability, resilience, and performance across dev, test, and production environments.
* Implement role-based access controls (RBAC), network policies, and scalable namespace designs within EKS.
Model Deployment & CI/CD Automation
* Build and support CI/CD pipelines (GitLab) for model packaging, container image builds, vulnerability scanning, and automated deployment flows.
* Enable standardized model release processes including environment promotion, versioning, and rollback workflows.
* Integrate CI/CD with ML frameworks, model repositories, artifacts, and runtime environments.
Container & Kubernetes Workloads
* Design and manage EKS workloads supporting containerized ML jobs and microservices.
* Implement auto-scaling, resource quotas, cluster optimization, and multi-tenant workload isolation.
* Support GPU and CPU-based training/inference workloads.
Monitoring, Observability & Optimization
* Implement logging, monitoring, and alerting for ML pipelines, model endpoints, batch jobs, and platform components.
* Analyze compute, storage, and data transfer usage to optimize cost efficiency across ML workloads.
* Perform incident response, root cause analysis, and long-term remediation planning.
Collaboration & Enablement
* Partner with Data Scientists, ML Engineers, and application teams to operationalize end-to-end machine learning solutions.
* Provide technical guidance on best practices for ML model lifecycle management, deployment patterns, and scalable architectures.
* Contribute to documentation, runbooks, onboarding materials, and internal knowledge bases.
Required Qualifications
* 3+ years of hands-on experience with AWS services, including EKS, EC2, S3, IAM, CloudWatch, and ECR.
* Strong experience operating and troubleshooting Kubernetes (preferably AWS EKS).
* Proficiency in containerization (Docker) and orchestration concepts.
* Strong programming/scripting experience in Python and Bash.
* Experience building and managing CI/CD pipelines (GitLab or equivalent).
* Familiarity with machine learning workflows, including training, inference, and model monitoring.
* Experience with infrastructure-as-code (Terraform or CloudFormation).
* Experience supporting production platforms, including incident management and root cause analysis.
Preferred Qualifications
* Experience managing Data Analytics Platforms / Tools (e.g., Domino, SageMaker)
* Experience with ML lifecycle tools such as MLflow, or similar.
* Experience supporting GPU-based workloads or distributed training environments.
* Familiarity with enterprise MLOps architectures and patterns (batch, real-time, microservices).
* Understanding of data processing frameworks and feature pipelines.
Other Competencies
* Strong analytical, troubleshooting, and problem-solving skills.
* Effective communication and documentation abilities.
* Ability to collaborate across engineering, analytics, and product teams.
* Self-motivated with the ability to drive initiatives independently.
* Ability to work in a complex, regulated enterprise environment.
Thanks & Regards