Senior AI Ops Engineer
VericenceFull Description
About Vericence
Vericence is a digital engineering and technology consulting firm helping enterprises build AI-driven platforms, modernize legacy systems, and scale innovation through cloud, data, and intelligent engineering. We partner with global organizations to deliver high-impact technology solutions and build world-class engineering teams.
Role Overview
We are seeking a highly skilled Senior AI Ops Engineer to support and scale enterprise AI/ML platforms within a strategic Oracle Cloud (OCI) ecosystem. This role sits at the intersection of AI engineering, platform engineering, and DevOps, focused on building, deploying, and maintaining reliable, scalable AI systems.
The ideal candidate will have strong hands-on experience with Kubernetes, Python, distributed systems, and cloud-native architectures, with a preference for Oracle Cloud Infrastructure (OCI).
Key Responsibilities:
* Design, deploy, and manage AI/ML infrastructure and pipelines on cloud platforms (OCI)
* Build and maintain scalable, production-grade AI platforms using Kubernetes and containerization
* Develop and optimize CI/CD pipelines for ML workflows (training, validation, deployment, monitoring)
* Collaborate with data scientists and engineers to operationalize ML models into production
* Implement monitoring, logging, and observability frameworks for AI systems
* Ensure high availability, performance, and reliability of distributed AI systems
* Work on infrastructure-as-code (IaC) and automation for repeatable deployments
* Optimize compute, storage, and networking for AI workloads on cloud platforms
* Support MLOps best practices, including model versioning, governance, and lifecycle management
Required Skills & Experience
* 6-10+ years of experience in AI/ML Engineering, DevOps, or Platform Engineering
* Strong programming experience in Python
* Hands-on expertise with Kubernetes (EKS/AKS/OKE preferred)
* Experience working with distributed systems and microservices architecture
* Strong experience with CI/CD tools (GitHub Actions, Jenkins, etc.)
* Experience with containerization (Docker)
* Exposure to MLOps frameworks and tools
* Solid understanding of cloud platforms (OCI strongly preferred; AWS/Azure acceptable)