Back to jobs

Software/Production ML Engineer (Python, AWS)

Sphere
North Miami Beach, FL
Full-time
AI tools:
AWS
scikit-learn
pandas
numpy

We are looking for a Software/Production ML Engineer to own and evolve real-world, production-grade AI systems within a fast-paced insurance technology company.

This is a hands-on engineering role focused on building, deploying, and operating customer-facing and internal AI services in production. Our team owns multiple live systems, including real-time decisioning pipelines, AI-driven operational automations, chatbots, and the ML infrastructure that powers them.

This role is not focused on offline modeling or research-only machine learning. We are looking for engineers who take end-to-end ownership of ML systems - from data and features, to inference services, deployment, monitoring, and on-call support in production environments. Candidates whose experience is primarily limited to offline modeling, experimentation, or handoff-based deployment workflows will not be a good fit for this role.

Responsibilities

* Design and build APIs and pub/sub event streams to support real-time machine learning inference and automated agentic processes.

* Play a role in the development and maintenance of both online and offline feature stores for machine learning.

* Gain familiarity with the property casualty insurance sector, including key policyholder and product attributes, to help enhance model effectiveness.

* Implement industry-standard MLOps and LLMOps techniques to monitor ML models, feature sets, and agentic systems for performance degradation and data drift.

* Support the ongoing development of our core MLOps platform, as well as the codebase and infrastructure for serverless AI applications.

* Validate the performance of machine learning models through rigorous training and testing methodologies.

* Collaborate with Data Science teams to engineer new features, construct transformation pipelines, integrate custom loss functions, and experiment with novel inference strategies such as chaining and shadow deployments.

* Create and scale new agentic AI automations, guiding them from initial proof-of-concept through to full production deployment.

* Construct evaluation frameworks designed to rigorously test AI applications, covering not only standard workflows but also the complex, real-world scenarios common to the car insurance domain.

* Utilize the Python data ecosystem to execute machine learning projects and initiatives.

* Take part in the team's weekly on-call rotation, addressing alerts promptly to maintain high service availability for both customers and internal stakeholders.

Requirements

* Experience writing production-quality Python code.

* Experience with Python data science and machine learning libraries, including scikit-learn, pandas, numpy, and related tools.

* Experience deploying, operating, and supporting ML or AI services in production, including monitoring, incident response, and iterative improvement.

* Hands-on experience with AWS (e.g., Lambda, Step Functions, DynamoDB, IAM, containerized services).

* Experience with Kafka or other event-driven / pub-sub systems.

* Experience with Git and CI/CD pipelines in production environments.

Nice to have

* Experience building or operating MLOps platforms or ML infrastructure.

* Experience with real-time data pipelines and streaming architectures.

* Experience with AI chatbots, LLM-based systems, or retrieval-augmented generation (RAG).

* Familiarity with feature stores, model monitoring, and deployment strategies such as A/B or shadow deployments.

Applications go to the hiring team directly