Back to jobs

Machine Learning Engineer | Remote

Crossing Hurdles
United States
Contract
7,000 – 12,000 / year
Applications go directly to the hiring team

Join Crossing Hurdles as a PhD Rater in a part-time, remote role where you will tackle real-world STEM problems in data science and machine learning. You'll collaborate with a team dedicated to improving AI model evaluation and contribute to innovative projects that have a significant impact on the field.

Contract
Fully Remote
Entry Level
PhD

Skills & Expertise

Python
Data Science
Machine Learning
Software Engineering
AI Model Evaluation
Research
Benchmark Problems

Key Responsibilities

Design challenging, real-world STEM benchmark problems in various domains.

Implement tasks using Python within an agentic development environment.

Evaluate and analyze AI model behavior and diagnose reasoning failures.

Full Description

Position: PhD Rater

Type: Part-Time

Compensation: $70–$120/hour

Location: Remote

Commitment: 30+ hours/week (primarily weekdays)

Role Responsibilities

* Design challenging, real-world STEM benchmark problems in domains such as data science, machine learning, finance, and software engineering.

* Implement tasks within an agentic development environment using Python.

* Create reproducible problem setups with clear specifications and executable tests.

* Evaluate and analyze AI model behavior, including reasoning traces and agent workflows.

* Diagnose reasoning failures, logic gaps, and problem-solving limitations in AI systems.

* Contribute to improving benchmark quality and evaluation frameworks for frontier AI models.

Requirements

* Active or recently graduated PhD.

* Deep expertise in data science, machine learning, finance, and/or Python-based software development.

* Strong research background in advanced STEM topics.

* Ability to commit reliably for 30+ hours per week.

* Demonstrated technical output such as high-quality open-source contributions or research work.

* Ability to analyze agent behavior traces and diagnose failures beyond surface-level errors.

Application Process

* Upload resume

* Interview

* Submit form

Applications go to the hiring team directly