Back to jobs

Machine Learning Evaluation Specialist

Alignerr
Toronto, Ontario, Canada
Contract
20,000 – 40,000 / year
AI tools:
LLMs
Applications go directly to the hiring team

Full Description

About The Role

We're looking for domain experts with strong machine learning backgrounds to design challenging ML evaluation problems that test the boundaries of state-of-the-art AI systems. You'll draw on your specialized research expertise to craft problems that go beyond textbook knowledge — the kind of challenges that require deep, nuanced domain understanding to solve correctly.

Your work directly shapes how we measure and improve the next generation of AI models.

* Organization: Alignerr

* Type: Hourly Contract

* Compensation: $200–$400 /hour

* Location: Remote

* Commitment: 10–40 hours/week

What You'll Do

* Propose complex, original machine learning problems rooted in your domain of expertise

* Design evaluation tasks that require advanced domain knowledge beyond standard ML pipelines

* Draw from your own research experience to craft problems that would challenge a highly capable LLM

* Define clear problem statements, evaluation criteria, and gold-standard solutions

* Assess AI-generated ML solutions for correctness, creativity, and methodological rigor

* Document problem difficulty, required domain knowledge, and expected failure modes

* Collaborate asynchronously with a global team of researchers and engineers

Who You Are

* Graduate-level expertise (MS or PhD preferred) in a scientific or technical domain that intersects with machine learning

* Strong working knowledge of ML methods — model selection, feature engineering, evaluation metrics, and pipeline design

* Deep familiarity with active research problems in your field

* Ability to identify where general ML knowledge falls short and specialized domain insight becomes critical

* Experience publishing or conducting original research is highly valued

* Excellent written communication — able to articulate complex problems clearly and precisely

* Self-motivated and comfortable working independently on intellectually demanding tasks

Example Domains (Not Exhaustive)

* Computational biology, genomics, or bioinformatics

* Climate science and environmental modeling

* Medical imaging and healthcare ML

* Materials science and computational chemistry

* Astrophysics and signal processing

* Natural language processing for low-resource or specialized corpora

* Robotics, control theory, or reinforcement learning in complex environments

* Financial modeling and quantitative analysis

Why Join Us

* Work at the frontier of AI evaluation and safety research

* Collaborate with top research labs pushing the boundaries of what AI can do

* Leverage your hard-earned domain expertise in a high-impact, meaningful way

* Full autonomy, flexible schedule, and global collaboration

* Potential for ongoing work, contract extension, and deeper research involvement

* Build your profile as a contributor to cutting-edge AI development

Application Process (Takes 10–15 min)

* Submit your resume highlighting your domain expertise and ML experience

* Complete a short screening assessment

* Project matching and onboarding

PS: Our team reviews applications daily. Please complete your application steps to be considered for this opportunity.

Applications go to the hiring team directly