About The Role

We're looking for domain experts with strong machine learning backgrounds to design challenging ML evaluation problems that test the boundaries of state-of-the-art AI systems. You'll draw on your specialized research expertise to craft problems that go beyond textbook knowledge — the kind of challenges that require deep, nuanced domain understanding to solve correctly.

Your work directly shapes how we measure and improve the next generation of AI models.

* Organization: Alignerr

* Type: Hourly Contract

* Compensation: $200–$400 /hour

* Location: Remote

* Commitment: 10–40 hours/week

What You'll Do

* Propose complex, original machine learning problems rooted in your domain of expertise

* Design evaluation tasks that require advanced domain knowledge beyond standard ML pipelines

* Draw from your own research experience to craft problems that would challenge a highly capable LLM

* Define clear problem statements, evaluation criteria, and gold-standard solutions

* Assess AI-generated ML solutions for correctness, creativity, and methodological rigor

* Document problem difficulty, required domain knowledge, and expected failure modes

* Collaborate asynchronously with a global team of researchers and engineers

Who You Are

* Graduate-level expertise (MS or PhD preferred) in a scientific or technical domain that intersects with machine learning

* Strong working knowledge of ML methods — model selection, feature engineering, evaluation metrics, and pipeline design

* Deep familiarity with active research problems in your field

* Ability to identify where general ML knowledge falls short and specialized domain insight becomes critical

* Experience publishing or conducting original research is highly valued

* Excellent written communication — able to articulate complex problems clearly and precisely

* Self-motivated and comfortable working independently on intellectually demanding tasks

Example Domains (Not Exhaustive)

* Computational biology, genomics, or bioinformatics

* Climate science and environmental modeling

* Medical imaging and healthcare ML

* Materials science and computational chemistry

* Astrophysics and signal processing

* Natural language processing for low-resource or specialized corpora

* Robotics, control theory, or reinforcement learning in complex environments

* Financial modeling and quantitative analysis

Why Join Us

* Work at the frontier of AI evaluation and safety research

* Collaborate with top research labs pushing the boundaries of what AI can do

* Leverage your hard-earned domain expertise in a high-impact, meaningful way

* Full autonomy, flexible schedule, and global collaboration

* Potential for ongoing work, contract extension, and deeper research involvement

* Build your profile as a contributor to cutting-edge AI development

Application Process (Takes 10–15 min)

* Submit your resume highlighting your domain expertise and ML experience

* Complete a short screening assessment

* Project matching and onboarding

PS: Our team reviews applications daily. Please complete your application steps to be considered for this opportunity.

Machine Learning Evaluation Specialist

Skills & Expertise

Key Responsibilities

Full Description