AI Safety & Alignment Specialist
Pasiflora AIFull Description
About Pasiflora AI
Pasiflora AI builds the human layer for frontier AI. We deploy small teams of credentialed specialists, PhDs, MDs, JDs, and senior practitioners across 431+ fields to produce the high-quality training, evaluation, and red-team data that frontier labs depend on. Our work powers reinforcement learning, alignment research, and post-training pipelines at some of the most demanding AI organizations in the world.
We're hiring AI Safety & Alignment Specialists into our expert network to lead and contribute to client engagements focused specifically on making AI systems safer, more aligned, and more reliable.
About the role
As an AI Safety & Alignment Specialist on Pasiflora's bench, you'll work on contract engagements with frontier AI labs and research organizations. Each engagement is scoped, time-bounded, and fully managed by Pasiflora. You focus on the work, not on chasing clients or wrangling logistics.
Typical engagements include:
* Designing and executing red-team evaluations to surface harmful, deceptive, or unsafe model behaviors
* Building and refining safety-focused scoring rubrics for RLHF and RLAIF pipelines
* Producing high-quality safety training data, including adversarial prompts, refusal examples, and edge-case scenarios
* Auditing existing model outputs against safety and alignment criteria
* Contributing to interpretability or behavioral analysis research projects
* Advising on safety taxonomies, threat models, and evaluation methodologies
Engagement length varies from one-week sprints to multi-month embedded work. We match you to projects that fit your specific expertise, availability, and interests.
What you'll do
* Apply your domain expertise to safety and alignment data work that directly improves how frontier models behave
* Operate flexibly across creating, evaluating, and mentoring — depending on the engagement, you may design tasks, audit other experts' work, or coach newer specialists
* Maintain Pasiflora's six-layer quality bar on every deliverable
* Communicate findings, concerns, and methodological recommendations clearly to client teams
* Stay current on the rapidly evolving safety and alignment research landscape
What we're looking for
Required:
* Graduate-level degree (PhD preferred, Master's considered with strong experience) in a relevant field: ML, CS, cognitive science, philosophy of mind, mathematics, statistics, or a directly related discipline
* Demonstrable expertise in at least one of: AI safety research, alignment research, red-teaming, model evaluation, interpretability, or AI policy with technical depth
* Prior hands-on experience with LLMs, RLHF pipelines, or AI evaluation frameworks
* Strong written communication — much of this work involves producing rubrics, written assessments, and structured findings
* Ability to work independently, manage your own deliverables, and meet quality standards without close supervision
Strongly preferred:
* Published research, technical reports, or public writing on AI safety, alignment, or related topics
* Prior contract or consulting work for frontier AI labs (Anthropic, OpenAI, DeepMind, Meta AI, etc.) or established AI safety organizations (MIRI, ARC, METR, Apollo Research, Redwood, etc.)
* Experience designing or executing red-team campaigns at scale
* Familiarity with current safety evaluation methodologies (e.g., dangerous capability evals, deception evals, sycophancy detection)
* Background in adjacent technical domains where safety expertise compounds: cybersecurity, biosecurity, formal verification, or game theory
Why work with Pasiflora
* Real impact, real models. Our engagements directly inform how frontier AI systems get trained and deployed. Your work matters.
* Fully managed engagements. You don't chase clients, write proposals, or handle billing. We bring you scoped work and you do it.
* Compounding expertise. Most specialists work across multiple engagements over time, building deep familiarity with the safety landscape across different labs and approaches.
* Small, elite teams. You'll work alongside other credentialed specialists, not generalist crowds. The quality of your peers raises the quality of the work.
* Flexible commitment. Take engagements that fit your schedule. Decline ones that don't.
How to apply
Submit your application through our expert portal. Applications are reviewed by domain leads, not generic recruiters. If you're a strong fit, you'll typically hear back within one week, with a screening conversation scheduled shortly after.
Apply to Pasiflora's expert network →
Pasiflora AI is an equal opportunity organization. We work with specialists from every background and geography. What matters is your expertise, your judgment, and the quality of your work.