Back to jobs

Founding Machine Learning Scientist (Molecular AI)

Novogaia
San Francisco Bay Area
Full-time
AI tools:
PyTorch
Applications go directly to the hiring team

Full Description

Company description

We are building computational systems to discover and develop small molecule medicines from fungi. Nearly half of all oral medicines originate from natural molecules, yet discovery from nature has historically been slow. Advances in mass spectrometry and computation now make it possible to systematically explore nature’s chemical diversity at scale.

We recently introduced Gaia-01, a 1B-parameter foundation model for molecular structure prediction from mass spectrometry that outperforms current state-of-the-art systems on the MassSpecGym benchmark. We are now developing the next generation of this model.

Role description

We are looking for a founding machine learning scientist to design and advance models that infer molecular structure and properties directly from mass spectrometry data.

You will take ownership of the next iteration of our molecular foundation model (Gaia-02), extending spectrum-to-structure prediction into broader molecular reasoning and downstream applications. This role sits at the intersection of machine learning, chemistry, and metabolomics, and involves close collaboration with computational biology and experimental teams.

This is a hands-on, fast-paced role in an early-stage company with significant autonomy and technical responsibility.

What you’ll own

* Lead the development of the next generation of our molecular foundation model for mass spectrometry

* Design and train models for mass spectra to molecular structure inference

* Develop latent molecular representations from MS/MS and related data

* Extend structure predictions into downstream molecular reasoning (e.g., bioactivity, prioritization)

Core experience

* Experience developing and training machine learning models in PyTorch or similar frameworks

* Experience designing novel modeling approaches and implementing the latest methods from the literature

* Ability to independently scope and execute research problems involving large, high-dimensional datasets, including handling noise and distributional shifts

* Experience training models at scale (cloud or HPC environments)

* Strong software engineering skills in Python, including writing clean, well-structured, production-quality code

Nice-to-haves

* Experience with generative models (e.g., autoregressive, diffusion, flow) and geometric deep learning (e.g., GNNs, Deep sets, EGNNs)

* Experience working with molecular, chemical, or spectral datasets

* Familiarity with metabolomics or mass spectrometry workflows and computational models (e.g., MIST, DreaMS, ICEBERG)

What we offer

* Hands-on involvement in tackling unresolved scientific problems, with the opportunity to shape how we think, work, and build from day one

* Competitive salary and early-stage equity - you will be a founding member of this team, and your compensation reflects it

* Comprehensive benefits: medical, dental, and 401(k)

* Visa sponsorship available

We value agency, technical depth, and learning velocity more than years of experience.

If you find this exciting and think you'd be a great fit, we’d love to hear from you. We can go from first conversation to offer decision in days.

To apply, email us at [email protected]. We do not review LinkedIn applications.

Applications go to the hiring team directly