Bagel Labs is an Artificial Intelligence Research Lab developing novel methods for distributed training of frontier diffusion models on commodity hardware. Our work enables training of state-of-the-art generative models for image, video, and world modelling, without centralized GPU superclusters, reducing training compute capex by up to 50%.

We ignore years of experience and pedigree. If you have high agency — meaning your default assumption is that you can control the outcome of whatever situation you are in — we want to hear from you. Every requirement below is flexible for a candidate with high enough agency and tolerance for ambiguity.

Role Overview

We encourage curiosity-driven research and welcome bold, untested concepts. You will push the boundaries of machine learning and distributed systems, testing hypotheses across generative AI, representation learning, and scalable infrastructure. We love novel, provocative, untested ideas that challenge conventional paradigms.

Key Responsibilities

* Develop novel training algorithms, optimization strategies, and model architectures that unlock new capabilities in efficiency, robustness, or generalization.

* Design and train large-scale models — spanning generative, discriminative, and self-supervised paradigms — that can scale across distributed infrastructures.

* Publish papers at top-tier ML venues, organize workshops, and keep our roadmap aligned with the latest academic advances.

* Share insights through internal notes, external blog posts, and conference-grade write-ups (e.g., blog.bagel.com).

* Contribute to open-source code and stay active in the ML community.

Who You Might Be

* Strong foundations in modern deep learning: transformer architectures, self-supervised learning, generative modelling, or reinforcement learning.

* Hands-on experience with distributed training frameworks: FSDP, DeepSpeed, Megatron-LM, or custom implementations of tensor/pipeline/data parallelism.

* Solid mathematical grounding in optimization, probability, statistics, and linear algebra — comfortable deriving and implementing novel objectives from scratch.

What We Offer

* Top-of-market compensation and time to pursue open-ended research.

* A deeply technical culture where bold, frontier ideas are debated, stress-tested, and built.

* Publishing and paid travel opportunities to the top ML conferences around the world.

Machine Learning Researcher