Founding MLE (On-device AI)
Greylock PartnersFull Description
Summary
Early-stage AI investment of ours, founded by a successful repeat entrepreneur, is looking to hire a Senior Machine Learning Engineer or Applied Research Scientist focused on efficient on-device and edge-deployed language models.
This will be one of the earliest ML hires and a foundational technical role inside the company. The person will help architect and build core ML systems focused on efficient inference, model optimization, deployment reliability, and production-scale edge AI infrastructure.
The role sits at the intersection of applied ML research, inference systems, and production engineering.
What You’ll Work On
* Architect and build core ML infrastructure for efficient language model deployment
* Optimize small language models for constrained compute and memory environments
* Improve inference latency, throughput, memory footprint, and deployment reliability
* Develop production-ready pipelines for model evaluation, benchmarking, deployment, and monitoring
* Translate experimental research code into scalable, maintainable production systems
* Work closely across research and engineering to productionize new model capabilities
* Help define long-term technical direction across edge AI and inference systems
Key Qualifications
* Strong experience deploying ML systems or language models in constrained runtime environments
* Deep understanding of model optimization techniques including quantization, distillation, pruning, and efficient inference
* Experience with modern inference runtimes, deployment frameworks, or accelerated ML systems
* Strong systems intuition around latency, memory efficiency, and real-time inference behavior
* Strong PyTorch experience and comfort operating across both research and production environments
* Experience building scalable ML infrastructure and evaluation pipelines
* Ability to operate independently in ambiguous, zero-to-one startup environments
* Prior experience leading small, high-impact technical initiatives or teams preferred
Strong Plus Signals
* Experience working with small language models (SLMs) or edge-deployed LLM systems
* Background in embedded AI, systems optimization, runtime engineering, or inference infrastructure
* Familiarity with low-level performance optimization or hardware-aware ML deployment
* Experience productionizing transformer-based systems in resource-constrained environments
Background
* Advanced degree in Computer Science, Electrical Engineering, Applied Mathematics, or related field preferred
* 4+ years of relevant industry or applied research experience
Please note:
There are no fees associated with any of the support we provide our investments. Greylock Talent provides free candidate referrals/introductions to all of our active investments (one of the many services we provide).
Due to the volume of applicants we typically receive, a follow-up email will not be sent unless a match is identified.