Machine Learning Engineer
Financial News SystemsML Engineer - Financial NLP
We build fast, specialized NLP models that extract structured information from financial press releases: M&A deals, share buybacks, IPOs, earnings and more. Our system processes thousands of documents daily, turning unstructured text into production-ready structured data.
We're a small team working across a modern Python monorepo and we move fast. You'll own real problems end-to-end: from data pipelines and labeling through model training and optimized inference.
What You'll Do
- Train and iterate on models for NER on financial text
- Build and maintain data pipelines: database sampling, preprocessing, and training data assembly
- Design label refinement workflows
- Optimize models for production inference via ONNX export, quantization and runtime tuning
- Build Streamlit tools for model inspection, error analysis and annotation review
- Work across the full stack: label guidelines for our labelers, Hydra-based training configs, evaluation metrics and deployment
What Makes You a Great Fit
- MSc or PhD in Computer Science, Computational Linguistics, or a related field (or equivalent research/industry experience in NLP or ML)
- Experience training and deploying transformer-based NLP models, especially for NER or structured extraction
- Strong Python skills. Comfortable with PyTorch, HuggingFace Transformers and Pydantic
- Experience with data pipelines at scale. You've wrangled large, messy datasets and built reproducible workflows
- Familiarity with GCP/Azure/AWS or similar cloud ML infrastructure
- Comfortable with Docker for packaging and deploying ML workloads
- You care about data quality and understand that label quality drives model quality
- Pragmatic engineering instincts. You ship working systems, not over-engineered abstractions
Bonus Points
- Experience with ONNX Runtime optimization (quantization, OpenVINO, hardware-specific compilation)
- Experience with spaCy, LightGBM or other classical NLP/ML tools alongside deep learning
- Familiarity with financial texts
- Track record of building internal tools that accelerate team velocity
Our Stack
Python 3.12+ · PyTorch · HuggingFace Transformers · ONNX Runtime · GCP · Hydra · Pydantic · uv · DVC · Streamlit · spaCy
Questions?
Reach out to Rasmus Jones, our Head of Machine Learning