Job Overview:

We are looking for a highly skilled AI / LLM Engineer to lead the training, alignment, and optimization of large language models. This role focuses on Reinforcement Learning from Human Feedback (RLHF) and end-to-end post-training pipelines, while ensuring models are efficient and production-ready.

Key Responsibilities:

* Lead and manage the end-to-end RLHF pipeline (data collection, reward modeling, RL fine-tuning – PPO, DPO, GRPO, RLAIF)

* Design and implement Supervised Fine-Tuning (SFT) pipelines using models like LLaMA, Mistral, and Qwen

* Build and train reward models based on human feedback

* Develop annotation pipelines (guidelines, calibration, dataset curation)

* Apply Constitutional AI & RLAIF to reduce manual labeling

* Perform model evaluation & red teaming for safety and quality

* Create benchmarks for performance, alignment, and reliability

* Optimize inference pipelines (llama.cpp, vLLM, TensorRT)

* Implement model optimization (INT4/INT8/FP8, LoRA, QLoRA)

* Troubleshoot training issues & production bugs

* Collaborate with teams to bring research into production

* Stay updated with the latest in AI, RL, and LLM advancements

Qualifications:

* Bachelor’s degree in Computer Science, Engineering, Mathematics, or related field

* Proven 3-5 years' experience in AI/ML, NLP, or LLM development

* Strong understanding of Reinforcement Learning and RLHF

Required Skills:

* Hands-on experience with end-to-end RLHF pipelines

* Strong knowledge of PPO, KL divergence, reward shaping

* Experience with DPO and related techniques

* Familiarity with RLAIF & Constitutional AI

* Strong Python programming skills

* Solid background in math (probability, linear algebra, optimization)

* Experience troubleshooting training instabilities

* Understanding of transformers & LLM workflows

* Experience with distributed training

Technical Skills:

* Languages: Python, C/C++, Rust (preferred)

* Frameworks: PyTorch, TensorFlow, Hugging Face

* Inference Tools: llama.cpp, vLLM, TensorRT

* Data Tools: FAISS, Milvus, RAG pipelines

* Integration: APIs, agent systems, external tools

Good to Have:

* Experience with vector databases & retrieval systems

* Exposure to Rapid Application Development (RAD)

* Strong interest in AI alignment & safety

Why Join Us?

* Work on cutting-edge AI technologies

* Build impactful, real-world LLM solutions

* Be part of an innovative and collaborative team

Additional information:

Location: Preferably candidates based in the Philippines.

Availability: Can start immediately or as soon as possible.

Flexibility: Open to any shift assignment, including the graveyard shift.

Artificial Intelligence Engineer

Full Description