About The Role

STEM Careers in Brazil at Rex.zone support AI/ML training workflows and large language model (LLM) evaluation. In this full-time remote engineering role, you will help improve model performance through high-quality training data, RLHF evaluation, and rigorous QA processes.

What You Will Do

* Support LLM training pipelines by validating datasets, rubrics, and evaluation protocols used in RLHF and offline evaluation

* Perform data labeling and QA evaluation across NLP, named entity recognition (NER), and computer vision annotation tasks

* Execute prompt evaluation and response grading to improve helpfulness, correctness, and policy compliance

* Apply annotation guidelines compliance checks, identify inconsistencies, and propose guideline clarifications

* Track training data quality metrics (agreement rates, defect taxonomy, escalation patterns) and recommend fixes

* Contribute to content safety labeling workflows (toxicity, self-harm, hate/harassment, privacy, and sensitive content categories)

* Collaborate asynchronously with cross-functional stakeholders (engineering, ops, QA) to unblock delivery and improve throughput

Required Qualifications

* Mid-Senior experience in a STEM discipline (engineering, computer science, data, or related technical field) or equivalent practical experience

* Hands-on experience with at least two of: data labeling, QA evaluation, prompt evaluation, RLHF workflows, or dataset auditing

* Strong written communication for producing clear guidelines, rubrics, and defect reports

* Working familiarity with NLP concepts (classification, NER, information extraction) and/or computer vision annotation

* Ability to follow strict annotation guidelines compliance while applying sound judgment to ambiguous edge cases

* Experience collaborating remotely across time zones with consistent delivery in full-time schedules

FAQ

Is this role remote?

Yes. Remote Type is Remote and the role is designed for full-time remote work.

Why does a STEM engineering role include RLHF, data labeling, and LLM evaluation?

Modern AI systems depend on high-quality training data and robust evaluation. This role supports LLM training pipelines through RLHF datasets, prompt evaluation, QA evaluation, and annotation guidelines compliance to drive model performance improvement.

STEM Careers in Brazil

Full Description