We are sharing a specialised part-time consulting opportunity for bilingual professionals with strong English and Arabic language skills, analytical judgment, and experience evaluating general-purpose conversational AI systems.

This role supports leading AI teams working to improve the quality, usefulness, and reliability of large language models used across a wide range of everyday and professional scenarios.

Selected professionals will help evaluate model-generated responses, fact-check outputs using trusted public sources, provide structured human feedback, and contribute to improving how advanced AI systems communicate across diverse topics and use cases.

Key Responsibilities

Professionals in this role may contribute to:

AI Response Evaluation

Evaluate LLM-generated responses based on their ability to effectively answer user queries

Assess reasoning quality, clarity, tone, completeness, and overall usefulness of model outputs

Ensure responses align with expected conversational behaviour and system guidelines

Fact-Checking & Feedback

Conduct fact-checking using trusted public sources and external tools

Generate high-quality human evaluation data by annotating response strengths, weaknesses, and factual inaccuracies

Identify reasoning errors, communication gaps, and subtle issues in model outputs

Annotation Quality & Consistency

Apply consistent annotations by following clear taxonomies, benchmarks, and detailed evaluation guidelines

Produce clear, reproducible evaluation artifacts

Help surface issues before public deployment and support measurable improvements in response quality

Ideal Profile

Strong candidates may have:

Bachelor's degree

Native-level Arabic proficiency or ILR 5 / primary fluency, equivalent to CEFR C2

Strong fluency in English

Significant experience using large language models and understanding how and why people use them

Excellent writing skills and ability to clearly articulate nuanced feedback

Strong attention to detail and ability to notice subtle issues others may overlook

Adaptability across topics, domains, and customer requirements

Background in fields requiring structured analytical thinking such as research, policy, analytics, linguistics, or engineering

Excellent college-level mathematics skills

Preferred Qualifications

Prior experience with RLHF, model evaluation, or data annotation work

Experience writing or editing high-quality written content

Experience comparing multiple outputs and making fine-grained qualitative judgments

Familiarity with evaluation rubrics, benchmarks, or quality scoring systems

Ability to contribute in both full-time and part-time contract work settings

Why This Opportunity

Work at the frontier of human-in-the-loop AI development

Help shape how advanced language models behave in real-world settings

Contribute to improving AI systems used by millions of people

Flexible remote contract work with competitive compensation

Contract Details

Independent contractor role

Fully remote with flexible scheduling

Compensation of $22.64/hr

Full-time or part-time contract work

Weekly payments via Stripe or Wise

Projects may be extended, shortened, or concluded early depending on project needs and performance

Work will not involve access to confidential or proprietary information from any employer, client, or institution

Please note: We are unable to support H1-B or STEM OPT candidates at this time

Location restricted to Egypt, Saudi Arabia, UAE, or the United States

About The Platform

This opportunity is available through a leading AI-driven work platform that connects domain experts with frontier AI research projects.

Experts contribute to improving advanced AI systems by providing specialised expertise across model evaluation, fact-checking, structured annotation, and conversational quality assessment.

By submitting this application, you acknowledge that your information may be processed by 24-MAG LLC for recruitment and opportunity matching in accordance with our Privacy Policy: https://www.24-mag.com/privacy-policy

Remote | Generalist - English & Arabic — $22.64/hr

Skills & Expertise

Key Responsibilities

Full Description