We are a Y-Combinator-backed startup building your AI-powered Recruiter Agent
Generalist - English & Chinese (Mandarin)
Location
United States
Posted
2 days ago
Salary
Not specified
Seniority
Mid Level
Job Description
This role is for one of our clients
Compensation: $36.16 per hour
Location: Geography restricted to Taiwan, Malaysia, USA Type: Full-time or Part-time Contract Work Fluent Language Skills Required: English & Chinese (Mandarin)
This role focuses on improving the quality, accuracy, and reliability of conversational AI systems. You will evaluate and enhance how large language models (LLMs) respond to real-world queries, ensuring outputs are clear, well-reasoned, and aligned with human expectations across a wide range of topics.
Job Requirements
- What You’ll Do
- Evaluate AI-generated responses for their effectiveness in addressing user queries.
- Conduct fact-checking using reliable public sources and external tools.
- Generate high-quality evaluation data by identifying strengths, weaknesses, and factual inaccuracies.
- Assess reasoning, clarity, tone, and completeness of responses.
- Ensure outputs align with expected conversational standards and system guidelines.
- Apply consistent annotations based on defined taxonomies, benchmarks, and evaluation frameworks.
- Who You Are
- Bachelor’s degree in any discipline.
- Native-level fluency in Mandarin Chinese (ILR 5 / CEFR C2) and strong proficiency in English.
- Hands-on experience using large language models (LLMs) with a strong understanding of their applications.
- Excellent writing skills with the ability to provide structured and nuanced feedback.
- Strong attention to detail and ability to identify subtle inconsistencies or issues.
- Adaptable and comfortable working across multiple domains and topics.
- Background in fields requiring structured analytical thinking such as research, analytics, policy, linguistics, or engineering.
- Strong mathematical and logical reasoning skills at a college level.
- Nice-to-Have Skills
- Experience with RLHF (Reinforcement Learning from Human Feedback), model evaluation, or data annotation.
- Background in writing, editing, or content quality review.
- Experience comparing multiple outputs and making detailed qualitative judgments.
- Familiarity with evaluation frameworks, scoring systems, or benchmarking methodologies.
- What Success Looks Like
- Ability to identify factual inaccuracies, reasoning gaps, and communication issues in AI outputs.
- Deliver consistent, high-quality evaluation results.
- Provide actionable feedback that improves model performance and user experience.
- Contribute to building reliable and trustworthy AI systems at scale.
- Why Join This Opportunity
- Work at the forefront of human-in-the-loop AI development.
- Contribute to improving AI systems used across diverse real-world applications.
- Flexible, remote contract role with the ability to manage your own schedule.
- Competitive compensation aligned with expertise and contribution level.
- Contract & Payment Terms
- Engagement on an independent contractor basis.
- Fully remote with flexible working hours.
- Project duration may vary depending on performance and business requirements.
- Work involves only publicly available information, with no access to confidential or proprietary data.
- Payments processed weekly via Stripe or Wise based on completed work.
- Candidates requiring H1-B or STEM OPT sponsorship are not eligible.
- Core Skills
- LLM Evaluation | Content Analysis | Bilingual Communication (English & Mandarin)
Related Guides
Related Categories
Related Job Pages
More Bilingual Jobs
This role focuses on improving the quality, accuracy, and reliability of conversational AI systems. You will evaluate and enhance how large language models (LLMs) respond to real-world queries, ensuring outputs are clear, well-reasoned, and aligned with human expectations across ...
Generalist - English & Hindi
Weekday (YC W21)We are a Y-Combinator-backed startup building your AI-powered Recruiter Agent
This role is for one of our clientsCompensation: $12.19 per hourThis role supports the development of high-quality, reliable conversational AI systems by improving how large language models (LLMs) respond to real-world user queries. The focus is on eva...
Bilingual RN Case Manager
HealthCheck360#HealthCheck360 is dedicated to improving the health, well-being, and culture of your company.
Bilingual RN Case Manager delivering case management services at HealthCheck360
Diagnostic Psychologist conducting assessments for pediatric neurobehavioral disorders.

