We’re hiring a Code Reviewer with deep Python expertise to review evaluations completed by data annotators assessing AI-generated Python code responses. Your role is to ensure that annotators follow strict quality guidelines related to instruction-following, factual correctness, and code functionality.
Responsibilities :
Review and audit annotator evaluations of AI-generated Python code.
Assess if the Python code follows the prompt instructions, is functionally correct, and secure.
Validate code snippets using proof-of-work methodology, including Docker-based execution environments when applicable.
Identify inaccuracies in annotator ratings or explanations.
Provide constructive feedback to maintain high annotation standards.
Work within Project Atlas guidelines for evaluation integrity and consistency.
Required Qualifications
5–7+ years of experience in Python development, QA, or code review.
Strong knowledge of Python syntax, debugging, edge cases, and testing.
Comfortable using code execution environments and testing tools, including Docker containers for environment isolation and reproducibility.
Excellent written communication and documentation skills.
Experience working with structured QA or annotation workflows.
English proficiency at B2, C1, C2, or Native level.
Preferred Qualifications
Experience in AI training, LLM evaluation, or model alignment.
Familiarity with annotation platforms.
Exposure to RLHF (Reinforcement Learning from Human Feedback) pipelines.
Working knowledge of Docker for replicating or reviewing complex code environments is a plus.
Why Join Us? Join a high-impact team working at the intersection of AI and software development. Your Python expertise will directly influence the accuracy, safety, and clarity of AI-generated code. This role offers remote flexibility, milestone-based delivery, and competitive compensation.