AI Evaluation Specialist hourly

Upamind AI HQ: Houston, Texas, United States Remote job Jun 15

In this role, you'll apply your expertise to help train next-generation AI systems. Your work will shape how models learn, reason, and perform through high-quality, real-world input.

Key Responsibilities

  • Design and implement self-contained evaluation tasks, including prompts, supporting files, and detailed grading rubrics to assess AI performance on practical computer-based workflows.
  • Define clear, unambiguous written criteria for what constitutes successful and unsuccessful task completion across diverse administrative and workflow scenarios.
  • Meticulously observe and document AI agent behaviors, producing crisp, precise summaries and reports in high-quality English.
  • Iterate and refine evaluation tasks and rubrics based on feedback and team collaboration to ensure robust benchmarking methodologies.
  • Work cross-functionally across a wide range of domains, adapting evaluation frameworks as project requirements evolve.
  • Collaborate with the customer's team to share insights and help drive continuous improvement in AI evaluation techniques.
  • Champion meticulousness, structured observation, and clear written communication throughout all project deliverables.

Required Skills and Qualifications

  • Minimum 3 years of experience in roles emphasizing written precision and structured thinking, such as paralegal, executive assistant, junior analyst, librarian, document archival specialist, research assistant, technical writer, or QA analyst.
  • Native or fluent in English writing, with a demonstrated ability to produce observations that are succinct, specific, and unambiguous.
  • Proven skill in designing or applying rubric-based evaluation, grading against set criteria, or building structured scoring frameworks.
  • High attention to detail and ability to notice subtle patterns or inconsistencies others might miss.
  • Exceptional written and verbal communication skills, especially for documenting nuanced observations and feedback.
  • Fluency in navigating computers, common SaaS tools, web browsers, file management, and document editing platforms.
  • Strong self-direction, with the ability to independently take ownership of ambiguous or loosely defined projects.
Requirements
Availability:
Hourly contract
Languages:
English

$30/hr