We deploy world-class Creative
on demand.

Request a callback

Engineering · AI / Machine Learning / LLM Engineering

AI Evaluation Engineer

Help track model quality and improve results with clear checks.

Remote Full-time Mid
EvaluationAILLMQuality
Apply nowBack to Careers

Snapshot

TeamEngineering
CategoryAI / Machine Learning / LLM Engineering
TypeFull-time
LocationRemote
LevelMid
Apply now

Focus

Evaluation, scoring, quality checks

Role overview

You will create evaluation steps, score model outputs, and help teams understand quality. You’ll support updates with clear results.

Responsibilities

  • Create evaluation tests.
  • Review model outputs.
  • Track metrics and share reports.
  • Suggest small improvements.

Requirements

  • 2+ years in AI or QA.
  • Comfort with Python.
  • Basic data analysis skills.
  • Clear communication.

Perks

Remote work

Learning support

Tooling help

Weekly syncs

Ready?

This could be your best role yet.

Apply now or talk to us about the team.

Apply now