- 经验
- 任何
- 薪水
- —
- 职位空缺
- 1
- 发布
- 1 天前
- Work mode
- 在办公室
- Eligibility
- SDETs and candidates with strong testing backgrounds are preferred for this role. Applicants with domain knowledge relevant to the work will have an added advantage.
- Resume
- Required to apply
Where you'll work
职位描述
Role Overview
This position is for a Model Validator working on agentic AI systems. The role focuses on designing evaluation approaches, stress-testing agent behavior, and verifying that model outputs and tool usage align with business requirements.
Key Responsibilities
- Build evaluation sets that can benchmark agent performance by tracing reasoning paths.
- Carry out adversarial tests by presenting conflicting or challenging instructions to expose weak spots in the agent.
- Run regression checks to measure how much agent behavior changes across test cycles.
- Validate tool calls to ensure the agent is invoking the right external APIs and databases.
- Review thought chains and pinpoint where the agent’s logic starts to drift from the business requirement document.
- Apply judge LLMs to score and assess model outputs.
- Use semantic debugging by analyzing the agent’s thought trace to identify decision issues.
Skills and Technical Expectations
- Strong Python programming ability.
- Hands-on familiarity with evaluation frameworks such as DeepEval and LangSmith.
- Solid understanding of data- and SQL-driven testing methods.
- Ability to work with model traces, reasoning paths, and output grading workflows.
Screening Criteria
- SDET profiles are preferred because their coding and testing background suits evaluation development.
- Prior domain exposure is an added advantage for this role.
Location and Work Mode
This is a full-time onsite opportunity based in Andhra Pradesh, India.