English हिन्दी বাংলা தமிழ் తెలుగు मराठी ગુજરાતી ಕನ್ನಡ മലയാളം ਪੰਜਾਬੀ اردو Español Français Deutsch Português Русский Italiano 日本語✓ 中文 العربية Bahasa Indonesia Türkçe 한국어

Log in Sign up

Broxer

Log in Sign up

Jobs/Internships

Jobs

Browse jobs Internships Companies Find people Govt jobs

For employers

Hire on Broxer

Post a job Pricing & plans About us Contact us

Career tools

Free tools

Resume builder AI Mock Interview Skill Assessment

Community

Explore

Groups Q&A Articles Feed Play games

Read

Help

Support

Help Center Report a bug Suggest a feature Contact us

Language

English हिन्दी বাংলা தமிழ் తెలుగు मराठी ગુજરાતી ಕನ್ನಡ മലയാളം ਪੰਜਾਬੀ اردو Español Français Deutsch Português Русский Italiano 日本語中文 العربية Bahasa Indonesia Türkçe 한국어

esc

↑ ↓ navigate ↵ open esc close Badges

Virtusa

Model Validator for Agentic AI

Andhra Pradesh, India · Full Time

Be the first to apply

Experience: Any
Salary: —
Openings: 1
Posted: 3週間前

Work mode: In office
Resume: Required to apply

Where you'll work

Job description

Role Overview

The position involves validating AI agent models by developing evaluation datasets to benchmark their reasoning processes and performance.

Key Responsibilities

Create evaluation sets that measure agent performance based on reasoning pathways.
Conduct adversarial testing by providing conflicting instructions to challenge the agent.
Perform scholastic regression testing to identify variations in agent behavior over time.
Verify accuracy of agent calls to external APIs and databases.
Analyze the reasoning chains of the agent and pinpoint divergences from business requirements documentation.
Apply judge large language models (LLMs) to assess output quality.
Use Python and relevant frameworks such as DeepEval and Langsmith proficiently.
Conduct semantic debugging by reviewing the agent's thought process traces.

Preferred Qualifications and Skills

Background as Software Development Engineer in Test (SDET) for coding and testing expertise, including evaluation development.
Strong knowledge of data testing and SQL querying for validation purposes.
Domain experience relevant to specific roles is considered advantageous.

Skills

SQL Software Testing LLM evaluation Python Programming Regression Testing API Testing Model Validation Adversarial testing Semantic Debugging DeepEval Framework Langsmith Framework

Your email (optional)

Leave it if you'd like a reply — we won't use it for anything else.

*

*

Screenshots or Videos

Click to browse, drag & drop, or paste a screenshot

PNG, JPG, GIF, MP4, WebM, MOV · Max 20MB each · Up to 5 files