This page was automatically translated and may contain errors. View in English.
跨越障碍

Academic Researcher

Crossing Hurdles

Remote · 合同

抢先申请

经验
任何
薪水
USD 80 – USD 110 / year
职位空缺
1
发布
1周前
工作模式
在家办公
学历
PhD
合格
Current or retired professors and PhD candidates in STEM or professional disciplines based in the United States.
恢复
需要申请

职位描述

Role overview

This contract role is for academics and research professionals in the United States who want to contribute to a frontier model evaluation program. The work centers on improving next-generation large language model systems across technical and professional subject areas.

What you will do

  • Create demanding benchmark tasks based on your academic or professional expertise and make sure they reflect real-world use cases.
  • Develop Python-based problem sets that can be executed, clearly specified, and backed by test cases for agent-style workflows.
  • Review model responses to spot weaknesses in reasoning, logic, and problem solving across complex scenarios.
  • Produce gold-standard answers and evaluation rubrics that enable consistent assessment.
  • Study system behavior to identify capability gaps and recurring failure patterns in advanced reasoning tasks.
  • Work with subject-matter experts from STEM and quantitative fields to raise the quality and rigor of evaluations.

Requirements

  • You should be a current or retired professor, or a PhD candidate, in a STEM or professional field such as computer science, mathematics, physics, engineering, statistics, economics, finance, law, or a closely related area.
  • A strong academic record from a leading university or an equivalent research setting is expected.
  • You need practical Python skills used in research, academic work, or a professional environment.
  • You should be able to create executable problem-solving tasks and computational workflows.
  • Prior exposure to benchmarking, structured evaluation, or research-based task design is an advantage.
  • Strong analytical judgment is important for checking logical validity and understanding system behavior.
  • You must be able to work on your own and maintain a steady schedule of at least 30 hours per week on weekdays.

Additional information

This position is a W-2 contingent role based in the United States. The pay range is stated as $80 to $110 per year, and the expected workload is 30+ hours per week. Applicants should proceed through the easy-apply process to continue.

如果您希望收到回复,请留下您的信息——我们不会将您的信息用于其他用途。

点击浏览拖放,或 粘贴 截图

PNG、JPG、GIF、MP4、WebM、MOV 格式 · 每个文件最大 20MB · 最多 5 个文件

🤖
布罗克瑟助理
在线·即时人工智能帮助
由 AI 提供支持 · 来自 Broxer Help 的解答