Hi, I'm Zhehao Zhang

I'm a PhD Student in Computer Science at The Ohio State University, working on Language Agents.

I am a first year PhD student in Computer Science & Engineering at

The Ohio State University and a member of the OSU NLP Lab, advised by Prof. Yu Su and closely collaborating with Prof. Huan Sun. Previously, I worked as a Research Intern at

Stanford SALT Lab,

Netflix,

Amazon,

Adobe Research, and

Microsoft Research Lab – Asia. I received my Master's degree from

Dartmouth College and Bachelor's degree in Artificial Intelligence Honor Class at

Shanghai Jiao Tong University.

My research interests lie in Language Agents, Agent Safety, (Recursive) Self-Evolving Agents, and LLM Alignment. I focus on developing methods to evaluate and improve the safety, robustness, and reliability of language agents and LLMs in real-world applications. I believe that agentic AI will drive the next industrial revolution, and I am excited to build agents that are not only capable but also safe, trustworthy, and continually self-improving.

Please feel free to contact me by email (zhang.16420@osu.edu) for collaboration opportunities!

News

2026 June Joined

Netflix as a Machine Learning Intern, working on language agents for automatic video editing.

2026 May 🎉 Two papers, AutoElicit and Misaligned Action Detection, are accepted to ICML 2026!

2026 May 🏅 Recognized as a Gold Reviewer for ICML 2026.

2025 Nov 🎉 VipAct, on visual-perception enhancement via specialized VLM agent collaboration and tool-use, is accepted to AAAI 2026!

2025 Sept ✈️ Received the COLM 2025 Travel Grant.

2025 Sept 🎓 Started my PhD journey at the OSU NLP Group, working on language agents, supported by an OSU Fellowship.

2025 July 📰 Our FalseReject work is featured on the Amazon Science Blog and AI Era.

2025 July 🎉 FalseReject, a resource for improving contextual safety and mitigating over-refusals in LLMs via structured reasoning, is accepted to COLM 2025. See you in Montreal!

2025 Mar 🎉 Personalization of Large Language Models: A Survey is accepted to TMLR.

2024 Nov Joined

Amazon as an Applied Scientist Intern in Seattle.

2024 Sept 25 🎉 The DARG paper is accepted to NeurIPS 2024. See you in Vancouver!

2024 Sept 20 🎉 One first-author paper on vision language models for academic chart generation is accepted to EMNLP 2024. See you in Miami!

2024 Sept 5 🎤 Honored to give a talk on Recent Advances in Synthetic Data for Foundation Models at Stanford SALT Lab. Slides.

2024 June 25 📄 A new preprint is released! Please check DARG, a dynamic evaluation framework that augments current reasoning benchmarks from the level of reasoning graphs.

2024 Mar 13 🎉 One first-author paper on LLMs for hierarchical table analysis is accepted to NAACL 2024. See you in Mexico City!

2024 Mar 🎤 Honored to give a talk on Augmented Language Models at TRIP Lab at Dartmouth, hosted by Prof. Yaoqing Yang. Slides and Recording.

2024 Feb I will join

Adobe Research as a Research Intern this summer. See you in San Jose and the Bay Area!

2023 Dec 14 🎉 One first-author paper from my undergraduate is accepted to ICASSP 2024.

2023 Oct 27 🎉 The paper "Can Large Language Models Transform Computational Social Science?" is accepted to Computational Linguistics.

2023 Oct 7 🎉 Two first-author papers (CRT-QA and Hate Speech Detection) from my undergraduate are accepted to EMNLP 2023. See you in Singapore!

Featured Research Publications

Selected papers on language agents, agent safety, and the alignment and robustness of large language models. See my Google Scholar for the full list.

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

Yuting Ning^*, Zhehao Zhang^*, Yash Kumar Lal, Boyu Gou, Junyi Li, Weitong Ruan, Chentao Ye, Rahul Gupta, Diyi Yang, Yu Su, Huan Sun ^*equal contribution

arXiv 2026

Project PDF Code Data

When Benign Inputs Lead to Severe Harms: Eliciting Unsafe Unintended Behaviors of Computer-Use Agents

Jaylen Jones^*, Zhehao Zhang^*, Yuting Ning, Eric Fosler-Lussier, Pierre-Luc St-Charles, Yoshua Bengio, Dawn Song, Yu Su, Huan Sun ^*equal contribution

ICML 2026

Project PDF Code Data

QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks

Jian Xie, Tianhe Lin, Zilu Wang, Yuting Ning, Yuekun Yao, Tianci Xue, Zhehao Zhang, Zhongyang Li, Kai Zhang, Yufan Wu, Shijie Chen, Boyu Gou, Mingzhe Han, Yifei Wang, Vint Lee, Xinpeng Wei, Xiangjun Wang, Yu Su, Huan Sun

arXiv 2026

Project Demo PDF Code Model

When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use Agents

Yuting Ning, Jaylen Jones, Zhehao Zhang, Chentao Ye, Weitong Ruan, Junyi Li, Rahul Gupta, Huan Sun

ICML 2026

Project PDF Code

Distractor Injection Attacks on Large Reasoning Models: Characterization and Defense

Zhehao Zhang, Weijie Xu, Shixian Cui, Chandan K Reddy

arXiv 2025

PDF Code

Falsereject: A resource for improving contextual safety and mitigating over-refusals in llms via structured reasoning

Zhehao Zhang, Weijie Xu, Fanyou Wu, Chandan K Reddy

COLM 2025

Project PDF Code

DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph

Zhehao Zhang, Jiaao Chen, Diyi Yang

NeurIPS 2024

Project PDF Code

VipAct: Visual-Perception Enhancement via Specialized VLM Agent Collaboration and Tool-use

Zhehao Zhang, Ryan A. Rossi, Tong Yu, Franck Dernoncourt, Ruiyi Zhang, Jiuxiang Gu, Sungchul Kim, Xiang Chen, Zichao Wang, Nedim Lipka

AAAI 2026

PDF

Vision-Language Models for Academic Charts

Is GPT-4V (ision) All You Need for Automating Academic Data Visualization? Exploring Vision-Language Models' Capability in Reproducing Academic Charts

Zhehao Zhang, Weicheng Ma, Soroush Vosoughi

EMNLP 2024

PDF Code

Personalization of Large Language Models: A Survey

Zhehao Zhang, Ryan A. Rossi, Branislav Kveton, Yijia Shao, Diyi Yang, Hamed Zamani, Franck Dernoncourt, Joe Barrow, Tong Yu, Sungchul Kim, Ruiyi Zhang, Jiuxiang Gu, Tyler Derr, Hongjie Chen, Junda Wu, Xiang Chen, Zichao Wang, Subrata Mitra, Nedim Lipka, Nesreen Ahmed, Yu Wang

TMLR 2025

PDF