Bohan Lyu

I am an undergruduate at Tsinghua University, pursuing a dual degree in Computer Science and Economics since 2022.

In the summer of 2023, I joined Tsinghua NLP Lab, advised by Prof. Zhiyuan Liu. In the summer of 2024, I interned at Rose-STL-Lab, Department of CSE, UCSD, advised by Prof. Rose Yu. Now I'm working on LLMs for auto math proving with formal language (LEAN4) at PLI, Princeton.

Before that, I interned as a quantitative researcher at China Securities in 2023.

My research interests lie in facilitating scientific discovery with large models. My current research questions are:
1. How to enable models to learn from their interactions with the physical world and drive progress in scientific discovery?
2. How can we rigorously evaluate the capabilities of scientific agents with methods that go beyond static datasets?
3. How to verify natural language math reasoning with formal language?

I have served as a reviewer for ICLR 2025, and workshops of ICLR 2024, ICML 2024 and ACL 2024.

I am seeking PhD opportunities starting in Fall 2026. I welcome any collaboration or discussion. Please feel free to reach out!

Email  /  GitHub  /  Google Scholar  /  Twitter  /  CV

   

profile photo
▶ Contact Info (click to see)

News

  • 2024-12 Delivered a talk at the Challenge Cup Seminar hosted by Department of Automation, Tsinghua University.
  • 2024-11 Presented my summer project at AAAI FSS.
  • 2024-10 Awarded the Scholarship for Technological Innovation of 5,000 RMB.
  • 2024-06 My team got No.1 out of 600+ teams at a Baidu Inc. Data Mining Competition.
  • 2024-05 I became a member of 'Sparking Program', the most prestigious and selective academic organization for students at Tsinghua University.
  • 2024-04 I led my team to win Second Place and the Newcomer Prize at Tsinghua University's Challenge Cup.
  • 2024-04 I received a Grade A in the 'Global Exploration Initiative' and was awarded a grant of 20,000 RMB.

Research (* indicates equal contribution)

I'm interested in Natural Language Processing, AI4Science and multimodal learning.

project image

Goedel-Prover: A New Frontier in Automated Theorem Proving


Yong Lin*, Shange Tang*, Bohan Lyu, Jiayun Wu, Hongzhou Lin, Kaiyu Yang, Jia Li, Mengzhou Xia, Danqi Chen, Sanjeev Arora, Chi Jin
Preprint, 2025
code / website /

The new 7B Goedel-Prover sets a new state-of-the-art in open-source automated theorem proving, beating previous records with a 7% improvement on miniF2F, topping the PutnamBench Leaderboard, and solving nearly twice as many problems on Lean Workbook.

project image

Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation


Bohan Lyu*, Yadi Cao*, Duncan Watson-Parris, Leon Bergen, Taylor Berg-Kirkpatrick, Rose Yu
AAAI FSS ORAL, Main Conference under review, 2024
arxiv / slides / youtube /

This work proposes a fine-tuning method where LLMs internalize tool-generated solutions (World Knowledge Distillation) and learn to switch between direct answers and tool use for complex problems (Tool Usage Adaptation). It outperforms GPT-4 and Claude-3.5 across six scientific benchmarks.

project image

Enhancing LLM’s Capabilities in Open Domains via Autonomous Tool Integration


Bohan Lyu*, Xin Cong*, Heyang Yu, Pan Yang, Yujia Qin, Yining Ye, Yaxi Lu, Zhong Zhang, Yukun Yan, Yankai Lin, Zhiyuan Liu, Maosong Sun
Under Peer Review, 2023
arxiv /

Developed an autonomous agent that leverages GitHub repositories to extend its capabilities to address diverse user queries. Introduced a new agent architecture that achieved SOTA performance on SciAct.






Design and source code from Leonid Keselman's website