About Me

I am currently a research assistant at the University of Illinois Urbana-Champaign, working with Prof. Tong Zhang. Previously, I was a visiting student at Rutgers University (2024-2025), advised by Prof. Hao Wang. I obtained my B.S. in Electronic Information Engineering from Wuhan University in 2024, graduating with a GPA of 3.60/4.0.

My research focuses on developing innovative approaches to enhance multimodal understanding and reasoning capabilities in artificial intelligence systems. I have extensive experience in multimodal large language models (MLLMs), computer vision, and reinforcement learning applications.

Research Interests

Multimodal Learning: Developing frameworks that bridge visual and textual understanding, particularly in geometric reasoning and cross-modal alignment.

Computer Vision & Reinforcement Learning: Designing efficient neural architectures and reinforcement learning frameworks for visual understanding tasks.

Trustworthy AI: Working on interpretability and reliability of large language models and multimodal systems.

Formal Languages & Reasoning: Exploring the potential of formal languages to augment reasoning capabilities in AI systems.

Recent Work

I am currently working on Geo-Image-Textualization, a reinforcement learning-based framework for generating semantically aligned geometry image-caption pairs. This work has resulted in the creation of GeoReasoning-10K, the first dataset with full modality equivalence for geometric reasoning.

My research has been published in top-tier venues including NAACL, Neural Networks, and has submissions under review at NeurIPS, IJCAI, and TMLR. I have also contributed to comprehensive surveys in continual learning for large language models.