Generalizable Geometric Image Caption Synthesis

Published in ICLR 2026(Under Review), 2024

Key Contributions

  • Proposed Geo-Image-Textualization, a reinforcement learning-based framework for generating semantically aligned geometry image-caption pairs
  • Constructed GeoReasoning-10K, the first dataset with full modality equivalence for geometric reasoning
  • Demonstrated significant improvements in Qwen-2.5-vl performance across geometry, arithmetic, algebraic, and numeric domains

Status: ICLR 2025(Under Review)

Recommended citation: Yue Xin*, 'Wenyuan Wang*, Rui Pan, Ruida Wang, BingXu Meng, Renjie Pi, Shizhe Diao, Tong Zhang. "Generalizable Geometric Image Caption Synthesis." ICLR 2025(Under Review).'
Download Paper | Download Slides | Download Bibtex