Generalizable Geometric Image Caption Synthesis
Published in ICLR 2026(Under Review), 2024
Key Contributions
- Proposed Geo-Image-Textualization, a reinforcement learning-based framework for generating semantically aligned geometry image-caption pairs
- Constructed GeoReasoning-10K, the first dataset with full modality equivalence for geometric reasoning
- Demonstrated significant improvements in Qwen-2.5-vl performance across geometry, arithmetic, algebraic, and numeric domains
Status: ICLR 2025(Under Review)
Recommended citation: Yue Xin*, 'Wenyuan Wang*, Rui Pan, Ruida Wang, BingXu Meng, Renjie Pi, Shizhe Diao, Tong Zhang. "Generalizable Geometric Image Caption Synthesis." ICLR 2025(Under Review).'
Download Paper | Download Slides | Download Bibtex
