Generalizable Geometric Image Caption Synthesis

Published in ICLR 2026(Under Review), 2024

Key Contributions

Proposed Geo-Image-Textualization, a reinforcement learning-based framework for generating semantically aligned geometry image-caption pairs
Constructed GeoReasoning-10K, the first dataset with full modality equivalence for geometric reasoning
Demonstrated significant improvements in Qwen-2.5-vl performance across geometry, arithmetic, algebraic, and numeric domains

Status: ICLR 2025(Under Review)

Recommended citation: Yue Xin*, 'Wenyuan Wang*, Rui Pan, Ruida Wang, BingXu Meng, Renjie Pi, Shizhe Diao, Tong Zhang. "Generalizable Geometric Image Caption Synthesis." ICLR 2025(Under Review).'
Download Paper | Download Slides | Download Bibtex

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Wenyuan Wang(王文渊)

Key Contributions

Share on