Publications
A collection of my research work.
Superstructuring a Task-Sufficient World Model on Top of Visual Foundational Model
Minghao Fu, Fan Feng, Wenyuan Wang, Biwei Huang
WMW 2026 Workshop 2026
Analyzing the latent space of world models using t-SNE clustering visualization and attention pattern analysis
MMMG: A Comprehensive and Reliable Benchmark for Multitask Multimodal Generation
Jihan Yao, Yushi Hu, Wenyuan Wang, Bin Han, Shangbin Feng, Guang Yang, Yujie Yi, Bingbing Wen, Ranjay Krishna, Lucy Lu Wang, Yulia Tsvetkov, Noah A. Smith, Banghua Zhu
ICLR 2026
Extended benchmark with additional examples and systematic analysis of generative model failures
Generalizable Geometric Image Caption Synthesis
Yue Xin†, Wenyuan Wang†, Rui Pan, Ruida Wang, Howard Meng, Renjie Pi, Shizhe Diao, Tong Zhang
ICLR under review 2025
Integrated symbolic reasoning with neural-guided heuristics for geometric problem-solving and image generation
Probabilistic Residual User Clustering
Wenyuan Wang, Yusong Zhao, Zihao Xu, Hengyi Wang, Shreya Venugopal, Desmond Lobo, Chengzhi Mao, Qi Xu, Zhigang Hua, Yan Xie, Bo Long, Shuang Yang, Hao Wang
TMLR under review 2025
A causal Bayesian framework that clusters users and models residuals to enhance recommendation accuracy
Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of MLLMs
Hengyi Wang, Haizhou Shi, Shiwei Tan, Weiyi Qin, Wenyuan Wang, Tunyu Zhang, Akshay Nambi, Tanuja Ganu, Hao Wang
NAACL 2025
Evaluated performance of MLLMs on multimodal long-context benchmarks
Continual Learning of Large Language Models: A Comprehensive Survey
Haizhou Shi, Zihao Xu, Hengyi Wang, Weiyi Qin, Wenyuan Wang, Yibin Wang, Zifeng Wang, Sayna Ebrahimi, Hao Wang
ACM Computing Surveys 2024
Comprehensive survey of advancements in continual learning for multimodal LLMs
Multi-tailed Vision Transformer for Efficient Inference
Yunke Wang, Bo Du, Wenyuan Wang, Chang Xu
Neural Networks 2024
Multiple tails architecture for Vision Transformers achieving significant FLOP reduction