Multi-tailed vision transformer for efficient inference

Published in Neural Networks, 2022

Key Contributions

  • Designed multiple tails to generate visual sequences of different lengths for the Transformer encoder
  • Employed a tail predictor to determine which tail produces the most accurate prediction for each image
  • Achieved significant reduction in FLOPs with no accuracy degradation

Status: Neural Networks (2024), Vol. 174: 106235

Recommended citation: Yunke Wang, Bo Du, Wenyuan Wang, Chang Xu. "Multi-tailed vision transformer for efficient inference." Neural Networks, 2024, 174: 106235.
Download Paper