I am a fourth-year Ph.D. candidate in the Department of Automation at SEIEE, SJTU, advised by Prof. Xiaolin Huang. I also spend several wonderful months in the Department of Computer Science and Engineering, HKUST, as a visiting student, under the guidance of Prof. James Kwok. My research primarily revolves around machine learning and optimization. Specifically, I am deeply interested in efficiency, robustness, and generalization aspects of learning algorithms and large models.

I have a keen interest in data structures and algorithms and have participated in several competitive programming contests before. I am also a Nikon user.

🔥 News

📝 Publications

TMLR 2024
sym

Revisiting Random Weight Perturbation for Efficiently Improving Generalization
Tao Li, Qinghua Tao, Weihao Yan, Yingwen Wu, Zehao Lei, Kun Fang, Mingzhen He, Xiaolin Huang

Code

  • This work enhances the generalization performance of random weight perturbation from the perspective of convergence and perturbation generation, and shows that it can achieve more efficient generalization improvement than adversarial weight perturbation in SAM, especially on large-scale problems.
CVPR 2024
sym

Friendly Sharpness-Aware Minimization
Tao Li, Pan Zhou, Zhengbao He, Xinwen Cheng, Xiaolin Huang

Code

  • This work uncovers that the full gradient component in SAM’s adversarial perturbation does not contribute to generalization and, in fact, has undesirable effects. We propose an efficient variant to mitigate these effects and enhance the generalization performance.
ICLR 2023
sym

Trainable Weight Averaging: Efficient Training by Optimizing Historical Solutions
Tao Li, Zhehao Huang, Qinghua Tao, Yingwen Wu, Xiaolin Huang

Code

  • This work introduces trainable weight averaging to average the historical solutions during the DNN training process to achieve efficient training and better performance.
CVPR 2022 (oral)
sym

Subspace Adversarial Training
Tao Li, Yingwen Wu, Sizhe Chen, Kun Fang, Xiaolin Huang

Code

  • This work proposes subspace training as a method to address the overfitting issues in single and multi-step adversarial training, known as catastrophic and robust overfittings. We achieve efficient and stable single-step adversarial training with comparable robustness performance of multi-step methods.
TPAMI 2022
sym

Low Dimensional Trajectory Hypothesis is True: DNNs can be Trained in Tiny Subspaces
Tao Li, Lei Tan, Zhehao Huang, Qinghua Tao, Yipeng Liu, Xiaolin Huang

Code

  • This work explores the low-dimensional characteristics of DNN training trajectories and proposes a dimension reduction method for training DNNs within a lower-dimensional subspace. This approach has the potential to reduce training costs and enhance model robustness.

🎖 Honors and Awards

  • 2020.07 Shanghai Outstanding Graduates
  • 2013.12 First Prize in CSP-S (full grades)

📖 Educations

  • 2020.09 - now, Ph.D., Control Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
  • 2016.09 - 2020.06, B.Eng., Automation, Shanghai Jiao Tong University, Shanghai, China

💬 Invited Talks

  • 2023.03, Trainable Weight Averaging, Huawei 2012 Lab internal talk.

💻 Internships

  • 2021.09 - 2021.12, Tencent WeChat Group, China.