Bio
I am a final-year PhD student in the School of Computer Science at the University of Sydney (USYD), under the supervision of Prof. Dacheng Tao. Prior to this, I received an MPhil in Computer Science from USYD in 2022, also advised by Prof. Dacheng Tao, and a BEng from the School of Automation Science and Electrical Engineering (SASEE), Beihang University, in 2019.
I am currently a research intern at ByteDance BandAI, where my research focuses on LLM post-training, agentic reinforcement learning, and data-centric AI.
Contact
leishiye@gmail.com
shiye.lei@sydney.edu.au
Publications [Google Scholar]
* indicates co-first authors
LLM Post-training
A Step Back: Prefix Importance Ratio Stabilizes Policy Optimization [paper]
Shiye Lei, Zhihao Cheng, and Dacheng Tao
arXiv preprint, 2026EAPO: Enhancing Policy Optimization with On-Demand Expert Assistance [paper]
Siyao Song, Cong Ma, Zhihao Cheng, Shiye Lei, Minghao Li, Ying Zeng, Huaixiao Tou, and Kai Jia
arXiv preprint, 2025Revisiting LLM Reasoning via Information Bottleneck [paper]
Shiye Lei, Zhihao Cheng, Kai Jia, and Dacheng Tao
arXiv preprint, 2025
Data-centric AI
Offline Behavioral Data Selection [paper][code]
Shiye Lei, Zhihao Cheng, and Dacheng Tao
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2026EarthSynth: Generating Informative Earth Observation with Diffusion Models [paper]
Jiancheng Pan*, Shiye Lei*, Yuqian Fu, Jiahao Li, Yanxing Liu, Yuze Sun, Xiao He, Long Peng, Xiaomeng Huang, and Bo Zhao
arXiv preprint, 2025State Diversity Matters in Offline Behavior Distillation [paper]
Shiye Lei, Zhihao Cheng, and Dacheng Tao
arXiv preprint, 2025Image Captions are Natural Prompts for Training Data Synthesis [paper][code]
Shiye Lei*, Hao Chen*, Sen Zhang, Bo Zhao, and Dacheng Tao
International Journal of Computer Vision (IJCV), 2025Offline Behavior Distillation [paper][code][poster]
Shiye Lei, Sen Zhang, and Dacheng Tao
Advances in Neural Information Processing Systems (NeurIPS), 2024A Comprehensive Survey of Dataset Distillation [paper]
Shiye Lei and Dacheng Tao
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Trustworthy Deep Learning
Attentive Learning Facilitates Generalization of Neural Networks [paper][code]
Shiye Lei, Fengxiang He, Haowen Chen, and Dacheng Tao
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2024Understanding Deep Learning via Decision Boundary [paper]
Shiye Lei, Fengxiang He, Yancheng Yuan, and Dacheng Tao
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023Spectral Complexity-scaled Generalisation Bound of Complex-Valued Neural Networks [paper][code]
Haowen Chen, Fengxiang He, Shiye Lei, and Dacheng Tao
Artificial Intelligence (AIJ), 2023Spatial-temporal-fusion BNN: Variational Bayesian feature layer [paper]
Shiye Lei, Zhuozhuo Tu, Leszek Rutkowski, Feng Zhou, Li Shen, Fengxiang He, and Dacheng Tao
arXiv preprint, 2021Neural Networks Behave as Hash Encoders: An Empirical Study [paper]
Fengxiang He*, Shiye Lei*, Jianmin Ji, and Dacheng Tao
arXiv preprint, 2021
Teaching Assistant
- COMP5318: Machine Learning and Data Mining, 2024 S2 @ USYD
Conference Reviewer
Conference reviewer: ICML, NeurIPS, ICLR, AISTATS, CVPR, ICCV, ECCV, AAAI, ACM MM, etc.
Journal reviewer: JMLR, Springer Machine Learning, Neurocomputing, etc.
