top of page

Youjie Li

DistMLSyser | PyTorch Post-Doc | UIUC Ph.D.

Distributed ML, ML Systems, ML Hardware


I am a tiny Research Scientist at ByteDance and am building something big from 0 to 1 (project veScale) with Li-Wen Chang and Haibin Lin.

Previously, I was a post-doc at PyTorch-Distributed of Meta and worked with Shen Li in 2023. Previous-previously, I was a research intern at Microsoft Research and worked heavily with Amar Phanishayee in 2020-2022 and Anirudh Badam in 2019.

I received my Ph.D. at UIUC in 2022.

My research surrounds Distributed Machine Learning Systems, which spans the areas of Efficient Machine Learning, System Optimization, and Hardware Acceleration. I am one of the pioneers who studied pipelined data parallelism (NeurIPS'18) and in-network acceleration for distributed training with SmartNICs (MICRO'18) and programmable switches (ISCA'19). Recently, I have been in massive model training systems (VLDB'22) and massive graph training systems (MLSys'22).

Few students have ever studied distributed machine learning systems at Illinois. I created my research direction from complete scratch and autonomously carried out my research projects during my Ph.D.




(VLDB'22) Harmony: Overcoming the Hurdles of GPU Memory Capacity to Train Massive DNN Models on Commodity Servers

Youjie Li, Amar Phanishayee, Derek Murray, Jakub Tarnawski, Nam Sung Kim
The 48th International Conference on Very Large Databases (VLDB), 2022.

[pdf] [slides] [video] [code] [bibtex]

(MLSys'22) BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling

Cheng Wan*, Youjie Li*, Ang Li, Nam Sung Kim, Yingyan Lin

(*Equal contribution)

The Fifth Conference on Machine Learning and Systems (MLSys), 2022. Acceptance rate of 20%.

[pdf] [slides] [video] [code] [bibtex]

(ICLR'22) PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication

Cheng Wan, Youjie Li, Cameron R. Wolfe, Anastasios Kyrillidis, Nam Sung Kim, Yingyan Lin
The Tenth International Conference on Learning Representations (ICLR), 2022.

[pdf] [slides] [video] [code] [bibtex]

(MobiCom'21) Visage: Enabling Timely Analytics for Drone Imagery

Sagar Jha*, Youjie Li*, Shadi Noghabi, Vaishnavi Ranganathan, Peeyush Kumar, Andrew Nelson, Michael Toelle, Sudipta Sinha, Ranveer Chandra, Anirudh Badam

(*Co-first author, Alphabetical order)

The 27th International Conference on Mobile Computing and Networking (MobiCom), 2021. Acceptance rate of 19%.

[pdf] [code deployed in Microsoft's FarmBeats project] [bibtex]

(HotOS'21) Doing More with Less: Training Large DNN Models on Commodity Servers for the Masses

Youjie Li, Amar Phanishayee, Derek Murray, Nam Sung Kim

The 18th Workshop on Hot Topics in Operating Systems (HotOS XVIII), 2021. Acceptance rate of 26%.

[pdf] [slides] [video] [bibtex]

(MICRO'19) DeepStore: In-Storage Acceleration for Intelligent Queries

Vikram S Mailthody, Zaid Qureshi, Weixin Liang, Ziyan Feng, SimonGarcia de Gonzalo, Youjie Li, Hubertus Franke, Jinjun Xiong, Jian Huang, Wen-mei Hwu

The 52nd International Symposium on Microarchitecture (MICRO), 2019. Acceptance rate of 23%.

[pdf] [slides] [bibtex]

(ISCA'19) iSwitch: Accelerating Distributed Reinforcement Learning with In-Switch Computing

Youjie Li, Iou-Jen Liu, Yifan Yuan, Deming Chen, Alexander Schwing, Jian Huang

The 46th International Symposium on Computer Architecture (ISCA), 2019. Acceptance rate of 17%.

[pdf[slides] [bibtex]

(NeurIPS'18) PipeSGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training

Youjie Li, Mingchao Yu, Songze Li, Salman Avestimehr, Nam Sung Kim, Alexander Schwing

Advances in Neural Information Processing Systems (NeurIPS), 2018. Acceptance rate of 20%.

[pdf[poster] [bibtex]

(NeurIPS'18) GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training

Mingchao Yu, Zhifeng Lin, Krishna Narra, Songze Li, Youjie Li, Nam Sung Kim, Alexander Schwing, Murali Annavaram, Salman Avestimehr

Advances in Neural Information Processing Systems (NeurIPS), 2018. Acceptance rate of 20%.

[pdf[poster] [bibtex]

(MICRO'18) INCEPTIONN: A Network-Centric Hardware/Algorithm Co-Design to Accelerate Distributed Training of Deep Neural Networks

Youjie Li, Jongse Park, Mohammad Alian, Yifan Yuan, Zheng Qu, Peitian Pan, Ren Wang, Alexander Schwing, Hadi Esmaeilzadeh, Nam Sung Kim

The 51st International Symposium on Microarchitecture (MICRO), 2018. Acceptance rate of 21%.

[pdf[slides] [bibtex]

(Neurocomputing'17) Energy Efficient Parallel Neuromorphic Architectures with Approximate Arithmetic on FPGA

Qian Wang, Youjie Li, Botang Shao, Siddhartha Dey, Peng Li

Neurocomputing, 2017.

[pdf] [bibtex]

(ISCAS'16) Liquid State Machine based Pattern Recognition on FPGA with Firing-Activity Dependent Power Gating and Approximate Computing

Qian Wang, Youjie Li, Peng Li

The IEEE International Symposium on Circuits and Systems (ISCAS), 2016. 

Best Paper Award

[pdf] [bibtex]


Research Intern

  • Microsoft Research, Redmond, Summer 2020​​​​

    • Amazing mentor: Dr. Amar Phanishayee​

    • Project: Efficient Parallel Training of Massive DNN Models

  • Microsoft Research, Redmond, Summer 2019​​​​

    • Amazing mentors: Dr. Anirudh Badam, Dr. Ranveer Chandra, Dr. Amar Phanishayee​

    • Project: Efficient Inference/Training of Mobile DNNs

  • IBM Research, Yorktown, Summer 2018

    • ​Project: System Acceleration for Large DNN Training

  • IBM Research, Austin, Summer 2017

    • ​Project: Efficient Communication for Distributed DNN Training

"Tenured" Teaching Assistant​

  • ECE385 Digital Systems Laboratory, Fall 2018 & Spring 2022 & Fall 2022

  • ECE411 Computer Architecture, Spring 2021 & Fall 2021

  • ECE438 Computer Networks, Fall 2019

  • ECE446 Machine Learning, Spring 2018

Peer Reviewer​

  • ICML, 2024

  • ICLR, 2024

  • NeurIPS, 2023

  • TNET, 2023

  • TNNLS, 2022

  • ICCV, 2021

  • NeurIPS, 2020
  • SIGCOMM, 2020
Awards & Interests


  • Dan Vivoli Endowed Fellowship, ECE, UIUC, 2019-2020

  • Best Paper Award, The IEEE International Symposium on Circuits and Systems (ISCAS), 2016

  • ECE Department Scholarship, Texas A&M University, 2014-2015



  • It's hard to pronounce my name correctly. My name can be pronounced as "Yo-Jay Lee".

  • My last name is actually "黎" : )

  • My dream life is doing research only 996, while playing badminton every week. Yo~


youjieli [at]

​San Jose, CA

  • 687474703a2f2f7777772e7075626c69632e6173
  • LinkedIn Social Icon
  • Facebook Social Icon
bottom of page