|
Hong Xu, Henry
Research Interests
I work on systems and networking. I lead the NetX Lab. Welcome to visit us!
Our research is/was supported by funding from
Research Grants Council (RGC) of Hong Kong, The Chinese University of Hong Kong, City University of Hong Kong
Amazon Web Services, Azure, ByteDance, Huawei, Microsoft Research Asia, etc.
Current Projects
Recent Publications (full list)
“Efficient GPU-Centric Evolving Graph Processing at Scale”, OSDI 2026.
“TSGuard: Automated User-Centric Incident Diagnosis for AI Workloads in the Cloud”, FSE 2026
“Offloading Cloud Network Services at Production Scale with SONiC DASH SmartSwitch”, NSDI 2026
“Dynamic Sparsity in Large-Scale Video DiT Training”, ASPLOS 2026
“Mycroft: Tracing Dependencies in Collective Communication Towards Reliable LLM Training”, SOSP 2025
“Towards End-to-End Optimization of LLM-based Applications with Ayo”, ASPLOS 2025
“Performance Prediction of On-NIC Network Functions with Multi-Resource Contention and Traffic Awareness”, ASPLOS 2025
“Accelerating Distributed MoE Training and Inference with Lina”, USENIX ATC 2023
“Lyra: Elastic Scheduling for Deep Learning Clusters”, ACM EuroSys 2023
“Software-defined Network Assimilation: Bridging the Last Mile Towards Centralized Network Configuration Management with NAssim”, ACM SIGCOMM 2022 (Best Paper Award)
Prospective Students and Current Openings
[04/2025] I am looking for PhD students starting in fall 2026. The early admissions exercise has begun and deadline of first-round intake is May 22, 2025. See here for more details.
News
05/2026. AGZO, FOCUS, UniScale accepted to ICML’26! Congrats to Wei, Kaihua, Xin, and the team!
04/2026. ReasonCache accepted to IWQoS’26, and Matt to APNet’26! Congrats to Kaiwen and Jianqiang!
03/2026. Omega accepted to OSDI’26! Congrats to Yunmo!
03/2026. Gave a talk on reliability engineering in AI infra at the IRTF open meeting as part of IETF-125.
02/2026. Gave a keynote on AI infra at NINeS 2026 HKUST pod and Secure & Scalable Machine Learning Symposium 2026 at Thailand.
01/2026. PRISM accepted to MLSys’26! Congrats to Yuetao!
01/2026. Serving as PC members for IEEE ICDCS and APNet, publicity chair for IWQoS, and area chair for ACL ARR 2026 January.
12/2025. TSGuard accepted to FES’26! Congrats to Yitao, Yangtao, and all collaborators at Microsoft!
12/2025. Serving as an associate editor for ACM Transactions on Computer Systems. Please submit!
12/2025. SmartSwitch accepted to NSDI’26! Congrats to Shaofeng and all collaborators at Microsoft!
12/2025. Serving on the PC of SIGCOMM’26. Please submit!
12/2025. Gave a keynote on AI infra for ACM CoNEXT’25 Student Workshop.
11/2025. Serving on the external review committee of MLSys’26.
11/2025. FAISys’25 was successfully held in CUHK! More than 160 people attended the workshop with 70+ from industry. Check out the photo album here.
10/2025. Registration is open for FAISys’25! We have a terrific 2-day program ahead, register early to secure your spot!
09/2025. Our work on learning for gradient descent accepted to NeulIPS’25! Congrats to Qingyu and Wei!
08/2025. We are launching FAISys, a unique workshop on AI/ML systems! The first FAISys is to be held at CUHK! Please submit, sponsor, and participate!
|