Intro

I am a Pre-Doc MS student in Computer Science at the University of Chicago, advised by Prof. Junchen Jiang. My research focuses on systems for large language models (LLMs) and computer networking. I am particularly interested in MLSys, which bridges the gap between rapidly developing machine learning algorithms and hardware. Previously, I was a visiting student at the University of Pennsylvania, where I worked closely with Dr. Liangcheng Yu. I earned my B.E. in Information Engineering from Shanghai Jiao Tong University.

 

I am one of the core contributors and maintainers of the following open-source projects:

  • LMCache (Stars 5.6k): The first high-performance KV cache management layer for distributed LLM inference.
  • vLLM production stack (Stars 1.9k): vLLM’s reference system for K8S-native cluster-wide deployment.

NEWS

[10/2025]  Attending PyTorch Conference 2025 in San Francisco.

[10/2025]  LMCache technical report is now live on arXiv!

[10/2025]  AdaptCache was presented at SOSP’25. Thanks Ganesh for the presentation!