Publications:

Open-source projects that I'm maintaining:

LMCache

LMCache

The first open-source Knowledge Delivery Network (KDN) that accelerates LLM applications up to 8x faster, at 8x lower cost.

vLLM production stack

vLLM production stack

Scale from single vLLM instance to distributed vLLM deployment without changing any application code.

LMBenchmark

LMBenchmark

Systematic and comprehensive benchmarks for LLM systems.

Past Projects:

Network MAC Layer Implementation for LoRa Development Board

Network MAC Layer Implementation for LoRa Development Board

Developed and implemented a robust MAC layer for the LoRa communication protocol, optimizing for long-range, low-power, and anti-interference performance. Designed a complete software stack integrating MCU control and advanced features like timeout retransmission and duty cycle sleep management for a software-defined radio platform.