This summer, I'll be joining Cloudflare's Workers AI team to optimize AI model serving. I'm also working on Tokenspeed, a speed-of-light LLM inference engine, with mentorship from Mingxing Zhang and Tsinghua MADSys group.

I spend my free time contributing to SGLang while learning about MLsys, distributed systems, and inference optimization."Once a word leaves your mouth, even four horses cannot chase it back." - 邓析

今年夏天,我将加入Cloudflare的 Workers AI 团队,负责优化AI模型服务系统。此外,我最近也开始参与 Tokenspeed, 一个 speed-of-light LLM 推理引擎,并在 Mingxing Zhang 清华 MADSys 课题组的指导下工作。

我在空闲时间为 SGLang 做贡献,同时学习机器学习系统、分布式系统和推理优化。一言既出,驷马难追

trees