0%

一篇关于 Zero Sum SVD 的中文阅读笔记:把所有层的奇异值堆到一个全局优先队列里,用带符号的损失敏感度和「零和守恒」的贪心规则一次性决定全模型的秩预算,让异质化的逐层秩自然从一条标量约束里掉出来。
Read more »

A detailed technical review of Zero Sum SVD, which replaces per-layer rank optimization with a global, signed loss-sensitivity heap and a greedy zero-sum rule, letting heterogeneous per-layer ranks fall out of one scalar conservation law.
Read more »

一篇关于 DisagMoE 的中文阅读笔记:把 attention 和 FFN 分别放到独立 GPU 池,用 AF-Pipe 调度和 M2N 通讯原语把两侧拼起来,从而把 MoE 训练里的 all-to-all 瓶颈藏进计算之下。
Read more »

A detailed technical review of DisagMoE, which disaggregates attention and FFN layers onto separate GPU pools and stitches them together via the AF-Pipe schedule to hide the MoE all-to-all bottleneck during training.
Read more »

一篇关于 DAPO 的中文阅读笔记:它把 Clip-Higher、动态采样、token-level loss 与 overlong reward shaping 组合成可复现的大规模 LLM 强化学习配方。
Read more »

A detailed technical review of DAPO, an open-source large-scale reinforcement learning recipe for reasoning LLMs using Clip-Higher, dynamic sampling, token-level loss, and overlong reward shaping.
Read more »

一篇关于 Tutti 的中文阅读笔记:它从 GPU-native KV cache object store、GPU io_uring 与 slack-aware scheduling 出发,让 SSD-backed KV cache 更适合长上下文 LLM serving。
Read more »

A detailed technical review of Swift-SVD, an activation-aware low-rank compression method for LLM weights and KV cache that uses output covariance eigendecomposition to avoid expensive generalized SVD.
Read more »