A detailed technical review of DisagMoE, which disaggregates attention and FFN layers onto separate GPU pools and stitches them together via the AF-Pipe schedule to hide the MoE all-to-all bottleneck during training.
A detailed technical review of DisagMoE, which disaggregates attention and FFN layers onto separate GPU pools and stitches them together via the AF-Pipe schedule to hide the MoE all-to-all bottleneck during training.
A detailed technical review of DAPO, an open-source large-scale reinforcement learning recipe for reasoning LLMs using Clip-Higher, dynamic sampling, token-level loss, and overlong reward shaping.
A detailed technical review of DAPO, an open-source large-scale reinforcement learning recipe for reasoning LLMs using Clip-Higher, dynamic sampling, token-level loss, and overlong reward shaping.
A detailed technical review of MASPO, a joint prompt optimization method for multi-agent LLM systems that balances local, downstream, and global rewards.
A detailed technical review of MASPO, a joint prompt optimization method for multi-agent LLM systems that balances local, downstream, and global rewards.
A detailed technical review of Swift-SVD, an activation-aware low-rank compression method for LLM weights and KV cache that uses output covariance eigendecomposition to avoid expensive generalized SVD.
A detailed technical review of Swift-SVD, an activation-aware low-rank compression method for LLM weights and KV cache that uses output covariance eigendecomposition to avoid expensive generalized SVD.
A detailed technical review of Piper, a resource-model-driven system for large-scale MoE training with pipelined hybrid parallelism, HALO hierarchical all-to-all, and topology-aware expert placement.
A detailed technical review of Piper, a resource-model-driven system for large-scale MoE training with pipelined hybrid parallelism, HALO hierarchical all-to-all, and topology-aware expert placement.
A detailed technical review of NExt, a method that models low-rank optimization trajectories to accelerate reinforcement learning with verifiable rewards for large language models.
A detailed technical review of NExt, a method that models low-rank optimization trajectories to accelerate reinforcement learning with verifiable rewards for large language models.