04-11 Language Agent Tree Search (LATS): Unifying Reasoning, Acting, and Planning in Language Models — Deep Technical Review
04-10 SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression — Deep Technical Review
04-09 DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving — Deep Technical Review
04-08 SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models — In-Depth Technical Review