0%

AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning — Detailed Technical Review

Paper: AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
Authors: Qingru Zhang, Minshuo Chen, Alexander Bukharin, Nikos Karampatziakis, Pengcheng He, Yu Cheng, Weizhu Chen, Tuo Zhao
Affiliations: Georgia Institute of Technology, Princeton University, Microsoft Azure AI
Published: ICLR 2023 (arXiv: 2303.10512)
Reviewer: Zhongzhu Zhou
Review Date: February 19, 2026


I. Prerequisites: What You Need to Know

Before diving into AdaLoRA's contributions, this section establishes the foundational concepts needed to fully understand the paper. These prerequisites are designed to be accessible even if you are encountering parameter-efficient fine-tuning for the first time.

Read more »

vLLM and PagedAttention: Efficient Memory Management for Large Language Model Serving — Detailed Technical Review

Paper: Efficient Memory Management for Large Language Model Serving with PagedAttention
Authors: Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Liang Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, Ion Stoica
Affiliation: UC Berkeley, Stanford University
Published: SOSP 2023 (arXiv: 2309.06180)
Reviewer: Zhongzhu Zhou
Review Date: February 19, 2026


I. Prerequisites: What You Need to Know

This section builds up all the foundational concepts needed to understand why vLLM matters and how PagedAttention works. Even if you are new to LLM serving systems, this section will give you the complete background.

Read more »

GLM-5: from Vibe Coding to Agentic Engineering — Deep Technical Review (EN)

Author: zhongzhu zhou
Paper: GLM-5: from Vibe Coding to Agentic Engineering (arXiv 2602.15763v1, 2026)
ArXiv: https://arxiv.org/abs/2602.15763
Project: https://github.com/zai-org/GLM-5


TL;DR

  • GLM-5 is best read as a full-stack agent-engineering system paper, not only a model-scale update.
  • The strongest practical story is the combination of DSA long-context efficiency + staged asynchronous RL + realistic agent evaluation.
  • Evidence is strong on engineering-oriented tasks, with clear progress over prior GLM generations and competitive standing vs proprietary models.
  • Main caveat remains full external reproducibility at the same scale.
Read more »

DeepSeek-V2: Multi-head Latent Attention and DeepSeekMoE — Detailed Technical Review

Paper: DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Authors: DeepSeek-AI
Affiliation: DeepSeek
Published: May 2024 (arXiv: 2405.04434)
Reviewer: Zhongzhu Zhou
Review Date: February 18, 2026


I. Prerequisites: What You Need to Know

This section covers every foundational concept needed to understand DeepSeek-V2's innovations. We will build from basic attention mechanics all the way to the KV cache bottleneck and low-rank compression.

1.1 The Transformer and Multi-Head Attention (MHA)

The Transformer architecture is the backbone of virtually all modern large language models. At its core, each Transformer block has two components: an attention module and a Feed-Forward Network (FFN).

Read more »

Direct Preference Optimization: Your Language Model Is Secretly a Reward Model — Detailed Technical Review

Paper: Direct Preference Optimization: Your Language Model Is Secretly a Reward Model
Authors: Rafael Rafailov*, Archit Sharma*, Eric Mitchell*, Stefano Ermon, Christopher D. Manning, Chelsea Finn
Affiliation: Stanford University, CZ Biohub
Published: NeurIPS 2023 (arXiv: 2305.18290)
Reviewer: Zhongzhu Zhou
Review Date: February 17, 2026


I. Prerequisites: What You Need to Know

This section builds up every concept you need to understand DPO from scratch. Even if you have never encountered reinforcement learning or language model alignment, you should be able to follow along.

Read more »

Tree of Thoughts: Deliberate Problem Solving with Large Language Models — Detailed Technical Review

Paper: Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Authors: Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
Affiliations: Princeton University, Google DeepMind
Published: NeurIPS 2023 (arXiv: 2305.10601)
Reviewer: Zhongzhu Zhou
Review Date: February 16, 2026


I. Prerequisites: What You Need to Know

Before diving into the Tree of Thoughts framework, let us establish the foundational concepts that make this paper accessible, even if you are new to LLM reasoning research.

Read more »

ReAct In-Depth Technical Review (English, v5)

Author: zhongzhu zhou
Paper: ReAct: Synergizing Reasoning and Acting in Language Models (ICLR 2023)

Abstract

ReAct’s contribution is not simply “longer reasoning traces.” Its key idea is to reframe LLM problem solving as an executable closed loop: Thought (reasoning) → Action (interaction) → Observation (feedback). This unifies reasoning and tool interaction in one trajectory, enabling planning before acting and correction after observing. As a result, ReAct improves robustness, interpretability, and diagnosability across both knowledge-intensive reasoning and long-horizon decision-making tasks. This review provides a detailed analysis of method mechanics, experiments, failure modes, and practical deployment guidance.

Read more »

Introduction

In particular, we focus on four topics. First, we present a taxonomy of instruction set alternatives and give some qualitative assessment of the advantages and disadvantages of various approaches. Second, we present and analyze some instruction set measurements that are largely independent of a specific instruction set. Third, we address the issue of languages and compilers and their bearing on instruction set architecture. Finally, the “Putting It All Together” section shows how these ideas are reflected in the RISC-V instruction set, which is typical of RISC architectures.

Read more »

Preface

Got two certifications from RL in Alberta, I feel I understand more concepts in RL. Keep going! The third part - I see it's related to the GD & function approximation.

  • Introduction
  • Value-function Approximation
  • The Prediction Objective (VE)

Introduction

Control Problem is the task of improving a policy. So, if we only need to evaluate the state, it's not a control problem.

Can we represent the value function with a tabel? => no; GD / Average award.

The novelty in this chapter is that the approximate value function is represented not as a table but as a parameterized functional form with weight vector w Rd\in R^d .

Read more »

前言

最近很多科研工作要做,但是就是不想做,想整七整八。所以今天来折腾下路由器。极路由1S, 5661A。还是很有意思的。

看着本科16年买的极路由1S,虽然公司倒闭了,但是路由器的开发版,我仔细看了下论坛,应该是现在破解方式比较多的路由器之一。不过实话说2.4G HZ还是有点慢。等有时间了还是想新换一个方便刷机的2.4G, 5G WIFI 6路由器。但我去业界看了下,竟然很少有WIFI 6的 openwrt的系统。Glinet有一个,但是好贵,而且没开始卖。我有点春春欲动想去学了写一个。但是最近实在太忙,还是再放放。澳洲了有空闲时间一定要来试下。

暂时考虑着之后组网会有一个full control的路由器,所以还是把这个作为科学上网的中继路由。简单记录一下心酸的刷机历程。之后很多会员使用可能都需要先在中介科学上网路由刷下DNS再弄.

Read more »

前言

从深入理解Linux内核这本书,我就对操作系统的某些部分,文件系统,系统调用产生了一些兴趣,想知道每个模块是如何组成如此庞大且复杂的操作系统。但自己实现操作系统的时候,汇编代码有着太大难度,所以一直没有时间阅读并仔细分析操作系统的细节。这次通过陈海波教授的操作系统书籍,希望能进一步补全操作系统中自己忽视了的短板,同时能够学会大部分的汇编代码,写出自己的第一个迷你操作系统。

概述

  1. 对硬件进行管理和抽象
  2. 为应用提供服务并进行管理 - 服务应用, 管理应用
Read more »

Recently, because of limit of memory, my macbook will run slowly when I open several applications, coding, and do some paperwork at the same time. Thus, I bought a new macbook pro 2021 - M1 chip. However, there are lots of data need to migrate to the new macbook. And there are some difference between m1 and intel chip. This blogs give some hints for migration between the computer.

1. Erase the old computer

  1. Backup using time machine

  2. Sign out of iCloud

    If using macOS Catalina or later, choose Apple menu  > System Preferences, then click Apple ID. Select Overview in the sidebar, then click Sign Out.

    If using an earlier version of macOS, choose Apple menu  > System Preferences, click iCloud, then click Sign Out.

Read more »