ReinforcementLearning-Principle-Day9

Posted on 2021-10-31 Edited on 2021-11-07

Sarsa: On-policy TD Control
Q-learning: Off-policy TD Control
Maximization Bias and Double Learning
Games, Afterstates, and Other Special Cases
Summary
Read more »

ReinforcementLearning-Principle-Day8

Posted on 2021-10-20 Edited on 2021-10-31

Abstract
TD Prediction
Advantages of TD Prediction Methods
Optimality of TD(0)

ReinforcementLearning-Principle-Day7

Posted on 2021-10-13 Edited on 2023-03-22

What is Monte Carlo
Using Monte Carlo for Prediction
Using Monte Carlo for Action Values
Using Monte Carlo methods for generalized policy iteration
Solving the Blackjack Example

Operating System Memory Address

Posted on 2021-09-29 Edited on 2022-01-18

Preface

When I do some system research work, I found it worth understanding the real implementation of every details of every components (file system, memory management, etc.). Thus, I want to start a new chapter here to records every notes and experience of reading books - Understanding the Linux Kernel, Third Edition 3rd Edition by Daniel P. Bovet. Hope after reading this books, I can understand every papers in the OSDI and figure out more useful, novel idea. Not only think without considering any real problems or architecture in the operating system.

ReinforcementLearning-Principle-Day6

Posted on 2021-09-29 Edited on 2021-10-13

Preface

We start our coursera Sample-based Learning Methods from now on. And in this period, I will still excerpt some sentences from Sutton’s book. But this time, I will label my own comprehension red.