David Meyer · @dmm
222 followers · 579 posts · Server mathstodon.xyz

Policy gradient methods in reinforcement learning are very cool and are used in a wide variety of machine learning applications including robotics, game playing, autonomous vehicles and many others, including incremental training of Large Language Models (LLMs).

A few of my notes on policy gradients are here: davidmeyer.github.io/ml/policy. The LaTeX source is here: overleaf.com/read/kbgxbmhmrksb.

As always questions/comments/corrections/* greatly appreciated.

#texlatex #largelanguagemodelsa #policygradients #reinforcementlearning #machinelearning

Last updated 1 year ago