The "log derivative trick" (see image below) is an incredibly cool and useful thing. Essentially what it does (in many cases anyway) is to provide a method for estimating a gradient in terms of an expectation, which is a big win (because the law of large numbers tells us that we can estimate the expectation in an unbiased way directly from the samples). One place it comes up is in likelihood ratio policy gradients for reinforcement learning.
See https://davidmeyer.github.io/ml/policy_gradient_methods_for_robotics.pdf for some notes on all of this.
#policygradients #logderivativetrick #machinelearning