Automatic Differentiation

Tobias Madsen
(BiRC, MOMA)
Thiele Seminar
Thursday, 2 February, 2017, at 13:15-14:00, in Koll. D (1531-211)
Abstract:
Automatic Differentiation (AD) is a method to evaluate derivatives of a function, $f:\mathbb{R}^M\rightarrow\mathbb{R}^N$, defined by a computer program. By viewing the computation as a composition of elementary functions and applying the chain rule, the derivatives are exact up to machine precision and the method has the same computational time complexity as the original program.

Although the fundamental ideas in AD, where discovered in the 60’s, there has been a recent spike in interest, partly owing to efficient implementations embedded in statistical computing libraries: The probabilistic programming tool Stan uses AD to do automated Bayesian inference using either Hamiltonian Monte Carlo or variational inference. Tensorflow and Theano, software libraries primarily used in machine learning to train Neural Networks, also both rely on AD.

In this talk I will introduce the two modes of AD; forward and reverse. The differences between AD and numerical and symbolic differentiation will be discussed. I will demonstrate stand-alone tools for AD and finally discuss recent statistical software libraries employing AD.

 

Organised by: The T.N. Thiele Centre
Contact person: lars nørvang andersen