Citation
in Lectures from Caianiello Summer School on Adaptive Processing of Sequences
Abstract
Deriving gradient algorithms for time-dependent neural network structures typically requires numerous chain rule expansions, diligent bookkeeping, and careful manipulation of terms. While principled methods using Euler- Lagrange or ordered derivative approaches exist, we present an alternative approach based on a set of simple block diagram manipulation rules. The approach provides a common framework to derive popular algorithms including backpropagation and backpropagation-through-time, without a single chain rule expansion. We further illustrate a complimentary approaching flow graph inter reciprocity to show transformations between on-line and batch learning algorithms. This provides simple intuitive relationships between such algorithms as real-time recurrent learning, dynamic backpropagation, and back propagation-through-time. Additional examples are provided for a variety of architectures to illustrate Bothe the generality and the simplicity of the diagrammatic approaches.