3rd seminar
The spectacular results of deep learning rely significantly on the elegance and effectiveness of the Backpropagation (BP) algorithm that has primarily succeeded for feedforward neural architectures. When time is involved the discovered related solutions that somewhat involve variations of the BP computational scheme exhibit limited capabilities of extracting long-term dependencies. Interestingly, those solutions are still developed under the framework of learning from collections of sequences, whereas one could be interested in learning over time on the basis of the environmental
interactions without relying on the appropriate preparation of the training set.
This cycle of seminars covers the classic architectures and learning algorithms of recurrent neural networks and shows that backpropation-like algorithms can nicely be transformed into neural graph propagation schemes that are truly local in both time and space. It is shown that this significant paradigm shift into the propagation that involves the neural architecture arises from the formulation of learning as an optimization problem over time which, under ergodic assumptions, corresponds with optimizing functional risks indexes used in statistical learning. The corresponding learning process turns out to be driven by Hamiltonian dynamics whose effective optimization does involve most classic topics that have been independently regarded as crucial in the field of cognitive science.
Finally, the generative mechanisms behind the Hamiltonian dynamics is shown and an in-depth comparison is carried out with the current autoregressive schemes used for Large Language Models.
Join at: imt.lu/aula2