CS 6890: Deep Learning

Spring 2020

This course will introduce the

Logistic and Softmax Regression, Feed-Forward Neural Networks, Backpropagation, Vectorization, PCA and Whitening, Deep Networks, Convolution and Pooling, Recurrent Neural Networks, Long Short-Term Memory, Gated Recurrent Units, Neural Attention Models, Sequence-to-Sequence Models, Distributional Representations, Variational Auto-Encoders, Generative Adversarial Networks, Deep Reinforcement Learning.

Previous exposure to basic concepts in machine learning, such as: supervised vs. unsupervised learning, classification vs. regression, linear regression, logistic and softmax regression, cost functions, overfitting and regularization, gradient-based optimization. Experience with programming and familiarity with basic concepts in linear algebra and statistics.

- Syllabus & Introduction
- Hand notes Jan 14.

- Linear Regression, Logistic Regression, and Vectorization
- Gradient Descent algorithms
- An overview of gradient descent optimization algorithms, Sebastian Ruder, CoRR 2016

- Linear algebra and optimization in NumPy and PyTorch
- Hand notes Jan 28.
- Tutorials on NumPy and SciPy.
- Broadcasting explained.

- NumPy/SciPy examples and NumPy session on Jan 23.
- PyTorch examples and linear regression in Jupyter Notebook.

- Feed-Forward Neural Networks and Backpropagation
- Andrej Karpathy: Yes you should understand backprop.
- Unsupervised Feature Learning with Autoencoders
- Introduction to Automatic Differentiation, invited lecture by Dr. David Juedes.
- PCA, PCA whitening, and ZCA whitening
- Convolutional Neural Networks
- Andrej Karpathy's notes on CS231n: Convolutional Neural Networks for Visual Recognition.
- UFLDL Tutorial at Stanford.

- Word Embeddings
- Natural Language Processing (Almost) from Scratch, Collobert, Weston, Bottou, Karlen, Kavukcuoglu, and Kuksa, JMLR 2011.
- Distributed Representations of Words and Phrases and their Compositionality, Mikolov, Sutskever, Chen, Corrado, and Dean, NIPS 2013.

- Recurrent Neural Networks for NLP
- RNNs with Attention for Machine Translation

- Assignment and code.
- Assignment and code.
- Assignment, code and data.
- Assignment, code and data.
- Assignment, code, word2vec Google News embeddings, and the Stanford Natural Language Inference (SNLI) dataset.
- Reasoning about entailment with neural attention, Rocktaschel et al., ICLR 2016.

- Petersen et al.'s The Matrix Cookbook
- James H. Martin's Introduction to probabilities
- Jason Eisner's equestrian Introduction to probabilities
- Gilbert Strang's Introduction to Linear Algebra
- Strang's Video Lectures on Linear Algebra
- Inderjit Dhillon's Linear Algebra Background
- Convex Optimization, Stephen Boyd and Lieven Vandenberghe, Cambridge University Press 2004