VLSI Architectures for Signal Processing and Machine Learning

Code: EE5516

Category: PME

Credits:2-0-2-3

Course Description: This course will introduce the design and implementation of signal processing and machine learning algorithms on resource-constrained systems. First part of the course deals with various formal techniques and high-level transforms to map signal processing algorithms onto hardware. Then the least mean squared (LMS) algorithm will be employed as a vehicle to connect signal processing and machine learning including – algorithmic properties (training, convergence); analytical estimation of bit precision requirements etc. Final part deals with architectures for machine learning. The course includes hands-on system building experiments using MATLAB/Python, Verilog and finally FPGA implementations. MATLAB/Python lab sessions are used to understand the algorithmic reformulations, LMS learning rate and range analysis for fixed point implementations. Primary focus of the lab sessions will be Verilog simulation. FPGA implementations will be limited to algorithm to architecture mapping case studies like GCD, CORDIC, FFT, Matrix Inversion. Detailed course content is given below:

Course Contents

Topic	Lecture (hours)	Lab (hours)
Fundamentals Motivation and scope of the course; terminology, applications and platforms, use of block diagrams, signal flow graphs, data-flow graphs, dependence graphs for representing signal processing algorithms	04	00
High level transformations Iteration bound, algorithms to compute iteration bound, critical path, pipelining and retiming of data-flow graphs; unfolding and folding transformation of dataflow graphs, systolic array architectures, distributed arithmetic, implementation of the transformed DFGs	09	08
Algorithm to Architecture Mapping Case studies Performance-complexity trade-offs while mapping an algorithm onto architecture, Case studies of GCD (Greatest common divisor), CORDIC (Co-ordinate Rotation Digital Computer), FFT (Fast Fourier transform), Matrix Inversion algorithms and their fixed-point implementations	05	10
LMS Learning Algorithm Described using the language of machine learning; connection to stochastic gradient descent (SGD) algorithm; applications (linear prediction, estimation, system identification) for time-series data; finite precision effects; case studies of CMOS prototypes of LMS and its variants.	04	05
Hardware Architectures for Machine Learning Architectural approaches for implementing DNN: reduced precision of operations and operands (floating point to fixed point, reducing the bit width, nonuniform quantization, and weight sharing), reduce number of operations and model size (compression, pruning, and compact network architectures). Advanced topics in ML hardware design	06	05
Total	28	28

Learning Outcomes:

At the end of the course, the students should be able to

Understand the design methodologies for realization of dedicated VLSI architectures for signal processing and machine learning applications.
Understand the area-power-speed tradeoffs for different applications.
Be familiar with the intrinsic error tolerance of machine learning algorithms and approximate computing.
FPGA implementation of the given complex algorithm according to given specifications

Teaching Methodology : Classroom lectures and MATLAB/Verilog programming in Lab

Assessment Methods : Written examination, Continuous lab assessment

Text Books

1. K.K. Parhi, VLSI Digital Signal Processing Systems: Design and implementation, John Wiley, 1999. ISBN: 978-8-126-51098-6

2. U. Meyer-Baese, Digital Signal Processing with Field Programmable Gate Arrays, 4th Ed. Springer, 2014. ISBN 978-3-540-72613-5

Selected articles

1. V. Sze, "Designing Hardware for Machine Learning," in IEEE Solid-State Circuits Magazine, vol. 9, no. 4, pp. 46-54, Fall 2017.

2. N. R. Shanbhag, N. Verma, Y. Kim, A. D. Patil and L. R. Varshney, "Shannon-Inspired Statistical Computing for the Nanoscale Era," in Proceedings of the IEEE, vol. 107, no. 1, pp. 90-107, Jan. 2019.

3. V. Sze, Y. Chen, T. Yang and J. S. Emer, "Efficient Processing of Deep Neural Networks: A Tutorial and Survey," in Proceedings of the IEEE, vol. 105, no. 12, pp. 2295-2329, Dec. 2017.