Date of Award
Electrical and Computer Engineering
Neural networks have been widely responsible for recent advances in machine learning, powering technologies such as digital assistants and AR photography. LPLANN (Low-Precision Linear Algebra for Neural Networks) is a cross-platform library written in C++ used for implementing neural networks. The software allows users to set specific levels of precision for calculations. Low-precision calculations use advanced parallelization techniques (SIMD, SWAR) to run neural networks at faster rates than full-precision calculations. This library is lightweight enough to run on embedded systems, only relies on OpenMP as a dependency, and is portable to any operating system. LPLANN also includes optimizations to provide drastic speedups on a workstation, allowing it so serve as a testbed for novel low-precision neural network architectures. The purpose of this project was to implement optimizations and to test how the execution time of binary networks compares to that of floating-point networks. Performing 2-dimensional convolution using 3x3 filters is implemented on binary weights with 43% overhead, and 1.5% overhead for 7 x 7 filters. Depending on the architecture of the neural network, speedup with 3 x 3 filters varied from 2.5x to 8x.
Mitchell, Frost Bennion, "Low-Precision Linear Algebra for Neural Networks" (2018). Undergraduate Honors Capstone Projects. 307.
Copyright for this work is retained by the student. If you have any questions regarding the inclusion of this work in the Digital Commons, please email us at .