Software Techniques for the Measurement, Management and Reduction of Numerical Error in Programs
Dr John L. Gustafson, Research Professor, School of Computing
Join Zoom Meeting
https://nus-sg.zoom.us/j/92850207555?pwd=OXhGTytCelVDMDUwa29icGFkUTZFZz09
Abstract:
Floating point arithmetic is ubiquitous in software. Scientists, engineers, programmers or anyone intending to solve real world problems using computers, have to grapple with the floating-point number systems available on machines in order to encode the real number quantities in applications into a format that can be computed with. However, due to finite memory availability and their design, more often than not, floating point numbers only represent an approximation of the actual value. The resulting numerical error can play the roles of both friend and foe, providing opportunities for energy savings through further approximation but also presenting formidable challenges to the correctness of programs. In this thesis, we develop automatic software techniques that are efficient and effective in dealing with numerical error in the identified areas of measurement, management and reduction to achieve desired program objectives. A unique feature of these techniques is that we employ novel representations of floating point numbers for this purpose.
To measure numerical error of a computation by obtaining rigorous error bounds on it, we propose the use of a new number system called unums. As a use case, we employ unums to measure the numerical stability of the Strassen-Winograd matrix multiplication when executed with different hardware instructions and techniques that can potentially improve its numerical stability. Managing numerical error in a controlled manner allows for potential energy savings in certain applications that are amenable to approximation. To this end, we augment existing symbolic execution methods to identify program components that can lend to the reduction of execution cost under acceptable losses of accuracy. Emergent applications such as deep learning are demanding more efficient floating point representations than those that are currently available to reduce the effects of numerical error when smaller bitwidths are used to maximize energy savings. We show that the use of posit numbers in the training of several state-of-the-art deep neural networks produce superior results when compared to other existing representations and based on our analysis propose a new configuration for low-precision posit training. Finally, we develop a framework approach to handling numerical error by combining the above techniques to address the aforementioned issues in programs.