For software developers, avoiding errors and crashes is top priority. When an application crashes, it wastes time and resources, either rolling back to the last checkpoint or starting over entirely. But in the growing realm of approximate computing, where the output of applications doesn’t need to be exactly right but just “good enough,” avoiding crashes by giving these errors the silent treatment might help save energy and time.
That’s the idea behind “Crash Skipping,” a new technique created by UChicago CS graduate Yan Verdeja Herms and assistant professor Yanjing Li for efficient error recovery in approximate computing environments. The research received the best paper award at the ACM Great Lakes Symposium on VLSI (GLSVLSI) conference held in May 2019 in Washington DC.
The approach holds promise for approximate applications used in machine learning, computer vision, data mining, and other increasingly popular functions that don’t always require precise accuracy. Following the philosophy of “perfect is the enemy of good,” Crash Skipping largely ignores errors that would normally cause the application to crash, trusting that the end result will still be acceptable.
“The notion of allowing detected errors to propagate unchecked seemed strange, because one generally tries to avoid them. But sometimes starting with an aggressive sounding solution can yield pleasantly surprising results,” said Verdeja Herms. “If the goal of approximate computing is to relax reliability to achieve performance and energy efficiency benefits, crashes actually occur much more frequently than unacceptable outputs. We show that novel ways to handle crashes can provide large benefits.”