Problems with Floating-point Precision

« Return to News
As high-level programmers, we can take a lot of things for granted. Things like memory management and portability are problems for some further issues down the food chain.  That doesn’t mean that there aren’t certain low-level concepts that still affect our everyday workflow and one of these things, that seems to rear its head more often than others, is the fundamental problem of floating-point arithmetic.
The root of the problem is based in the fact that it is impossible to represent an infinite collection in a finite space. Just like the decimal system has problems representing fractions, such as “⅓”, binary representation must overcome similar obstacles. For instance, the number “2.05” is actually stored as something around “2.04999995” as a floating-point value. What’s more, if you are expecting a specific value, even a simple value by decimal standards, you may be in for a surprise. Try running this statement in PHP:
var_dump( 0.1 + 0.2 == 0.3 );
For those unfamiliar with PHP, this statement is asking “is ‘0.1 + 0.2’ the same as ‘0.3’” and it will print whether it is true or false.
You may, or may not by this point, be surprised to find out that the above statement will return a false result. Why? Well, let’s consider a standard 32-bit floating-point number which uses 1 bit to hold the sign of the number (positive or negative), 8 bits to hold the exponent and the remaining 23 bits to hold the mantissa (the base number). Without going into a full decimal to binary conversion tutorial, I’ll tell you that the two closest values to 2.05 that can be represented are 2.0499999523162841796875 and 2.05000019073486328125.
In many everyday calculations, this lack of precision goes unnoticed, along with the fact that some languages have built-in features to help address it, but there are times when it can become particularly troublesome. One is in something like the “0.1 + 0.2” example, cases in which we are testing for very specific values based on arithmetic.
Another is in situations in which we calculate things that involve many different values and large quantities, especially using multiplication and division. For instance, a business that sells items by size, such as fabric or other raw materials, may choose to set prices based on size and/or weight. Combine these calculations with quantity discounts, coupons, tax and shipping calculations and you’ll find that every new calculation increases the margin of error of our result.
So what do we do? Well, 9 times out of 10, we work with it. We try to be proactive and recognize situations in which compounded arithmetic results might lead to a significant loss of accuracy. We choose to round our calculations at certain points, so we can re-establish precision. We do comparisons using ranges and “greater than” or “less than” instead of strict equality.
Of course, developers have also come up with solutions that have been built into many of the most common programming languages, such as PHP’s, GMP and BC math functions. These solutions come with a performance cost, but such are the compromises we make as programmers.

By Brad Hagedorn on 02/13/2018 11:00 AM

eLink Design is a national web design, application development, SEO, and business consulting firm, founded in 2001 that specializes in custom solutions for over 800 clients around the country. With this blog, we hope to provide insights into what we are working on, areas where we think we can help shed light on problems we hear, and sometimes just cool things we have come across.

« Return to News

Subscribe to our mailing list