[previous_group] [previous] [up] [next] [next_group]

Floating Point Numbers

Like integers, floating point numbers come in three sizes:
* float - 4 bytes (about 7 decimal digits of precision),
* double - 8 bytes, (about 14 decimal digits of precision)
* long double - 16 bytes.
For these sizes, it is not only the range of numbers that can be preresented, but also the precision of the number.

How are floating point numbers represented internally?

Like scientific notation. For example, the national debt is about $4 trillion. We can write that as:
     $ 4 0 0 0 0 0 0 0 0 0 0 0 0

which takes up alot of space on the page. A more compact representation is:

In C, we can write a constant like that as:

     4.0E12
or
     4.0e12

By default this will be stored as a double, but we can force it to be float:
     4.0E12F or 4.0E12f

or long double:
     4.0E12L or 4.0E12l

Of course this is the same as:
      40.0E11
     400.0E10
       0.4E13

To make best use of the bits available for representing a floating point number, we would like to have just one representation for each. Such numbers will always be stored in normalized form - say with one significant digit to the left of the radix point (adjusting the exponent as appropriate).

So, to store the number in memory, we do not need to store the base (it is implied, we will use base 10 for our examples, though in the computer it would really be 2), and we do not need to store the radix point (it is implied by the normalization). So the national debt would be stored as:


Here we are prepresenting a floating point number in 8 digits in decimal, but the real binary representation is similar.

But the national debt is really:

     $ 4 1 3 7 3 4 1 8 2 6 1 7 5 . 6 3
or

     $ 4 . 1 3 7 3 4 1 8 2 6 1 7 5 6 3 E 12
but since we are limited to the number of digits we can store, we would have to represent this as:
3 4 1 8 2 6 1 7 5 6 3
throwing away some of the digits, thus loosing some of the precision we have for that number.

So we can see that we cannot represent ALL real numbers exactly; there is some error introduced because we have to truncate digits. But the more digits of mantissa we can store, the greater the precision and therefore accuracy .


[up] to Overview.

[next]