Floating Point
Last updated
Was this helpful?
Last updated
Was this helpful?
We have a binary point that signifies the boundary between the integer and fractional portion. That is, the number to the right of the decimal point represents , and the number to the right of the decimal number represents . .
Same as traditional addition.
Same as traditional multiplication.
If we could just move the binary point wherever we wanted, we could store much more varied information. To do this we use a variant on scientific notation. It has a few components: given a scientific number , is the mantissa, is the exponent, is the radix, or base. In IEEE 754 Floating Point Standard, we dedicate one bit to the sign, 8 bits for our exponent, 23 bits for our significand.
We can use bias notation to represent more numbers, where the bias is a number subtracted to get the real number.