Why?

While there are some excellent sources for information about theory behind fixed point, most sources present the facts and fail to provide some examples to show the idea in action.

On this page you can find an online demo of the representation of fixed point and some of its operations. While there might be other schemes out there, the techniques here have been used in various projects and as such should be useful.

Note that the target audience is expected to have a basic understanding of binary representation of signed and unsigned numbers and arithmetic operations at the binary level.

Representation

Signed:
Bits:
Fractional:
 
Format:
...
Range:
...
 
Decimal:
Integer:
Hex:
Bits:

Use the fields above to type a representation and finish with enter to update the other representations.

Fixed point addition

Because fixed point is a form of interpretation rather than binary encoding, addition and substraction between 2 fixed point numbers is the same as integer addition or substraction.

Example using FIX_8_4:
4.50
0.75+

5.25

Now the same example using binary addition in FIX_8_4 (the dot is only to emphasize the fractional part):
0100.1000
0000.1100+

0101.0100

Naturally, overflow is still a posibility. But when handling fixed point logic, one can use normal integer addition and substraction.

Fixed point multiplication

Multiplication is a bit trickier.
First, you need to align the virtual dot in both numbers and then sign extend and pad both until their lengths match.

Lets look at FIX_8_2 0.5 and FIX_4_3 -0.25, where the result will be FIX_9_3:
FIX_8_2:000000.10 => 000000.100
FIX_4_3:1.110 => 111111.110

After aligning, you can multiply both numbers in the same format. Lets start with an example of two positive fractions of the same type: we multiply FIX_8_4 0.5 and 0.25.
Representations:
0000.10000.50
0000.0100*0.25

0000000.001000000.125
The result has the same number of integer bits as both fixed point inputs summed together and the same goes for the fractional bits.
Note that this is a normal integer multiplication at the bit level; if the same output range is required as the inputs (at the risk of getting overflow or losing precision), shifting the result 4 bits and taking the lower 8 bits, a FIX_8_4 result is obtained.
In practice you need to extend the number of bits so both formats match. You need to align the virtual dot to get the correct result, see the next example.

The same procedure works for inputs with a mixed sign. Lets multiply FIX_3_2 0.5 and FIX_3_2 -0.25 where the result will be FIX_6_4.
Note that we sign-extend the inputs to the double bit width in order to get the correct multiplication result. The result of this multiplication is in essence a 12-bit result but the upper 6-bits will be discarded.
0.10 => 0000.10 0.50
1.11 => 1111.11* -0.25*

11.1110-0.125

In this example the result is 11.1110, would this be truncated to the FIX_3_2 format, we would lose precision: 1.11. This is the representation of -0.25 and as such the nearest value given the 3 bits to hold the data.

Fixed point division

Division is similar to multiplication but requires a different alignment.

The difference between number of fractional bits between the numerator and denominator will yield the number of fractional bits in the answer.
If we would divide a numerator in UFIX_8_4 notation by a denominator in UFIX_8_4, the result will be a UFIX_4_4 number.
Now, if we divide a UFIX_12_8 number by a UFIX_8_4 number, the result will be a UFIX_8_4 number: the highest number of integer bits is 4, so the answer will have 4 integer bits. The numerator has 8 fractional bits while the denominator has 4, the result of the division will have 8 - 4 = 4 fractional bits.

Note that this means that the numerator has to have at least the same number of bits as the denominator to end up with a complete integer result.
If the denominator has more fractional bits than the numerator, the result will not be complete as the lower integer bits will be truncated.

Lets do an example: we will divide 2 by 8, both of which will be in UFIX_8_4 notation: 0x20 / 0x80. As noted before, the end result will be in UFIX_4_0 notation.
0x20   0010.0000
0x801000.0000

/
0x00000.
As we can see, the result get truncated to 0.

Lets do the same thing again but now we want a UFIX_8_4 result. This means the numerator has to have 4 fractional bits more than the denominator.
As we want 4 integer bits, we retain the 4 fractional bits from the first example.
The desired format for the numerator should thus be UFIX_12_8

Converting 2.0 to UFIX_12_8 notation means shifting it 4 bits to the left compared to the first example: 0x200.
0x200   0010.0000 0000
0x801000.0000

/
0x40000.0100

The result is 0x4 which in UFIX_8_4 notation is in decimal 0.25, the correct result from 2 / 8.

Manually convert between decimal and fixed point

To convert between decimal and fixed point, you simply multiply or divide by the power of 2 that you scale with.

Example:
Lets say you want to convert the decimal '1.05' to FIX_8_5 (8 bits, 5 of which are fractional). You determine the scale factor using the position of the virtual dot: 25 = 32.
Now multiply the decimal with the scale: 1.05 * 32 = 33.6. This needs to be rounded to 34 or 0x22 to obtain the 'raw' fixed point value.

Example 2:
We have 0x42 as a FIX_8_7 number and we want to know its decimal value. The scale factor is 27 = 128.
So 0x42 * 128 = 66 / 128 = 0.52 (rounded).