Two’s Complement

+ve as usual
-ve flip and add one; reverse: flip and add one

Or another trick: from the right to left, flip all bits after the first encountered 1

400

Subtraction

Suppose we do with 4 bits

  1. Convert to its two’s complement form (e.g. 1111 - 0110 + 0001 = 1010)
  1. Addition

Signed vs Unsigned

UnsignedSigned 2’s Complement
Minimum
Maximum
Range

In a signed integer, a number can overflow/underflow when it exceeds the maximum/minimum

Sign Extension

  • For signed integers
  • Fill missing bits with the sign digit, e.g. 1001 1111 1001
  • Value will be preserved

Zero Extension

  • For unsigned integers
  • Fill missing bits with 0
  • Bitwise logical operations

Floating Point Numbers

Normalized Scientific Notation

  • Scientific notation with no leading 0’s

Important

When dealing with binary number, the first digit will always be 1, so we don’t have to use extra memory space to store that

Important

Exponent is added with a bias, which is , where n is the size of the exponent in bits, to ensure all the exponent stored is positive for faster comparison

Single Precision

  • 32 bits, 1 sign bit, 8 bits exponent, 23 bits significand

Range

0000 0000 and 1111 1111 are reserved

  • Smallest

    • Exponent: 0000 0001, i.e.
    • Fraction: 000...00
  • Largest

    • Exponent: 1111 1110, i.e.
    • Fraction: 111...11

Double Precision

  • 64 bits, 1 sign bit, 11 bits exponents, 52 bits significand

Bias

e.g. 8 bits exponent (0 255)
biased (-128 127)

Formula

Conversion Example

5.75 (10) = 101.11 (2)

use short division when dealing with integer part
use multiplication when dealing with decimal part

0.75 x 2 = 1.5
0.5 x 2 = 1.0 (stop when 0)

start writing from top to bottom (the integer part)

then shift the result until there is only one single digit in the integer part

101.11 1.0111 x 2^2 (shift by 2 digits)

then we have to store the sign, significand, exponent

storing: E = bias + exp
retrive: exp = E - bias

bias is the max value of n bits, e.g. 8 bit store 0 will be stored as 127

bias =

Minimum for single precision:
(0)(0)(000…0001)

Overflow

A positive exponent becomes too large to fit in the exponent filed

Underflow

A negative exponent becomes too large to fit in the exponent filed

ASCII