Types

  • RAM

    • Static RAM for cache
    • Dynamic for main memory
  • HDD or SSD

  • DRAM

    • Capacitor
    • Wait for recharge
  • SRAM

    • Transistor
    • No periodic refreshes
    • Read is not destructive
  • HDD

    • Cylinder
    • Platter
    • Track (moving head to correct track is called seek time)
    • Sector (rotational latency)

Principle of Locality

  • Memory bottleneck memory hierarchy to exploit the principle of locality

Temporal locality

  • Items accessed recently are likely to be accessed again

Spatial locality

  • Items near those accessed recently are likely to be accessed soon

How

  • Put things into SRAM (cache)

  • Memory hierarchy

    • Illusion of as large as largest level
    • Illusion of as fast as the fastest level
    • Always get data from closet cache
    • Steps
      • Hard disk
      • Load into DRAM
      • On first access, copy instruction/data to cache
        • Also copy nearby memories
      • On subsequent access, look in cache look in memory
    • Hit: access satisfied by upper level
      • time to get data from upper level
    • Miss: block copied from lower level
      • miss penalty: time to copy
    • Hit rate (hit ratio): probability
    • Miss rate: 1 - hit rate: 1 - hit rate
    • Cache hit time: time to determine hit/miss + time to access the cache
    • Cache miss penalty: time to bring a block from lower to higher
    • Hit time << miss penalty
    • average memory latency
      • single
        • hit time + miss ratio * miss penalty
      • two level

c for cycle, m for miss rate
average memory access time =

Cache

Block Placement

  • Organized as an array of cache blocks
  • Cache and memory location is an one-to-one mapping
  • Cache size usually 32/64 bytes

Direct-mapped

  • One memory block to one possible cache block

  • Addr of cache = Block number % Number of blocks

  • Block number = floor(addr/block size)

    • e.g. , we only do shifting
    • and then , we put at cache addr 5
  • Shortcuts

    • division: shift right by n bits (blcok size = )
    • modulo: retrieve last n bits (cache size = )

Set-associative

  • Kind of like linked list in hash table

Fully-associative

  • Each block can be placed anywhere in the cache
  • Tag will be longer
  • Costly to search

Block Identification

  • Tags
  • Valid Bits
ValidTagCache
1 or 0Everything except the last 2 chunks of missing bits
  • 000…0100 101 10000

    • Tag
    • determines cache addr
    • determines cache block
  • The cache addr determines the offset in block of the main memory

  • It’s a hit if the tag is same as the upper bits in the main memory

  • miss rate = num of misses / num of total memory accesses

Block Replacement

Write Strategy