Labels

Saturday, November 25, 2017

Cache Coherence..

Overview

In a shared memory multiprocessor system with a separate cache memory for each processor(CPU), it is possible to have many copies of shared data: One copy in the main memory and One in the local cache of each processor that requested it. When one of the copies of data is changed, the other copies must reflect that change, otherwise system will have further have data integrity issues/problems. This is resolved by a concept called as 'Cache Coherence'.

Cache Coherence is the discipline which ensures that the changes in the values of shared operands(data) are propagated throughout the system in a timely fashion.

Share Memory:  In computer hardware, Shared Memory refers to a (typically large) block of random access memory (RAM) that can be accessed by several different central processing units (CPUs) in a multiprocessor computer system. Below is an illustration of a shared memory system of three processors.



Multi-Processing: Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system.

More Details:

Consider two clients have a cached copy of a particular memory block from a previous read. Suppose the client on the above updates/changes that memory block, the client on the bottom could be left with an invalid cache of memory without any notification of the change. In such cases, Cache coherence is intended to manage such conflicts by maintaining a coherent view of the data values in multiple caches.



More Technical

We discussed, when multiple processors with on-chip caches are placed on a common bus sharing a common memory, then it's necessary to ensure that the caches are kept in a coherent state.

Let's understand more technically the problem and solution using Coherence Cache.
Here assume main memory has value '200' stored in it's location 'x'.

Step 1: 
  • Processor_A reads location x. Copy of x transferred to PA's cache.
  • Processor_B also reads location x. Copy of x transferred to PB's cache too.





















Step 2:
  • PA adds 1 to x. x is in PA's cache, so there's a cache hit.
  • If PB reads x again (perhaps after synchronizing with PA), it will also see a cache hit. However it will read a stale value of x.
























Problem Resolution:

This problem is avoided by adding 'Cache Coherence' hardware to the system interface. This hardware monitors the bus for transactions which affect locations cached in this processor.
Here cache needs to generate 'Invalidate Transactions' when it writes to shared locations.

I.e.
  • When PA updates x, the PA cache generates an Invalidate Transaction.(I.e. Simply communicating all the processors, the address of a cache line which has been invalidated) .
  • When PB's hardware sees the invalidate x transaction, it finds a copy of x in its cache and marks it 'Invalid'.
  • Now a read x by PB will cause a cache miss and initiate a databus transaction to read x from main memory.






Invalidate Transaction:  Is an address-only transaction: it simply communicates the address of a cache line which has been invalidated to all the other processors.













Further..
  • When PA's hardware sees the memory read for x (by PB), it detects the modified copy in its own cache, and emits a retry response, causing PB to Suspend the read transaction.
  • PA now writes (Flushes) the modified cache line to main memory.
  • PB later continues its suspended transaction and reads the correct value from main memory.
























Hope this helps!!

Arun Manglick