









| Computer Architecture                                                                                                                                                                           |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| DRAM (Dynamic RAM):                                                                                                                                                                             |
| <ul> <li>It is used to implement the main memory.</li> </ul>                                                                                                                                    |
| A dynamic RAM (DRAM) is made with cells that     Address line     Address line                                                                                                                  |
| <ul> <li>Because capacitors have a natural tendency to<br/>discharge, dynamic RAMs require periodic charge<br/>refreshing to maintain data storage (~ every 8ms).</li> </ul>                    |
| <ul> <li>The stored charge leaks away, even with power<br/>continuously applied (The term <i>dynamic</i> refers to<br/>this tendency ).</li> <li>Bit line<br/>(Data)</li> <li>Ground</li> </ul> |
| • During the refresh process, the memory is unavailable (latency). (-)                                                                                                                          |
| • Must be re-written after being read. Reading destroys the information (latency).                                                                                                              |
| • Difference between access time and cycle time. Cycle time > access time (-)                                                                                                                   |
| • Cheap and dense: one transistor/bit. More bits can be placed on one chip. (+)                                                                                                                 |
| Some improvements:                                                                                                                                                                              |
| <ul> <li>SDRAM (Synchronous DRAM): Added clock to DRAM interface. Burst mode.</li> </ul>                                                                                                        |
| • DDRAM (Double data rate DRAM): Data is transferred on both the rising edge and falling edge.                                                                                                  |
| http://akademi.itu.edu.tr/en/buzluca<br>http://www.buzluca.info                                                                                                                                 |



| Computer Architecture                                                                                                                                                                                              |  |  |  |  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|
| Associative Memory / Content Addressable Memory (CAM):                                                                                                                                                             |  |  |  |  |
| <ul> <li>Random access memory (SRAM) + circuitry for search</li> </ul>                                                                                                                                             |  |  |  |  |
| • It consists of SRAM to store data and digital circuits for parallel search.                                                                                                                                      |  |  |  |  |
| <ul> <li>It is used in high speed searching applications.</li> </ul>                                                                                                                                               |  |  |  |  |
| Cache memory is one of the primary uses for associative memory.                                                                                                                                                    |  |  |  |  |
| <ul> <li>A Content Addressable Memory (CAM) is an SRAM-based memory, which can<br/>be accessed in parallel to search for a given search word, providing as result<br/>the address of the matching data.</li> </ul> |  |  |  |  |
| • A word is retrieved based on a portion of its contents rather than its address.                                                                                                                                  |  |  |  |  |
| • The user supplies a data word (Argument A) (not the address), and the CAM searches its entire memory simultaneously (not sequentially) to see if that word is stored anywhere in it.                             |  |  |  |  |
| <ul> <li>The user also supplies a key value (K) to determine the portion of data to<br/>search for.</li> </ul>                                                                                                     |  |  |  |  |
| • If the search data (or the required portion) is found in a row of the memory, then the match bit for that row is set, and the data stored in this row is output.                                                 |  |  |  |  |
| http://akademi.itu.edu.tr/em/buzluca                                                                                                                                                                               |  |  |  |  |







| Computer Architecture 6.5.1 Cache Memory Principles (cont'd)                                                                                                                                 |  |  |  |  |  |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|
| Block transfer:                                                                                                                                                                              |  |  |  |  |  |
| Upon a cache miss (the requested element is not found in cache), <b>a</b> <i>block</i> of elements that contains the requested element is brought (copied) from main memory to cache memory. |  |  |  |  |  |
| Reason for block (instead of a single word) transfer: spatial locality.                                                                                                                      |  |  |  |  |  |
| The next element to be requested will most likely be located near the currently requested element.                                                                                           |  |  |  |  |  |
| The disadvantage of transferring a block instead of a single element is that it takes longer.                                                                                                |  |  |  |  |  |
| To reduce the block transfer time between main and cache memories, the memory interleaving technique is used.                                                                                |  |  |  |  |  |
| Data are stored in different memory modules; that is, consecutive memory<br>addresses are located in successive memory modules, so they can be accessed at                                   |  |  |  |  |  |
| the same time (similar to RAID, section 7.2).                                                                                                                                                |  |  |  |  |  |
| M7 M6 M5 M4 M3 M2 M1 M0                                                                                                                                                                      |  |  |  |  |  |
| consecutive                                                                                                                                                                                  |  |  |  |  |  |
| bytes> B7 B6 B5 B4 B3 B2 B1 B0 main memory                                                                                                                                                   |  |  |  |  |  |
|                                                                                                                                                                                              |  |  |  |  |  |
| One block of cache memory                                                                                                                                                                    |  |  |  |  |  |
| http://akademi.itu.edu.tr/en/buzluca<br>http://www.buzluca.info                                                                                                                              |  |  |  |  |  |



| Computer Architecture                                                                                                                                                                                                                                                                                                        |                                           |                                                                                   |                                                                                |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|-----------------------------------------------------------------------------------|--------------------------------------------------------------------------------|--|
| <ul> <li>6.5.2 Cache Memory Mapping Techniques</li> <li>Is a content of main memory currently present in cache memory?</li> <li>If present, where in cache memory is the data located?</li> </ul>                                                                                                                            |                                           |                                                                                   |                                                                                |  |
| <ol> <li>Full Associative Mapping         <ul> <li>a) Without blocks:</li> <li>In practice, all mapping techniques</li> <li>For the sake of simplicity, we will the sake of simplicity and the sake of simplicity.</li> </ul> </li> </ol>                                                                                    |                                           |                                                                                   |                                                                                |  |
| <ul> <li>Method: The most frequently reference associative memory.</li> <li>The address generated by the CPU is searched for in the cache (content addressable memory).</li> <li>If there is a hit, data is read from cache.</li> <li>If a miss occurs, data is read from main memory and also copied into cache.</li> </ul> | erenced<br>Valid<br>Bits<br>(V)<br>0<br>0 | addresses and<br>Address Data<br>A000 02<br>A001 3A<br>00C0 54<br>0400 A1<br>Data | their data are kept in an<br>Cache Memory:<br>Associative Memory<br>Hit<br>→ H |  |
| Without blocks, the technique benefits from only temporal locality but not spatial<br>locality.<br>Therefore, in practice, data blocks are moved between main and cache memories.                                                                                                                                            |                                           |                                                                                   |                                                                                |  |
| http://akademi.itu.edu.tr/en/buzluca<br>http://www.buzluca.info                                                                                                                                                                                                                                                              |                                           |                                                                                   | 13 - 2020 Feza BUZLUCA 6.14                                                    |  |



| Computer Architecture                                                                                                                                                                                                                                                                |                                          |                              |                       |                      |                        |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------|------------------------------|-----------------------|----------------------|------------------------|
| Example: Full Associative Mapping                                                                                                                                                                                                                                                    |                                          |                              |                       |                      |                        |
| Main Memory: 256K x words<br>Block size: 16 wordsAddress: a = 18 bits<br>w = 4 bits Main memory contains 214 blocks. b = 14<br>Cache Memory: 2K x words<br>data can be stored.Cache Memory: 2K x words<br>data can be stored.Cache memory contains 27 = 128 frames. f = 7<br>18 bits |                                          |                              |                       |                      |                        |
| Main mem                                                                                                                                                                                                                                                                             | ory address:                             | Bloc                         | k Number              | Word number          |                        |
|                                                                                                                                                                                                                                                                                      | Associativ<br>search<br>V Tag (14 bi     |                              | Dne frame<br>16 words |                      | Dne block<br>6 words   |
| 128<br>tags                                                                                                                                                                                                                                                                          |                                          | :<br>Frame 127<br>Data memor |                       | Elock 16383          | 2 <sup>14</sup> blocks |
|                                                                                                                                                                                                                                                                                      | Fag memory<br>associative)<br>Cache      | (SRAM)<br>Memory             | у<br>                 | Main Memory (1       | DRAM)                  |
| http://akade<br>http://www.b                                                                                                                                                                                                                                                         | mi.itu.edu.tr/en/buzluca<br>buzluca.info |                              |                       | 2013 - 2020 Feza BUZ | LUCA 6.16              |



| Computer Architecture                                                                                                                                                                                                                                                                                                                                                               |                    |                             |         |                                                                                         |  |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|-----------------------------|---------|-----------------------------------------------------------------------------------------|--|
| <b>Example:</b> Array in a cache memory system<br>Main Memory: 256K x words Cache Memory: 2K x words Block size: 16 words<br><b>Case A)</b> A program in this system accesses an array with the starting address<br>\$00002 and a size of 10 words. The array is not in cache memory.<br>Assume: When the CPU starts accessing the array, the least recently used frame is<br>F #1. |                    |                             |         |                                                                                         |  |
| Starting address of t<br>End address of the ar                                                                                                                                                                                                                                                                                                                                      |                    | 0 0000 0000<br>00 0000 0000 |         |                                                                                         |  |
| Frame 0<br>Tag:                                                                                                                                                                                                                                                                                                                                                                     | \$00002<br>\$0000B | 10 words                    | Block 0 | When the CPU accesses the first word of the array (\$00002), a miss occurs.             |  |
| Frame 1 10 words*                                                                                                                                                                                                                                                                                                                                                                   | 16 words are       |                             | Block 1 | Although the array has 10<br>words, the cache management<br>system transfers Block 0 in |  |
|                                                                                                                                                                                                                                                                                                                                                                                     | transferred.       |                             |         | its entirety (16 words) to<br>Frame 1.                                                  |  |
| Cache Mem                                                                                                                                                                                                                                                                                                                                                                           | ory                | Main Memo                   | ry      | When the CPU accesses the<br>next 9 elements of the array,<br>hits will occur.          |  |
| http://akademi.itu.edu.tr/en/buzlu                                                                                                                                                                                                                                                                                                                                                  | Ica                | 0                           | 080     | In total: 1 miss, 9 hits.<br>2013 - 2020 Feza BUZLUCA 6.18                              |  |

| Computer Architecture                                                                                                                                                                                   | License: https://creativecommons.org/licenses/by-nc-nd/4.0/                                                                                                                                                                                      |  |  |  |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| Example: Array in a cache mem                                                                                                                                                                           | ory system (cont'd)                                                                                                                                                                                                                              |  |  |  |
| <b>Case B)</b> A program in this syste<br>\$0000A and a size of 10 words. T<br>Assume that, the least recently<br>Starting address of the array:<br>End address of the array:                           | Cache Memory: 2K x words Block size: 16 words<br>m accesses an array with the starting address<br>The array is not in cache memory.<br>used frames are Frames #0 and #2.<br>00 0000 0000 0000 1010 (\$0000A)<br>00 0000 0000 0001 0011 (\$00013) |  |  |  |
| 0000000000000         Frame 0         \$0000A           Frame 0         6 words         \$00010           Frame 1         \$000000000000         \$00013           000000000000000000000000000000000000 |                                                                                                                                                                                                                                                  |  |  |  |
| Cache Memory                                                                                                                                                                                            | Main Memory                                                                                                                                                                                                                                      |  |  |  |
| Although the size of the array is smaller than the block (frame) size, it occupies two frames in cache memory.                                                                                          |                                                                                                                                                                                                                                                  |  |  |  |
| http://akademi.itu.edu.tr/en/buzluca<br>http://www.buzluca.info                                                                                                                                         | 000 000 2013 - 2020 Feza BUZLUCA 6.19                                                                                                                                                                                                            |  |  |  |



| Computer Architecture                                                                             |                                                                                                                                     |                            |                 |  |  |
|---------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|----------------------------|-----------------|--|--|
| 2. Direct Mapping                                                                                 |                                                                                                                                     |                            |                 |  |  |
| An incoming main memory block is always placed into a specific, fixed cache frame location.       |                                                                                                                                     |                            |                 |  |  |
| It is not necessary to searc<br>predetermined and fixed.                                          | h for the location of a:                                                                                                            | block in the cach          | e because it is |  |  |
| Therefore, associative mem                                                                        | ory is not necessary.                                                                                                               |                            |                 |  |  |
| As the size of main memory main memory map to the sa                                              |                                                                                                                                     | f the cache, seve          | ral blocks of   |  |  |
| It is necessary to determine which main memory block is currently residing in a frame.            |                                                                                                                                     |                            |                 |  |  |
| The cache memory control unit divides the address from the CPU into three fields:<br>a bits       |                                                                                                                                     |                            |                 |  |  |
| , ≥ Tag Cache Frame number Word number                                                            |                                                                                                                                     |                            |                 |  |  |
| a-(f+w) bits f bits w bits                                                                        |                                                                                                                                     |                            |                 |  |  |
| It indicates which of the<br>blocks that can be placed<br>in this frame is currently<br>in cache. | This field determines the<br>will hold this data.<br>The blocks that have the<br>reside in the same fram<br>Only one of them can re | is filed in common (<br>e. | same) try to    |  |  |
| http://akademi.itu.edu.tr/en/buzluca<br>http://www.buzluca.info                                   |                                                                                                                                     | 2013 - 2020 Feza           | BUZLUCA 6.21    |  |  |



| Computer Architecture                                                                                                                              |                                                                                                                                                                                                                     |                         |                      |        |  |
|----------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|----------------------|--------|--|
| Example (Direct Mapping):                                                                                                                          |                                                                                                                                                                                                                     |                         |                      |        |  |
| Main Memory: 256K x word<br>Block size: 16 words<br>Cache Memory: 2K x word<br>data capacity                                                       | ds w = 4 bits Main memory contains 2 <sup>14</sup> blocks. b = 14<br>word Cache memory contains 2 <sup>7</sup> =128 frames. f = 7                                                                                   |                         |                      |        |  |
|                                                                                                                                                    | Tee                                                                                                                                                                                                                 | 18 bits<br>Frame number | Word number          |        |  |
| Main memory address:                                                                                                                               | Tag                                                                                                                                                                                                                 |                         |                      |        |  |
|                                                                                                                                                    | 7 bits                                                                                                                                                                                                              | 7 bits                  | 4 bits               |        |  |
| In this system, the data in the following two addresses try to reside in the same cache frame.                                                     |                                                                                                                                                                                                                     |                         |                      |        |  |
| Tag Frame num. Word num.<br>0000000 0000000 XXXX                                                                                                   | <ul> <li>The "Frame number" fields of both addresses are<br/>the same: 0000000. They will be placed in Frame 0.</li> <li>At a specific point in time, only one of these data<br/>can be in cache memory.</li> </ul> |                         |                      |        |  |
| 0000001 0000000 XXXX                                                                                                                               |                                                                                                                                                                                                                     |                         |                      |        |  |
| To determine which data is currently in Frame 0 of cache memory, the tag value of the address is compared to the tag value stored in cache memory. |                                                                                                                                                                                                                     |                         |                      |        |  |
|                                                                                                                                                    |                                                                                                                                                                                                                     |                         |                      |        |  |
| http://akademi.itu.edu.tr/en/buzluca                                                                                                               |                                                                                                                                                                                                                     | @080 201                | 3 - 2020 Feza BUZLUC | A 6.23 |  |
| http://www.buzluca.info                                                                                                                            |                                                                                                                                                                                                                     | BY NO ND                |                      |        |  |





| Computer Architecture                                                                                                                                                                                                                                                                                                                    |                                                                                                  |  |  |  |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|--|--|--|--|
| 3. Set Associative Mapping                                                                                                                                                                                                                                                                                                               |                                                                                                  |  |  |  |  |
| This method is a compromise between direct map                                                                                                                                                                                                                                                                                           | ping and full associative mapping.                                                               |  |  |  |  |
| The cache is divided into a number of <u>set</u> s, and ea<br>frames.                                                                                                                                                                                                                                                                    | ach set consists of a number of                                                                  |  |  |  |  |
| A given main memory block maps to a specific, fixed cache set (Snum), based on<br>the equation Snum = B mod S, where S is the number of sets in the cache, B is the<br>main memory block number, and Snum is the specific cache set to which block B<br>maps.<br>However, an incoming block maps to any frame in the assigned cache set. |                                                                                                  |  |  |  |  |
| <ul> <li>Direct mapping is used to determine the set (fixed).</li> <li>Associative mapping is used to determine the frame in the set (flexible).</li> </ul>                                                                                                                                                                              |                                                                                                  |  |  |  |  |
| The cache controller divides the address issued by the CPU into three fields:                                                                                                                                                                                                                                                            |                                                                                                  |  |  |  |  |
| a bits                                                                                                                                                                                                                                                                                                                                   |                                                                                                  |  |  |  |  |
| 🔉 Tag 🛛 Cache Set number                                                                                                                                                                                                                                                                                                                 | Word number                                                                                      |  |  |  |  |
| /a-(s+w) bits s bits                                                                                                                                                                                                                                                                                                                     | w bits                                                                                           |  |  |  |  |
| Tag is used to search the<br>block in within the determined<br>set (associative).                                                                                                                                                                                                                                                        | This field is used to identify<br>the specific cache set that<br>should hold the targeted block. |  |  |  |  |

@099



| Computer Architecture                                                                                                                                                                                                                                                                                                                                                                                                              |                                           |           |               |                   |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------|-----------|---------------|-------------------|--|
| Example (Set Associative):         Main Memory: 256K x words       Address: a = 18 bits         Block size: 16 words       w = 4 bits         Main memory contains 2 <sup>14</sup> blocks.       b = 14         Cache Memory: 2K x words       Cache memory contains 2 <sup>7</sup> = 128 frames.         data capacity.       Given: Each set contains 2 frames.         Hence, the cache memory contains 64 sets.         18 bit |                                           |           |               |                   |  |
| Main memory address:                                                                                                                                                                                                                                                                                                                                                                                                               | emory address: Tag Set number Word number |           |               |                   |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                    | 8 þits                                    | 6 bits    | 4 bits        |                   |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                    | Turolit                                   |           | Main          | memory            |  |
| V                                                                                                                                                                                                                                                                                                                                                                                                                                  | Tag: 8 bit                                | Frame 0   | B             | ock 0             |  |
| Set 0                                                                                                                                                                                                                                                                                                                                                                                                                              |                                           |           | B             | ock 1             |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                           | Frame 1   |               | :                 |  |
| → Set 1                                                                                                                                                                                                                                                                                                                                                                                                                            |                                           | Frame 2   |               | ock 64            |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                           | Frame 3   |               | :                 |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                    | :                                         | :         | Blo           | ck 128            |  |
| Set 63                                                                                                                                                                                                                                                                                                                                                                                                                             | l f                                       | Frame 126 |               | :                 |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                           | Frame 127 | Bloc          | k 16383           |  |
| http://akademi.itu.edu.tr/en/buzluc<br>http://www.buzluca.info                                                                                                                                                                                                                                                                                                                                                                     | a                                         |           | 2013 - 2020 I | Feza BUZLUCA 6.28 |  |



| Computer Architecture                                                                                         |  |  |  |  |
|---------------------------------------------------------------------------------------------------------------|--|--|--|--|
| 6.5.3 Cache Memory - Main Memory Interactions                                                                 |  |  |  |  |
| → Read (Hit): Data is read from cache memory.                                                                 |  |  |  |  |
| → Read (Miss):                                                                                                |  |  |  |  |
| a) Read-Through (RT):                                                                                         |  |  |  |  |
| While the data (block) is being brought from main memory to cache, it is also read by the CPU simultaneously. |  |  |  |  |
| Cache memory and main memory are accessed in parallel.                                                        |  |  |  |  |
| b) No Read-Through (NRT):                                                                                     |  |  |  |  |
| Data are first brought from main memory to cache memory, and then the CPU reads data from the cache.          |  |  |  |  |
| $\rightarrow$ Write (Hit):                                                                                    |  |  |  |  |
| a) Write-Through (WT):                                                                                        |  |  |  |  |
| In each write operation, data is written to cache and also to main memory.                                    |  |  |  |  |
| Disadvantage: It increases the access time.                                                                   |  |  |  |  |
| Advantage: It provides coherence between the cache frames and their counterparts in main memory.              |  |  |  |  |
|                                                                                                               |  |  |  |  |
| http://akademi.itu.edu.tr/en/buzluca<br>http://akademi.itu.edu.tr/en/buzluca<br>http://www.buzluca.info       |  |  |  |  |

| Computer Architecture                      | License: https://creativecommons.org/licenses/by-nc-nd/4.0/                                                          |
|--------------------------------------------|----------------------------------------------------------------------------------------------------------------------|
| → Write (Hit) (cont'd):                    |                                                                                                                      |
| b) Write-Back (WB):                        |                                                                                                                      |
| Writes are done only to t                  | he cache.                                                                                                            |
|                                            | main memory only when a replacement is needed.                                                                       |
| There are two types of w<br>write-back.    | rite-back policies: Simple write-back and flagged                                                                    |
| • Simple Write-Back (SW                    | B):                                                                                                                  |
| The replaced frame is alw                  | vays written back to main memory.                                                                                    |
| It is not checked whethe                   | r the frame was changed or not.                                                                                      |
| <ul> <li>Flagged Write-Back (FV</li> </ul> | VB):                                                                                                                 |
|                                            | gned a bit, called the <i>dirty bit</i> , to indicate that at<br>has been made to the block while residing in cache. |
|                                            | dirty bit is checked: if it is set, then the block is<br>nory; otherwise, it is simply overwritten by the            |
| The dirty bit is stored in                 | the tag memory of the cache.                                                                                         |
|                                            |                                                                                                                      |
| http://akademi.itu.edu.tr/en/buzluca       |                                                                                                                      |
| http://www.buzluca.info                    | 2013 - 2020 Feza BUZLUCA 6.31                                                                                        |

| <ul> <li>→ Write (Miss):</li> <li>a) Write Allocate (WA):<br/>The main memory block is updated and brought to cache.</li> <li>a) No Write Allocate (NWA):<br/>The missed main memory block is updated in main memory and not brought to cache.<br/>If an attempt is made to read this block later, a miss will occur, and data will be brought to cache.</li> <li>The write-through (WT) policy can be used together with write-allocate (WA) or no-write-allocate (NWA) methods. WTWA, WTNWA</li> <li>In write-back (WB) policy, to maintain coherence between cache and main memory at the beginning, the write-allocate (WA) method is used (WBWA).</li> <li>Information held in Tag memory:</li> <li>If LRU is used aging counters,</li> <li>If flagged Write-Back (FWB) method is used "dirty" bit (D).<br/>A single line of a tag memory:</li> <li>V D Counter Tag</li> </ul> | Computer Architecture                                  |  |  |  |  |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------|--|--|--|--|
| <ul> <li>The main memory block is updated and brought to cache.</li> <li>a) No Write Allocate (NWA): The missed main memory block is updated in main memory and not brought to cache. If an attempt is made to read this block later, a miss will occur, and data will be brought to cache. The write-through (WT) policy can be used together with write-allocate (WA) or no-write-allocate (NWA) methods. WTWA, WTNWA In write-back (WB) policy, to maintain coherence between cache and main memory at the beginning, the write-allocate (WA) method is used (WBWA). Information held in Tag memory: In addition to Valid (V) and tag bits, depending on the method used, the following data must be also kept in tag memory: If LRU is used aging counters, If flagged Write-Back (FWB) method is used "dirty" bit (D).</li></ul>                                               | $\rightarrow$ Write (Miss):                            |  |  |  |  |
| <ul> <li>a) No Write Allocate (NWA):<br/>The missed main memory block is updated in main memory and not brought to cache.<br/>If an attempt is made to read this block later, a miss will occur, and data will be brought to cache.</li> <li>The write-through (WT) policy can be used together with write-allocate (WA) or no-write-allocate (NWA) methods. WTWA, WTNWA</li> <li>In write-back (WB) policy, to maintain coherence between cache and main memory at the beginning, the write-allocate (WA) method is used (WBWA).</li> <li>Information held in Tag memory:</li> <li>In addition to Valid (V) and tag bits, depending on the method used, the following data must be also kept in tag memory:</li> <li>If LRU is used aging counters,</li> <li>If flagged Write-Back (FWB ) method is used "dirty" bit (D).</li> </ul>                                               | a) Write Allocate (WA):                                |  |  |  |  |
| <ul> <li>The missed main memory block is updated in main memory and not brought to cache.</li> <li>If an attempt is made to read this block later, a miss will occur, and data will be brought to cache.</li> <li>The write-through (WT) policy can be used together with write-allocate (WA) or no-write-allocate (NWA) methods. WTWA, WTNWA</li> <li>In write-back (WB) policy, to maintain coherence between cache and main memory at the beginning, the write-allocate (WA) method is used (WBWA).</li> <li>Information held in Tag memory:</li> <li>In addition to Valid (V) and tag bits, depending on the method used, the following data must be also kept in tag memory:</li> <li>If LRU is used aging counters,</li> <li>If flagged Write-Back (FWB ) method is used "dirty" bit (D).</li> </ul>                                                                          | The main memory block is updated and brought to cache. |  |  |  |  |
| <ul> <li>cache.</li> <li>If an attempt is made to read this block later, a miss will occur, and data will be brought to cache.</li> <li>The write-through (WT) policy can be used together with write-allocate (WA) or no-write-allocate (NWA) methods. WTWA, WTNWA</li> <li>In write-back (WB) policy, to maintain coherence between cache and main memory at the beginning, the write-allocate (WA) method is used (WBWA).</li> <li>Information held in Tag memory:</li> <li>In addition to Valid (V) and tag bits, depending on the method used, the following data must be also kept in tag memory:</li> <li>If LRU is used aging counters,</li> <li>If flagged Write-Back (FWB) method is used "dirty" bit (D).</li> </ul>                                                                                                                                                     | a) No Write Allocate (NWA):                            |  |  |  |  |
| <ul> <li>be brought to cache.</li> <li>The write-through (WT) policy can be used together with write-allocate (WA) or no-write-allocate (NWA) methods. WTWA, WTNWA</li> <li>In write-back (WB) policy, to maintain coherence between cache and main memory at the beginning, the write-allocate (WA) method is used (WBWA).</li> <li>Information held in Tag memory:</li> <li>In addition to Valid (V) and tag bits, depending on the method used, the following data must be also kept in tag memory:</li> <li>If LRU is used aging counters,</li> <li>If flagged Write-Back (FWB) method is used "dirty" bit (D).</li> </ul>                                                                                                                                                                                                                                                      |                                                        |  |  |  |  |
| <ul> <li>no-write-allocate (NWA) methods. WTWA, WTNWA</li> <li>In write-back (WB) policy, to maintain coherence between cache and main memory at the beginning, the write-allocate (WA) method is used (WBWA).</li> <li>Information held in Tag memory:</li> <li>In addition to Valid (V) and tag bits, depending on the method used, the following data must be also kept in tag memory:</li> <li>If LRU is used aging counters,</li> <li>If flagged Write-Back (FWB) method is used "dirty" bit (D).</li> </ul>                                                                                                                                                                                                                                                                                                                                                                   |                                                        |  |  |  |  |
| <ul> <li>at the beginning, the write-allocate (WA) method is used (WBWA).</li> <li>Information held in Tag memory:</li> <li>In addition to Valid (V) and tag bits, depending on the method used, the following data must be also kept in tag memory:</li> <li>If LRU is used aging counters,</li> <li>If flagged Write-Back (FWB) method is used "dirty" bit (D).</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                        |  |  |  |  |
| <ul> <li>In addition to Valid (V) and tag bits, depending on the method used, the following data must be also kept in tag memory:</li> <li>If LRU is used aging counters,</li> <li>If flagged Write-Back (FWB) method is used "dirty" bit (D).</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                                        |  |  |  |  |
| <ul> <li>data must be also kept in tag memory:</li> <li>If LRU is used aging counters,</li> <li>If flagged Write-Back (FWB ) method is used "dirty" bit (D).</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | Information held in Tag memory:                        |  |  |  |  |
| <ul> <li>If flagged Write-Back (FWB) method is used "dirty" bit (D).</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | data must be also kept in tag memory:                  |  |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                        |  |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                        |  |  |  |  |
| http://akademi.itu.edu.tr/en/buzluca                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                        |  |  |  |  |



| Computer Architecture                                                       |                                             |                                       |                                         |                                                              |  |  |
|-----------------------------------------------------------------------------|---------------------------------------------|---------------------------------------|-----------------------------------------|--------------------------------------------------------------|--|--|
| 6.5.5 Access Time:                                                          |                                             |                                       |                                         |                                                              |  |  |
| t <sub>a</sub> : Average Memory Access Time                                 |                                             |                                       |                                         |                                                              |  |  |
| W: Write ratio (number of write accesses / total number of all accesses)    |                                             |                                       |                                         |                                                              |  |  |
| h: Hitra                                                                    | atio                                        |                                       |                                         |                                                              |  |  |
| t <sub>cache</sub> : Cache                                                  |                                             |                                       |                                         |                                                              |  |  |
|                                                                             | t <sub>main</sub> : Main memory access time |                                       |                                         |                                                              |  |  |
| t <sub>trans</sub> : Time to transfer a block between main memory and cache |                                             |                                       |                                         |                                                              |  |  |
| W <sub>d</sub> : The p                                                      | probability                                 | that a block in a                     | cache is updated                        |                                                              |  |  |
| WT, RT/LT WB, WA, NRT/NLT                                                   |                                             |                                       |                                         |                                                              |  |  |
| (Write-through , Parallel read/write)                                       |                                             | (Write-back, Serial read/write)       |                                         |                                                              |  |  |
| Probability                                                                 | NWA                                         | WA                                    | SWB                                     | FWB                                                          |  |  |
| Read Hit                                                                    | Access Time                                 |                                       | Access Time                             |                                                              |  |  |
| (1-w)h                                                                      | t <sub>cache</sub>                          | t <sub>cache</sub>                    | t <sub>cache</sub>                      | t <sub>cache</sub>                                           |  |  |
| Read Miss                                                                   |                                             |                                       | 24                                      | $W_d (2t_{trans} + t_{cache}) +$                             |  |  |
| (1-w)(1-h)                                                                  | t <sub>trans</sub>                          | t <sub>trans</sub>                    | 2t <sub>trans</sub> +t <sub>cache</sub> | (1-w <sub>d</sub> )(t <sub>trans</sub> +t <sub>cache</sub> ) |  |  |
| Write Hit                                                                   | <b>_</b>                                    | +                                     | +                                       | +                                                            |  |  |
| wh<br>Write Miss                                                            | t <sub>main</sub>                           | t <sub>main</sub>                     | t <sub>cache</sub>                      | t <sub>cache</sub>                                           |  |  |
|                                                                             | +                                           | + +                                   | 2+ +                                    | $W_d (2t_{trans} + t_{cache}) +$                             |  |  |
| w(1-h)                                                                      | t <sub>main</sub>                           | t <sub>main</sub> +t <sub>trans</sub> | 2t <sub>trans</sub> +t <sub>cache</sub> | (1-w <sub>d</sub> )(t <sub>trans</sub> +t <sub>cache</sub> ) |  |  |
| http://akademi.itu.edu.tr/en/buzluca                                        |                                             |                                       |                                         |                                                              |  |  |



| Computer Architecture                                                                                                               |  |  |  |  |
|-------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|
| Exemplary processors with cache memories:                                                                                           |  |  |  |  |
| • Intel386™: Cache memory is outside of the CPU chip. SRAM memory.                                                                  |  |  |  |  |
| • Intel486™ (1989)                                                                                                                  |  |  |  |  |
| 8-KByte on-chip (L1)                                                                                                                |  |  |  |  |
| · Intel® Pentium® (1993)                                                                                                            |  |  |  |  |
| L1 on-chip: 8 KB instruction,8 KB data cache (Harvard architecture)                                                                 |  |  |  |  |
| • Intel P6 Family: (1995-1999)                                                                                                      |  |  |  |  |
| <ul> <li>Intel Pentium Pro:</li> <li>L1 on-chip: 8 KB instruction, 8 KB data cache (Harvard architecture)</li> </ul>                |  |  |  |  |
| First L2 cache memory in the CPU chip.                                                                                              |  |  |  |  |
| L2 on-chip: 256 KB. Different interconnections between L1, L2 and the CPU.                                                          |  |  |  |  |
| - Intel Pentium II:                                                                                                                 |  |  |  |  |
| L1 on-chip: 16 KB instruction,16 KB data cache (Harvard architecture)<br>L2 on-chip: 256 KB, 512 KB, 1 MB                           |  |  |  |  |
| • Intel® Pentium® M (2003)                                                                                                          |  |  |  |  |
| L1 on-chip: 32 KB instruction, 32 KB data cache                                                                                     |  |  |  |  |
| L2 on-chip: up to 2 MByte                                                                                                           |  |  |  |  |
| • Intel <sup>®</sup> Core™ i9-9900 (2019)                                                                                           |  |  |  |  |
| Multicore: 8 cores. Private caches (L1: 512KiB) and shared caches (L2: 2 MiB)<br>L3: 16 MiB smartcache: All cores share this cache. |  |  |  |  |
| http://akademi.itu.edu.tr/en/buzluca                                                                                                |  |  |  |  |