memes

14891 readers

5029 users here now

Community rules

1. Be civil

No trolling, bigotry or other insulting / annoying behaviour

2. No politics

This is non-politics community. For political memes please go to !politicalmemes@lemmy.world

3. No recent reposts

Check for reposts when posting a meme, you can only repost after 1 month

4. No bots

No bots without the express approval of the mods or the admins

5. No Spam/Ads

No advertisements or spam. This is an instance rule and the only way to live.

A collection of some classic Lemmy memes for your enjoyment

Sister communities

!tenforward@lemmy.world : Star Trek memes, chat and shitposts
!lemmyshitpost@lemmy.world : Lemmy Shitposts, anything and everything goes.
!linuxmemes@lemmy.world : Linux themed memes
!comicstrips@lemmy.world : for those who love comic stories.

founded 2 years ago

MODERATORS

Tenthrow@lemmy.world

The_Picard_Maneuver@lemmy.world

The_Picard_Maneuver@startrek.website

477

Is 8GB a lot? Depends on the context. (lemmy.ml)

submitted 23 hours ago by cm0002@lemmy.world to c/memes@lemmy.world

74 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] Smoolak@lemmy.world 15 points 22 hours ago (2 children)

The meme don't make sense. An SRAM cache of that size would be so slow that you would most likely save clock cycles reading directly from RAM an not having a cache at all...

[–] cogman@lemmy.world 28 points 21 hours ago (1 children)

Slow? Not necessarily.

The main issue with that much memory is the data routing and the physical locality of the memory. Assuming you (somehow) could shrink down the distance from the cache to the registers and could have a wide enough data line/request lines you can have data from such a cache in ~4 cycles (assuming L1 and a hit).

What slows down memory for L2 is the wider address space and slower residence checks. L3 gets a bit slower because of even wider address spaces but also it has to deal with concurrency issues since it's shared among cores. It also ends up being slower because it physically has to be further away from the cores due to it's size.

If you ever look at a CPU die, you'll see that L1 caches are generally tiny and embedded right into the center of the processor. L2 tends to be bolted onto the sides of the physical cores. And L3 tends to be the largest amount of silicon real estate on a CPU package. This is all what contributes to the increasing fetch performance for each layer along with the fact that you have to check the closest layers first (An L3 hit, for example, means that the CPU checked L1 and L2 and failed at both which takes time. So L3 access will always be at least the L1 + L2 times).

[–] Smoolak@lemmy.world 5 points 20 hours ago

I agree. When evaluating cache access latency, it is important to consider the entire read path rather than just the intrinsic access time of a single SRAM cell. Much of the latency arises from all the supporting operations required for a functioning cache, such as tag lookups, address decoding, and bitline traversal. As you pointed out, implementing an 8 GB SRAM cache on-die using current manufacturing technology would be extremely impractical. The physical size would lead to substantial wire delays and increased complexity in the indexing and associativity circuits. As a result, the access latency of such a large on-chip cache could actually exceed that of off-chip DRAM, which would defeat the main purpose of having on-die caches in the first place.