The huge caches also appear to be extremely fast – the L1D lands in at a 3-cycle load-use
latency. We don’t know if this is clever load-load cascading such as described on
Samsung’s cores, but in any case, it’s very impressive for such a large structure. AMD has
a 32KB 4-cycle cache, whilst Intel’s latest Sunny Cove saw a regression to 5 cycles when
they grew the size to 48KB. Food for thought on the advantages or disadvantages of slow
of fast frequency designs.