summaryrefslogtreecommitdiff
path: root/src/mem/cache/base.hh
AgeCommit message (Collapse)Author
2018-10-18mem: Add write coalescing and write-no-allocate to the cachesNikos Nikoleris
Enable the cache to detect contiguous writes and hold on to the MSHR long enough to allow the entire line to be written. If the whole line is written, the MSHR will be sent out as an invalidation requests, as it is part of a whole-line write, i.e. no-fetch-on-write. The cache is also able to switch to a write-no-allocate policy on the actual completion of the writes, and instead use the tempBlock and turn the write operation into a writeback. These policies are all well-known, and described in works such as Jouppi, Cache Write Policies and Performance, vol 21, no 2, ACM, 1993. Change-Id: I19792f2970b3c6798c9b2b493acdd156897284ae Reviewed-on: https://gem5-review.googlesource.com/c/12907 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-10-18mem: Restructure whole-line writes to simplify write mergingNikos Nikoleris
This patch changes how we deal with whole-line writes their responses. With these changes, we use the MSHR tracking to determine if a whole-line is written, and on a fill we simply handle the invalidation response, with the actual writes taking place as part of satisfying the CPU-side hit. Change-Id: I9a18e41a95db3c20b97f8bca7d95ff33d35a578b Reviewed-on: https://gem5-review.googlesource.com/c/12905 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-10-11mem-cache: Rename blk.cc/hh to cache_blk.cc/hhDaniel R. Carvalho
Rename the files blk.cc and blk.hh to cache_blk.cc and cache_blk.hh to comply with the usual file-class naming rules. Change-Id: I8af45df3e4b8dd934fd9929ec914fb230cb2cb09 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/13416 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-09-13mem-cache: Fix bug in handleAtomicReqMissNikos Nikoleris
"4976ff5 mem-cache: Refactor the recvAtomic function" introduced a bug where if an atomic request that fills in using the tempBlock it will not evict it when it finishes handling the request as it should. This triggers an assertion. This change fixes this bug. Change-Id: I73c808a7e15237eddb36b5448ef6728f7bcf7fd9 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/12644 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-06-13mem-cache: Insert on block allocationDaniel R. Carvalho
When a block is being replaced in an allocation, if successfull, the block will be inserted. Therefore we move the insertion functionality to allocateBlock(). allocateBlock's signature has been modified to allow this modification. Change-Id: I60d17a83ff4f3021fdc976378868ccde6c7507bc Reviewed-on: https://gem5-review.googlesource.com/10812 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-06-01mem-cache: Create an address aware TempCacheBlkDaniel R. Carvalho
tempBlock has its member variables manually set in order to allow it to be used in the block address regeneration function. This is not necessary, and ti can be simply given the address, so it does not need to be aware of set and tag. This will simplify implementation of sector and skewed caches. Change-Id: Iaffb10c323509722cd5589fe1030b818d43336d6 Reviewed-on: https://gem5-review.googlesource.com/9961 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-05-31mem-cache: Replace block visitor with std::functionNikos Nikoleris
This change modifies forEachBlk tags function to accept std::function as parameter. It also adds an anyBlk tags function that given a condition, it iterates through the blocks and returns whether the condition is met. Finally, it uses forEachBlk to implement the print, computeStats and cleanupRefs functions that also work for the FALRU class. Change-Id: I2f75f4baa1fdd5a1d343a63ecace3eb9458fbf03 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10621 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-05-31mem-cache: Adopt a more sensible cache class hierarchyNikos Nikoleris
This patch changes what goes into the BaseCache and what goes into the Cache, to make it easier to add a NoncoherentCache with as much re-use as possible. A number of redundant members and definitions are also removed in the process. This is a modified version of a changeset put together by Andreas Hansson <andreas.hansson@arm.com> Change-Id: Ie9dd73c4ec07732e778e7416b712dad8b4bd5d4b Reviewed-on: https://gem5-review.googlesource.com/10431 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-05-17mem-cache: Move replacements stat to the base cache classNikos Nikoleris
Change-Id: I25dbcfcddfe1c422a76eb1af3f726c1360d8d110 Reviewed-on: https://gem5-review.googlesource.com/10426 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2017-12-04misc: Rename misc.(hh|cc) to logging.(hh|cc)Gabe Black
These files aren't a collection of miscellaneous stuff, they're the definition of the Logger interface, and a few utility macros for calling into that interface (panic, warn, etc.). Change-Id: I84267ac3f45896a83c0ef027f8f19c5e9a5667d1 Reviewed-on: https://gem5-review.googlesource.com/6226 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com>
2017-06-27mem-cache: Add missing overrides to BaseCacheAndreas Sandberg
Change-Id: I6a3a57e3067c247bd6ce6f01ac9459883f4aae2c Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/3880 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2017-06-20mem: Replace EventWrapper use with EventFunctionWrapperSean Wilson
NOTE: With this change there is a possibility for `DRAMCtrl::Rank`s event names to not properly match the rank they were generated by. This could occur if the public rank member is modified after the Rank's construction. A patch would mean refactoring Rank and `DRAMCtrl`b to privatize many of the members of Rank behind getters. Change-Id: I7b8bd15086f4ffdfd3f40be4aeddac5e786fd78e Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3745 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2017-03-03mem: Use pkt::getBlockAddr instead of BaseCace::blockAlignNikos Nikoleris
Change-Id: I0ed4e528cb750a323facdc811dde7f0ed1ff228e Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05mem: Make packet debug printing more uniformNikos Nikoleris
Previously DPRINTFs printing information about a packet would use ad hoc formats. This patch changes all DPRINTFs to use the print function defined by the packet class, making the packet printing format more uniform and easier to change. Change-Id: Idd436a9758d4bf70c86a574d524648b2a2580970 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-11-30mem: Split the hit_latency into tag_latency and data_latencySophiane Senni
If the cache access mode is parallel, i.e. "sequential_access" parameter is set to "False", tags and data are accessed in parallel. Therefore, the hit_latency is the maximum latency between tag_latency and data_latency. On the other hand, if the cache access mode is sequential, i.e. "sequential_access" parameter is set to "True", tags and data are accessed sequentially. Therefore, the hit_latency is the sum of tag_latency plus data_latency. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-05-26mem: fix the line length in the cache related classesNikos Nikoleris
Change-Id: I6d1feb164a958dde0da87a1cd2698096112c4a82 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-04-21mem: Remove unused cache statsAndreas Hansson
Prune cache stats that are never actually used.
2015-05-27mem: Add unused prefetch counter in cachesRekai Gonzalez Alberquilla
Added stat to the cache to account for HardPF'ed blocks that are evicted before being referenced (over-prefetching).
2016-03-17mem: Create a separate class for the cache write bufferAndreas Hansson
This patch breaks out the cache write buffer into a separate class, without affecting any stats. The goal of the patch is to avoid encumbering the much-simpler write queue with the complex MSHR handling. In a follow on patch this simplification allows us to implement write combining. The WriteQueue gets its own class, but shares a common ancestor, the generic Queue, with the MSHRQueue.
2016-02-10mem: Deduce if cache should forward snoopsAndreas Hansson
This patch changes how the cache determines if snoops should be forwarded from the memory side to the CPU side. Instead of having a parameter, the cache now looks at the port connected on the CPU side, and if it is a snooping port, then snoops are forwarded. Less error prone, and less parameters to worry about. The patch also tidies up the CPU classes to ensure that their I-side port is not snooping by removing overrides to the snoop request handler, such that snoop requests will panic via the default MasterPort implement
2015-12-31mem: Make cache terminology easier to understandAndreas Hansson
This patch changes the name of a bunch of packet flags and MSHR member functions and variables to make the coherency protocol easier to understand. In addition the patch adds and updates lots of descriptions, explicitly spelling out assumptions. The following name changes are made: * the packet memInhibit flag is renamed to cacheResponding * the packet sharedAsserted flag is renamed to hasSharers * the packet NeedsExclusive attribute is renamed to NeedsWritable * the packet isSupplyExclusive is renamed responderHadWritable * the MSHR pendingDirty is renamed to pendingModified The cache states, Modified, Owned, Exclusive, Shared are also called out in the cache and MSHR code to make it easier to understand.
2015-11-06mem: Add an option to perform clean writebacks from cachesAndreas Hansson
This patch adds the necessary commands and cache functionality to allow clean writebacks. This functionality is crucial, especially when having exclusive (victim) caches. For example, if read-only L1 instruction caches are not sending clean writebacks, there will never be any spills from the L1 to the L2. At the moment the cache model defaults to not sending clean writebacks, and this should possibly be re-evaluated. The implementation of clean writebacks relies on a new packet command WritebackClean, which acts much like a Writeback (renamed WritebackDirty), and also much like a CleanEvict. On eviction of a clean block the cache either sends a clean evict, or a clean writeback, and if any copies are still cached upstream the clean evict/writeback is dropped. Similarly, if a clean evict/writeback reaches a cache where there are outstanding MSHRs for the block, the packet is dropped. In the typical case though, the clean writeback allocates a block in the downstream cache, and marks it writable if the evicted block was writable. The patch changes the O3_ARM_v7a L1 cache configuration and the default L1 caches in config/common/Caches.py
2015-11-06mem: Add cache clusivityAndreas Hansson
This patch adds a parameter to control the cache clusivity, that is if the cache is mostly inclusive or exclusive. At the moment there is no intention to support strict policies, and thus the options are: 1) mostly inclusive, or 2) mostly exclusive. The choice of policy guides the behaviuor on a cache fill, and a new helper function, allocOnFill, is created to encapsulate the decision making process. For the timing mode, the decision is annotated on the MSHR on sending out the downstream packet, and in atomic we directly pass the decision to handleFill. We (ab)use the tempBlock in cases where we are not allocating on fill, leaving the rest of the cache unaffected. Simple and effective. This patch also makes it more explicit that multiple caches are allowed to consider a block writable (this is the case also before this patch). That is, for a mostly inclusive cache, multiple caches upstream may also consider the block exclusive. The caches considering the block writable/exclusive all appear along the same path to memory, and from a coherency protocol point of view it works due to the fact that we always snoop upwards in zero time before querying any downstream cache. Note that this patch does not introduce clean writebacks. Thus, for clean lines we are essentially removing a cache level if it is made mostly exclusive. For example, lines from the read-only L1 instruction cache or table-walker cache are always clean, and simply get dropped rather than being passed to the L2. If the L2 is mostly exclusive and does not allocate on fill it will thus never hold the line. A follow on patch adds the clean writebacks. The patch changes the L2 of the O3_ARM_v7a CPU configuration to be mostly exclusive (and stats are affected accordingly).
2015-11-06mem: Do not treat CleanEvict as a write operationAndreas Hansson
This patch changes the CleanEvict command type to not be considered a write. Initially it was made a zero-sized write to match the writeback command, but as things developed it became clear that it causes more problems than it solves. For example, the memory modules (and bridge) should not consider the CleanEvict as a write, but instead discard it. With this patch it will be neither a read, nor write, and as it does not need a response the slave will simply sink it.
2015-08-21mem: Add explicit Cache subclass and make BaseCache abstractAndreas Hansson
Open up for other subclasses to BaseCache and transition to using the explicit Cache subclass. --HG-- rename : src/mem/cache/BaseCache.py => src/mem/cache/Cache.py
2015-08-21mem: Move cache_impl.hh to cache.ccAndreas Hansson
There is no longer any need to keep the implementation in a header.
2015-07-30mem: Remove unused RequestCause in cacheAndreas Hansson
This patch removes the RequestCause, and also simplifies how we schedule the sending of packets through the memory-side port. The deassertion of bus requests is removed as it is not used.
2015-07-07sim: Decouple draining from the SimObject hierarchyAndreas Sandberg
Draining is currently done by traversing the SimObject graph and calling drain()/drainResume() on the SimObjects. This is not ideal when non-SimObjects (e.g., ports) need draining since this means that SimObjects owning those objects need to be aware of this. This changeset moves the responsibility for finding objects that need draining from SimObjects and the Python-side of the simulator to the DrainManager. The DrainManager now maintains a set of all objects that need draining. To reduce the overhead in classes owning non-SimObjects that need draining, objects inheriting from Drainable now automatically register with the DrainManager. If such an object is destroyed, it is automatically unregistered. This means that drain() and drainResume() should never be called directly on a Drainable object. While implementing the new functionality, the DrainManager has now been made thread safe. In practice, this means that it takes a lock whenever it manipulates the set of Drainable objects since SimObjects in different threads may create Drainable objects dynamically. Similarly, the drain counter is now an atomic_uint, which ensures that it is manipulated correctly when objects signal that they are done draining. A nice side effect of these changes is that it makes the drain state changes stricter, which the simulation scripts can exploit to avoid redundant drains.
2015-07-03mem: Remove redundant is_top_level cache parameterAndreas Hansson
This patch takes the final step in removing the is_top_level parameter from the cache. With the recent changes to read requests and write invalidations, the parameter is no longer needed, and consequently removed. This also means that asymmetric cache hierarchies are now fully supported (and we are actually using them already with L1 caches, but no table-walker caches, connected to a shared L2).
2015-07-03mem: Allow read-only caches and check complianceAndreas Hansson
This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data. A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data.
2015-05-05mem: Snoop into caches on uncacheable accessesAndreas Hansson
This patch takes a last step in fixing issues related to uncacheable accesses. We do not separate uncacheable memory from uncacheable devices, and in cases where it is really memory, there are valid scenarios where we need to snoop since we do not support cache maintenance instructions (yet). On snooping an uncacheable access we thus provide data if possible. In essence this makes uncacheable accesses IO coherent. The snoop filter is also queried to steer the snoops, but not updated since the uncacheable accesses do not allocate a block.
2015-03-27mem: Remove redundant allocateUncachedReadBuffer in cacheAndreas Hansson
This patch removes the no-longer-needed allocateUncachedReadBuffer. Besides the checks it is exactly the same as allocateMissBuffer and thus provides no value.
2015-03-27mem: Align all MSHR entries to block boundariesAndreas Hansson
This patch aligns all MSHR queue entries to block boundaries to simplify checks for matches. Previously there were corner cases that could lead to existing entries not being identified as matches. There are, rather alarmingly, a few regressions that change with this patch.
2015-03-02mem: Tidy up the cache debug messagesAndreas Hansson
Avoid redundant inclusion of the name in the DPRINTF string.
2015-03-02mem: Split port retry for all different packet classesAndreas Hansson
This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios. The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting. The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks.
2015-02-11mem: Clarify usage of latency in the cacheMarco Balboni
This patch adds some much-needed clarity in the specification of the cache timing. For now, hit_latency and response_latency are kept as top-level parameters, but the cache itself has a number of local variables to better map the individual timing variables to different behaviours (and sub-components). The introduced variables are: - lookupLatency: latency of tag lookup, occuring on any access - forwardLatency: latency that occurs in case of outbound miss - fillLatency: latency to fill a cache block We keep the existing responseLatency The forwardLatency is used by allocateInternalBuffer() for: - MSHR allocateWriteBuffer (unchached write forwarded to WriteBuffer); - MSHR allocateMissBuffer (cacheable miss in MSHR queue); - MSHR allocateUncachedReadBuffer (unchached read allocated in MSHR queue) It is our assumption that the time for the above three buffers is the same. Similarly, for snoop responses passing through the cache we use forwardLatency.
2015-02-03mem: Clarify cache behaviour for pending dirty responsesAndreas Hansson
This patch adds a bit of clarification around the assumptions made in the cache when packets are sent out, and dirty responses are pending. As part of the change, the marking of an MSHR as in service is simplified slightly, and comments are added to explain what assumptions are made.
2014-12-02mem: Remove WriteInvalidate supportCurtis Dunham
Prepare for a different implementation following in the next patch
2014-06-27mem: write streaming support via WriteInvalidate promotionCurtis Dunham
Support full-block writes directly rather than requiring RMW: * a cache line is allocated in the cache upon receipt of a WriteInvalidateReq, not the WriteInvalidateResp. * only top-level caches allocate the line; the others just pass the request along and invalidate as necessary. * to close a timing window between the *Req and the *Resp, a new metadata bit tracks whether another cache has read a copy of the new line before the writeback to memory.
2014-09-03mem: Fix a bug in the cache port flow controlAndreas Hansson
This patch fixes a bug in the cache port where the retry flag was reset too early, allowing new requests to arrive before the retry was actually sent, but with the event already scheduled. This caused a deadlock in the interactions with the O3 LSQ. The patche fixes the underlying issue by shifting the resetting of the flag to be done by the event that also calls sendRetry(). The patch also tidies up the flow control in recvTimingReq and ensures that we also check if we already have a retry outstanding.
2014-01-24mem: Add support for a security bit in the memory systemGiacomo Gabrielli
This patch adds the basic building blocks required to support e.g. ARM TrustZone by discerning secure and non-secure memory accesses.
2014-01-24mem: track per-request latencies and access depths in the cache hierarchyMatt Horsnell
Add some values and methods to the request object to track the translation and access latency for a request and which level of the cache hierarchy responded to the request.
2013-02-15mem: Tighten up cache constness and scopingAndreas Hansson
This patch merely adopts a more strict use of const for the cache member functions and variables, and also moves a large portion of the member functions from public to protected.
2013-01-28cache: remove drainManager because it's not usedAnthony Gutierrez
the cache drainManager is set but never cleared, this is because the cache itself does not need to be drained and thus never triggers a signalDrainDone(). because the drainManager variable is not used properly and does not appear to be necessary it has been removed with this patch.
2012-11-02mem: Add support for writing back and flushing cachesAndreas Sandberg
This patch adds support for the following optional drain methods in the classical memory system's cache model: memWriteback() - Write back all dirty cache lines to memory using functional accesses. memInvalidate() - Invalidate all cache lines. Dirty cache lines are lost unless a writeback is requested. Since memWriteback() is called when checkpointing systems, this patch adds support for checkpointing systems with caches. The serialization code now checks whether there are any dirty lines in the cache. If there are dirty lines in the cache, the checkpoint is flagged as bad and a warning is printed.
2012-11-02sim: Move the draining interface into a separate base classAndreas Sandberg
This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager.
2012-10-15Port: Add protocol-agnostic ports in the port hierarchyAndreas Hansson
This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.
2012-10-15Mem: Use cycles to express cache-related latenciesAndreas Hansson
This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.
2012-09-25Cache: add a response latency to the cachesMrinmoy Ghosh
In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path.
2012-08-22Port: Extend the QueuedPort interface and use where appropriateAndreas Hansson
This patch extends the queued port interfaces with methods for scheduling the transmission of a timing request/response. The methods are named similar to the corresponding sendTiming(Snoop)Req/Resp, replacing the "send" with "sched". As the queues are currently unbounded, the methods always succeed and hence do not return a value. This functionality was previously provided in the subclasses by calling PacketQueue::schedSendTiming with the appropriate parameters. With this change, there is no need to introduce these extra methods in the subclasses, and the use of the queued interface is more uniform and explicit.