summaryrefslogtreecommitdiff
path: root/src/mem/cache/prefetch
AgeCommit message (Collapse)Author
2016-11-09style: [patch 3/22] reduce include dependencies in some headersBrandon Potter
Used cppclean to help identify useless includes and removed them. This involved erroneously included headers, but also cases where forward declarations could have been used rather than a full include.
2016-11-09style: [patch 1/22] use /r/3648/ to reorganize includesBrandon Potter
2016-06-06sim: Call regStats of base-class as wellStephan Diestelhorst
We want to extend the stats of objects hierarchically and thus it is necessary to register the statistics of the base-class(es), as well. For now, these are empty, but generic stats will be added there. Patch originally provided by Akash Bagdia at ARM Ltd.
2016-05-26mem: change NULL to nullptr in the cache related classesNikos Nikoleris
Change-Id: I5042410be54935650b7d05c84d8d9efbfcc06e70 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-04-07mem: Add priority to QueuedPrefetcherRekai Gonzalez Alberquilla
Queued prefetcher entries now count with a priority field. The idea is to add packets ordered by priority and then by age. For the existing algorithms in which priority doesn't make sense, it is set to 0 for all deferred packets in the queue.
2016-04-07mem: Handful extra features for BasePrefetcherRekai Gonzalez Alberquilla
Some common functionality added to the base prefetcher, mainly dealing with extracting the block address, page address, block index inside the page and some other information that can be inferred from the block address. This is used for some prefetching algorithms, and having the methods in the base, as well as the block size and other information is the sensible way.
2016-04-07mem: Remove threadId from memory request classMitch Hayenga
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu. This is a re-spin of 20264eb after the revert (bd1c6789) and includes some fixes of that commit.
2016-04-06Revert power patch sets with unexpected interactionsAndreas Sandberg
The following patches had unexpected interactions with the current upstream code and have been reverted for now: e07fd01651f3: power: Add support for power models 831c7f2f9e39: power: Low-power idle power state for idle CPUs 4f749e00b667: power: Add power states to ClockedObject Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> --HG-- extra : amend_source : 0b6fb073c6bbc24be533ec431eb51fbf1b269508
2016-04-05mem: Remove threadId from memory request classMitch Hayenga
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu.
2016-02-06style: fix missing spaces in control statementsSteve Reinhardt
Result of running 'hg m5style --skip-all --fix-control -a'.
2015-10-12misc: Remove redundant compiler-specific definesAndreas Hansson
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions.
2015-07-30mem: Remove unused RequestCause in cacheAndreas Hansson
This patch removes the RequestCause, and also simplifies how we schedule the sending of packets through the memory-side port. The deassertion of bus requests is removed as it is not used.
2015-07-13mem: Fix (ab)use of emplace to avoid temporary object creationAndreas Hansson
2015-07-03mem: Add clean evicts to improve snoop filter trackingAli Jafri
This patch adds eviction notices to the caches, to provide accurate tracking of cache blocks in snoop filters. We add the CleanEvict message to the memory heirarchy and use both CleanEvicts and Writebacks with BLOCK_CACHED flags to propagate notice of clean and dirty evictions respectively, down the memory hierarchy. Note that the BLOCK_CACHED flag indicates whether there exist any copies of the evicted block in the caches above the evicting cache. The purpose of the CleanEvict message is to notify snoop filters of silent evictions in the relevant caches. The CleanEvict message behaves much like a Writeback. CleanEvict is a write and a request but unlike a Writeback, CleanEvict does not have data and does not need exclusive access to the block. The cache generates the CleanEvict message on a fill resulting in eviction of a clean block. Before travelling downwards CleanEvict requests generate zero-time snoop requests to check if the same block is cached in upper levels of the memory heirarchy. If the block exists, the cache discards the CleanEvict message. The snoops check the tags, writeback queue and the MSHRs of upper level caches in a manner similar to snoops generated from HardPFReqs. Currently CleanEvicts keep travelling towards main memory unless they encounter the block corresponding to their address or reach main memory (since we have no well defined point of serialisation). Main memory simply discards CleanEvict messages. We have modified the behavior of Writebacks, such that they generate snoops to check for the presence of blocks in upper level caches. It is possible in our current implmentation for a lower level cache to be writing back a block while a shared copy of the same block exists in the upper level cache. If the snoops find the same block in upper level caches, we set the BLOCK_CACHED flag in the Writeback message. We have also added logic to account for interaction of other message types with CleanEvicts waiting in the writeback queue. A simple example is of a response arriving at a cache removing any CleanEvicts to the same address from the cache's writeback queue.
2015-03-27mem: Support any number of master-IDs in stride prefetcherStephan Diestelhorst
The stride prefetcher had a hardcoded number of contexts (i.e. master-IDs) that it could handle. Since master IDs need to be unique per system, and every core, cache etc. requires a separate master port, a static limit on these does not make much sense. Instead, this patch adds a small hash map that will map all master IDs to the right prefetch state and dynamically allocates new state for new master IDs.
2015-03-19mem: Use emplace front/back for deferred packetsAndreas Hansson
Embrace C++11 for the deferred packets as we actually store the objects in the data structure, and not just pointers.
2014-12-23mem: Change prefetcher to use random_mtMitch Hayenga
Prefechers has used rand() to generate random numers previously.
2014-12-23mem: Hide WriteInvalidate requests from prefetchersCurtis Dunham
Without this tweak, a prefetcher will happily prefetch data that will promptly be invalidated and overwritten by a WriteInvalidate.
2014-12-23mem: Rework the structuring of the prefetchersMitch Hayenga
Re-organizes the prefetcher class structure. Previously the BasePrefetcher forced multiple assumptions on the prefetchers that inherited from it. This patch makes the BasePrefetcher class truly representative of base functionality. For example, the base class no longer enforces FIFO order. Instead, prefetchers with FIFO requests (like the existing stride and tagged prefetchers) now inherit from a new QueuedPrefetcher base class. Finally, the stride-based prefetcher now assumes a custimizable lookup table (sets/ways) rather than the previous fully associative structure.
2014-10-16mem: Dynamically determine page bytes in memory componentsAndreas Hansson
This patch takes a step towards an ISA-agnostic memory system by enabling the components to establish the page size after instantiation. The swap operation in the memory is now also allowing any granularity to avoid depending on the IntReg of the ISA.
2014-09-20mem: Remove the GHB prefetcher from the source treeMitch Hayenga
There are two primary issues with this code which make it deserving of deletion. 1) GHB is a way to structure a prefetcher, not a definitive type of prefetcher 2) This prefetcher isn't even structured like a GHB prefetcher. It's basically a worse version of the stride prefetcher. It primarily serves to confuse new gem5 users and most functionality is already present in the stride prefetcher.
2014-09-09misc: Fix a number of unitialised variables and membersAndreas Hansson
Static analysis unearther a bunch of uninitialised variables and members, and this patch addresses the problem. In all cases these omissions seem benign in the end, but at least fixing them means less false positives next time round.
2014-09-03arch: Cleanup unused ISA traits constantsAndreas Hansson
This patch prunes unused values, and also unifies how the values are defined (not using an enum for ALPHA), aligning the use of int vs Addr etc. The patch also removes the duplication of PageBytes/PageShift and VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical values and the latter has been removed.
2014-01-29mem: Add additional tolerance to stride prefetcherMitch Hayenga
Forces the prefetcher to mispredict twice in a row before resetting the confidence of prefetching. This helps cases where a load PC strides by a constant factor, however it may operate on different arrays at times. Avoids the cost of retraining. Primarily helps with small iteration loops. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-29mem: Allowed tagged instruction prefetching in stride prefetcherMitch Hayenga
For systems with a tightly coupled L2, a stride-based prefetcher may observe access requests from both instruction and data L1 caches. However, the PC address of an instruction miss gives no relevant training information to the stride based prefetcher(there is no stride to train). In theses cases, its better if the L2 stride prefetcher simply reverted back to a simple N-block ahead prefetcher. This patch enables this option. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-29mem: prefetcher: add options, support for unaligned addressesMitch Hayenga ext:(%2C%20Amin%20Farmahini%20%3Caminfar%40gmail.com%3E)
This patch extends the classic prefetcher to work on non-block aligned addresses. Because the existing prefetchers in gem5 mask off the lower address bits of cache accesses, many predictable strides fail to be detected. For example, if a load were to stride by 48 bytes, with 64 byte cachelines, the current stride based prefetcher would see an access pattern of 0, 64, 64, 128, 192.... Thus not detecting a constant stride pattern. This patch fixes this, by training the prefetcher on access and not masking off the lower address bits. It also adds the following configuration options: 1) Training/prefetching only on cache misses, 2) Training/prefetching only on data acceses, 3) Optionally tagging prefetches with a PC address. #3 allows prefetchers to train off of prefetch requests in systems with multiple cache levels and PC-based prefetchers present at multiple levels. It also effectively allows a pipelining of prefetch requests (like in POWER4) across multiple levels of cache hierarchy. Improves performance on my gem5 configuration by 4.3% for SPECINT and 4.7% for SPECFP (geomean).
2014-01-24mem: Add support for a security bit in the memory systemGiacomo Gabrielli
This patch adds the basic building blocks required to support e.g. ARM TrustZone by discerning secure and non-secure memory accesses.
2014-01-24mem: per-thread cache occupancy and per-block agesDam Sunwoo
This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump.
2013-09-04arch: Resurrect the NOISA build target and rename it NULLAndreas Hansson
This patch makes it possible to once again build gem5 without any ISA. The main purpose is to enable work around the interconnect and memory system without having to build any CPU models or device models. The regress script is updated to include the NULL ISA target. Currently no regressions make use of it, but all the testers could (and perhaps should) transition to it. --HG-- rename : build_opts/NOISA => build_opts/NULL rename : src/arch/noisa/SConsopts => src/arch/null/SConsopts rename : src/arch/noisa/cpu_dummy.hh => src/arch/null/cpu_dummy.hh rename : src/cpu/intr_control.cc => src/cpu/intr_control_noisa.cc
2013-02-19mem: Add deferred packet class to prefetcherAndreas Hansson
This patch removes the time field from the packet as it was only used by the preftecher. Similar to the packet queue, the prefetcher now wraps the packet in a deferred packet, which also has a tick representing the absolute time when the packet should be sent.
2013-02-19sim: Make clock private and access using clockPeriod()Andreas Hansson
This patch makes the clock member private to the ClockedObject and forces all children to access it using clockPeriod(). This makes it impossible to inadvertently change the clock, and also makes it easier to transition to a situation where the clock is derived from e.g. a clock domain, or through a multiplier.
2012-11-02sim: Include object header files in SWIG interfacesAndreas Sandberg
When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it.
2012-10-15Mem: Use cycles to express cache-related latenciesAndreas Hansson
This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.
2012-05-10gem5: fix some iterator use and erase bugsAli Saidi
2012-04-14MEM: Remove the Broadcast destination from the packetAndreas Hansson
This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field. Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before). The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class. In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing.
2012-02-12mem: Add a master ID to each request object.Ali Saidi
This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python.
2012-02-12prefetcher: Make prefetcher a sim object instead of it being a parameter on ↵Mrinmoy Ghosh
cache
2011-09-01Fix build for gcc-4.2 opt/fastLisa Hsu
Even though the code is safe, compiler flags a warning here, which are treated as errors for fast/opt. I know it's redundant but it has no side effects and fixes the compile.
2011-08-19Prefetcher: Fix some memory leaks with the prefetcher.Ali Saidi
2011-04-15trace: reimplement the DTRACE function so it doesn't use a vectorNathan Binkert
At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help
2011-04-15includes: sort all includesNathan Binkert
2011-02-23Includes: Don't include isa_traits.hh and use the TheISA namespace unless ↵Ali Saidi
really needed.
2010-11-19SCons: Support building without an ISAAli Saidi
2009-09-26Force prefetches to check cache and MSHRs immediately prior to issue.Steve Reinhardt
This prevents redundant prefetches from being issued, solving the occasional 'needsExclusive && !blk->isWritable()' assertion failure in cache_impl.hh that several people have run into. Eliminates "prefetch_cache_check_push" flag, neither setting of which really solved the problem.
2009-09-23arch: nuke arch/isa_specific.hh and move stuff to generated config/the_isa.hhNathan Binkert
2009-06-04types: clean up types, especially signed vs unsignedNathan Binkert
2009-04-20request: rename INST_READ to INST_FETCH.Steve Reinhardt
2009-03-10prefetch: don't panic on requests w/o contextID (e.g., writebacks).Steve Reinhardt
2009-03-05stats: Fix all stats usages to deal with template fixesNathan Binkert
2009-02-16Fixes to get prefetching working again.Steve Reinhardt
Apparently we broke it with the cache rewrite and never noticed. Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part of these changes (and for inspiring me to work on the rest). Some other overdue cleanup on the prefetch code too.