Age | Commit message (Collapse) | Author |
|
Replacement policies (LRU, Random) are currently considered as array
indexing methods, but have completely different functionalities:
- Array indexers determine the possible locations for block allocation.
This information is used to generate replacement candidates when
conflicts happen.
- Replacement policies determine which of the replacement candidates
should be evicted to make room for new allocations.
For this reason, they were split into different classes. Advantages:
- Easier and more straightforward to implement other replacement
policies (RRIP, LFU, ARC, ...)
- Allow easier future implementation of cache organization schemes
As now we can't assure the use of sets, the previous way to create a
true LRU is not viable. Now a timestamp_bits parameter controls how
many bits are dedicated for the timestamp, and a true LRU can be
achieved through an infinite number of bits (although a few bits suffice
in practice).
Change-Id: I23750db121f1474d17831137e6ff618beb2b3eda
Reviewed-on: https://gem5-review.googlesource.com/8501
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
|
Skewed caches need to know the way to regenerate a block address.
Change-Id: I62c61ac9509eff2f37bad36862751956db7a6e40
Reviewed-on: https://gem5-review.googlesource.com/8782
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
|
Change-Id: I79c2662fc81630ab321db8a75be6cd15fa07d372
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Change-Id: Ia57cc104978861ab342720654e408dbbfcbe4b69
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Change-Id: I5042410be54935650b7d05c84d8d9efbfcc06e70
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
This patch makes cache sets aware of the way number. This enables
some nice features such as the ablity to restrict way allocation. The
implemented mechanism allows to set a maximum way number to be
allocated 'k' which must fulfill 0 < k <= N (where N is the number of
ways). In the future more sophisticated mechasims can be implemented.
|
|
This patch changes the cache implementation to rely on virtual methods
rather than using the replacement policy as a template argument.
There is no impact on the simulation performance, and overall the
changes make it easier to modify (and subclass) the cache and/or
replacement policy.
|
|
this patch implements a new tags class that uses a random replacement policy.
these tags prefer to evict invalid blocks first, if none are available a
replacement candidate is chosen at random.
this patch factors out the common code in the LRU class and creates a new
abstract class: the BaseSetAssoc class. any set associative tag class must
implement the functionality related to the actual replacement policy in the
following methods:
accessBlock()
findVictim()
insertBlock()
invalidate()
|
|
The patch
(1) removes the redundant writeback argument from findVictim()
(2) fixes the description of access() function
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
|
|
This patch adds the basic building blocks required to support e.g. ARM
TrustZone by discerning secure and non-secure memory accesses.
|
|
Adds very basic statistics on the number of tag and data accesses within the
cache, which is important for power modelling. For the tags, simply count
the associativity of the cache each time. For the data, this depends on
whether tags and data are accessed sequentially, which is given by a new
parameter. In the parallel case, all data blocks are accessed each time, but
with sequential accesses, a single data block is accessed only on a hit.
|
|
This patch enables tracking of cache occupancy per thread along with
ages (in buckets) per cache blocks. Cache occupancy stats are
recalculated on each stat dump.
|
|
This patch reorganizes the cache tags to allow more flexibility to
implement new replacement policies. The base tags class is now a
clocked object so that derived classes can use a clock if they need
one. Also having deriving from SimObject allows specialized Tag
classes to be swapped in/out in .py files.
The cache set is now templatized to allow it to contain customized
cache blocks with additional informaiton. This involved moving code to
the .hh file and removing cacheset.cc.
The statistics belonging to the cache tags are now including ".tags"
in their name. Hence, the stats need an update to reflect the change
in naming.
|
|
This patch provides useful printouts throughut the memory system. This
includes pretty-printed cache tags and function call messages
(call-stack like).
|
|
This patch changes the cache-related latencies from an absolute time
expressed in Ticks, to a number of cycles that can be scaled with the
clock period of the caches. Ultimately this patch serves to enable
future work that involves dynamic frequency scaling. As an immediate
benefit it also makes it more convenient to specify cache performance
without implicitly assuming a specific CPU core operating frequency.
The stat blocked_cycles that actually counter in ticks is now updated
to count in cycles.
As the timing is now rounded to the clock edges of the cache, there
are some regressions that change. Plenty of them have very minor
changes, whereas some regressions with a short run-time are perturbed
quite significantly. A follow-on patch updates all the statistics for
the regressions.
|
|
This seperates the functionality to clear the state in a block into
blk.hh and the functionality to udpate the tag information into the
tags. This gets rid of the case where calling invalidateBlk on an
already-invalid block does something different than calling it on a
valid block, which was confusing.
|
|
The LRU policy always evicted the least recently touched way, even if it
contained valid data and another way was invalid, as can happen if a block has
been invalidated by coherance. This can result in caches never warming up even
though they are replacing blocks. This modifies the LRU policy to move blocks
to LRU position on invalidation.
|
|
This patch fixes the cache stats to use the new request ids.
Cache stats also display the requestor names in the vector subnames.
Most cache stats now include "nozero" and "nonan" flags to reduce the
amount of excessive cache stat dump. Also, simplified
incMissCount()/incHitCount() functions.
|
|
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
|
|
|
|
This step makes it easy to replace the accessor functions
(which still access a global variable) with ones that access
per-thread curTick values.
|
|
|
|
when it in received
|
|
Plus, a minor bugfix that neglects to update blk->contextSrc in certain cases on a cache insert.
|
|
On the config end, if a shared L2 is created for the system, it is
parameterized to have n sharers as defined by option.num_cpus. In addition to
making the cache sharing aware so that discriminating tag policies can make use
of context_ids to make decisions, I added an occupancy AverageStat and an occ %
stat to each cache so that you could know which contexts are occupying how much
cache on average, both in terms of blocks and percentage. Note that since
devices have context_id -1, having an array of occ stats that correspond to
each context_id will break here, so in FS mode I add an extra bucket for device
blocks. This bucket is explicitly not added in SE mode in order to not only
avoid ugliness in the stats.txt file, but to avoid broken stats (some formulas
break when a bucket is 0).
|
|
|
|
that the cache can make context-specific decisions within their various tag policy implementations.
|
|
|
|
|
|
connotations for what is really happening and how it should be used.
|
|
|
|
should configure their editors to not insert tabs
|
|
--HG--
extra : convert_revision : b5008115dc5b34958246608757e69a3fa43b85c5
|
|
--HG--
extra : convert_revision : 31724d19ebdf2cdc2a2bafff83d17845b3a0b183
|
|
--HG--
extra : convert_revision : 703da6128832eb0d5cfed7724e5105f4b3fe4f90
|
|
src/mem/cache/tags/lru.cc:
Add some replacement DPRINTFs
--HG--
extra : convert_revision : 7993ec24d6af7e7774d04ce36f20e3f43f887fd9
|
|
timing mode still broken.
configs/example/memtest.py:
Revamp options.
src/cpu/memtest/memtest.cc:
No need for memory initialization.
No need to make atomic response... memory system should do that now.
src/cpu/memtest/memtest.hh:
MemTest really doesn't want to snoop.
src/mem/bridge.cc:
checkFunctional() cleanup.
src/mem/bus.cc:
src/mem/bus.hh:
src/mem/cache/base_cache.cc:
src/mem/cache/base_cache.hh:
src/mem/cache/cache.cc:
src/mem/cache/cache.hh:
src/mem/cache/cache_blk.hh:
src/mem/cache/cache_builder.cc:
src/mem/cache/cache_impl.hh:
src/mem/cache/coherence/coherence_protocol.cc:
src/mem/cache/coherence/coherence_protocol.hh:
src/mem/cache/coherence/simple_coherence.hh:
src/mem/cache/miss/SConscript:
src/mem/cache/miss/mshr.cc:
src/mem/cache/miss/mshr.hh:
src/mem/cache/miss/mshr_queue.cc:
src/mem/cache/miss/mshr_queue.hh:
src/mem/cache/prefetch/base_prefetcher.cc:
src/mem/cache/tags/fa_lru.cc:
src/mem/cache/tags/fa_lru.hh:
src/mem/cache/tags/iic.cc:
src/mem/cache/tags/iic.hh:
src/mem/cache/tags/lru.cc:
src/mem/cache/tags/lru.hh:
src/mem/cache/tags/split.cc:
src/mem/cache/tags/split.hh:
src/mem/cache/tags/split_lifo.cc:
src/mem/cache/tags/split_lifo.hh:
src/mem/cache/tags/split_lru.cc:
src/mem/cache/tags/split_lru.hh:
src/mem/packet.cc:
src/mem/packet.hh:
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/tport.cc:
More major reorg. Seems to work for atomic mode now,
timing mode still broken.
--HG--
extra : convert_revision : 7e70dfc4a752393b911880ff028271433855ae87
|
|
directly configured by python. Move stuff from root.(cc|hh) to
core.(cc|hh) since it really belogs there now.
In the process, simplify how ticks are used in the python code.
--HG--
extra : convert_revision : cf82ee1ea20f9343924f30bacc2a38d4edee8df3
|
|
don't regenerate address from block in cache so that tags can
turn around and use address to look up block again.
--HG--
extra : convert_revision : 171018aa6e331d98399c4e5ef24e173c95eaca28
|
|
--HG--
extra : convert_revision : d9eb83ab77ffd2d725961f295b1733137e187711
|
|
configs/example/fs.py:
Add MOESI protocol to caches (uni coherence not quite working w/FS yet).
--HG--
extra : convert_revision : 7bef7d9c5b24bf7241cc810df692408837b06b86
|
|
--HG--
extra : convert_revision : a701ed9d078c67718a33f4284c0403a8aaac7b25
|
|
translatingPort read/write Blob function problems with caches.
-Basically removed the ASID from places it is no longer needed due to PageTable
src/mem/cache/cache.hh:
src/mem/cache/cache_impl.hh:
src/mem/cache/miss/blocking_buffer.cc:
src/mem/cache/miss/blocking_buffer.hh:
src/mem/cache/miss/miss_queue.cc:
src/mem/cache/miss/miss_queue.hh:
src/mem/cache/miss/mshr.cc:
src/mem/cache/miss/mshr.hh:
src/mem/cache/miss/mshr_queue.cc:
src/mem/cache/miss/mshr_queue.hh:
src/mem/cache/prefetch/base_prefetcher.cc:
src/mem/cache/prefetch/base_prefetcher.hh:
src/mem/cache/tags/fa_lru.cc:
src/mem/cache/tags/fa_lru.hh:
src/mem/cache/tags/iic.cc:
src/mem/cache/tags/iic.hh:
src/mem/cache/tags/lru.cc:
src/mem/cache/tags/lru.hh:
src/mem/cache/tags/split.cc:
src/mem/cache/tags/split.hh:
src/mem/cache/tags/split_lifo.cc:
src/mem/cache/tags/split_lifo.hh:
src/mem/cache/tags/split_lru.cc:
src/mem/cache/tags/split_lru.hh:
Remove asid where it wasn't neccesary anymore due to Page Table
--HG--
extra : convert_revision : ab8bbf4cc47b9eaefa9cdfa790881a21d0e7bf28
|
|
Need to clean up a bunch of flags/hacks in the code. Then onto Timming mode.
Functional accesses also work properly, although not exactly how we wanted them. I'll need to clean that up as well.
src/cpu/simple/atomic.cc:
Atomic CPU needs to set thread context so stats work in cache. Temporarily just use CPU=0 ThreadID=0
src/mem/cache/cache_impl.hh:
Need to return success/failure properly still
Physical memory object doesn't assert SATISFIED anymore, need to remove that flag
src/mem/cache/tags/lru.cc:
Doesn't work if the REQ doesn't set it's ASID. Temporary fix use 0 always
--HG--
extra : convert_revision : d06a39684af593db699b64df9a29f80c61d8d050
|
|
my initial work.
This now compiles.
src/mem/cache/base_cache.cc:
Fix getPort function that changed
src/mem/cache/base_cache.hh:
Fix get port function, provide default implementations of virtual functions in the base class
src/mem/cache/cache.hh:
Fix virtual function declerations
src/mem/cache/cache_builder.cc:
Fix params
src/mem/cache/cache_impl.hh:
src/mem/cache/miss/blocking_buffer.cc:
src/mem/cache/miss/miss_queue.cc:
src/mem/cache/miss/mshr.cc:
src/mem/cache/prefetch/base_prefetcher.cc:
src/mem/cache/tags/iic.cc:
src/mem/cache/tags/lru.cc:
Properly allocate data in packet
--HG--
extra : convert_revision : dedf8b0f76ab90b06b60f8fe079c0ae361f91a48
|
|
Missing some functionality (like split caches and copy support)
src/SConscript:
Typo
src/mem/cache/prefetch/base_prefetcher.cc:
src/mem/cache/prefetch/ghb_prefetcher.hh:
src/mem/cache/prefetch/stride_prefetcher.hh:
src/mem/cache/prefetch/tagged_prefetcher_impl.hh:
src/mem/cache/tags/fa_lru.cc:
src/mem/cache/tags/fa_lru.hh:
src/mem/cache/tags/iic.cc:
src/mem/cache/tags/iic.hh:
src/mem/cache/tags/lru.cc:
src/mem/cache/tags/lru.hh:
src/mem/cache/tags/split.cc:
src/mem/cache/tags/split.hh:
src/mem/cache/tags/split_lifo.cc:
src/mem/cache/tags/split_lifo.hh:
src/mem/cache/tags/split_lru.cc:
src/mem/cache/tags/split_lru.hh:
src/mem/packet.hh:
src/mem/request.hh:
Fix so it compiles
--HG--
extra : convert_revision : 0d87d84f6e9445bab655c0cb0f8541bbf6eab904
|
|
base_cache.cc compiles, continuing on
src/SConscript:
Add in compilation flags for cache files
src/mem/cache/base_cache.cc:
src/mem/cache/base_cache.hh:
Back in more fixes, now base_cache compiles
src/mem/cache/cache.hh:
src/mem/cache/cache_blk.hh:
src/mem/cache/cache_impl.hh:
src/mem/cache/coherence/coherence_protocol.cc:
src/mem/cache/miss/blocking_buffer.cc:
src/mem/cache/miss/blocking_buffer.hh:
src/mem/cache/miss/miss_queue.cc:
src/mem/cache/miss/miss_queue.hh:
src/mem/cache/miss/mshr.cc:
src/mem/cache/miss/mshr.hh:
src/mem/cache/miss/mshr_queue.cc:
src/mem/cache/miss/mshr_queue.hh:
src/mem/cache/prefetch/base_prefetcher.cc:
src/mem/cache/tags/fa_lru.cc:
src/mem/cache/tags/iic.cc:
src/mem/cache/tags/lru.cc:
src/mem/cache/tags/split_lifo.cc:
src/mem/cache/tags/split_lru.cc:
src/mem/packet.cc:
src/mem/packet.hh:
src/mem/request.hh:
Backing in more changsets, getting closer to compile
--HG--
extra : convert_revision : ac2dcda39f8d27baffc4db1df17b9a1fcce5b6ed
|
|
and will add back in the patches to make it work soon.
src/mem/cache/prefetch/tagged_prefetcher_impl.hh:
Trying to merge
src/mem/cache/base_cache.cc:
src/mem/cache/base_cache.hh:
src/mem/cache/cache.cc:
src/mem/cache/cache.hh:
src/mem/cache/cache_blk.hh:
src/mem/cache/cache_builder.cc:
src/mem/cache/cache_impl.hh:
src/mem/cache/coherence/coherence_protocol.cc:
src/mem/cache/coherence/coherence_protocol.hh:
src/mem/cache/coherence/simple_coherence.hh:
src/mem/cache/coherence/uni_coherence.cc:
src/mem/cache/coherence/uni_coherence.hh:
src/mem/cache/miss/blocking_buffer.cc:
src/mem/cache/miss/blocking_buffer.hh:
src/mem/cache/miss/miss_queue.cc:
src/mem/cache/miss/miss_queue.hh:
src/mem/cache/miss/mshr.cc:
src/mem/cache/miss/mshr.hh:
src/mem/cache/miss/mshr_queue.cc:
src/mem/cache/miss/mshr_queue.hh:
src/mem/cache/prefetch/base_prefetcher.cc:
src/mem/cache/prefetch/base_prefetcher.hh:
src/mem/cache/prefetch/ghb_prefetcher.cc:
src/mem/cache/prefetch/ghb_prefetcher.hh:
src/mem/cache/prefetch/stride_prefetcher.cc:
src/mem/cache/prefetch/stride_prefetcher.hh:
src/mem/cache/prefetch/tagged_prefetcher.hh:
src/mem/cache/tags/base_tags.cc:
src/mem/cache/tags/base_tags.hh:
src/mem/cache/tags/fa_lru.cc:
src/mem/cache/tags/fa_lru.hh:
src/mem/cache/tags/iic.cc:
src/mem/cache/tags/iic.hh:
src/mem/cache/tags/lru.cc:
src/mem/cache/tags/lru.hh:
src/mem/cache/tags/repl/gen.cc:
src/mem/cache/tags/repl/gen.hh:
src/mem/cache/tags/repl/repl.cc:
src/mem/cache/tags/repl/repl.hh:
src/mem/cache/tags/split.cc:
src/mem/cache/tags/split.hh:
src/mem/cache/tags/split_blk.hh:
src/mem/cache/tags/split_lifo.cc:
src/mem/cache/tags/split_lifo.hh:
src/mem/cache/tags/split_lru.cc:
src/mem/cache/tags/split_lru.hh:
Pulling an early version of the cache into the tree due to merging issues. Will apply patches and push.
--HG--
extra : convert_revision : 3276e5fb9a6272681a1690babf2b586dd0e1f380
|