Age | Commit message (Collapse) | Author |
|
Block allocation can fail when there is an in-service MSHR that
operates on the victim block. This can happed due to:
* an upgrade operation: a request that needs a writable copy of the
block finds a shared (non-writable) copy of the block in the cache
and has allocates an MSHR for the pending upgrade operation, or
* a clean operation: a clean request finds a dirty copy of the block
and allocates an MSHR for the pending clean operation.
This changes relaxes an assertion to allow for the 2nd case (cache
clean operations).
Change-Id: Ib51482160b5f2b3702ed744b0eac2029d34bc9d4
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/9021
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Gabe Black <gabeblack@google.com>
|
|
A response to a ReadReq can either be a ReadResp or a
ReadRespWithInvalidate. As we add targets to an MSHR for a ReadReq we
assume that the response will be a ReadResp. When the response is
invalidating (ReadRespWithInvalidate) servicing more than one targets
can potentially violate the memory ordering. This change fixes the way
we handle a ReadRespWithInvalidate. When a cache receives a
ReadRespWithInvalidate we service only the first FromCPU target and
all the FromSnoop targets from the MSHR target list. The rest of the
FromCPU targets are deferred and serviced by a new request.
Change-Id: I75c30c268851987ee5f8644acb46f440b4eeeec2
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
|
|
Previously the information of whether a response was allocating or not
was a property of the MSHR. This change makes this flag a property of
the TargetList. Differernt TargetLists, e.g. the targets and the
deferred targets lists might have different values. Additionally, the
information about whether each of the target expects an allocating
response is stored inside the TargetList container. This allows for
repopulating the flag in case some of the targets are removed.
Change-Id: If3ec2516992f42a6d9da907009ffe3ab8d0d2021
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
|
|
This patch adds support for repopulating the flags of an MSHR
TargetList. The added functionality makes it possible to remove
targets from a TargetList without leaving it in an inconsistent state.
Change-Id: I3f7a8e97bfd3e2e49bebad056d11bbfb087aad91
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
|
|
This patch breaks out the cache write buffer into a separate class,
without affecting any stats. The goal of the patch is to avoid
encumbering the much-simpler write queue with the complex MSHR
handling. In a follow on patch this simplification allows us to
implement write combining.
The WriteQueue gets its own class, but shares a common ancestor, the
generic Queue, with the MSHRQueue.
|
|
This patch adds assertions that enforce that only invalidating snoops
will ever reach into the logic that tracks in-order load completion and
also invalidation of LL/SC (and MONITOR / MWAIT) monitors. Also adds
some comments to MSHR::replaceUpgrades().
|
|
This patch changes the name of a bunch of packet flags and MSHR member
functions and variables to make the coherency protocol easier to
understand. In addition the patch adds and updates lots of
descriptions, explicitly spelling out assumptions.
The following name changes are made:
* the packet memInhibit flag is renamed to cacheResponding
* the packet sharedAsserted flag is renamed to hasSharers
* the packet NeedsExclusive attribute is renamed to NeedsWritable
* the packet isSupplyExclusive is renamed responderHadWritable
* the MSHR pendingDirty is renamed to pendingModified
The cache states, Modified, Owned, Exclusive, Shared are also called
out in the cache and MSHR code to make it easier to understand.
|
|
This patch removes the unused squash function from the MSHR queue, and
the associated (and also unused) threadNum member from the MSHR.
|
|
This patch adds a parameter to control the cache clusivity, that is if
the cache is mostly inclusive or exclusive. At the moment there is no
intention to support strict policies, and thus the options are: 1)
mostly inclusive, or 2) mostly exclusive.
The choice of policy guides the behaviuor on a cache fill, and a new
helper function, allocOnFill, is created to encapsulate the decision
making process. For the timing mode, the decision is annotated on the
MSHR on sending out the downstream packet, and in atomic we directly
pass the decision to handleFill. We (ab)use the tempBlock in cases
where we are not allocating on fill, leaving the rest of the cache
unaffected. Simple and effective.
This patch also makes it more explicit that multiple caches are
allowed to consider a block writable (this is the case
also before this patch). That is, for a mostly inclusive cache,
multiple caches upstream may also consider the block exclusive. The
caches considering the block writable/exclusive all appear along the
same path to memory, and from a coherency protocol point of view it
works due to the fact that we always snoop upwards in zero time before
querying any downstream cache.
Note that this patch does not introduce clean writebacks. Thus, for
clean lines we are essentially removing a cache level if it is made
mostly exclusive. For example, lines from the read-only L1 instruction
cache or table-walker cache are always clean, and simply get dropped
rather than being passed to the L2. If the L2 is mostly exclusive and
does not allocate on fill it will thus never hold the line. A follow
on patch adds the clean writebacks.
The patch changes the L2 of the O3_ARM_v7a CPU configuration to be
mostly exclusive (and stats are affected accordingly).
|
|
This patch addresses the upgrading of deferred targets in the MSHR,
and makes it clearer by explicitly calling out what is happening
(deferred targets are promoted if we get exclusivity without asking
for it).
|
|
This patch updates the iterators in the MSHR and MSHR queues to use
C++11 range-based for loops. It also does a bit of additional house
keeping.
|
|
This patch aligns all MSHR queue entries to block boundaries to
simplify checks for matches. Previously there were corner cases that
could lead to existing entries not being identified as matches.
There are, rather alarmingly, a few regressions that change with this
patch.
|
|
This patch adds a bit of clarification around the assumptions made in
the cache when packets are sent out, and dirty responses are
pending. As part of the change, the marking of an MSHR as in service
is simplified slightly, and comments are added to explain what
assumptions are made.
|
|
Prepare for a different implementation following in the next patch
|
|
WriteInvalidate semantics depend on the unconditional writeback
or they won't complete. Also, there's no point in deferring snoops
on their MSHRs, as they don't get new data at the end of their life
cycle the way other transactions do.
Add comment in the cache about a minor inefficiency re: WriteInvalidate.
|
|
Since WriteInvalidate directly writes into the cache, it can
create tricky timing interleavings with reads and writes to the
same cache line that haven't yet completed. This patch ensures
that these requests, when completed, don't overwrite the newer
data from the WriteInvalidate.
|
|
This patch adds the basic building blocks required to support e.g. ARM
TrustZone by discerning secure and non-secure memory accesses.
|
|
This patch does some minor tidying up of the MSHR and MSHRQueue. The
clean up started as part of some ad-hoc tracing and debugging, but
seems worthwhile enough to go in as a separate patch.
The highlights of the changes are reduced scoping (private) members
where possible, avoiding redundant new/delete, and constructor
initialisation to please static code analyzers.
|
|
This patch provides useful printouts throughut the memory system. This
includes pretty-printed cache tags and function call messages
(call-stack like).
|
|
|
|
This step makes it easy to replace the accessor functions
(which still access a global variable) with ones that access
per-thread curTick values.
|
|
Allow lower-level caches (e.g., L2 or L3) to pass exclusive
copies to higher levels (e.g., L1). This eliminates a lot
of unnecessary upgrade transactions on read-write sequences
to non-shared data.
Also some cleanup of MSHR coherence handling and multiple
bug fixes.
|
|
|
|
|
|
Apparently we broke it with the cache rewrite and never noticed.
Thanks to Bao Yungang <baoyungang@gmail.com> for a significant part
of these changes (and for inspiring me to work on the rest).
Some other overdue cleanup on the prefetch code too.
|
|
Makes adding write-through operations easier.
|
|
--HG--
rename : src/mem/cache/base_cache.cc => src/mem/cache/base.cc
rename : src/mem/cache/base_cache.hh => src/mem/cache/base.hh
rename : src/mem/cache/cache_blk.cc => src/mem/cache/blk.cc
rename : src/mem/cache/cache_blk.hh => src/mem/cache/blk.hh
rename : src/mem/cache/cache_builder.cc => src/mem/cache/builder.cc
rename : src/mem/cache/miss/mshr.cc => src/mem/cache/mshr.cc
rename : src/mem/cache/miss/mshr.hh => src/mem/cache/mshr.hh
rename : src/mem/cache/miss/mshr_queue.cc => src/mem/cache/mshr_queue.cc
rename : src/mem/cache/miss/mshr_queue.hh => src/mem/cache/mshr_queue.hh
rename : src/mem/cache/prefetch/base_prefetcher.cc => src/mem/cache/prefetch/base.cc
rename : src/mem/cache/prefetch/base_prefetcher.hh => src/mem/cache/prefetch/base.hh
rename : src/mem/cache/prefetch/ghb_prefetcher.cc => src/mem/cache/prefetch/ghb.cc
rename : src/mem/cache/prefetch/ghb_prefetcher.hh => src/mem/cache/prefetch/ghb.hh
rename : src/mem/cache/prefetch/stride_prefetcher.cc => src/mem/cache/prefetch/stride.cc
rename : src/mem/cache/prefetch/stride_prefetcher.hh => src/mem/cache/prefetch/stride.hh
rename : src/mem/cache/prefetch/tagged_prefetcher.cc => src/mem/cache/prefetch/tagged.cc
rename : src/mem/cache/prefetch/tagged_prefetcher.hh => src/mem/cache/prefetch/tagged.hh
rename : src/mem/cache/tags/base_tags.cc => src/mem/cache/tags/base.cc
rename : src/mem/cache/tags/base_tags.hh => src/mem/cache/tags/base.hh
rename : src/mem/cache/tags/Repl.py => src/mem/cache/tags/iic_repl/Repl.py
rename : src/mem/cache/tags/repl/gen.cc => src/mem/cache/tags/iic_repl/gen.cc
rename : src/mem/cache/tags/repl/gen.hh => src/mem/cache/tags/iic_repl/gen.hh
rename : src/mem/cache/tags/repl/repl.hh => src/mem/cache/tags/iic_repl/repl.hh
extra : convert_revision : ff7a35cc155a8d80317563c45cebe405984eac62
|