Age | Commit message (Collapse) | Author |
|
Request and Packet for squashed HWPrefetches were not deleted
Change-Id: I9b66bb01b8ed6a5ddfaaa8739a68165dc4a7006c
Signed-off-by: Pau Cabre <pau.cabre@metempsy.com>
Reviewed-on: https://gem5-review.googlesource.com/4340
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
|
Change-Id: I6a3a57e3067c247bd6ce6f01ac9459883f4aae2c
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3880
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
|
NOTE: With this change there is a possibility for `DRAMCtrl::Rank`s
event names to not properly match the rank they were generated by. This
could occur if the public rank member is modified after the Rank's
construction. A patch would mean refactoring Rank and `DRAMCtrl`b to
privatize many of the members of Rank behind getters.
Change-Id: I7b8bd15086f4ffdfd3f40be4aeddac5e786fd78e
Signed-off-by: Sean Wilson <spwilson2@wisc.edu>
Reviewed-on: https://gem5-review.googlesource.com/3745
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
|
blkAlign was defined as a separate function in the base associative
and fully-associative tags classes although both functions implemented
identical functionality. This patch moves the blkAlign in the base
tags class.
Change-Id: I3d415d0e62bddeec7ce0d559667e40a8c5fdc2d4
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
|
|
Change-Id: I0ed4e528cb750a323facdc811dde7f0ed1ff228e
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
|
|
Change-Id: I6149290d6d2ac1a4bd6165871c93d7b7d6a980ad
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Change-Id: I29f45733c5fad822bdd0d8dcc7939d86b2e8c97b
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Change-Id: I79c2662fc81630ab321db8a75be6cd15fa07d372
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Change-Id: If9ebb8488e8db587482ecfa99d2c12cfe5734fb9
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Change-Id: I4f3c2c027b1acaaf791a4c71086f34a9b9fbf4df
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Policies like the LRU need to be notified when a block is invalidated,
the helper function does this along with invalidating the block.
Change-Id: I3ed59cf07938caa7f394ee6054b0af9e00b267ea
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
This changeset updates an assert in src/mem/cache/mshr.cc which was
erroneously catching invalidated prefetch requests. These requests can
become invalidated if another component writes (an exclusive access)
to this location during the time that the read request is in
flight. The original assert made the assumption that these cases can
only occur for reads generated by the CPU, and hence
prefetcher-generated requests would sometimes trip the assert.
Change-Id: If4f043273a688c2bab8f7a641192a2b583e7b20e
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Previously the writeback visitor would not consider and set the secure
flag for the blocks that are written back to memory. This patch fixes
this.
Change-Id: Ie1a425fa9211407a70a4343f2c6b3d073371378f
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Change-Id: I7ae5fa44a937f641a2ddd242a49e0cd23f68b9f2
Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
The traversal of drainable objects could potentially be
non-deterministic when using an unordered set containing object
pointers. To ensure that the iteration is deterministic, we switch to
a vector. Note that the lookup and traversal of the drainable objects
is not performance critical, so the change has no negative consequences.
|
|
This patch fixes a bug where a deferred snoop ended up being to a
partial cache line, and not cache-line aligned, all due to how we copy
the packet.
|
|
Rather than having the 1st line on the Log line and every other line on its
own, add a new line to have a common format for all of them. Makes parsing
a lot easier.
Reviewed at http://reviews.gem5.org/r/3808/
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
|
|
Used cppclean to help identify useless includes and removed them. This
involved erroneously included headers, but also cases where forward
declarations could have been used rather than a full include.
|
|
|
|
Previously when an InvalidateReq snooped a cache with a dirty block or
a pending modified MSHR, it would invalidate the block or set the
postInv flag. The cache would not send an InvalidateResp. though,
causing memory order violations. This patches changes this behavior,
making the cache with the dirty block or pending modified MSHR the
ordering point.
Change-Id: Ib4c31012f4f6693ffb137cd77258b160fbc239ca
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
|
|
Previously an MSHR with one or more invalidating targets would first
service all targets in the MSHR TargetList and then invalidate the
block. As a result any service snooping targets would lookup in the
cache and incorrectly find the block. This patch forces the
invalidation to happen when the first invalidating target is
encountered.
Change-Id: I9df15de24e1d351cd96f5a2c424d9a03d81c2cce
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
|
|
This patch changes an assertion that previously assumed that a non
invalidating snoop request should never be serviced by an
InvalidateReq MSHR. The MSHR serves as the ordering point for the
snooping packet. When the InvalidateResp reaches the cache the
snooping packet snoops the caches above to find the requested
block. One or more of the caches above will have the block since
earlier it has seen a WriteLineReq.
Change-Id: I0c147c8b5d5019e18bd34adf9af0fccfe431ae07
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
|
|
Previously, a WriteLineReq that missed in a cache would send out an
InvalidateReq if the block lookup failed or an UpgradeReq if the
block lookup succeeded but the block had sharers. This changes ensures
that a WriteLineReq always sends an InvalidateReq to invalidate all
copies of the block and satisfy the WriteLineReq.
Change-Id: I207ff5b267663abf02bc0b08aeadde69ad81be61
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
|
|
This patch fixes an issue where an MSHR would incorrectly be perceived
to provide data to targets arriving after an InvalidateReq. To address
this the InvalidateReq is now treated as isForward, much like an
UpgradeReq that did not hit in the cache.
Change-Id: Ia878444d949539b5c33fd19f3e12b0b8a872275e
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
|
|
Previously DPRINTFs printing information about a packet would use ad hoc
formats. This patch changes all DPRINTFs to use the print function
defined by the packet class, making the packet printing format more
uniform and easier to change.
Change-Id: Idd436a9758d4bf70c86a574d524648b2a2580970
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
|
|
A response to a ReadReq can either be a ReadResp or a
ReadRespWithInvalidate. As we add targets to an MSHR for a ReadReq we
assume that the response will be a ReadResp. When the response is
invalidating (ReadRespWithInvalidate) servicing more than one targets
can potentially violate the memory ordering. This change fixes the way
we handle a ReadRespWithInvalidate. When a cache receives a
ReadRespWithInvalidate we service only the first FromCPU target and
all the FromSnoop targets from the MSHR target list. The rest of the
FromCPU targets are deferred and serviced by a new request.
Change-Id: I75c30c268851987ee5f8644acb46f440b4eeeec2
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
|
|
Previously the information of whether a response was allocating or not
was a property of the MSHR. This change makes this flag a property of
the TargetList. Differernt TargetLists, e.g. the targets and the
deferred targets lists might have different values. Additionally, the
information about whether each of the target expects an allocating
response is stored inside the TargetList container. This allows for
repopulating the flag in case some of the targets are removed.
Change-Id: If3ec2516992f42a6d9da907009ffe3ab8d0d2021
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
|
|
This patch adds support for repopulating the flags of an MSHR
TargetList. The added functionality makes it possible to remove
targets from a TargetList without leaving it in an inconsistent state.
Change-Id: I3f7a8e97bfd3e2e49bebad056d11bbfb087aad91
Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
|
|
If the cache access mode is parallel, i.e. "sequential_access" parameter
is set to "False", tags and data are accessed in parallel. Therefore,
the hit_latency is the maximum latency between tag_latency and
data_latency. On the other hand, if the cache access mode is
sequential, i.e. "sequential_access" parameter is set to "True",
tags and data are accessed sequentially. Therefore, the hit_latency
is the sum of tag_latency plus data_latency.
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
|
|
Previously printing an mshr would trigger an assertion if the MSHR was
not in service or if the targets list was empty. This patch changes
the print function to bypasses the accessor functions for
postInvalidate and postDowngrade and avoid the relevant assertions. It
also checks if the targets list is empty before calling print on it.
Change-Id: Ic18bee6cb088f63976112eba40e89501237cfe62
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
This patch takes yet another step in maintaining the clusivity, in
that it allows a mostly-inclusive cache to hold on to blocks even when
responding to a ReadExReq or UpgradeReq. Previously the cache simply
invalidated these blocks, but there is no strict need to do so.
The most important part of this patch is that we simply mark the block
clean when satisfying the upstream request where the cache is allowed
to keep the block. The only tricky part of the patch is in the memory
management of deferred snoops, where we need to distinguish the cases
where only the packet was copied (we expected to respond), and the
cases where we created an entirely new packet and request (we kept it
only to replay later).
The code in satisfyRequest is definitely ready for some refactoring
after this.
Change-Id: I201ddc7b2582eaa46fb8cff0c7ad09e02d64b0fc
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com>
|
|
This patch changes how the mostly exclusive policy is enforced to
ensure that we drop blocks when we should. As part of this change, the
actual invalidation due to the clusivity enforcement is moved outside
the hit handling, to a separate method maintainClusivity. For the
timing mode that means we can deal with all MSHR targets before taking
any action and possibly dropping the block. The method
satisfyCpuSideRequest is also renamed satisfyRequest as part of this
change (since we only ever see requests from the cpu-side port).
Change-Id: If6f3d1e0c3e7be9a67b72a55e4fc2ec4a90fd3d2
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com>
|
|
This patch adds a FromCache attribute to the packet, and updates a
number of the existing request commands to reflect that the request
originates from a cache. The attribute simplifies checking if a
requests came from a cache or not, and this is used by both the cache
and snoop filter in follow-on patches.
Change-Id: Ib0a7a080bbe4d6036ddd84b46fd45bc7eb41cd8f
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com>
Reviewed-by: Steve Reinhardt <stever@gmail.com>
|
|
Change-Id: I70dd11c23b45dfc606ef08233d2e50fcc0817505
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
We want to extend the stats of objects hierarchically and thus it is necessary
to register the statistics of the base-class(es), as well. For now, these are
empty, but generic stats will be added there.
Patch originally provided by Akash Bagdia at ARM Ltd.
|
|
This patch fixes a memory leak where deferred snoop packets never got
deallocated. On the call to MSHR::handleSnoop these snoops were
treated as if a response will be sent, as the MSHR was
pendingModified. Consequently, a copy of the packet was created and
added to the MSHR targets. However, an preceeding target to the same
MSHR, originally from a CPU, was serviced before the snoop, and caused
the block to be invalidated. This happens for ReadExReq and
UpgradeReq.
Note that the original snoop will receive a response, just not from
the cache in question, but instead from the cache upstream that issued
the ReadExReq or UpgradeReq.
Change-Id: I4ac012fbc8a46cf693ca390fe9476105d444e6f4
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
|
This patch changes the flow control for HSHR::handleSnoop to ensure
that we only set cacheResponding on the snoop packet if we are
actually responding. This avoids situations where a responder is
stalling indefinitely on a response that never arrives.
Change-Id: I691dd01755b614b30203581aa74fc743b350eacc
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
|
|
Change-Id: Ia57cc104978861ab342720654e408dbbfcbe4b69
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Change-Id: I57b56771086e1e2f512977fb7248d93c171ab925
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Change-Id: I5042410be54935650b7d05c84d8d9efbfcc06e70
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Change-Id: I6d1feb164a958dde0da87a1cd2698096112c4a82
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Somehow the WriteLineReq were never added to the list of commands
considered demand.
|
|
Prune cache stats that are never actually used.
|
|
This patch removes the write-queue entry tracking previously used for
uncacheable writes. The write-queue entry is now deallocated as soon
as the packet is sent. As a result we also forego the stats for
uncacheable writes. Additionally, there is no longer a need to attach
the write-queue entry to the packet.
|
|
This patch makes the control flow more uniform in atomic and timing,
ultimately making the code easier to understand.
|
|
Queued prefetcher entries now count with a priority field. The idea is to
add packets ordered by priority and then by age.
For the existing algorithms in which priority doesn't make sense, it is set
to 0 for all deferred packets in the queue.
|
|
Some common functionality added to the base prefetcher, mainly dealing with
extracting the block address, page address, block index inside the page and
some other information that can be inferred from the block address. This is
used for some prefetching algorithms, and having the methods in the base,
as well as the block size and other information is the sensible way.
|
|
Added stat to the cache to account for HardPF'ed blocks that are evicted
before being referenced (over-prefetching).
|
|
In general, the ThreadID parameter is unnecessary in the memory system
as the ContextID is what is used for the purposes of locks/wakeups.
Since we allocate sequential ContextIDs for each thread on MT-enabled
CPUs, ThreadID is unnecessary as the CPUs can identify the requesting
thread through sideband info (SenderState / LSQ entries) or ContextID
offset from the base ContextID for a cpu.
This is a re-spin of 20264eb after the revert (bd1c6789) and includes
some fixes of that commit.
|
|
The following patches had unexpected interactions with the current
upstream code and have been reverted for now:
e07fd01651f3: power: Add support for power models
831c7f2f9e39: power: Low-power idle power state for idle CPUs
4f749e00b667: power: Add power states to ClockedObject
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
--HG--
extra : amend_source : 0b6fb073c6bbc24be533ec431eb51fbf1b269508
|