Age | Commit message (Collapse) | Author |
|
This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.
The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.
The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.
|
|
This patch fixes a case where a store in Minor's store buffer never
leaves the store buffer as it is pre-maturely counted as having been
issued, leading to the store buffer idling.
LSQ::StoreBuffer::numUnissuedAccesses should count the number of accesses
either in memory, or still in the store buffer after being completed.
For stores which are also barriers, the store will stay in the store
buffer for a cycle after it is completed and will be cleaned up by the
barrier clearing code (to ensure that barriers are completed in-order).
To acheive this, numUnissuedAccesses is not decremented when a store-barrier
is issued to memory, but when its barrier effect is cleared.
Without this patch, the correct behaviour happens when a memory transaction
is immediately accepted, but not if it needs a retry.
|
|
This patch changes how faults are passed between methods in an attempt
to copy as few reference-counting pointer instances as possible. This
should avoid unecessary copies being created, contributing to the
increment/decrement of the reference counters.
|
|
This patch fixes cases where uncacheable/memory type flags are not set
correctly on a memory op which is split in the LSQ. Without this
patch, request->request if freely used to check flags where the flags
should actually come from the accumulation of request fragment flags.
This patch also fixes a bug where an uncacheable access which passes
through tryToSendRequest more than once can increment
LSQ::numAccessesInMemorySystem more than once.
|
|
This patch contains a new CPU model named `Minor'. Minor models a four
stage in-order execution pipeline (fetch lines, decompose into
macroops, decompose macroops into microops, execute).
The model was developed to support the ARM ISA but should be fixable
to support all the remaining gem5 ISAs. It currently also works for
Alpha, and regressions are included for ARM and Alpha (including Linux
boot).
Documentation for the model can be found in src/doc/inside-minor.doxygen and
its internal operations can be visualised using the Minorview tool
utils/minorview.py.
Minor was designed to be fairly simple and not to engage in a lot of
instruction annotation. As such, it currently has very few gathered
stats and may lack other gem5 features.
Minor is faster than the o3 model. Sample results:
Benchmark | Stat host_seconds (s)
---------------+--------v--------v--------
(on ARM, opt) | simple | o3 | minor
| timing | timing | timing
---------------+--------+--------+--------
10.linux-boot | 169 | 1883 | 1075
10.mcf | 117 | 967 | 491
20.parser | 668 | 6315 | 3146
30.eon | 542 | 3413 | 2414
40.perlbmk | 2339 | 20905 | 11532
50.vortex | 122 | 1094 | 588
60.bzip2 | 2045 | 18061 | 9662
70.twolf | 207 | 2736 | 1036
|