Age | Commit message (Collapse) | Author |
|
|
|
switching between O3 and another CPU, O3's tick event might still be scheduled
in the event queue (as squashed). Therefore, check for a squashed tick event
as well as a non-scheduled event when taking over from another CPU and deal
with it accordingly.
|
|
|
|
m5 doesnt do stats specific to binary and this resource request stat is probably only
useful for people who really know the ins/outs of the model anyway
|
|
the nextPC was getting sent to the branch predictor not the current PC, so
the RAS was returning the wrong PC and mispredicting everything.
|
|
replace priority queue with vector of lists(1 list per stage) and place inside a class
so that we have more control of when an instruction uses a particular schedule entry
...
also, this is the 1st step toward making the InOrderCPU fully parameterizable. See the
wiki for details on this process
|
|
remove the annotation 'virtual' from function declaration that isnt being derived from
|
|
|
|
|
|
this applies to multithreading models which would like to squash a thread on memory stall
|
|
|
|
- use InOrderBPred instead of Resource for DPRINTFs
- account for DELAY SLOT in updating RAS and in squashing
- don't let squashed instructions update the predictor
- the BTB needs to use the ASID not the TID to work for multithreaded programs
- add stats for BTB hits
|
|
also, remove inst-req stats as default.good for debugging
but in terms of pure processor stats they aren't useful
|
|
remove stall only when necessary
add debugging printfs
|
|
use nextCycle to calculate ticks after addition
|
|
the system pointers match in Full System mode.
|
|
|
|
|
|
add a couple of helper functions to base for deleteing all pointers in
a container and outputting containers to a stream
|
|
Expand the help text on the --remote-gdb-port option so
people know you can use it to disable remote gdb without
reading the source code, and thus don't waste any time
trying to add a separate option to do that.
Clean up some gdb-related cruft I found while looking
for where one would add a gdb disable option, before
I found the comment that told me that I didn't need
to do that.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Suppose the saturating counters of a branch predictor contain n bits. When the
counter is between 0 and (2^(n-1) - 1), boundaries included, the branch is
predicted as not taken. When the counter is between 2^(n-1) and (2^n - 1),
boundaries included, the branch is predicted as taken.
|
|
|
|
when insts execute, they mark the time they finish to be used for subsequent isnts
they may need forwarding of data. However, the regdepmap was using the wrong
value to index into the destination operands of the instruction to be forwarded.
Thus, in some cases, we are checking to see if the 3rd destination register
for an instruction is executed at a certain time, when there is only 1 dest. register
valid. Thus, we get a bad, uninitialized time value that will stall forwarding
causing performance loss but still the correct execution.
|
|
|
|
In addition to obvious changes, this required a slight change to the slicc
grammar to allow types with :: in them. Otherwise slicc barfs on std::string
which we need for the headers that slicc generates.
|
|
|
|
|
|
make sure to only read 1 src reg. for write-hint and any other similar
'store' instruction. Reading the source reg when its not necessary
can cause the simulator to read from uninitialized values
|
|
|
|
These recordEvent() calls could cause crashes since they
access the req pointer after it's potentially been
deleted during a failed translation call. (Similar
problem to the traceData bug fixed in the previous cset.)
Moving them above the translation call (as was done
recentlyi in cset 8b2b8e5e7d35) avoids the crash
but doesn't work, since at that point we don't know if
the access is uncached or not.
It's not clear why these calls are there, and no one
seems to use them, so we'll just delete them. If they
are needed, they should be moved to somewhere that's
guaranteed to be after the translation completes but
before the request is possibly deleted, e.g., in
finishTranslation().
|
|
Accessing traceData (to call setAddress() and/or setData())
after initiating a timing translation was causing crashes,
since a failed translation could delete the traceData
object before returning.
It turns out that there was never a need to access traceData
after initiating the translation, as the traced data was
always available earlier; this ordering was merely
historical. Furthermore, traceData->setAddress() and
traceData->setData() were being called both from the CPU
model and the ISA definition, often redundantly.
This patch standardizes all setAddress and setData calls
for memory instructions to be in the CPU models and not
in the ISA definition. It also moves those calls above
the translation calls to eliminate the crashes.
|
|
|
|
|
|
|
|
Previously the recording of an uncached read occurred after the request was
possibly deleted within the translateTiming function.
|
|
Do not use "using namespace std;" in headers
Include header files as needed
|
|
|
|
When implementing timing address translations instead of atomic, I
forgot to preserve the faults that are returned from the read and
write calls. This patch reinstates them.
|
|
When each load or store is sent to the LSQ, we check whether it will cross a
cache line boundary and, if so, split it in two. This creates two TLB
translations and two memory requests. Care has to be taken if the first
packet of a split load is sent but the second blocks the cache. Similarly,
for a store, if the first packet cannot be sent, we must store the second
one somewhere to retry later.
This modifies the LSQSenderState class to record both packets in a split
load or store.
Finally, a new const variable, HasUnalignedMemAcc, is added to each ISA
to indicate whether unaligned memory accesses are allowed. This is used
throughout the changed code so that compiler can optimise away code dealing
with split requests for ISAs that don't need them.
|
|
This initiates a timing translation and passes the read or write on to the
processor before waiting for it to finish. Once the translation is finished,
the instruction's state is updated via the 'finish' function. A new
DataTranslation class is created to handle this.
The idea is taken from the implementation of timing translations in
TimingSimpleCPU by Gabe Black. This patch also separates out the timing
translations from this CPU and uses the new DataTranslation class.
|
|
Make sure that instructions are dereferenced/deleted twice by marking they are
on the remove list
|
|
|
|
|