Age | Commit message (Collapse) | Author |
|
This patch takes the final step in removing the InOrderCPU from the
tree. Rest in peace.
The MinorCPU is now used to model an in-order microarchitecture, and
long term the MinorCPU will eventually be renamed InOrderCPU.
|
|
Using '== true' in a boolean expression is totally redundant,
and using '== false' is pretty verbose (and arguably less
readable in most cases) compared to '!'.
It's somewhat of a pet peeve, perhaps, but I had some time
waiting for some tests to run and decided to clean these up.
Unfortunately, SLICC appears not to have the '!' operator,
so I had to leave the '== false' tests in the SLICC code.
|
|
Make these names more meaningful.
Specifically, made these substitutions:
s/FP_Base_DepTag/FP_Reg_Base/g;
s/Ctrl_Base_DepTag/Misc_Reg_Base/g;
s/Max_DepTag/Max_Reg_Index/g;
|
|
The previous patch introduced a RegClass enum to clean
up register classification. The inorder model already
had an equivalent enum (RegType) that was used internally.
This patch replaces RegType with RegClass to get rid
of the now-redundant code.
|
|
This patch addresses the comments and feedback on the preceding patch
that reworks the clocks and now more clearly shows where cycles
(relative cycle counts) are used to express time.
Instead of bumping the existing patch I chose to make this a separate
patch, merely to try and focus the discussion around a smaller set of
changes. The two patches will be pushed together though.
This changes done as part of this patch are mostly following directly
from the introduction of the wrapper class, and change enough code to
make things compile and run again. There are definitely more places
where int/uint/Tick is still used to represent cycles, and it will
take some time to chase them all down. Similarly, a lot of parameters
should be changed from Param.Tick and Param.Unsigned to
Param.Cycles.
In addition, the use of curTick is questionable as there should not be
an absolute cycle. Potential solutions can be built on top of this
patch. There is a similar situation in the o3 CPU where
lastRunningCycle is currently counting in Cycles, and is still an
absolute time. More discussion to be had in other words.
An additional change that would be appropriate in the future is to
perform a similar wrapping of Tick and probably also introduce a
Ticks class along with suitable operators for all these classes.
|
|
handle them like we do in FS mode, by blocking the TLB until the fault
is handled by the fault->invoke()
|
|
this will make sure we get the correct view of a FP register
|
|
including IPR accesses and store-conditionals. These class of instructions will not
execute correctly in a superscalar machine
|
|
if a faulting instruction reaches an execution unit,
then ignore it and pass it through the pipeline.
Once we recognize the fault in the graduation unit,
dont allow a second fault to creep in on the same cycle.
|
|
Sharing the FP value w/the integer values was giving inconsistent results esp. when
their is a 32-bit integer register matched w/a 64-bit float value
|
|
this causes forwarding a bad value register value
|
|
use it in reg. dep. tracking
|
|
formerly, this was implicit when you accessed the execution unit
or the use-def unit but it's better that this just be something
that a user can specify.
|
|
|
|
keep stats for int/float reg file usage instead
of aggregating across reg file types
|
|
Architectures like SPARC need to read the window pointer
in order to figure out it's register dependence. However,
this may not get updated until after an instruction gets
executed, so now we lazily detect the register dependence
in the EXE stage (execution unit or use_def). This
makes sure we get the mapping after the most current change.
|
|
call trap function when a fault is received
|
|
|
|
- also use "threadId()" instead of readTid() everywhere
- this will help support more complex ISA indexing
|
|
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
|
|
|
|
Because int and not InstSeqNum was used in a couple of places, you can
overflow the int type and thus get wierd bugs when the sequence number
is negative (or some wierd value)
|
|
if a resource has a zero cycle latency (e.g. RegFile write), then dont allocate an event
for it to use
|
|
take away all instances of reqMap in the code and make all references use the built-in
request vectors inside of each resource. The request map was dynamically allocating
a request per instruction. The request vector just allocates N number of requests
during instantiation and then the surrounding code is fixed up to reuse those N requests
***
setRequest() and clearRequest() are the new accessors needed to define a new
request in a resource
|
|
we are going to be getting away from creating new resource requests for every
instruction so no more need to keep track of a reqRemoveList and clean it up
every tick
|
|
first change in an optimization that will stop InOrder from allocating new memory for every instruction's
request to a resource. This gets expensive since every instruction needs to access ~10 requests before
graduation. Instead, the plan is to allocate just enough resource request objects to satisfy each resource's
bandwidth (e.g. the execution unit would need to allocate 3 resource request objects for a 1-issue pipeline
since on any given cycle it could have 2 read requests and 1 write request) and then let the instructions
contend and reuse those allocated requests. The end result is a smaller memory footprint for the InOrder model
and increased simulation performance
|
|
allow the pipeline and resources to use the cached instruction schedule and resource
sked iterator
|
|
|
|
also, remove inst-req stats as default.good for debugging
but in terms of pure processor stats they aren't useful
|
|
when insts execute, they mark the time they finish to be used for subsequent isnts
they may need forwarding of data. However, the regdepmap was using the wrong
value to index into the destination operands of the instruction to be forwarded.
Thus, in some cases, we are checking to see if the 3rd destination register
for an instruction is executed at a certain time, when there is only 1 dest. register
valid. Thus, we get a bad, uninitialized time value that will stall forwarding
causing performance loss but still the correct execution.
|
|
|
|
- m5 line enforcement on use_def.cc,hh
|
|
|
|
|
|
|
|
inorder was incorrectly storing FP values and confusing the integer/fp storage view of floating point operations. A big issue was knowing trying to infer when were doing single or double precision access
because this lets you know the size of value to store (32-64 bits). This isnt exactly straightforward since alpha uses all 64-bit regs while mips/sparc uses a dual-reg view. by getting this value from
the actual floating point register file, the model can figure out what it needs to store
|
|
result-types for better tracing of these types of values
|
|
'Resource' flag
|
|
This model currently only works in MIPS_SE mode, so it will take some effort
to clean it up and make it generally useful. Hopefully people are willing to
help make that happen!
|