Age | Commit message (Collapse) | Author |
|
And by "everything" I mean all the quick regressions.
|
|
This change pulls the instruction decoding machinery (including caches) out of
the StaticInst class and puts it into its own class. This has a few intrinsic
benefits. First, the StaticInst code, which has gotten to be quite large, gets
simpler. Second, the code that handles decode caching is now separated out
into its own component and can be looked at in isolation, making it easier to
understand. I took the opportunity to restructure the code a bit which will
hopefully also help.
Beyond that, this change also lays some ground work for each ISA to have its
own, potentially stateful decode object. We'd be able to include less
contextualizing information in the ExtMachInst objects since that context
would be applied at the decoder. Also, the decoder could "know" ahead of time
that all the instructions it's going to see are going to be, for instance, 64
bit mode, and it will have one less thing to check when it decodes them.
Because the decode caching mechanism has been separated out, it's now possible
to have multiple caches which correspond to different types of decoding
context. Having one cache for each element of the cross product of different
configurations may become prohibitive, so it may be desirable to clear out the
cache when relatively static state changes and not to have one for each
setting.
Because the decode function is no longer universally accessible as a static
member of the StaticInst class, a new function was added to the ThreadContexts
that returns the applicable decode object.
|
|
|
|
|
|
|
|
|
|
|
|
if a faulting instruction reaches an execution unit,
then ignore it and pass it through the pipeline.
Once we recognize the fault in the graduation unit,
dont allow a second fault to creep in on the same cycle.
|
|
use a dummy instruction to facilitate the squash after
the interrupts trap
|
|
Before graduating an instruction, explicitly check fault
by making the fault check it's own separate command
that can be put on an instruction schedule.
|
|
|
|
make syscall a SE mode only functionality
copy over basic FS functions (hwrei) to make FS compile
|
|
Sharing the FP value w/the integer values was giving inconsistent results esp. when
their is a 32-bit integer register matched w/a 64-bit float value
|
|
define a syscallContext to schedule the syscall and then use syscall() to actually perform the action
|
|
remove events in the resource pool that can be called from the CPU event, since the CPU
event is scheduled at the same time at the resource pool event.
----
Also, match the resPool event function names to the cpu event function names
----
|
|
only update BTB on a taken branch and update branch predictor w/pcstate from instruction
---
only pay attention to branch predictor updates if the the inst. is in fact a branch
|
|
this causes forwarding a bad value register value
|
|
|
|
|
|
dont use offset to calculate this but rather an enum
that can be updated
|
|
|
|
implement a clean interface to handle branch misprediction and eventually all pipeline
flushing
|
|
formerly, this was implicit when you accessed the execution unit
or the use-def unit but it's better that this just be something
that a user can specify.
|
|
|
|
make handling of speculative and nonspeculative insts
more explicit
|
|
call trap function when a fault is received
|
|
ignore writes to the ISA zero register
|
|
|
|
get rid of accessing iterators (for instructions) by reference
|
|
|
|
- also use "threadId()" instead of readTid() everywhere
- this will help support more complex ISA indexing
|
|
since we dont care about if the cache of instruction schedules is sorted or not,
then the hash map should be faster
|
|
Add a few constants and functions that the InOrder model wants for SPARC.
* * *
sparc: add eaComp function
InOrder separates the address generation from the actual access so give
Sparc that functionality
* * *
sparc: add control flags for branches
branch predictors and other cpu model functions need to know specific information
about branches, so add the necessary flags here
|
|
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
|
|
|
|
***
(1): get rid of expandForMT function
MIPS is the only ISA that cares about having a piece of ISA state integrate
multiple threads so add constants for MIPS and relieve the other ISAs from having
to define this. Also, InOrder was the only core that was actively calling
this function
* * *
(2): get rid of corespecific type
The CoreSpecific type was used as a proxy to pass in HW specific params to
a MIPS CPU, but since MIPS FS hasnt been touched for awhile, it makes sense
to not force every other ISA to use CoreSpecific as well use a special
reset function to set it. That probably should go in a PowerOn reset fault
anyway.
|
|
make sure instructions are able to commit before writing back to the RF
do not commit more than 1 non-speculative instruction per cycle
|
|
cleanup hanging pointers and other cruft in the destructors
|
|
take away all instances of reqMap in the code and make all references use the built-in
request vectors inside of each resource. The request map was dynamically allocating
a request per instruction. The request vector just allocates N number of requests
during instantiation and then the surrounding code is fixed up to reuse those N requests
***
setRequest() and clearRequest() are the new accessors needed to define a new
request in a resource
|
|
we are going to be getting away from creating new resource requests for every
instruction so no more need to keep track of a reqRemoveList and clean it up
every tick
|
|
remove remnants of old way of instruction scheduling which dynamically allocated
a new resource schedule for every instruction
|
|
allow the pipeline and resources to use the cached instruction schedule and resource
sked iterator
|
|
add a stage scheduler class to replace InstStage in pipeline_traits.cc
use that class to define a default front-end, resource schedule that all
instructions will follow. This will also replace the back end schedule in
pipeline_traits.cc. The reason for adding this is so that we can cache
instruction schedules in the future instead of calling the same function
over/over again as well as constantly dynamically alllocating memory on
every instruction to try to figure out it's schedule
|
|
first step in a optimization to not dynamically allocate an instruction schedule
for every instruction but rather used cached schedules
|
|
allow the user to specify how many instructions a pipeline stage can process
on any given cycle (stageWidth...i.e.bandwidth) by setting the parameter through
the python interface rather than compile the code after changing the *.cc file.
(we always had the parameter there, but still used the static 'ThePipeline::StageWidth'
instead)
-
Since StageWidth is now dynamically defined, change the interstage communication
structure to use a vector and get rid of array and array handling index (toNextStageIndex)
since we can just make calls to the list for the same information
|
|
the current code was using incorrect dummy instruction in interrupts function
|
|
This step makes it easy to replace the accessor functions
(which still access a global variable) with ones that access
per-thread curTick values.
|
|
There were several copies of similar functions that looked
like they all replicated reschedule(), so I replaced them
with direct calls. Keeping this separate from the previous
cset since there may be some subtle functional differences
if the code ever reschedules an event that is scheduled but
not squashed (though none were detected in the regressions).
|
|
Events need to be scheduled on the queue assigned
to the SimObject, not on the global queue (which
should be going away).
Also cleaned up a number of redundant expressions
that made the code unnecessarily verbose.
|
|
This change modifies the way prefetches work. They are now like normal loads
that don't writeback a register. Previously prefetches were supposed to call
prefetch() on the exection context, so they executed with execute() methods
instead of initiateAcc() completeAcc(). The prefetch() methods for all the CPUs
are blank, meaning that they get executed, but don't actually do anything.
On Alpha dead cache copy code was removed and prefetches are now normal ops.
They count as executed operations, but still don't do anything and IsMemRef is
not longer set on them.
On ARM IsDataPrefetch or IsInstructionPreftech is now set on all prefetch
instructions. The timing simple CPU doesn't try to do anything special for
prefetches now and they execute with the normal memory code path.
|