Age | Commit message (Collapse) | Author |
|
On the config end, if a shared L2 is created for the system, it is
parameterized to have n sharers as defined by option.num_cpus. In addition to
making the cache sharing aware so that discriminating tag policies can make use
of context_ids to make decisions, I added an occupancy AverageStat and an occ %
stat to each cache so that you could know which contexts are occupying how much
cache on average, both in terms of blocks and percentage. Note that since
devices have context_id -1, having an array of occ stats that correspond to
each context_id will break here, so in FS mode I add an extra bucket for device
blocks. This bucket is explicitly not added in SE mode in order to not only
avoid ugliness in the stats.txt file, but to avoid broken stats (some formulas
break when a bucket is 0).
|
|
Also, make Formulas work on AverageVector. First, Stat::Average (and thus
Stats::AverageVector) was broken when coming out of a checkpoint and on resets,
this fixes that. Formulas also didn't work with AverageVector, but added
support for that.
|
|
|
|
When implementing timing address translations instead of atomic, I
forgot to preserve the faults that are returned from the read and
write calls. This patch reinstates them.
|
|
When each load or store is sent to the LSQ, we check whether it will cross a
cache line boundary and, if so, split it in two. This creates two TLB
translations and two memory requests. Care has to be taken if the first
packet of a split load is sent but the second blocks the cache. Similarly,
for a store, if the first packet cannot be sent, we must store the second
one somewhere to retry later.
This modifies the LSQSenderState class to record both packets in a split
load or store.
Finally, a new const variable, HasUnalignedMemAcc, is added to each ISA
to indicate whether unaligned memory accesses are allowed. This is used
throughout the changed code so that compiler can optimise away code dealing
with split requests for ISAs that don't need them.
|
|
This initiates a timing translation and passes the read or write on to the
processor before waiting for it to finish. Once the translation is finished,
the instruction's state is updated via the 'finish' function. A new
DataTranslation class is created to handle this.
The idea is taken from the implementation of timing translations in
TimingSimpleCPU by Gabe Black. This patch also separates out the timing
translations from this CPU and uses the new DataTranslation class.
|
|
|
|
Fixed data block assignment to not delete if not internally allocated.
|
|
|
|
|
|
Added full-system support to the simple mesh toplogy by allowing dma contrllers
to be attached to router zero in the network.
|
|
|
|
Make sure that instructions are dereferenced/deleted twice by marking they are
on the remove list
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- on certain retry requests you can get an assertion failure
- fix by allowing the request to literally "Retry" itself
if it wasnt successful before, and then block any requests
through cache port while waiting for the cache to be
made available for access
|
|
only show requests processed when the resource is actually in use
|
|
- m5 line enforcement on use_def.cc,hh
|
|
add idle/run/utilization stats for each pipeline stage
|
|
each stage keeps track of insts_processed on a per_thread basis but we should
be keeping that on a total basis inorder to enforce stage width limits
|
|
set Active/Suspended/Halted status for threads. useful for system when determining
if/when to exit simulation
|
|
Halt is called from the exit() system call while
deallocate is unused. So to clear up things, just
use halt and remove deallocate.
|
|
when threads are switching in/out the CPU, we need to keep
track of special cases like branches. Add appropriate
variables in ThreadState t track this and then use
these variables when updating pc after context switch
|
|
this will be used for when a thread comes back from a cache miss, it needs to update the PCs
because the inst might of been a branch or delayslot in which the next PC isnt always
a straight addition
|
|
allow a thread to wakeup and be activated after
it has been in suspended state and another
thread is switched out. Need to give
pipeline stages a "activateThread" function
so that can get to their suspended instruction
when the time is right.
|
|
this prints out messages relative to what
threading model is being used (smt, switch-on-miss, single, etc.)
|
|
update address List and address Map to take
into account multiple threads
|
|
give resources their own specific
activity to do for a "suspend" event
instead of defaulting to deactivating the thread for a
suspend thread event. This really matters
for the fetch sequence unit which wants to remove the
thread from fetching while other units want to
ignore a thread suspension. If you deactivate a thread
in a resource then you may lose some of the allotted
bandwidth that the thread is taking up...
|
|
dont check total # of threads but instead all
active threads
|
|
update/add in the use of isThreadReady & isThreadSuspended
functions.Check in activateThread what list a thread is
on so it can be managed accordingly.
|
|
|
|
-Support ability to activate next ready thread after a cache miss
through the activateNextReadyContext/Thread() functions
-To support this a "readyList" of thread ids is added
-After a cache miss, thread will suspend and then call
activitynextreadythread
|
|
allow for events to schedule themselves later if desired. this is important
because of cases like where you need to activate a thread only after the previous
thread has been deactivated. The ordering there has to be enforced
|
|
add code to recognize memory stalls in resources and the pipeline as well
as squash a thread if there is a stall and we are in the switch on cache miss
model
|
|
some events are going to need instruction data when they process, so just
include the instruction in the event construction
|
|
add buffer for instructions to switch out to in a pipeline stage
can't squash the instruction and remove the pipeline so we kind of need
to 'suspend' an instruction at the stage while the memory stall resolves
for the switch on cache miss model
|
|
- loads were happening on same cycle as the address was generated which is slightly
unrealistic. Instead, force address generation to be on separate cycle from load
initiation
- also, mark the stages in a more traditional way (F-D-X-M-W)
|
|
|
|
- cpuEventNum
- resReqCount
|
|
This patch includes the necessary regression updates to test the new ruby
configuration system. The patch includes support for multiple ruby protocols
and adds the ruby random tester. The patch removes atomic mode test for
ruby since ruby does not support atomic mode acceses. These tests can be
added back in when ruby supports atomic mode for real.
--HG--
rename : tests/quick/50.memtest/test.py => tests/quick/60.rubytest/test.py
|
|
Replaced Ruby debug statements with M5 statements.
|
|
Removed the last level cache support and MOESI_hammer's dependency on it.
Replaces the LLC support with the more generic MachineType count.
|
|
|
|
Removed static members in RubyPort and removed the ruby request unique id.
|
|
Removed the old config interface from RubySystem and libruby.
|