Age | Commit message (Collapse) | Author |
|
Port proxies are used to replace non-structural ports, and thus enable
all ports in the system to correspond to a structural entity. This has
the advantage of accessing memory through the normal memory subsystem
and thus allowing any constellation of distributed memories, address
maps, etc. Most accesses are done through the "system port" that is
used for loading binaries, debugging etc. For the entities that belong
to the CPU, e.g. threads and thread contexts, they wrap the CPU data
port in a port proxy.
The following replacements are made:
FunctionalPort > PortProxy
TranslatingPort > SETranslatingPortProxy
VirtualPort > FSTranslatingPortProxy
--HG--
rename : src/mem/vport.cc => src/mem/fs_translating_port_proxy.cc
rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh
rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc
rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
|
|
|
|
And by "everything" I mean all the quick regressions.
|
|
In FS mode the syscall function will panic, but the interface will be
consistent and code which calls syscall can be compiled in. This will allow,
for instance, instructions that use syscall to be built unconditionally but
then not returned by the decoder.
|
|
Having two StaticInst classes, one nominally ISA dependent and the other ISA
dependent, has not been historically useful and makes the StaticInst class
more complicated that it needs to be. This change merges StaticInstBase into
StaticInst.
|
|
This change pulls the instruction decoding machinery (including caches) out of
the StaticInst class and puts it into its own class. This has a few intrinsic
benefits. First, the StaticInst code, which has gotten to be quite large, gets
simpler. Second, the code that handles decode caching is now separated out
into its own component and can be looked at in isolation, making it easier to
understand. I took the opportunity to restructure the code a bit which will
hopefully also help.
Beyond that, this change also lays some ground work for each ISA to have its
own, potentially stateful decode object. We'd be able to include less
contextualizing information in the ExtMachInst objects since that context
would be applied at the decoder. Also, the decoder could "know" ahead of time
that all the instructions it's going to see are going to be, for instance, 64
bit mode, and it will have one less thing to check when it decodes them.
Because the decode caching mechanism has been separated out, it's now possible
to have multiple caches which correspond to different types of decoding
context. Having one cache for each element of the cross product of different
configurations may become prohibitive, so it may be desirable to clear out the
cache when relatively static state changes and not to have one for each
setting.
Because the decode function is no longer universally accessible as a static
member of the StaticInst class, a new function was added to the ThreadContexts
that returns the applicable decode object.
|
|
|
|
readBytes and writeBytes had the word "bytes" in their names because they
accessed blobs of bytes. This distinguished them from the read and write
functions which handled higher level data types. Because those functions don't
exist any more, this change renames readBytes and writeBytes to more general
names, readMem and writeMem, which reflect the fact that they are how you read
and write memory. This also makes their names more consistent with the
register reading/writing functions, although those are still read and set for
some reason.
|
|
|
|
|
|
this will safeguard future code from trying to remove
from the list twice. That code wouldnt break but would
waste time.
|
|
|
|
|
|
handle them like we do in FS mode, by blocking the TLB until the fault
is handled by the fault->invoke()
|
|
|
|
|
|
|
|
implement clearfetchbufferfunction
extend predecoder to use multiple threads and clear those on trap
|
|
this will make sure we get the correct view of a FP register
|
|
|
|
The DTB expects the correct PC in the ThreadContext
but how if the memory accesses are speculative? Shouldn't
we send along the requestor's PC to the translate functions?
|
|
including IPR accesses and store-conditionals. These class of instructions will not
execute correctly in a superscalar machine
|
|
if a faulting instruction reaches an execution unit,
then ignore it and pass it through the pipeline.
Once we recognize the fault in the graduation unit,
dont allow a second fault to creep in on the same cycle.
|
|
handle "snoop" port registration as well as functional
port setup for FS mode
|
|
use a dummy instruction to facilitate the squash after
the interrupts trap
|
|
Before graduating an instruction, explicitly check fault
by making the fault check it's own separate command
that can be put on an instruction schedule.
|
|
|
|
|
|
make syscall a SE mode only functionality
copy over basic FS functions (hwrei) to make FS compile
|
|
speculative load/store pipelines can reenable this
|
|
calculate blocks in use for the fetch buffer to figure out how many total blocks
are pending
|
|
Sharing the FP value w/the integer values was giving inconsistent results esp. when
their is a 32-bit integer register matched w/a 64-bit float value
|
|
define a syscallContext to schedule the syscall and then use syscall() to actually perform the action
|
|
segfault was caused by squashed multiply thats in the process of an event.
use isProcessing flag to handle this and cleanup the MDU code
|
|
|
|
remove events in the resource pool that can be called from the CPU event, since the CPU
event is scheduled at the same time at the resource pool event.
----
Also, match the resPool event function names to the cpu event function names
----
|
|
once a ST is sent off, it's OK to keep processing, however it's a little more
complicated to handle the packet acknowledging the store is completed
|
|
once a ST is sent off, it's OK to keep processing, however it's a little more
complicated to handle the packet acknowledging the store is completed
|
|
also, cleanup comments for gem5.fast compilation
|
|
dont treat read() and write() fields as mut. exclusive
|
|
only update BTB on a taken branch and update branch predictor w/pcstate from instruction
---
only pay attention to branch predictor updates if the the inst. is in fact a branch
|
|
define separate priority resource pool squash and graduate events
|
|
|
|
this causes forwarding a bad value register value
|
|
|
|
|
|
use it in reg. dep. tracking
|
|
dont use offset to calculate this but rather an enum
that can be updated
|
|
|
|
|