Age | Commit message (Collapse) | Author |
|
occurs.
This change fixes an issue where a DTLB fault occurs and redirects fetch to
handle the fault and the ITLB requires a walk which delays translation. In this
case the status of the cpu isn't updated appropriately, and an additional
instruction fetch occurs. Eventually this hits an assert as multiple instruction
fetches are occuring in the system and when the second one returns the
processor is in the wrong state.
Some asserts below are removed because it was always true (typo) and the state
after the initiateAcc() the processor could be in any valid state when a
d-side fault occurs.
|
|
Some ISAs (like ARM) relies on hardware page table walkers. For those ISAs,
when a TLB miss occurs, initiateTranslation() can return with NoFault but with
the translation unfinished.
Instructions experiencing a delayed translation due to a hardware page table
walk are deferred until the translation completes and kept into the IQ. In
order to keep track of them, the IQ has been augmented with a queue of the
outstanding delayed memory instructions. When their translation completes,
instructions are re-executed (only their initiateAccess() was already
executed; their DTB translation is now skipped). The IEW stage has been
modified to support such a 2-pass execution.
|
|
The timer calculations were a bit off so time would run faster than
it otherwise should
|
|
Setup initial timesync event in initState or loadState so that curTick has
been updated to the new value, otherwise the event is scheduled in the past.
|
|
|
|
The TBE pointer in the MESI CMP implementation was not being set to NULL
when the TBE is deallocated. This resulted in segmentation fault on testing
the protocol when the ProtocolTrace was switched on.
|
|
If cr0.wp ("write protect" bit) is clear then do not generate page faults when
writing to write-protected pages in kernel mode.
|
|
During SYSCALL_64, use dataSize=8 when handling new rip (ref
http://www.intel.com/Assets/PDF/manual/253668.pdf 5.8.8 IA32_LSTAR is a 64-bit
address)
|
|
JMP_FAR_I was unpacking its far pointer operand using sll instead of srl like
it should, and also putting the components in the wrong registers for use by
other microcode.
|
|
During iret access LDT/GDT at CPL0 rather than after transition to user mode
(if I'm reading the Intel IA-64 architecture spec correctly, the contents of
the descriptor table are read before the CPL is updated).
|
|
The code for Orion 2.0 makes use of printf() at several places where there as
an error in configuration of the model. These have been replaced with fatal().
|
|
missing header file caused RUBY_FS to not compile
|
|
|
|
By stalling and waiting the mandatory queue instead of recycling it, one can
ensure that no incoming messages are starved when the mandatory queue puts
signficant of pressure on the L1 cache controller (i.e. the ruby memtester).
--HG--
rename : src/mem/slicc/ast/WakeUpDependentsStatementAST.py => src/mem/slicc/ast/WakeUpAllDependentsStatementAST.py
|
|
|
|
Split out dynamic and static power numbers for printing to ruby.stats
|
|
|
|
|
|
|
|
The packet now identifies whether static or dynamic data has been allocated and
is used by Ruby to determine whehter to copy the data pointer into the ruby
request. Subsequently, Ruby can be told not to update phys memory when
receiving packets.
|
|
|
|
|
|
Move page table walker state to its own object type, and make the
walker instantiate state for each outstanding walk. By storing the
states in a queue, the walker is able to handle multiple outstanding
timing requests. Note that functional walks use separate state
elements.
|
|
In sendSplitData, keep a pointer to the senderState that may be updated after
the call to handle*Packet. This way, if the receiver updates the packet
senderState, it can still be accessed in sendSplitData.
|
|
|
|
|
|
|
|
|
|
This patch ensures only aligned access are passed to ruby and includes a fix
to the DPRINTF address print.
|
|
|
|
Add checkpointing capability to the Intel 8254 timer, CMOS, I8042,
PS2 Keyboard and Mouse, I82094AA, I8237, I8254, I8259, and speaker
devices
|
|
Add checkpointing capability to the x86 interrupt device and the TLBs
|
|
Calls walker to look up virt. to phys. page mapping
|
|
The x86 local apic now includes a separate latency parameter for interrupts.
|
|
delete
Double packet delete problem is due to an interrupt device deleting a packet that the SimpleTimingPort also deletes. Since MessagePort descends from SimpleTimingPort, simply reimplement the failing code from SimpleTimingPort: recvTiming.
|
|
|
|
Updated patches from Rick Strong's set that modify performance counters for
McPAT
|
|
|
|
Separate data VCs and ctrl VCs in garnet, as ctrl VCs have 1 buffer per VC,
while data VCs have > 1 buffers per VC. This is for correct power estimations.
|
|
|
|
|
|
Maintain all information about an instruction's fault in the DynInst object rather
than any cpu-request object. Also, if there is a fault during the execution stage
then just save the fault inside the instruction and trap once the instruction
tries to graduate
|
|
not taken delay slots were not being advanced correctly to pc+8, so for those ISAs
we 'advance()' the pcstate one more time for the desired effect
|
|
Give fetch unit it's own parameterizable fetch buffer to read from. Very inefficient
(architecturally and in simulation) to continually fetch at the granularity of the
wordsize. As expected, the number of fetch memory requests drops dramatically
|
|
no need to have separate function name findSplitRequest, just overload the function
|
|
instead of having one cache-unit class be responsible for both data and code
accesses, separate code that is just for fetch in it's own derived class off the
original base class. This makes the code easier to manage as well as handle
future cases of special fetch handling
|
|
set the request to false when the cache port blocks so we dont deadlock.
also, comment out the outstanding address list sanity check for now.
|
|
allow the user to specify how many instructions a pipeline stage can process
on any given cycle (stageWidth...i.e.bandwidth) by setting the parameter through
the python interface rather than compile the code after changing the *.cc file.
(we always had the parameter there, but still used the static 'ThePipeline::StageWidth'
instead)
-
Since StageWidth is now dynamically defined, change the interstage communication
structure to use a vector and get rid of array and array handling index (toNextStageIndex)
since we can just make calls to the list for the same information
|
|
Only execute (resolve) one branch per cycle because handling more than one is
a little more complicated
|
|
use skidbuffer as only location for instructions between stages. before,
we had the insts queue from the prior stage and the skidbuffer for the
current stage, but that gets confusing and this consolidation helps
when handling squash cases
|