summaryrefslogtreecommitdiff
path: root/src/cpu
AgeCommit message (Collapse)Author
2011-02-12inorder: cache instruction schedulesKorey Sewell
first step in a optimization to not dynamically allocate an instruction schedule for every instruction but rather used cached schedules
2011-02-12inorder: comments for resource sked classKorey Sewell
2011-02-12inorder: remove unused fileKorey Sewell
inst_buffer file isn't used , so remove it
2011-02-11O3: Fix pipeline restart when a table walk completes in the fetch stage.Giacomo Gabrielli
When a table walk is initiated by the fetch stage, the CPU can potentially move to the idle state and never wake up. The fetch stage must call cpu->wakeCPU() when a translation completes (in finishTranslation()).
2011-02-11SimpleCPU: Fix a case where a DTLB fault redirects fetch and an I-side walk ↵Ali Saidi
occurs. This change fixes an issue where a DTLB fault occurs and redirects fetch to handle the fault and the ITLB requires a walk which delays translation. In this case the status of the cpu isn't updated appropriately, and an additional instruction fetch occurs. Eventually this hits an assert as multiple instruction fetches are occuring in the system and when the second one returns the processor is in the wrong state. Some asserts below are removed because it was always true (typo) and the state after the initiateAcc() the processor could be in any valid state when a d-side fault occurs.
2011-02-11O3: Enhance data address translation by supporting hardware page table walkers.Giacomo Gabrielli
Some ISAs (like ARM) relies on hardware page table walkers. For those ISAs, when a TLB miss occurs, initiateTranslation() can return with NoFault but with the translation unfinished. Instructions experiencing a delayed translation due to a hardware page table walk are deferred until the translation completes and kept into the IQ. In order to keep track of them, the IQ has been augmented with a queue of the outstanding delayed memory instructions. When their translation completes, instructions are re-executed (only their initiateAccess() was already executed; their DTB translation is now skipped). The IEW stage has been modified to support such a 2-pass execution.
2011-02-06m5: added work completed monitoring supportBrad Beckmann
2011-02-06TimingSimpleCPU: split data sender state fixJoel Hestness
In sendSplitData, keep a pointer to the senderState that may be updated after the call to handle*Packet. This way, if the receiver updates the packet senderState, it can still be accessed in sendSplitData.
2011-02-06mcpat: Adds McPAT performance countersJoel Hestness
Updated patches from Rick Strong's set that modify performance counters for McPAT
2011-02-04inorder: fault handlingKorey Sewell
Maintain all information about an instruction's fault in the DynInst object rather than any cpu-request object. Also, if there is a fault during the execution stage then just save the fault inside the instruction and trap once the instruction tries to graduate
2011-02-04inorder: pcstate and delay slots bugKorey Sewell
not taken delay slots were not being advanced correctly to pc+8, so for those ISAs we 'advance()' the pcstate one more time for the desired effect
2011-02-04inorder: add a fetch buffer to fetch unitKorey Sewell
Give fetch unit it's own parameterizable fetch buffer to read from. Very inefficient (architecturally and in simulation) to continually fetch at the granularity of the wordsize. As expected, the number of fetch memory requests drops dramatically
2011-02-04inorder: overload find-req fnKorey Sewell
no need to have separate function name findSplitRequest, just overload the function
2011-02-04inorder: implement separate fetch unitKorey Sewell
instead of having one cache-unit class be responsible for both data and code accesses, separate code that is just for fetch in it's own derived class off the original base class. This makes the code easier to manage as well as handle future cases of special fetch handling
2011-02-04inorder: cache port blockingKorey Sewell
set the request to false when the cache port blocks so we dont deadlock. also, comment out the outstanding address list sanity check for now.
2011-02-04inorder: stage width as a python parameterKorey Sewell
allow the user to specify how many instructions a pipeline stage can process on any given cycle (stageWidth...i.e.bandwidth) by setting the parameter through the python interface rather than compile the code after changing the *.cc file. (we always had the parameter there, but still used the static 'ThePipeline::StageWidth' instead) - Since StageWidth is now dynamically defined, change the interstage communication structure to use a vector and get rid of array and array handling index (toNextStageIndex) since we can just make calls to the list for the same information
2011-02-04inorder: multi-issue branch resolutionKorey Sewell
Only execute (resolve) one branch per cycle because handling more than one is a little more complicated
2011-02-04inorder: pipe. stage inst. bufferingKorey Sewell
use skidbuffer as only location for instructions between stages. before, we had the insts queue from the prior stage and the skidbuffer for the current stage, but that gets confusing and this consolidation helps when handling squash cases
2011-02-04inorder: change skidBuffer to list instead of queueKorey Sewell
manage insertion and deletion like a queue but will need access to internal elements for future changes Currently, skidbuffer manages any instruction that was in a stage but could not complete processing, however we will want to manage all blocked instructions (from prev stage and from cur. stage) in just one buffer.
2011-02-04inorder: activity tracking bugKorey Sewell
Previous code was marking CPU activity on almost every cycle due to a bug in tracking the status of pipeline stages. This disables the CPU from sleeping on long latency stalls and increases simulation time
2011-02-03Fault: Rename sim/fault.hh to fault_fwd.hh to distinguish it from faults.hh.Gabe Black
--HG-- rename : src/sim/fault.hh => src/sim/fault_fwd.hh
2011-02-03Config: Keep track of uncached and cached ports separately.Gabe Black
This makes sure that the address ranges requested for caches and uncached ports don't conflict with each other, and that accesses which are always uncached (message signaled interrupts for instance) don't waste time passing through caches.
2011-02-02O3: Fix a style bug in O3.Gabe Black
2011-02-01X86: Add L1 caches for the TLB walkers.Gabe Black
Small L1 caches are connected to the TLB walkers when caches are used. This allows them to participate in the coherence protocol properly.
2011-01-18O3: Fix some variable length instruction issues with the O3 CPU and ARM ISA.Matt Horsnell
2011-01-18O3: Don't test misprediction on load instructions until executed.Matt Horsnell
2011-01-18O3: Keep around the last committed instruction and use for squashing.Ali Saidi
Without this change 0 is always used for the youngest sequence number if a squash occured and the ROB was empty (E.g. an instruction is marked serializeAfter or a fetch stall prevents other instructions from issuing). Using 0 there is a race to rename where an instruction that committed the same cycle as the squashing instruction can have it's renamed state undone by the squash using sequence number 0.
2011-01-18O3: Don't try to scoreboard misc registers.Ali Saidi
I'm not positive this is the correct fix, but it's working right now. Either we need to do something like this, prevent the misc reg from being renamed at all, or there something else going on. We need to find the root cause as to why this is only a problem sometimes.
2011-01-18O3: Fix corner cases where multiple squashes/fetch redirects overwrite timebuf.Matt Horsnell
2011-01-18O3: Fix mispredicts from non control instructions.Matt Horsnell
The squash inside the fetch unit should not attempt to remove them from the branch predictor as non-control instructions are not pushed into the predictor.
2011-01-18O3: Fixes the way prefetches are handled inside the iew unit.Matt Horsnell
This patch prevents the prefetch being added to the instCommit queue twice.
2011-01-18O3: Support timing translations for O3 CPU fetch.Ali Saidi
2011-01-18ARM: Add support for moving predicated false dest operands from sources.Ali Saidi
2011-01-18O3: Fixes fetch deadlock when the interrupt clears before CPU handles it.Min Kyu Jeong
When this condition occurs the cpu should restart the fetch stage to fetch from the original execution path. Fault handling in the commit stage is cleaned up a little bit so the control flow is simplier. Finally, if an instruction is being used to carry a fault it isn't executed, so the fault propagates appropriately.
2011-01-12inorder: fix RUBY_FS buildKorey Sewell
the current code was using incorrect dummy instruction in interrupts function
2011-01-07Replace curTick global variable with accessor functions.Steve Reinhardt
This step makes it easy to replace the accessor functions (which still access a global variable) with ones that access per-thread curTick values.
2011-01-07inorder: replace schedEvent() code with reschedule().Steve Reinhardt
There were several copies of similar functions that looked like they all replicated reschedule(), so I replaced them with direct calls. Keeping this separate from the previous cset since there may be some subtle functional differences if the code ever reschedules an event that is scheduled but not squashed (though none were detected in the regressions).
2011-01-07inorder: get rid of references to mainEventQueue.Steve Reinhardt
Events need to be scheduled on the queue assigned to the SimObject, not on the global queue (which should be going away). Also cleaned up a number of redundant expressions that made the code unnecessarily verbose.
2011-01-03Move sched_list.hh and timebuf.hh from src/base to src/cpu.Steve Reinhardt
These files really aren't general enough to belong in src/base. This patch doesn't reorder include lines, leaving them unsorted in many cases, but Nate's magic script will fix that up shortly. --HG-- rename : src/base/sched_list.hh => src/cpu/sched_list.hh rename : src/base/timebuf.hh => src/cpu/timebuf.hh
2011-01-03Make commenting on close namespace brackets consistent.Steve Reinhardt
Ran all the source files through 'perl -pi' with this script: s|\s*(};?\s*)?/\*\s*(end\s*)?namespace\s*(\S+)\s*\*/(\s*})?|} // namespace $3|; s|\s*};?\s*//\s*(end\s*)?namespace\s*(\S+)\s*|} // namespace $2\n|; s|\s*};?\s*//\s*(\S+)\s*namespace\s*|} // namespace $1\n|; Also did a little manual editing on some of the arch/*/isa_traits.hh files and src/SConscript.
2010-12-22This patch removes the WARN_* and ERROR_* from src/mem/ruby/common/Debug.hh ↵Nilay Vaish
file. These statements have been replaced with warn(), panic() and fatal() defined in src/base/misc.hh
2010-12-21memtest: delete some crufty dead codeSteve Reinhardt
2010-12-20Style: Replace some tabs with spaces.Gabe Black
2010-12-07O3: Allow a store entry to store up to 16 bytes (instead of TheISA::IntReg).Ali Saidi
The store queue doesn't need to be ISA specific and architectures can frequently store more than an int registers worth of data. A 128 bits seems more common, but even 256 bits may be appropriate. Pretty much anything less than a cache line size is buildable.
2010-12-07O3: Support squashing all state after special instructionAli Saidi
For SPARC ASIs are added to the ExtMachInst. If the ASI is changed simply marking the instruction as Serializing isn't enough beacuse that only stops rename. This provides a mechanism to squash all the instructions and refetch them
2010-12-07O3: Make all instructions that write a misc. register not perform the write ↵Giacomo Gabrielli
until commit. ARM instructions updating cumulative flags (ARM FP exceptions and saturation flags) are not serialized. Added aliases for ARM FP exceptions and saturation flags in FPSCR. Removed write accesses to the FP condition codes for most ARM VFP instructions: only VCMP and VCMPE instructions update the FP condition codes. Removed a potential cause of seg. faults in the O3 model for NEON memory macro-ops (ARM).
2010-12-07O3: Support SWAP and predicated loads/store in ARM.Min Kyu Jeong
2010-12-07ARM: Support switchover with hardware table walkersAli Saidi
2010-12-01ruby: Converted old ruby debug calls to M5 debug callsNilay Vaish
This patch developed by Nilay Vaish converts all the old GEMS-style ruby debug calls to the appropriate M5 debug calls.
2010-11-23X86: Loosen an assert for x86 and connect the APIC ports when caches are used.Gabe Black