summaryrefslogtreecommitdiff
path: root/src/cpu
AgeCommit message (Collapse)Author
2013-02-19scons: Fix up numerous warnings about name shadowingAndreas Hansson
This patch address the most important name shadowing warnings (as produced when using gcc/clang with -Wshadow). There are many locations where constructor parameters and function parameters shadow local variables, but these are left unchanged.
2013-02-19x86: Move APIC clock divider to PythonAndreas Hansson
This patch moves the 16x APIC clock divider to the Python code to avoid the post-instantiation modifications to the clock. The x86 APIC was the only object setting the clock after creation time and this required some custom functionality and configuration. With this patch, the clock multiplier is moved to the Python code and the objects are instantiated with the appropriate clock.
2013-02-19mem: Add predecessor to SenderState base classAndreas Hansson
This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses. There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest.
2013-02-15cpu: Document exec trace flagsAndreas Sandberg
2013-02-15cpu: Avoid duplicate entries in tracking structures for writes to misc regsGeoffrey Blake
setMiscReg currently makes a new entry for each write to a misc reg without checking for duplicates, this can cause a triggering of the assert if an instruction get replayed and writes to the same misc regs multiple times. This fix prevents duplicate entries and instead updates the value.
2013-02-15cpu: Fix rename mis-handling serializing instructions when resource constrainedGeoffrey Blake
The rename can mis-handle serializing instructions (i.e. strex) if it gets into a resource constrained situation and the serializing instruction has to be placed on the skid buffer to handle blocking. In this situation the instruction informs the pipeline it is serializing and logs that the next instruction must be serialized, but since we are blocking the pipeline defers this action to place the serializing instruction and incoming instructions into the skid buffer. When resuming from blocking, rename will pull the serializing instruction from the skid buffer and the current logic will see this as the "next" instruction that has to be serialized and because of flags set on the serializing instruction, it passes through the pipeline stage as normal and resets rename to non-serializing. This causes instructions to follow the serializing inst incorrectly and eventually leads to an error in the pipeline. To fix this rename should check first if it has to block before checking for serializing instructions.
2013-02-15o3: fix tick used for renaming and issue with range selectionMatt Horsnell
Fixes the tick used from rename: - previously this gathered the tick on leaving rename which was always 1 less than the dispatch. This conflated the decode ticks when back pressure built in the pipeline. - now picks up tick on entry. Added --store_completions flag: - will additionally display the store completion tail in the viewer. - this highlights periods when large numbers of stores are outstanding (>16 LSQ blocking) Allows selection by tick range (previously this caused an infinite loop)
2013-02-15sim: Add a system-global option to bypass cachesAndreas Sandberg
Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches. To make memory mode tests cleaner, the following methods are added to the System class: * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed. The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.
2013-02-15cpu: Refactor memory system checksAndreas Sandberg
CPUs need to test that the memory system is in the right mode in two places, when the CPU is initialized (unless it's switched out) and on a drainResume(). This led to some code duplication in the CPU models. This changeset introduces the verifyMemoryMode() method which is called by BaseCPU::init() if the CPU isn't switched out. The individual CPU models are responsible for calling this method when resuming from a drain as this code is CPU model specific.
2013-02-15cpu: Make checker CPUs inherit from CheckerCPU in the Python hierarchyAndreas Sandberg
Checker CPUs currently don't inherit from the CheckerCPU in the Python object hierarchy. This has two consequences: * It makes CPU model discovery from the Python world somewhat complicated as there is no way of testing if a CPU is a checker. * Parameters are duplicated in the checker configuration specification. This changeset makes all checker CPUs inherit from the base checker CPU class.
2013-02-15cpu: Add CPU metadata om the Python classesAndreas Sandberg
The configuration scripts currently hard-code the requirements of each CPU. This is clearly not optimal as it makes writing new configuration scripts painful and adding new CPU models requires existing scripts to be updated. This patch adds the following class methods to the base CPU and all relevant CPUs: * memory_mode -- Return a string describing the current memory mode (invalid/atomic/timing). * require_caches -- Does the CPU model require caches? * support_take_over -- Does the CPU support CPU handover?
2013-02-15cpu: include set in o3/commit_impl.Ali Saidi
While the majority of compilers seemed to pickup set from else where, one version of gcc 4.7 complains, so explictly add it.
2013-02-15cpu: fix case with o3 cpu blocking and unblocking decode in cycleAli Saidi
Fix a case in the O3 CPU where the decode stage blocks and unblocks in a single cycle sending both signals to fetch which causes an assert or worse. The previous check could never work before since the status was set to Blocked before a test for the status being Unblocking was executed.
2013-02-15cpu: Fix a livelock in the o3 cpu.Ali Saidi
Check if an instruction just enabled interrupts and we've previously had an interrupt pending that was not handled because interrupts were subsequently disabled before the pipeline reached a place to handle the interrupt. In that case squash now to make sure the interrupt is handled.
2013-01-24branch predictor: move out of o3 and inorder cpusNilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)
This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh
2013-01-22o3 cpu: fix zero reg problemAndrea Pellegrini
There was an issue w/ the rename logic, which would assign a previous physical register to the ZeroReg architectural register in x86. This issue was giving problems for instructions squashed in threads w/ ID different from 0, sometimes allowing non-mispredicted instructions to obtain a value different from zero when reading the zeroReg.
2013-01-22x86, cpu: corrects 270c9a75e91f, take over decoder on cpu switchNilay Vaish
The changes made by the changeset 270c9a75e91f do not work well with switching of cpus. The problem is that decoder for the old thread context holds state that is not taken over by the new decoder. This patch adds a takeOverFrom() function to Decoder class in each ISA. Except for x86, functions in other ISAs are blank. For x86, the function copies state from the old decoder to the new decoder.
2013-01-19O3 IEW: Make incrWb and decrWb clearerJoel Hestness
Move the increment/decrement of wbOutstanding outside of the comparison in incrWb and decrWb in the IEW. This also fixes a compiler bug with gcc 4.4.7, which incorrectly optimizes "-- ==" as "-=".
2013-01-17ruby: remove calls to g_system_ptr->getTime()Nilay Vaish
This patch further removes calls to g_system_ptr->getTime() where ever other clocked objects are available for providing current time.
2013-01-12base simple cpu: removes commented out code about cache opsNilay Vaish
2013-01-12x86: Changes to decoder, corrects 9376Nilay Vaish
The changes made by the changeset 9376 were not quite correct. The patch made changes to the code which resulted in decoder not getting initialized correctly when the state was restored from a checkpoint. This patch adds a startup function to each ISA object. For x86, this function sets the required state in the decoder. For other ISAs, the function is empty right now.
2013-01-07cpu: Unify the serialization code for all of the CPU modelsAndreas Sandberg
Cleanup the serialization code for the simple CPUs and the O3 CPU. The CPU-specific code has been replaced with a (un)serializeThread that serializes the thread state / context of a specific thread. Assuming that the thread state class uses the CPU-specific thread state uses the base thread state serialization code, this allows us to restore a checkpoint with any of the CPU models.
2013-01-07cpu: Flush TLBs on switchOut()Andreas Sandberg
This changeset inserts a TLB flush in BaseCPU::switchOut to prevent stale translations when doing repeated switching. Additionally, the TLB flushing functionality is exported to the Python to make debugging of switching/checkpointing easier. A simulation script will typically use the TLB flushing functionality to generate a reference trace. The following sequence can be used to simulate a handover (this depends on how drain is implemented, but is generally the case) between identically configured CPU models: m5.drain(test_sys) [ cpu.flushTLBs() for cpu in test_sys.cpu ] m5.resume(test_sys) The generated trace should normally be identical to a trace generated when switching between identically configured CPU models or checkpointing and resuming.
2013-01-07cpu: Rewrite O3 draining to avoid stopping in microcodeAndreas Sandberg
Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty.
2013-01-07cpu: Make sure that a drained atomic CPU isn't executing ucodeAndreas Sandberg
Currently, the atomic CPU can be in the middle of a microcode sequence when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * Since curMacroStaticInst is populated when executing microcode, repeated switching between CPUs executing microcode leads to incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset fixes a bug where the multiple switches to the same atomic CPU sometimes corrupts the target state because of dangling pointers to the currently executing microinstruction. Note: This changeset moves tick event descheduling from switchOut() to drain(), which makes timing consistent between just draining a system and draining /and/ switching between two atomic CPUs. This makes debugging quite a lot easier (execution traces get the same timing), but the latency of the last instruction before a drain will not be accounted for correctly (it will always be 1 cycle). Note 2: This changeset removes so_state variable, the locked variable, and the tickEvent from checkpoints since none of them contain state that needs to be preserved across checkpoints. The so_state is made redundant because we don't use the drain state variable anymore, the lock variable should never be set when the system is drained, and the tick event isn't scheduled.
2013-01-07cpu: Make sure that a drained timing CPU isn't executing ucodeAndreas Sandberg
Currently, the timing CPU can be in the middle of a microcode sequence or multicycle (stayAtPC is true) instruction when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * If stayAtPC is true we might execute half of an instruction twice when restoring a checkpoint or switching CPUs, which leads to an incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset also fixes a bug where the timing CPU sometimes switches out with while stayAtPC is true, which corrupts the target state after a CPU switch or checkpoint. Note: This changeset removes the so_state variable from checkpoints since the drain state isn't used anymore.
2013-01-07cpu: Fix broken thread context handoverAndreas Sandberg
The thread context handover code used to break when multiple handovers were performed during the same quiesce period. Previously, the thread contexts would assign the TC pointer in the old quiesce event to the new TC. This obviously broke in cases where multiple switches were performed within the same quiesce period, in which case the TC pointer in the quiesce event would point to an old CPU. The new implementation deschedules pending quiesce events in the old TC and schedules a new quiesce event in the new TC. The code has been refactored to remove most of the code duplication.
2013-01-07cpu: Fix O3 LSQ debug dumping constness and formattingAndreas Sandberg
2013-01-07cpu: Fix broken squashAfter implementation in O3 CPUAndreas Sandberg
Commit can currently both commit and squash in the same cycle. This confuses other stages since the signals coming from the commit stage can only signal either a squash or a commit in a cycle. This changeset changes the behavior of squashAfter so that it commits all instructions, including the instruction that requested the squash, in the first cycle and then starts to squash in the next cycle.
2013-01-07o3 cpu: Remove unused variablesAndreas Sandberg
2013-01-07cpu: Rename defer_registration->switched_outAndreas Sandberg
The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive.
2013-01-07cpu: Remove unused params.hh header file in inorder CPUAndreas Sandberg
2013-01-07cpu: Introduce sanity checks when switching between CPUsAndreas Sandberg
This patch introduces the following sanity checks when switching between CPUs: * Check that the set of new and old CPUs do not overlap. Having an overlap between the set of new CPUs and the set of old CPUs is currently not supported. Doing such a switch used to result in the following assertion error: BaseCPU::takeOverFrom(BaseCPU*): \ Assertion `!new_itb_port->isConnected()' failed. * Check that all new CPUs are in the switched out state. * Check that all old CPUs are in the switched in state.
2013-01-07cpu: Correctly call parent on switchOut() and takeOverFrom()Andreas Sandberg
This patch cleans up the CPU switching functionality by making sure that CPU models consistently call the parent on switchOut() and takeOverFrom(). This has the following implications that might alter current functionality: * The call to BaseCPU::switchout() in the O3 CPU is moved from signalDrained() (!) to switchOut(). * A call to BaseSimpleCPU::switchOut() is introduced in the simple CPUs.
2013-01-07cpu: Unify SimpleCPU and O3 CPU serialization codeAndreas Sandberg
The O3 CPU used to copy its thread context to a SimpleThread in order to do serialization. This was a bit of a hack involving two static SimpleThread instances and a magic constructor that was only used by the O3 CPU. This patch moves the ThreadContext serialization code into two global procedures that, in addition to the normal serialization parameters, take a ThreadContext reference as a parameter. This allows us to reuse the serialization code in all ThreadContext implementations.
2013-01-07cpu: Initialize the O3 pipeline from startup()Andreas Sandberg
The entire O3 pipeline used to be initialized from init(), which is called before initState() or unserialize(). This causes the pipeline to be initialized from an incorrect thread context. This doesn't currently lead to correctness problems as instructions fetched from the incorrect start PC will be squashed a few cycles after initialization. This patch will affect the regressions since the O3 CPU now issues its first instruction fetch to the correct PC instead of 0x0.
2013-01-07cpu: Implement a flat register interface in thread contextsAndreas Sandberg
Some architectures map registers differently depending on their mode of operations. There is currently no architecture independent way of accessing all registers. This patch introduces a flat register interface to the ThreadContext class. This interface is useful, for example, when serializing or copying thread contexts.
2013-01-07arch: Move the ISA object to a separate sectionAndreas Sandberg
After making the ISA an independent SimObject, it is serialized automatically by the Python world. Previously, this just resulted in an empty ISA section. This patch moves the contents of the ISA to that section and removes the explicit ISA serialization from the thread contexts, which makes it behave like a normal SimObject during serialization. Note: This patch breaks checkpoint backwards compatibility! Use the cpt_upgrader.py utility to upgrade old checkpoints to the new format.
2013-01-07cpu: Check that the memory system is in the correct modeAndreas Sandberg
This patch adds checks to all CPU models to make sure that the memory system is in the correct mode at startup and when resuming after a drain. Previously, we only checked that the memory system was in the right mode when resuming. This is inadequate since this is a configuration error that should be detected at startup as well as when resuming. Additionally, since the check was done using an assert, it wasn't performed when NDEBUG was set (e.g., the fast target).
2013-01-07cpu: Share the send functionality between traffic generatorsAndreas Hansson
This patch moves the packet creating and sending to a member function in the shared base class to avoid code duplication.
2013-01-07cpu: Add support for protobuf input for the trace generatorAndreas Hansson
This patch adds support for reading input traces encoded using protobuf according to what is done in the CommMonitor. A follow-up patch adds a Python script that can be used to convert the previously used ASCII traces to protobuf equivalents. The appropriate regression input is updated as part of this patch.
2013-01-07cpu: Encapsulate traffic generator input in a streamAndreas Hansson
This patch encapsulates the traffic generator input in a stream class such that the parsing is not visible to the trace generator. The change takes us one step closer to using protobuf-based input traces for the trace replay. The functionality of the current input stream is identical to what it was, and the ASCII format remains the same for now.
2013-01-07cpu: Fix the traffic gen read percentageAndreas Hansson
This patch fixes the computation that determines whether to perform a read or a write such that the two corner cases (0 and 100) are both more efficient and handled correctly.
2013-01-07arch: Make the ISA class inherit from SimObjectAndreas Sandberg
The ISA class on stores the contents of ID registers on many architectures. In order to make reset values of such registers configurable, we make the class inherit from SimObject, which allows us to use the normal generated parameter headers. This patch introduces a Python helper method, BaseCPU.createThreads(), which creates a set of ISAs for each of the threads in an SMT system. Although it is currently only needed when creating multi-threaded CPUs, it should always be called before instantiating the system as this is an obvious place to configure ID registers identifying a thread/CPU.
2013-01-07o3: Fix issue with LLSC ordering and speculationAli Saidi
This patch unlocks the cpu-local monitor when the CPU sees a snoop to a locked address. Previously we relied on the cache to handle the locking for us, however some users on the gem5 mailing list reported a case where the cpu speculatively executes a ll operation after a pending sc operation in the pipeline and that makes the cache monitor valid. This should handle that case by invaliding the local monitor.
2013-01-07cpu: rename the misleading inSyscall to noSquashFromTCAli Saidi
isSyscall was originally created because during handling of a syscall in SE mode the threadcontext had to be updated. However, in many places this is used in FS mode (e.g. fault handlers) and the name doesn't make much sense. The boolean actually stops gem5 from squashing speculative and non-committed state when a write to a threadcontext happens, so re-name the variable to something more appropriate
2013-01-04Decoder: Remove the thread context get/set from the decoder.Gabe Black
This interface is no longer used, and getting rid of it simplifies the decoders and code that sets up the decoders. The thread context had been used to read architectural state which was used to contextualize the instruction memory as it came in. That was changed so that the state is now sent to the decoders to keep locally if/when it changes. That's significantly more efficient. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2012-12-11ruby: modify the directed tester to read/write streamsNilay Vaish
The directed tester supports only generating only read or only write accesses. The patch modifies the tester to support streams that have both read and write accesses.
2012-12-06TournamentBP: Fix some bugs with table sizes and countersErik Tomusk
globalHistoryBits, globalPredictorSize, and choicePredictorSize are decoupled. globalHistoryBits controls how much history is kept, global and choice predictor sizes control how much of that history is used when accessing predictor tables. This way, global and choice predictors can actually be different sizes, and it is no longer possible to walk off the predictor arrays and cause a seg fault. There are now individual thresholds for choice, global, and local saturating counters, so that taken/not taken decisions are correct even when the predictors' counters' sizes are different. The interface for localPredictorSize has been removed from TournamentBP because the value can be calculated from localHistoryBits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2012-12-06inorder cpu: add missing DPRINTF argumentMalek Musleh
Committed by: Nilay Vaish <nilay@cs.wisc.edu>