summaryrefslogtreecommitdiff
path: root/src/cpu
AgeCommit message (Collapse)Author
2013-05-30cpu: Make hash struct instead of class to please clangAndreas Hansson
This patch changes the type of the hash function for BasicBlockRanges to match the original definition of the templatized type. Without this, clang raises a warning and combined with the "-Werror" flag this causes compilation to fail.
2013-05-14cpu: remove local/globalHistoryBits params from branch predAnthony Gutierrez
having separate params for the local/globalHistoryBits and the local/globalPredictorSize can lead to inconsistencies if they are not carefully set. this patch dervies the number of bits necessary to index into the local/global predictors based on their size. the value of the localHistoryTableSize for the ARM O3 CPU has been increased to 1024 from 64, which is more accurate for an A15 based on some correlation against A15 hardware.
2013-05-14kvm: Add support for disabling coalesced MMIOAndreas Sandberg
Add the option useCoalescedMMIO to the BaseKvmCPU. The default behavior is to disable coalesced MMIO since this hasn't been heavily tested.
2013-05-14kvm: Dump state before panic in KVM exit handlersAndreas Sandberg
2013-05-14kvm: Fix the memory interface used by KVMAndreas Sandberg
The CpuPort class was removed before the KVM patches were committed, which means that the KVM interface currently doesn't compile. This changeset adds the BaseKvmCPU::KVMCpuPort class which derives from MasterPort. This class is used on the data and instruction ports instead of the old CpuPort.
2013-05-02kvm: Add a stat counting number of instructions executedAndreas Sandberg
This changeset adds a 'numInsts' stat to the KVM-based CPU. It also cleans up the variable names in kvmRun to make the distinction between host cycles and estimated simulated cycles clearer. As a bonus feature, it also fixes a warning (unreferenced variable) when compiling in fast mode.
2013-05-02kvm: Add checkpoint debug printAndreas Sandberg
Add a debug print (when the Checkpoint debug flag is set) on serialize and unserialize. Additionally, dump the KVM state before serializing. The KVM state isn't dumped after unserializing since the state is loaded lazily on the next KVM entry.
2013-05-02kvm: Make MMIO requests uncacheableAndreas Sandberg
Device accesses are normally uncacheable. This change probably doesn't make any difference since we normally disable caching when KVM is active. However, there might be devices that check this, so we'd better enable this flag to be safe.
2013-04-23cpu: Fix TraceGen flag initalisationAndreas Hansson
This patch ensures the flags are always initialised.
2013-04-22cpu: Use request flags in trace playbackAndreas Hansson
This patch changes the TraceGen such that it uses the optional request flags from the protobuf trace if they are present.
2013-04-22cpu: Make the generators usable outside the TrafficGen moduleAndreas Hansson
This patch enables the use of the generator behaviours outside the TrafficGen module. This is useful e.g. to allow packet replay modes for other devices in the system without having to replace them with a TrafficGen in the configuration files. This change also enables more specific behaviours to be composed as specific modules, e.g. BaseBandModem can use a number of generators and have application-specific parameters based around a specific set of generators.
2013-04-22kvm: Add support for pseudo-ops on ARMAndreas Sandberg
This changeset adds support for m5 pseudo-ops when running in kvm-mode. Unfortunately, we can't trap the normal gem5 co-processor entry in KVM (it doesn't seem to be possible to trap accesses to non-existing co-processors). We therefore use BZJ instructions to cause a trap from virtualized mode into gem5. The BZJ instruction is becomes a normal branch to the gem5 fallback code when running in simulated mode, which means that this patch does not need to change the ARM ISA-specific code. Note: This requires a patched host kernel.
2013-04-22kvm: Add support for state dumping on ARMAndreas Sandberg
2013-04-22kvm: Add basic support for ARMAndreas Sandberg
Architecture specific limitations: * LPAE is currently not supported by gem5. We therefore panic if LPAE is enabled when returning to gem5. * The co-processor based interface to the architected timer is unsupported. We can't support this due to limitations in the KVM API on ARM. * M5 ops are currently not supported. This requires either a kernel hack or a memory mapped device that handles the guest<->m5 interface.
2013-04-22kvm: Add experimental support for a perf-based execution timerAndreas Sandberg
Add support for using the CPU cycle counter instead of a normal POSIX timer to generate timed exits to gem5. This should, in theory, provide better resolution when requesting timer signals. The perf-based timer requires a fairly recent kernel since it requires a working PERF_EVENT_IOC_PERIOD ioctl. This ioctl has existed in the kernel for a long time, but it used to be completely broken due to an inverted match when the kernel copied things from user space. Additionally, the ioctl does not change the sample period correctly on all kernel versions which implement it. It is currently only known to work reliably on kernel version 3.7 and above on ARM.
2013-04-22kvm: Avoid synchronizing the TC on every KVM exitAndreas Sandberg
Reduce the number of KVM->TC synchronizations by overloading the getContext() method and only request an update when the TC is requested as opposed to every time KVM returns to gem5.
2013-04-22kvm: Basic support for hardware virtualized CPUsAndreas Sandberg
This changeset introduces the architecture independent parts required to support KVM-accelerated CPUs. It introduces two new simulation objects: KvmVM -- The KVM VM is a component shared between all CPUs in a shared memory domain. It is typically instantiated as a child of the system object in the simulation hierarchy. It provides access to KVM VM specific interfaces. BaseKvmCPU -- Abstract base class for all KVM-based CPUs. Architecture dependent CPU implementations inherit from this class and implement the following methods: * updateKvmState() -- Update the architecture-dependent KVM state from the gem5 thread context associated with the CPU. * updateThreadContext() -- Update the thread context from the architecture-dependent KVM state. * dump() -- Dump the KVM state using (optional). In order to deliver interrupts to the guest, CPU implementations typically override the tick() method and check for, and deliver, interrupts prior to entering KVM. Hardware-virutalized CPU currently have the following limitations: * SE mode is not supported. * PC events are not supported. * Timing statistics are currently very limited. The current approach simply scales the host cycles with a user-configurable factor. * The simulated system must not contain any caches. * Since cycle counts are approximate, there is no way to request an exact number of cycles (or instructions) to be executed by the CPU. * Hardware virtualized CPUs and gem5 CPUs must not execute at the same time in the same simulator instance. * Only single-CPU systems can be simulated. * Remote GDB connections to the guest system are not supported. Additionally, m5ops requires an architecture specific interface and might not be supported.
2013-04-22cpu: Let python scripts obtain the number of instructions executedTimothy M. Jones
2013-04-22arm: Enable support for triggering a sim panic on kernel panicsAndreas Sandberg
Add the options 'panic_on_panic' and 'panic_on_oops' to the LinuxArmSystem SimObject. When these option are enabled, the simulator panics when the guest kernel panics or oopses. Enable panic on panic and panic on oops in ARM-based test cases.
2013-04-22sim: separate nextCycle() and clockEdge() in clockedObjectsDam Sunwoo
Previously, nextCycle() could return the *current* cycle if the current tick was already aligned with the clock edge. This behavior is not only confusing (not quite what the function name implies), but also caused problems in the drainResume() function. When exiting/re-entering the sim loop (e.g., to take checkpoints), the CPUs will drain and resume. Due to the previous behavior of nextCycle(), the CPU tick events were being rescheduled in the same ticks that were already processed before draining. This caused divergence from runs that did not exit/re-entered the sim loop. (Initially a cycle difference, but a significant impact later on.) This patch separates out the two behaviors (nextCycle() and clockEdge()), uses nextCycle() in drainResume, and uses clockEdge() everywhere else. Nothing (other than name) should change except for the drainResume timing.
2013-04-22cpu: generate SimPoint basic block vector profilesDam Sunwoo
This patch is based on http://reviews.m5sim.org/r/1474/ originally written by Mitch Hayenga. Basic block vectors are generated (simpoint.bb.gz in simout folder) based on start and end addresses of basic blocks. Some comments to the original patch are addressed and hooks are added to create and resume from checkpoints based on instruction counts dictated by external SimPoint analysis tools. SimPoint creation/resuming options will be implemented as a separate patch.
2013-04-22cpu: fix a switching issue with the o3 cpu.Ali Saidi
This change fixes the switcheroo test that broke earlier this month. The code that was checking for the pipeline being blocked wasn't checking for a pending translation, only for a icache access.
2013-03-29o3cpu: commit: changes interrupt handlingNilay Vaish
Currently the commit stage keeps a local copy of the interrupt object. Since the interrupt is usually handled several cycles after the commit stage becomes aware of it, it is possible that the local copy of the interrupt object may not be the interrupt that is actually handled. It is possible that another interrupt occurred in the interval between interrupt detection and interrupt handling. This patch creates a copy of the interrupt just before the interrupt is handled. The local copy is ignored.
2013-03-26cpu: Remove CpuPort and use MasterPort in the CPU classesAndreas Hansson
This patch changes the port in the CPU classes to use MasterPort instead of the derived CpuPort. The functions of the CpuPort are now distributed across the relevant subclasses. The port accessor functions (getInstPort and getDataPort) now return a MasterPort instead of a CpuPort. This simplifies creating derivative CPUs that do not use the CpuPort.
2013-03-20cpu: Avoid including inorder TLBUnit to avoid gcc LTO bugAndreas Hansson
This patch comments out the inclusion of the inorder TLBUnit which is only used in the 9-stage pipeline. With the TLBUnit present, gcc >= 4.6 in combination with LTO ends up throwing away the definition of the TLBUnit destructor, and consequently fail to link. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53808 for more details about the bug, and http://gcc.gnu.org/ml/gcc/2012-06/msg00397.html for the discussion thread that also touches on similar issues seen with clang.
2013-03-12cpu: Fix state transition bug in the traffic generatorAndreas Sandberg
The traffic generator used to incorrectly determine the next state in when state 0 had a non-zero probability. Due to the way the next transition was determined, state 0 could never be entered other than as an initial state. This changeset updates the transitition() method to correctly handle such cases and cases where the transition matrix is a 1x1 matrix.
2013-03-04cpu: fix a switching issue with the o3 cpu.Ali Saidi
This change fixes the switcheroo test that broke earlier this month. The code that was checking for the pipeline being blocked wasn't checking for a pending translation, only for a icache access.
2013-02-19scons: Fix warnings issued by clang 3.2svn (XCode 4.6)Andreas Hansson
This patch fixes the warnings that clang3.2svn emit due to the "-Wall" flag. There is one case of an uninitialised value in the ARM neon ISA description, and then a whole range of unused private fields that are pruned.
2013-02-19scons: Add warning for missing declarationsAndreas Hansson
This patch enables warnings for missing declarations. To avoid issues with SWIG-generated code, the warning is only applied to non-SWIG code.
2013-02-19scons: Fix up numerous warnings about name shadowingAndreas Hansson
This patch address the most important name shadowing warnings (as produced when using gcc/clang with -Wshadow). There are many locations where constructor parameters and function parameters shadow local variables, but these are left unchanged.
2013-02-19x86: Move APIC clock divider to PythonAndreas Hansson
This patch moves the 16x APIC clock divider to the Python code to avoid the post-instantiation modifications to the clock. The x86 APIC was the only object setting the clock after creation time and this required some custom functionality and configuration. With this patch, the clock multiplier is moved to the Python code and the objects are instantiated with the appropriate clock.
2013-02-19mem: Add predecessor to SenderState base classAndreas Hansson
This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses. There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest.
2013-02-15cpu: Document exec trace flagsAndreas Sandberg
2013-02-15cpu: Avoid duplicate entries in tracking structures for writes to misc regsGeoffrey Blake
setMiscReg currently makes a new entry for each write to a misc reg without checking for duplicates, this can cause a triggering of the assert if an instruction get replayed and writes to the same misc regs multiple times. This fix prevents duplicate entries and instead updates the value.
2013-02-15cpu: Fix rename mis-handling serializing instructions when resource constrainedGeoffrey Blake
The rename can mis-handle serializing instructions (i.e. strex) if it gets into a resource constrained situation and the serializing instruction has to be placed on the skid buffer to handle blocking. In this situation the instruction informs the pipeline it is serializing and logs that the next instruction must be serialized, but since we are blocking the pipeline defers this action to place the serializing instruction and incoming instructions into the skid buffer. When resuming from blocking, rename will pull the serializing instruction from the skid buffer and the current logic will see this as the "next" instruction that has to be serialized and because of flags set on the serializing instruction, it passes through the pipeline stage as normal and resets rename to non-serializing. This causes instructions to follow the serializing inst incorrectly and eventually leads to an error in the pipeline. To fix this rename should check first if it has to block before checking for serializing instructions.
2013-02-15o3: fix tick used for renaming and issue with range selectionMatt Horsnell
Fixes the tick used from rename: - previously this gathered the tick on leaving rename which was always 1 less than the dispatch. This conflated the decode ticks when back pressure built in the pipeline. - now picks up tick on entry. Added --store_completions flag: - will additionally display the store completion tail in the viewer. - this highlights periods when large numbers of stores are outstanding (>16 LSQ blocking) Allows selection by tick range (previously this caused an infinite loop)
2013-02-15sim: Add a system-global option to bypass cachesAndreas Sandberg
Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches. To make memory mode tests cleaner, the following methods are added to the System class: * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed. The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.
2013-02-15cpu: Refactor memory system checksAndreas Sandberg
CPUs need to test that the memory system is in the right mode in two places, when the CPU is initialized (unless it's switched out) and on a drainResume(). This led to some code duplication in the CPU models. This changeset introduces the verifyMemoryMode() method which is called by BaseCPU::init() if the CPU isn't switched out. The individual CPU models are responsible for calling this method when resuming from a drain as this code is CPU model specific.
2013-02-15cpu: Make checker CPUs inherit from CheckerCPU in the Python hierarchyAndreas Sandberg
Checker CPUs currently don't inherit from the CheckerCPU in the Python object hierarchy. This has two consequences: * It makes CPU model discovery from the Python world somewhat complicated as there is no way of testing if a CPU is a checker. * Parameters are duplicated in the checker configuration specification. This changeset makes all checker CPUs inherit from the base checker CPU class.
2013-02-15cpu: Add CPU metadata om the Python classesAndreas Sandberg
The configuration scripts currently hard-code the requirements of each CPU. This is clearly not optimal as it makes writing new configuration scripts painful and adding new CPU models requires existing scripts to be updated. This patch adds the following class methods to the base CPU and all relevant CPUs: * memory_mode -- Return a string describing the current memory mode (invalid/atomic/timing). * require_caches -- Does the CPU model require caches? * support_take_over -- Does the CPU support CPU handover?
2013-02-15cpu: include set in o3/commit_impl.Ali Saidi
While the majority of compilers seemed to pickup set from else where, one version of gcc 4.7 complains, so explictly add it.
2013-02-15cpu: fix case with o3 cpu blocking and unblocking decode in cycleAli Saidi
Fix a case in the O3 CPU where the decode stage blocks and unblocks in a single cycle sending both signals to fetch which causes an assert or worse. The previous check could never work before since the status was set to Blocked before a test for the status being Unblocking was executed.
2013-02-15cpu: Fix a livelock in the o3 cpu.Ali Saidi
Check if an instruction just enabled interrupts and we've previously had an interrupt pending that was not handled because interrupts were subsequently disabled before the pipeline reached a place to handle the interrupt. In that case squash now to make sure the interrupt is handled.
2013-01-24branch predictor: move out of o3 and inorder cpusNilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)
This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh
2013-01-22o3 cpu: fix zero reg problemAndrea Pellegrini
There was an issue w/ the rename logic, which would assign a previous physical register to the ZeroReg architectural register in x86. This issue was giving problems for instructions squashed in threads w/ ID different from 0, sometimes allowing non-mispredicted instructions to obtain a value different from zero when reading the zeroReg.
2013-01-22x86, cpu: corrects 270c9a75e91f, take over decoder on cpu switchNilay Vaish
The changes made by the changeset 270c9a75e91f do not work well with switching of cpus. The problem is that decoder for the old thread context holds state that is not taken over by the new decoder. This patch adds a takeOverFrom() function to Decoder class in each ISA. Except for x86, functions in other ISAs are blank. For x86, the function copies state from the old decoder to the new decoder.
2013-01-19O3 IEW: Make incrWb and decrWb clearerJoel Hestness
Move the increment/decrement of wbOutstanding outside of the comparison in incrWb and decrWb in the IEW. This also fixes a compiler bug with gcc 4.4.7, which incorrectly optimizes "-- ==" as "-=".
2013-01-17ruby: remove calls to g_system_ptr->getTime()Nilay Vaish
This patch further removes calls to g_system_ptr->getTime() where ever other clocked objects are available for providing current time.
2013-01-12base simple cpu: removes commented out code about cache opsNilay Vaish
2013-01-12x86: Changes to decoder, corrects 9376Nilay Vaish
The changes made by the changeset 9376 were not quite correct. The patch made changes to the code which resulted in decoder not getting initialized correctly when the state was restored from a checkpoint. This patch adds a startup function to each ISA object. For x86, this function sets the required state in the decoder. For other ISAs, the function is empty right now.