summaryrefslogtreecommitdiff
path: root/src/cpu
AgeCommit message (Collapse)Author
2015-10-12misc: Remove redundant compiler-specific definesAndreas Hansson
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions.
2015-10-09isa: Add parameter to pick different decoder inside ISARekai Gonzalez Alberquilla
The decoder is responsible for splitting instructions in micro operations (uops). Given that different micro architectures may split operations differently, this patch allows to specify which micro architecture each isa implements, so different cores in the system can split instructions differently, also decoupling uop splitting (microArch) from ISA (Arch). This is done making the decodification calls templates that receive a type 'DecoderFlavour' that maps the name of the operation to the class that implements it. This way there is only one selection point (converting the command line enum to the appropriate DecodeFeatures object). In addition, there is no explicit code replication: template instantiation hides that, and the compiler should be able to resolve a number of things at compile-time.
2015-10-06sim: add ExecMacro to Exec* compound debug flagsSteve Reinhardt
Really should have been there in the first place, IMO. Makes debugging x86 execution a lot easier.
2015-09-30base: remove Trace::enabled flagCurtis Dunham
The DTRACE() macro tests both Trace::enabled and the specific flag. This change uses the same administrative interface for enabling/disabling tracing, but masks the SimpleFlags settings directly. This eliminates a load for every DTRACE() test, e.g. DPRINTF.
2015-09-30cpu,isa,mem: Add per-thread wakeup logicMitch Hayenga
Changes wakeup functionality so that only specific threads on SMT capable cpus are woken.
2015-09-30isa,cpu: Add support for FS SMT InterruptsMitch Hayenga
Adds per-thread interrupt controllers and thread/context logic so that interrupts properly get routed in SMT systems.
2015-09-30cpu: Add per-thread monitorsMitch Hayenga
Adds per-thread address monitors to support FullSystem SMT.
2015-09-30config,cpu: Add SMT support to Atomic and Timing CPUsMitch Hayenga
Adds SMT support to the "simple" CPU models so that they can be used with other SMT-supported CPUs. Example usage: this enables the TimingSimpleCPU to be used to warmup caches before swapping to detailed mode with the in-order or out-of-order based CPU models.
2015-09-30cpu: Change thread assignments for heterogenous SMTMitch Hayenga
Trying to run an SE system with varying threads per core (SMT cores + Non-SMT cores) caused failures due to the CPU id assignment logic. The comment about thread assignment (worrying about core 0 not having tid 0) seems not to be valid given that our configuration scripts initialize them in order. This removes that constraint so a heterogenously threaded sytem can work.
2015-09-15cpu: pred: Local Predictor Reset in Tournament PredictorAndrew Lukefahr
When a branch gets squashed, it's speculative branch predictor state should get rolled back in squash(). However, only the globalHistory state was being rolled back. This patch adds (at least some) support for rolling back the local predictor state also. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-09-15cpu, o3: consider split requests for LSQ checksnoop operationsHongil Yoon
This patch enables instructions in LSQ to track two physical addresses for corresponding two split requests. Later, the information is used in checksnoop() to search for/invalidate the corresponding LD instructions. The current implementation has kept track of only the physical address that is referenced by the first split request. Thus, for checksnoop(), the line accessed by the second request has not been considered, causing potential correctness issues. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-08-29ruby: eliminate type uint64 and int64Nilay Vaish
These types are being replaced with uint64_t and int64_t.
2015-08-21mem: Reflect that packet address and size are always validAndreas Hansson
This patch simplifies the packet, and removes the possibility of creating a packet without a valid address and/or size. Under no circumstances are these fields set at a later point, and thus they really have to be provided at construction time. The patch also fixes a case there the MinorCPU creates a packet without a valid address and size, only to later delete it.
2015-08-21cpu: Move invldPid constant from Request to BaseCPUAndreas Hansson
A more natural home for this constant.
2015-08-19ruby: reverts to changeset: bf82f1f7b040Nilay Vaish
2015-08-14ruby: eliminate type uint64 and int64Nilay Vaish
These types are being replaced with uint64_t and int64_t.
2015-08-14ruby: replace Address by AddrNilay Vaish
This patch eliminates the type Address defined by the ruby memory system. This memory system would now use the type Addr that is in use by the rest of the system.
2015-08-11ruby: drop some redundant includesNilay Vaish
2015-08-07base: Declare a type for context IDsAndreas Sandberg
Context IDs used to be declared as ad hoc (usually as int). This changeset introduces a typedef for ContextIDs and a constant for invalid context IDs.
2015-07-20cpu: Fixed a bug on where to fetch the next instruction fromDavid Hashe
Figure out if the next instruction to fetch comes from the micro-op ROM or not. Otherwise, wrong instructions may be fetched.
2015-07-31cpu: Update debug message from Fetch1 isDrained() in MinorAndreas Sandberg
Fix a spurious %s and include the state of the Fetch1 stage in the debug printout.
2015-07-31cpu: Fix Minor drain issues when switched outAndreas Sandberg
The Minor CPU currently doesn't drain properly when it is switched out. This happens because Fetch 1 expects to be in the FetchHalted state when it is drained. However, because the CPU is switched out, it is stuck in the FetchWaitingForPC state. Fix this by ignoring drain requests and returning DrainState::Drained from MinorCPU::drain() if the CPU is switched out. This is always safe since a switched out CPU, by definition, doesn't have any instructions in flight.
2015-07-30cpu: Only activate thread 0 in Minor if the CPU is activeAndreas Sandberg
Minor currently activates thread 0 in startup() to work around an issue where activateContext() is called from LiveProcess before the process entry point is known. When activateContext() is called, Minor creates a branch instruction to the process's entry point. The first time it is called, the branch points to an undefined location (0). The call in startup() updates the branch to point to the actual entry point. When instantiating a switched out Minor CPU, it still tries to activate thread 0. This is clearly incorrect since a switched out CPU can't have any active threads. This changeset adds a check to ensure that the thread is active before reactivating it.
2015-07-30cpu: Fix drain issues in the Minor CPUAndreas Sandberg
The drain refactor patches introduced a couple of bugs in the way Minor handles draining. This patch fixes an incorrect assert and a case of infinite recursion when the CPU signals drain done.
2015-07-30cpu: Fix issue identified by UBSanAndreas Hansson
2015-07-28revert 5af8f40d8f2cNilay Vaish
2015-07-26cpu: implements vector registersNilay Vaish
This adds a vector register type. The type is defined as a std::array of a fixed number of uint64_ts. The isa_parser.py has been modified to parse vector register operands and generate the required code. Different cpus have vector register files now.
2015-07-26cpu: o3: slight correction to identation in rename_impl.hhNilay Vaish
2015-07-10ruby: replace global g_abs_controls with per-RubySystem varBrandon Potter
This is another step in the process of removing global variables from Ruby to enable multiple RubySystem instances in a single simulation. The list of abstract controllers is per-RubySystem and should be represented that way, rather than as a global. Since this is the last remaining Ruby global variable, the src/mem/ruby/Common/Global.* files are also removed.
2015-07-07sim: Refactor and simplify the drain APIAndreas Sandberg
The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining. This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error. Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects.
2015-07-07sim: Make the drain state a global typed enumAndreas Sandberg
The drain state enum is currently a part of the Drainable interface. The same state machine will be used by the DrainManager to identify the global state of the simulator. Make the drain state a global typed enum to better cater for this usage scenario.
2015-07-07sim: Refactor the serialization base classAndreas Sandberg
Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code.
2015-07-04o3: correct the number of cc registers in rename mapNilay Vaish
2015-06-01kvm, arm: Add support for aarch64Andreas Sandberg
This changeset adds support for aarch64 in kvm. The CPU module supports both checkpointing and online CPU model switching as long as no devices are simulated by the host kernel. It currently has the following limitations: * The system register based generic timer can only be simulated by the host kernel. Workaround: Use a memory mapped timer instead to simulate the timer in gem5. * Simulating devices (e.g., the generic timer) in the host kernel requires that the host kernel also simulates the GIC. * ID registers in the host and in gem5 must match for switching between simulated CPUs and KVM. This is particularly important for ID registers describing memory system capabilities (e.g., ASID size, physical address size). * Switching between a virtualized CPU and a simulated CPU is currently not supported if in-kernel device emulation is used. This could be worked around by adding support for switching to the gem5 (e.g., the KvmGic) side of the device models. A simpler workaround is to avoid in-kernel device models altogether.
2015-06-01kvm, arm, dev: Add an in-kernel GIC implementationAndreas Sandberg
This changeset adds a GIC implementation that uses the kernel's built-in support for simulating the interrupt controller. Since there is currently no support for state transfer between gem5 and the kernel, the device model does not support serialization and CPU switching (which would require switching to a gem5-simulated GIC).
2015-06-01kvm: Handle inst events at the current instruction countAndreas Sandberg
There are cases (particularly when attaching GDB) when instruction events are scheduled at the current instruction tick. This used to trigger an assertion error in kvm. This changeset adds a check for this condition and forces KVM to do a quick entry that completes any pending IO operations, but does not execute any new instructions, before servicing the event. We could check if we need to enter KVM at all, but forcing a quick entry is makes the code slightly cleaner and does not hurt correctness (performance is hardly an issue in these cases).
2015-06-01kvm, arm: Move ARM-specific files to arch/arm/kvm/Andreas Sandberg
This changeset moves the ARM-specific KVM CPU implementation to arch/arm/kvm/. This change is expected to keep the source tree somewhat cleaner as we start adding support for ARMv8 and KVM in-kernel interrupt controller simulation. --HG-- rename : src/cpu/kvm/ArmKvmCPU.py => src/arch/arm/kvm/ArmKvmCPU.py rename : src/cpu/kvm/arm_cpu.cc => src/arch/arm/kvm/arm_cpu.cc rename : src/cpu/kvm/arm_cpu.hh => src/arch/arm/kvm/arm_cpu.hh
2015-05-26cpu: Fix a bug in counting issued instructions in MinorCPUAndrew Bardsley
The MinorCPU would count bubbles in Execute::issue as part of the num_insts_issued and so sometimes reach the instruction issue limit incorrectly. Fixed by checking for a bubble in one new place.
2015-05-23kvm: Fix dumping code for large registersAndreas Sandberg
The register dumping code in kvm tries to print the bytes in large registers (128 bits and larger) instead of printing them as hex. This changeset fixes that.
2015-05-23kvm, x86: Guard x86-specific APIs in KvmVMAndreas Sandberg
Protect x86-specific APIs in KvmVM with compile-time guards to avoid breaking ARM builds.
2015-05-15misc: Appease gcc 5.1Andreas Hansson
Three minor issues are resolved: 1. Apparently gcc 5.1 does not like negation of booleans followed by bitwise AND. 2. Somehow the compiler also gets confused and warns about NoopMachInst being unused (removing it causes compilation errors though). Most likely a compiler bug. 3. There seems to be a number of instances where loop unrolling causes false positives for the array-bounds check. For now, switch to std::array. Potentially we could disable the warning for newer gcc versions, but switching to std::array is probably a good move in any case.
2015-05-05mem, cpu: Add a separate flag for strictly ordered memoryAndreas Sandberg
The Request::UNCACHEABLE flag currently has two different functions. The first, and obvious, function is to prevent the memory system from caching data in the request. The second function is to prevent reordering and speculation in CPU models. This changeset gives the order/speculation requirement a separate flag (Request::STRICT_ORDER). This flag prevents CPU models from doing the following optimizations: * Speculation: CPU models are not allowed to issue speculative loads. * Write combining: CPU models and caches are not allowed to merge writes to the same cache line. Note: The memory system may still reorder accesses unless the UNCACHEABLE flag is set. It is therefore expected that the STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent this behavior.
2015-05-05mem: Snoop into caches on uncacheable accessesAndreas Hansson
This patch takes a last step in fixing issues related to uncacheable accesses. We do not separate uncacheable memory from uncacheable devices, and in cases where it is really memory, there are valid scenarios where we need to snoop since we do not support cache maintenance instructions (yet). On snooping an uncacheable access we thus provide data if possible. In essence this makes uncacheable accesses IO coherent. The snoop filter is also queried to steer the snoops, but not updated since the uncacheable accesses do not allocate a block.
2015-05-05cpu: Work around gcc 4.9 issues with Num_OpClassesAndreas Hansson
This patch fixes a recent issue with gcc 4.9 (and possibly more) being convinced that indices outside the array bounds are used when initialising the FUPool members.
2015-04-29cpu: o3: replace issueLatency with bool pipelinedNilay Vaish
Currently, each op class has a parameter issueLat that denotes the cycles after which another op of the same class can be issued. As of now, this latency can either be one cycle (fully pipelined) or same as execution latency of the op (not at all pipelined). The fact that issueLat is a parameter of type Cycles makes one believe that it can be set to any value. To avoid the confusion, the parameter is being renamed as 'pipelined' with type boolean. If set to true, the op would execute in a fully pipelined fashion. Otherwise, it would execute in an unpipelined fashion.
2015-04-29cpu: o3: single cycle default div microop latency on x86Nilay Vaish
This patch sets the default latency of the division microop to a single cycle on x86. This is because the division instructions DIV and IDIV have been implemented as loops of div microops, where each microop computes a single bit of the quotient.
2015-04-22cpu: remove conditional check (count > 0) on o3 IQ squashesBrandon Potter
The o3 cpu instruction queue model uses the count variable to track the number of unissued instructions in the queue. Previously, the squash method used this variable to avoid executing the doSquash method when there were no unissued instructions in the pipeline. A corner case problem exists when only issued instructions exist in the pipeline and a squash occurs; the doSquash code is not invoked and subsequently does not clean up state properly.
2015-04-20cpu: Remove the InOrderCPU from the treeAndreas Hansson
This patch takes the final step in removing the InOrderCPU from the tree. Rest in peace. The MinorCPU is now used to model an in-order microarchitecture, and long term the MinorCPU will eventually be renamed InOrderCPU.
2015-04-14config, cpu: fix progress interval for switched CPUsMalek Musleh
This patch ensures that the CPU progress Event is triggered for the new set of switched_cpus that get scheduled (e.g. during fast-forwarding). it also avoids printing the interval state if the cpu is currently switched out. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-13cpu: re-organizes the branch predictor structure.Dibakar Gope
Committed by: Nilay Vaish <nilay@cs.wisc.edu>