gem5 - gem5

Age	Commit message (Collapse)	Author
2016-11-09	style: [patch 1/22] use /r/3648/ to reorganize includes	Brandon Potter

2016-12-21	cpu: disallow speculative update of branch predictor tables (o3)	Arthur Perais
	The Minor and o3 cpu models share the branch prediction code. Minor relies on the BPredUnit::squash() function to update the branch predictor tables on a branch mispre- diction. This is fine because Minor executes in-order, so the update is on the correct path. However, this causes the branch predictor to be updated on out-of-order branch mispredictions when using the o3 model, which should not be the case. This patch guards against speculative update of the branch prediction tables. On a branch misprediction, BPredUnit::squash() calls BpredUnit::update(..., squashed = true). The underlying branch predictor tests against the value of squashed. If it is true, it restores any speculatively updated internal state it might have (e.g., global/local branch history), then returns. If false, it updates its prediction tables. Previously, exist- ing predictors did not test against the "squashed" parameter. To accomodate for this change, the Minor model must now call BPredUnit::squash() then BPredUnit::update(..., squashed = false) on branch mispredictions. Before, calling BpredUnit::squash() performed the prediction tables update. The effect is a slight MPKI improvement when using the o3 model. A further patch should perform the same modifications for the indirect target predictor and BTB (less critical). Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-10-15	cpu, arm: Distinguish Float* and SimdFloat, create FloatMem opClass	Fernando Endo
	Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which distinguishes writes to the INT and FP register banks. Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72, where the "latency" of FMADD is 3 if the next instruction is a FMADD and has only the augend to destination dependency, otherwise it's 7 cycles. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-08-15	cpu: Add missing override in Minor's exec context	Andreas Sandberg
	Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-08-15	cpu: Fixed clang errors. Added 'override' keyword for virtual functions.	Reiley Jeapaul
	Change-Id: Ic37311443ca11ee6d95bceffea599e054e7aa110 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-08-15	cpu, arch: fix the type used for the request flags	Nikos Nikoleris
	Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-07-21	cpu: Fix Minor SMT WFI/drain interaction issues	Mitch Hayenga
	The behavior of WFI is to cause minor to cease evaluating pipeline logic until an interrupt is observed, however a user may wish to drain the system while a core is sleeping due to a WFI. This patch makes WFI drain. If an actual drain occurs during a WFI, the CPU is already drained and will immediately be ready for swapping, checkpointing, etc. This should not negatively impact performance as WFI instructions are 'stream-changing' (treated like unpredicted branches), so all remaining instructions are wrong-path and will be squashed rapidly. Change-Id: I63833d5acb53d8dde78f9f0c9611de0ece385e45
2016-07-21	cpu: Add SMT support to MinorCPU	Mitch Hayenga
	This patch adds SMT support to the MinorCPU. Currently RoundRobin or Random thread scheduling are supported. Change-Id: I91faf39ff881af5918cca05051829fc6261f20e3
2016-06-06	pwr: Low-power idle power state for idle CPUs	David Guillen Fandos
	Add functionality to the BaseCPU that will put the entire CPU into a low-power idle state whenever all threads in it are idle. Change-Id: I984d1656eb0a4863c87ceacd773d2d10de5cfd2b
2016-05-27	cpu: fix lastStopped unserialisation	Ilias Vougioukas
	MinorCPU fix for corrupt numCycles when resuming from a previous simulation. --- src/cpu/minor/cpu.cc \| 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
2016-04-07	mem: Remove threadId from memory request class	Mitch Hayenga
	In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu. This is a re-spin of 20264eb after the revert (bd1c6789) and includes some fixes of that commit.
2016-04-06	Revert power patch sets with unexpected interactions	Andreas Sandberg
	The following patches had unexpected interactions with the current upstream code and have been reverted for now: e07fd01651f3: power: Add support for power models 831c7f2f9e39: power: Low-power idle power state for idle CPUs 4f749e00b667: power: Add power states to ClockedObject Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> --HG-- extra : amend_source : 0b6fb073c6bbc24be533ec431eb51fbf1b269508
2016-04-05	mem: Remove threadId from memory request class	Mitch Hayenga
	In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu.
2014-12-09	power: Low-power idle power state for idle CPUs	Akash Bagdia
	Add functionality to the BaseCPU that will put the entire CPU into a low-power idle state whenever all threads in it are idle.
2016-04-05	cpu: Add instruction opclass histogram to minor	Mitch Hayenga

2015-07-19	cpu: Fix LLSC atomic CPU wakeup	Krishnendra Nathella
	Writes to locked memory addresses (LLSC) did not wake up the locking CPU. This can lead to deadlocks on multi-core runs. In AtomicSimpleCPU, recvAtomicSnoop was checking if the incoming packet was an invalidation (isInvalidate) and only then handled a locked snoop. But, writes are seen instead of invalidates when running without caches (fast-forward configurations). As as simple fix, now handleLockedSnoop is also called even if the incoming snoop packet are from writes.
2016-02-15	misc: Add missing overrides to appease clang	Andreas Hansson
	Since the last round of fixes a few new issues have snuck in. We should consider switching the regression runs to clang.
2016-02-10	mem: Deduce if cache should forward snoops	Andreas Hansson
	This patch changes how the cache determines if snoops should be forwarded from the memory side to the CPU side. Instead of having a parameter, the cache now looks at the port connected on the CPU side, and if it is a snooping port, then snoops are forwarded. Less error prone, and less parameters to worry about. The patch also tidies up the CPU classes to ensure that their I-side port is not snooping by removing overrides to the snoop request handler, such that snoop requests will panic via the default MasterPort implement
2016-02-06	style: fix missing spaces in control statements	Steve Reinhardt
	Result of running 'hg m5style --skip-all --fix-control -a'.
2016-01-17	cpu. arch: add initiateMemRead() to ExecContext interface	Steve Reinhardt
	For historical reasons, the ExecContext interface had a single function, readMem(), that did two different things depending on whether the ExecContext supported atomic memory mode (i.e., AtomicSimpleCPU) or timing memory mode (all the other models). In the former case, it actually performed a memory read; in the latter case, it merely initiated a read access, and the read completion did not happen until later when a response packet arrived from the memory system. This led to some confusing things, including timing accesses being required to provide a pointer for the return data even though that pointer was only used in atomic mode. This patch splits this interface, adding a new initiateMemRead() function to the ExecContext interface to replace the timing-mode use of readMem(). For consistency and clarity, the readMemTiming() helper function in the ISA definitions is renamed to initiateMemRead() as well. For x86, where the access size is passed in explicitly, we can also get rid of the data parameter at this level. For other ISAs, where the access size is determined from the type of the data parameter, we have to keep the parameter for that purpose.
2015-10-12	misc: Add explicit overrides and fix other clang >= 3.5 issues	Andreas Hansson
	This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication. As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables).
2015-10-12	misc: Remove redundant compiler-specific defines	Andreas Hansson
	This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions.
2015-09-30	cpu,isa,mem: Add per-thread wakeup logic	Mitch Hayenga
	Changes wakeup functionality so that only specific threads on SMT capable cpus are woken.
2015-09-30	isa,cpu: Add support for FS SMT Interrupts	Mitch Hayenga
	Adds per-thread interrupt controllers and thread/context logic so that interrupts properly get routed in SMT systems.
2015-09-30	cpu: Add per-thread monitors	Mitch Hayenga
	Adds per-thread address monitors to support FullSystem SMT.
2015-08-21	mem: Reflect that packet address and size are always valid	Andreas Hansson
	This patch simplifies the packet, and removes the possibility of creating a packet without a valid address and/or size. Under no circumstances are these fields set at a later point, and thus they really have to be provided at construction time. The patch also fixes a case there the MinorCPU creates a packet without a valid address and size, only to later delete it.
2015-08-07	base: Declare a type for context IDs	Andreas Sandberg
	Context IDs used to be declared as ad hoc (usually as int). This changeset introduces a typedef for ContextIDs and a constant for invalid context IDs.
2015-07-31	cpu: Update debug message from Fetch1 isDrained() in Minor	Andreas Sandberg
	Fix a spurious %s and include the state of the Fetch1 stage in the debug printout.
2015-07-31	cpu: Fix Minor drain issues when switched out	Andreas Sandberg
	The Minor CPU currently doesn't drain properly when it is switched out. This happens because Fetch 1 expects to be in the FetchHalted state when it is drained. However, because the CPU is switched out, it is stuck in the FetchWaitingForPC state. Fix this by ignoring drain requests and returning DrainState::Drained from MinorCPU::drain() if the CPU is switched out. This is always safe since a switched out CPU, by definition, doesn't have any instructions in flight.
2015-07-30	cpu: Only activate thread 0 in Minor if the CPU is active	Andreas Sandberg
	Minor currently activates thread 0 in startup() to work around an issue where activateContext() is called from LiveProcess before the process entry point is known. When activateContext() is called, Minor creates a branch instruction to the process's entry point. The first time it is called, the branch points to an undefined location (0). The call in startup() updates the branch to point to the actual entry point. When instantiating a switched out Minor CPU, it still tries to activate thread 0. This is clearly incorrect since a switched out CPU can't have any active threads. This changeset adds a check to ensure that the thread is active before reactivating it.
2015-07-30	cpu: Fix drain issues in the Minor CPU	Andreas Sandberg
	The drain refactor patches introduced a couple of bugs in the way Minor handles draining. This patch fixes an incorrect assert and a case of infinite recursion when the CPU signals drain done.
2015-07-28	revert 5af8f40d8f2c	Nilay Vaish

2015-07-26	cpu: implements vector registers	Nilay Vaish
	This adds a vector register type. The type is defined as a std::array of a fixed number of uint64_ts. The isa_parser.py has been modified to parse vector register operands and generate the required code. Different cpus have vector register files now.
2015-07-07	sim: Refactor and simplify the drain API	Andreas Sandberg
	The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining. This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error. Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects.
2015-07-07	sim: Make the drain state a global typed enum	Andreas Sandberg
	The drain state enum is currently a part of the Drainable interface. The same state machine will be used by the DrainManager to identify the global state of the simulator. Make the drain state a global typed enum to better cater for this usage scenario.
2015-07-07	sim: Refactor the serialization base class	Andreas Sandberg
	Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code.
2015-05-26	cpu: Fix a bug in counting issued instructions in MinorCPU	Andrew Bardsley
	The MinorCPU would count bubbles in Execute::issue as part of the num_insts_issued and so sometimes reach the instruction issue limit incorrectly. Fixed by checking for a bubble in one new place.
2015-05-05	mem, cpu: Add a separate flag for strictly ordered memory	Andreas Sandberg
	The Request::UNCACHEABLE flag currently has two different functions. The first, and obvious, function is to prevent the memory system from caching data in the request. The second function is to prevent reordering and speculation in CPU models. This changeset gives the order/speculation requirement a separate flag (Request::STRICT_ORDER). This flag prevents CPU models from doing the following optimizations: * Speculation: CPU models are not allowed to issue speculative loads. * Write combining: CPU models and caches are not allowed to merge writes to the same cache line. Note: The memory system may still reorder accesses unless the UNCACHEABLE flag is set. It is therefore expected that the STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent this behavior.
2015-05-05	cpu: Work around gcc 4.9 issues with Num_OpClasses	Andreas Hansson
	This patch fixes a recent issue with gcc 4.9 (and possibly more) being convinced that indices outside the array bounds are used when initialising the FUPool members.
2015-04-13	cpu: re-organizes the branch predictor structure.	Dibakar Gope
	Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-03	cpu: fix system total instructions accounting	Nikos Nikoleris
	The totalInstructions counter is only incremented when the whole instruction is commited and not on every microop. It was incorrectly reset in atomic and timing cpus. Committed by: Nilay Vaish <nilay@cs.wisc.edu>"
2015-02-11	mem: restructure Packet cmd initialization a bit more	Steve Reinhardt
	Refactor the way that specific MemCmd values are generated for packets. The new approach is a little more elegant in that we assign the right value up front, and it's also more amenable to non-heap-allocated Packet objects. Also replaced the code in the Minor model that was still doing it the ad-hoc way. This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249.
2015-03-02	mem: Split port retry for all different packet classes	Andreas Hansson
	This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios. The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting. The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks.
2015-02-16	arch: Make readMiscRegNoEffect const throughout	Andreas Hansson
	Finally took the plunge and made this apply to all ISAs, not just ARM.
2015-01-25	sim: Clean up InstRecord	Ali Saidi
	Track memory size and flags as well as add some comments and consts.
2015-01-20	cpu: Fix retry bug in MinorCPU LSQ	Andreas Hansson

2015-01-03	minor: fixed LSQ MasterPortID	Andrew Lukefahr
	Minor was reporting the data cache access as ".inst" accesses. This just switches the MasterPortID to dataMasterPortId. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-12-02	cpu: Fix retries on barrier/store in Minor's store buffer	Andrew Bardsley
	This patch fixes a case where a store in Minor's store buffer never leaves the store buffer as it is pre-maturely counted as having been issued, leading to the store buffer idling. LSQ::StoreBuffer::numUnissuedAccesses should count the number of accesses either in memory, or still in the store buffer after being completed. For stores which are also barriers, the store will stay in the store buffer for a cycle after it is completed and will be cleaned up by the barrier clearing code (to ensure that barriers are completed in-order). To acheive this, numUnissuedAccesses is not decremented when a store-barrier is issued to memory, but when its barrier effect is cleared. Without this patch, the correct behaviour happens when a memory transaction is immediately accepted, but not if it needs a retry.
2014-12-02	cpu: Fix memoryIssueLimit checking in Minor	Andrew Bardsley
	This patch fixes the checking of the number of memory instructions issued per cycles in the Minor CPU.
2014-12-02	mem: Assume all dynamic packet data is array allocated	Andreas Hansson
	This patch simplifies how we deal with dynamically allocated data in the packet, always assuming that it is array allocated, and hence should be array deallocated (delete[] as opposed to delete). The only uses of dataDynamic was in the Ruby testers. The ARRAY_DATA flag in the packet is removed accordingly. No defragmentation of the flags is done at this point, leaving a gap in the bit masks. As the last part the patch, it renames dataDynamicArray to dataDynamic.