gem5 - gem5

Age	Commit message (Collapse)	Author
2014-09-20	base: Add getSectionNames to IniFile	Andrew Bardsley
	Add an accessor to IniFile to list all the sections in the file.
2014-09-20	cpu: Add ExecFlags debug flag	Mitch Hayenga
	Adds a debug flag to print out the flags a instruction is tagged with.
2014-09-20	mem: Remove the GHB prefetcher from the source tree	Mitch Hayenga
	There are two primary issues with this code which make it deserving of deletion. 1) GHB is a way to structure a prefetcher, not a definitive type of prefetcher 2) This prefetcher isn't even structured like a GHB prefetcher. It's basically a worse version of the stride prefetcher. It primarily serves to confuse new gem5 users and most functionality is already present in the stride prefetcher.
2014-09-20	cpu: use probes infrastructure to do simpoint profiling	Dam Sunwoo
	Instead of having code embedded in cpu model to do simpoint profiling use the probes infrastructure to do it.
2014-09-20	config: Cleanup .json config file generation	Andrew Bardsley
	This patch 'completes' .json config files generation by adding in the SimObject references and String-valued parameters not currently printed. TickParamValues are also changed to print in the same tick-value format as in .ini files. This allows .json files to describe a system as fully as the .ini files currently do. This patch adds a new function config_value (which mirrors ini_str) to each ParamValue and to SimObject. This function can then be explicitly changed to give different .json and .ini printing behaviour rather than being written in terms of ini_str.
2014-09-19	arch: Pass faults by const reference where possible	Andreas Hansson
	This patch changes how faults are passed between methods in an attempt to copy as few reference-counting pointer instances as possible. This should avoid unecessary copies being created, contributing to the increment/decrement of the reference counters.
2014-09-19	cpu: Use a deque in o3 rename instruction queue	Andreas Hansson
	Switch from a list to a data structure with better data layout.
2014-09-19	base: Ensure the CP annotation compiles again	Andreas Hansson
	A bit of revamping to get the CP annotate functionality to compile.
2014-09-19	misc: Use safe_cast when assumptions are made about return value	Andreas Hansson
	This patch changes two dynamic_cast to safe_cast as we assume the return value is not NULL (without checking).
2014-09-19	misc: Restore ostream flags where needed	Andreas Hansson
	This patch ensures we adhere to the normal ostream usage rules, and restore the flags after modifying them.
2014-09-19	stats: Fix flow-control bug in Vector2D printing	Andreas Hansson

2014-09-19	misc: Remove assertions ensuring unsigned values >= 0	Andreas Hansson

2014-09-19	mem: Check return value of checkFunctional in SimpleMemory	Andreas Hansson
	Simple fix to ensure we only iterate until we are done.
2014-09-19	mem: Add checks to sendTimingReq in cache	Andreas Hansson
	A small fix to ensure the return value is not ignored.
2014-09-15	ruby: network: revert some of the changes from ad9c042dce54	Nilay Vaish
	The changeset ad9c042dce54 made changes to the structures under the network directory to use a map of buffers instead of vector of buffers. The reasoning was that not all vnets that are created are used and we needlessly allocate more buffers than required and then iterate over them while processing network messages. But the move to map resulted in a slow down which was pointed out by Andreas Hansson. This patch moves things back to using vector of message buffers.
2014-09-12	cpu: Fix memory access in Minor not setting parent Request flags	Andrew Bardsley
	This patch fixes cases where uncacheable/memory type flags are not set correctly on a memory op which is split in the LSQ. Without this patch, request->request if freely used to check flags where the flags should actually come from the accumulation of request fragment flags. This patch also fixes a bug where an uncacheable access which passes through tryToSendRequest more than once can increment LSQ::numAccessesInMemorySystem more than once.
2014-09-12	style: Fix line continuation, especially in debug messages	Andrew Bardsley
	This patch closes a number of space gaps in debug messages caused by the incorrect use of line continuation within strings. (There's also one consistency change to a similar, but correct, use of line continuation)
2014-09-12	minor: Fix typo in DPRINTF for Minor branch prediction	Andreas Hansson

2014-09-09	sim: Automatically unregister probe listeners	Andreas Sandberg
	The ProbeListener base class automatically registers itself with a probe manager. Currently, the class does not unregister a itself when it is destroyed, which makes removing probes listeners somewhat cumbersome. This patch adds an automatic call to manager->removeListener in the ProbeListener destructor, which solves the problem.
2014-09-09	config: Fix vectorparam command line parsing	Geoffrey Blake
	Parsing vectorparams from the command was slightly broken in that it wouldn't accept the input that the help message provided to the user and it didn't do the conversion on the second code path used to convert the string input to the actual internal representation. This patch fixes these bugs.
2014-09-09	cpu: Only iterate over possible threads on the o3 cpu	Mitch Hayenga
	Some places in O3 always iterated over "Impl::MaxThreads" even if a CPU had fewer threads. This removes a few of those instances.
2014-09-09	mem: Add accessor function for vaddr	Mitch Hayenga
	Determine if a request has an associated virtual address.
2014-09-09	sim: Fix resource leak in BaseGlobalEvent	Andreas Sandberg
	Static analysis revealed that BaseGlobalEvent::barrier was never deallocated. This changeset solves this leak by making the barrier allocation a part of the BaseGlobalEvent instead of storing a pointer to a separate heap-allocated barrier.
2014-09-09	misc: Fix a number of unitialised variables and members	Andreas Hansson
	Static analysis unearther a bunch of uninitialised variables and members, and this patch addresses the problem. In all cases these omissions seem benign in the end, but at least fixing them means less false positives next time round.
2014-09-03	dev: seperate legacy io offsets from PCI offset	Ali Saidi
	The PC platform has a single IO range that is used both legacy IO and PCI IO while other platforms may use seperate regions. Provide another mechanism to configure the legacy IO base address range and set it to the PCI IO address range for x86.
2014-09-03	arm: Support >2GB of memory for AArch64 systems	Ali Saidi

2014-09-03	dev, arm: Add support for linux generic pci host driver	Ali Saidi
	This change adds support for a generic pci host bus driver that has been included in recent Linux kernel instead of the more bespoke one we've been using to date. It also works with aarch64 so it provides PCI support for 64-bit ARM Linux. To make this work a new configuration option pci_io_base is added to the RealView platform that should be set to the start of the memory used as memory mapped IO ports (IO ports that are memory mapped, not regular memory mapped IO). And a parameter pci_cfg_gen_offsets which specifies if the config space offsets should be used that the generic driver expects. To use the pci-host-generic device you need to: pci_io_base = 0x2f000000 (Valid for VExpress EMM) pci_cfg_gen_offsets = True and add the following to your device tree: pci { compatible = "pci-host-ecam-generic"; device_type = "pci"; #address-cells = <0x3>; #size-cells = <0x2>; #interrupt-cells = <0x1>; //bus-range = <0x0 0x1>; // CPU_PHYSICAL(2) SIZE(2) // Note, some DTS blobs only support 1 size reg = <0x0 0x30000000 0x0 0x10000000>; // IO (1), no bus address (2), cpu address (2), size (2) // MMIO (1), at address (2), cpu address (2), size (2) ranges = <0x01000000 0x0 0x00000000 0x0 0x2f000000 0x0 0x10000>, <0x02000000 0x0 0x40000000 0x0 0x40000000 0x0 0x10000000>; // With gem5 we typically use INTA/B/C/D one per device interrupt-map = <0x0000 0x0 0x0 0x1 0x1 0x0 0x11 0x1 0x0000 0x0 0x0 0x2 0x1 0x0 0x12 0x1 0x0000 0x0 0x0 0x3 0x1 0x0 0x13 0x1 0x0000 0x0 0x0 0x4 0x1 0x0 0x14 0x1>; // Only match INTA/B/C/D and not BDF interrupt-map-mask = <0x0000 0x0 0x0 0x7>; };
2014-09-03	config: Add port splicing capability to PortRef class	Geoffrey Blake
	The new configuration scripts need the ability to splice a simobject between a pair of ports that are already connected. The primary use case is when a CommMonitor needs to be created after the system is configured and then spliced between the pair of ports it will monitor.
2014-09-03	config: Refactor RealviewEMM to fit into new config system	Geoffrey Blake
	This eliminates some default devices and adds in helper functions to connect the devices defined here to associate with the proper clock domains.
2014-09-03	base: Use STL C++11 random number generation	Andreas Hansson
	This patch changes the random number generator from the in-house Mersenne twister to an implementation relying entirely on C++11 STL. The format for the checkpointing of the twister is simplified. As the functionality was never used this should not matter. Note that this patch does not actually make use of the checkpointing functionality. As the random number generator is not thread safe, it may be sensible to create one generator per thread, system, or even object. Until this is decided the status quo is maintained in that no generator state is part of the checkpoint.
2014-09-03	base: Use the global Mersenne twister throughout	Andreas Hansson
	This patch tidies up random number generation to ensure that it is done consistently throughout the code base. In essence this involves a clean-up of Ruby, and some code simplifications in the traffic generator. As part of this patch a bunch of skewed distributions (off-by-one etc) have been fixed. Note that a single global random number generator is used, and that the object instantiation order will impact the behaviour (the sequence of numbers will be unaffected, but if module A calles random before module B then they would obviously see a different outcome). The dependency on the instantiation order is true in any case due to the execution-model of gem5, so we leave it as is. Also note that the global ranom generator is not thread safe at this point. Regressions using the memtest, TrafficGen or any Ruby tester are affected and will be updated accordingly.
2014-09-03	mem: Avoid unecessary retries when bus peer is not ready	Andreas Hansson
	This patch removes unecessary retries that happened when the bus layer itself was no longer busy, but the the peer was not yet ready. Instead of sending a retry that will inevitably not succeed, the bus now silenty waits until the peer sends a retry.
2014-09-03	arm: Make memory ops work on 64bit/128-bit quantities	Mitch Hayenga
	Multiple instructions assume only 32-bit load operations are available, this patch increases load sizes to 64-bit or 128-bit for many load pair and load multiple instructions.
2014-06-27	mem: write streaming support via WriteInvalidate promotion	Curtis Dunham
	Support full-block writes directly rather than requiring RMW: * a cache line is allocated in the cache upon receipt of a WriteInvalidateReq, not the WriteInvalidateResp. * only top-level caches allocate the line; the others just pass the request along and invalidate as necessary. * to close a timing window between the Req and the Resp, a new metadata bit tracks whether another cache has read a copy of the new line before the writeback to memory.
2014-09-03	mem: Fix a bug in the cache port flow control	Andreas Hansson
	This patch fixes a bug in the cache port where the retry flag was reset too early, allowing new requests to arrive before the retry was actually sent, but with the event already scheduled. This caused a deadlock in the interactions with the O3 LSQ. The patche fixes the underlying issue by shifting the resetting of the flag to be done by the event that also calls sendRetry(). The patch also tidies up the flow control in recvTimingReq and ensures that we also check if we already have a retry outstanding.
2014-05-13	cpu, mem: Make software prefetches non-blocking	Curtis Dunham
	Previously, they were treated so much like loads that they could stall at the head of the ROB. Now they are always treated like L1 hits. If they actually miss, a new request is created at the L1 and tracked from the MSHRs there if necessary (i.e. if it didn't coalesce with an existing outstanding load).
2014-05-13	mem: Refactor assignment of Packet types	Curtis Dunham
	Put the packet type swizzling (that is currently done in a lot of places) into a refineCommand() member function.
2014-09-03	x86: Flag instructions that call suspend as IsQuiesce	Mitch Hayenga
	The o3 cpu relies upon instructions that suspend a thread context being flagged as "IsQuiesce". If they are not, unpredictable behavior can occur. This patch fixes that for the x86 ISA.
2014-09-03	cpu: Fix o3 drain bug	Mitch Hayenga
	For X86, the o3 CPU would get stuck with the commit stage not being drained if an interrupt arrived while drain was pending. isDrained() makes sure that pcState.microPC() == 0, thus ensuring that we are at an instruction boundary. However, when we take an interrupt we execute: pcState.upc(romMicroPC(entry)); pcState.nupc(romMicroPC(entry) + 1); tc->pcState(pcState); As a result, the MicroPC is no longer zero. This patch ensures the drain is delayed until no interrupts are present. Once draining, non-synchronous interrupts are deffered until after the switch.
2014-09-03	arm: Fix v8 neon latency issue for loads/stores	Mitch Hayenga
	Neon memory ops that operate on multiple registers currently have very poor performance because of interleave/deinterleave micro-ops. This patch marks the deinterleave/interleave micro-ops as "No_OpClass" such that they take minumum cycles to execute and are never resource constrained. Additionaly the micro-ops over-read registers. Although one form may need to read up to 20 sources, not all do. This adds in new forms so false dependencies are not modeled. Instructions read their minimum number of sources.
2014-04-29	arm: use condition code registers for ARM ISA	Curtis Dunham
	Analogous to ee049bf (for x86). Requires a bump of the checkpoint version and corresponding upgrader code to move the condition code register values to the new register file.
2014-09-03	arm: ISA X31 destination register fix	Andrew Bardsley
	This patch substituted the zero register for X31 used as a destination register. This prevents false dependencies based on X31.
2014-09-03	cpu: fix bimodal predictor to use correct global history reg	Dam Sunwoo
	A small bug in the bimodal predictor caused significant degradation in performance on some benchmarks. This was caused by using the wrong globalHistoryReg during the update phase. This patches fixes the bug and brings the performance to normal level.
2014-09-03	arm: Mark v7 cbz instructions as direct branches	Mitch Hayenga
	v7 cbz/cbnz instructions were improperly marked as indirect branches.
2014-09-03	cpu: Fix cache blocked load behavior in o3 cpu	Mitch Hayenga
	This patch fixes the load blocked/replay mechanism in the o3 cpu. Rather than flushing the entire pipeline, this patch replays loads once the cache becomes unblocked. Additionally, deferred memory instructions (loads which had conflicting stores), when replayed would not respect the number of functional units (only respected issue width). This patch also corrects that. Improvements over 20% have been observed on a microbenchmark designed to exercise this behavior.
2014-09-03	cpu: Fix o3 quiesce fetch bug	Mitch Hayenga
	O3 is supposed to stop fetching instructions once a quiesce is encountered. However due to a bug, it would continue fetching instructions from the current fetch buffer. This is because of a break statment that only broke out of the first of 2 nested loops. It should have broken out of both.
2014-09-03	cpu: Fix SMT scheduling issue with the O3 cpu	Mitch Hayenga
	The o3 cpu could attempt to schedule inactive threads under round-robin SMT mode. This is because it maintained an independent priority list of threads from the active thread list. This priority list could be come stale once threads were inactive, leading to the cpu trying to fetch/commit from inactive threads. Additionally the fetch queue is now forcibly flushed of instrctuctions from the de-scheduled thread. Relevant output: 24557000: system.cpu: [tid:1]: Calling deactivate thread. 24557000: system.cpu: [tid:1]: Removing from active threads list 24557500: system.cpu: FullO3CPU: Ticking main, FullO3CPU. 24557500: system.cpu.fetch: Running stage. 24557500: system.cpu.fetch: Attempting to fetch from [tid:1]
2014-09-03	cpu: Fix incorrect speculative branch predictor behavior	Mitch Hayenga
	When a branch mispredicted gem5 would squash all history after and including the mispredicted branch. However, the mispredicted branch is still speculative and its history is required to rollback state if another, older, branch mispredicts. This leads to things like RAS corruption.
2014-09-03	cpu: Add a fetch queue to the o3 cpu	Mitch Hayenga
	This patch adds a fetch queue that sits between fetch and decode to the o3 cpu. This effectively decouples fetch from decode stalls allowing it to be more aggressive, running futher ahead in the instruction stream.
2014-09-03	cpu: Fix o3 front-end pipeline interlock behavior	Mitch Hayenga
	The o3 pipeline interlock/stall logic is incorrect. o3 unnecessicarily stalled fetch and decode due to later stages in the pipeline. In general, a stage should usually only consider if it is stalled by the adjacent, downstream stage. Forcing stalls due to later stages creates and results in bubbles in the pipeline. Additionally, o3 stalled the entire frontend (fetch, decode, rename) on a branch mispredict while the ROB is being serially walked to update the RAT (robSquashing). Only should have stalled at rename.