gem5 - gem5

Age	Commit message (Collapse)	Author
2014-09-20	mem: Add memory rank-to-rank delay	Wendy Elsasser
	Add the following delay to the DRAM controller: - tCS : Different rank bus turnaround delay This will be applied for 1) read-to-read, 2) write-to-write, 3) write-to-read, and 4) read-to-write command sequences, where the new command accesses a different rank than the previous burst. The delay defaults to 2*tCK for each defined memory class. Note that this does not correspond to one particular timing constraint, but is a way of modelling all the associated constraints. The DRAM controller has some minor changes to prioritize commands to the same rank. This prioritization will only occur when the command stream is not switching from a read to write or vice versa (in the case of switching we have a gap in any case). To prioritize commands to the same rank, the model will determine if there are any commands queued (same type) to the same rank as the previous command. This check will ensure that the 'same rank' command will be able to execute without adding bubbles to the command flow, e.g. any ACT delay requirements can be done under the hoods, allowing the burst to issue seamlessly.
2014-09-20	cpu: Update DRAM traffic gen	Wendy Elsasser
	Add new DRAM_ROTATE mode to traffic generator. This mode will generate DRAM traffic that rotates across banks per rank, command types, and ranks per channel The looping order is illustrated below: for (ranks per channel) for (command types) for (banks per rank) // Generate DRAM Command Series This patch also adds the read percentage as an input argument to the DRAM sweep script. If the simulated read percentage is 0 or 100, the middle for loop does not generate additional commands. This loop is used only when the read percentage is set to 50, in which case the middle loop will toggle between read and write commands. Modified sweep.py script, which generates DRAM traffic. Added input arguments and support for new DRAM_ROTATE mode. The script now has input arguments for: 1) Read percentage 2) Number of ranks 3) Address mapping 4) Traffic generator mode (DRAM or DRAM_ROTATE) The default values are: 100% reads, 1 rank, RoRaBaCoCh address mapping, and DRAM traffic gen mode For the DRAM traffic mode, added multi-rank support.
2014-09-20	dev: Add support for 9p proxying over VirtIO	Andreas Sandberg
	This patch adds support for 9p filesystem proxying over VirtIO. It can currently operate by connecting to a 9p server over a socket (VirtIO9PSocket) or by starting the diod 9p server and connecting over pipe (VirtIO9PDiod). WARNING: Checkpoints are currently not supported for systems with 9p proxies!
2014-09-20	dev: Add a VirtIO block device model	Andreas Sandberg

2014-09-20	dev: Add a VirtIO console device model	Andreas Sandberg

2014-09-20	dev, pci: Implement basic VirtIO support	Andreas Sandberg
	This patch adds support for VirtIO over the PCI bus. It does so by providing the following new SimObjects: * VirtIODeviceBase - Abstract base class for VirtIO devices. * PciVirtIO - VirtIO PCI transport interface. A VirtIO device is hooked up to the guest system by adding a PciVirtIO device to the PCI bus and connecting it to a VirtIO device using the vio parameter. New VirtIO devices should inherit from VirtIODevice base and implementing one or more VirtQueues. The VirtQueues are usually device-specific and all derive from the VirtQueue class. Queues must be registered with the base class from the constructor since the device assumes that the number of queues stay constant.
2014-09-20	dev: Refactor terminal<->UART interface to make it more generic	Andreas Sandberg
	The terminal currently assumes that the transport to the guest always inherits from the Uart class. This assumption breaks when implementing, for example, a VirtIO consoles. This patch removes this assumption by adding pointer to the from the terminal to the uart and replacing it with a more general callback interface. The Uart, or any other class using the terminal, class implements an instance of the callbacks class and registers it with the terminal.
2014-09-20	base: Clean up redundant string functions and use C++11	Andreas Hansson
	This patch does a bit of housekeeping on the string helper functions and relies on the C++11 standard library where possible. It also does away with our custom string hash as an implementation is already part of the standard library.
2014-09-20	base: Add getSectionNames to IniFile	Andrew Bardsley
	Add an accessor to IniFile to list all the sections in the file.
2014-09-20	cpu: Add ExecFlags debug flag	Mitch Hayenga
	Adds a debug flag to print out the flags a instruction is tagged with.
2014-09-20	mem: Remove the GHB prefetcher from the source tree	Mitch Hayenga
	There are two primary issues with this code which make it deserving of deletion. 1) GHB is a way to structure a prefetcher, not a definitive type of prefetcher 2) This prefetcher isn't even structured like a GHB prefetcher. It's basically a worse version of the stride prefetcher. It primarily serves to confuse new gem5 users and most functionality is already present in the stride prefetcher.
2014-09-20	cpu: use probes infrastructure to do simpoint profiling	Dam Sunwoo
	Instead of having code embedded in cpu model to do simpoint profiling use the probes infrastructure to do it.
2014-09-20	config: Cleanup .json config file generation	Andrew Bardsley
	This patch 'completes' .json config files generation by adding in the SimObject references and String-valued parameters not currently printed. TickParamValues are also changed to print in the same tick-value format as in .ini files. This allows .json files to describe a system as fully as the .ini files currently do. This patch adds a new function config_value (which mirrors ini_str) to each ParamValue and to SimObject. This function can then be explicitly changed to give different .json and .ini printing behaviour rather than being written in terms of ini_str.
2014-09-19	arch: Pass faults by const reference where possible	Andreas Hansson
	This patch changes how faults are passed between methods in an attempt to copy as few reference-counting pointer instances as possible. This should avoid unecessary copies being created, contributing to the increment/decrement of the reference counters.
2014-09-19	cpu: Use a deque in o3 rename instruction queue	Andreas Hansson
	Switch from a list to a data structure with better data layout.
2014-09-19	base: Ensure the CP annotation compiles again	Andreas Hansson
	A bit of revamping to get the CP annotate functionality to compile.
2014-09-19	misc: Use safe_cast when assumptions are made about return value	Andreas Hansson
	This patch changes two dynamic_cast to safe_cast as we assume the return value is not NULL (without checking).
2014-09-19	misc: Restore ostream flags where needed	Andreas Hansson
	This patch ensures we adhere to the normal ostream usage rules, and restore the flags after modifying them.
2014-09-19	stats: Fix flow-control bug in Vector2D printing	Andreas Hansson

2014-09-19	misc: Remove assertions ensuring unsigned values >= 0	Andreas Hansson

2014-09-19	mem: Check return value of checkFunctional in SimpleMemory	Andreas Hansson
	Simple fix to ensure we only iterate until we are done.
2014-09-19	mem: Add checks to sendTimingReq in cache	Andreas Hansson
	A small fix to ensure the return value is not ignored.
2014-09-15	ruby: network: revert some of the changes from ad9c042dce54	Nilay Vaish
	The changeset ad9c042dce54 made changes to the structures under the network directory to use a map of buffers instead of vector of buffers. The reasoning was that not all vnets that are created are used and we needlessly allocate more buffers than required and then iterate over them while processing network messages. But the move to map resulted in a slow down which was pointed out by Andreas Hansson. This patch moves things back to using vector of message buffers.
2014-09-12	cpu: Fix memory access in Minor not setting parent Request flags	Andrew Bardsley
	This patch fixes cases where uncacheable/memory type flags are not set correctly on a memory op which is split in the LSQ. Without this patch, request->request if freely used to check flags where the flags should actually come from the accumulation of request fragment flags. This patch also fixes a bug where an uncacheable access which passes through tryToSendRequest more than once can increment LSQ::numAccessesInMemorySystem more than once.
2014-09-12	style: Fix line continuation, especially in debug messages	Andrew Bardsley
	This patch closes a number of space gaps in debug messages caused by the incorrect use of line continuation within strings. (There's also one consistency change to a similar, but correct, use of line continuation)
2014-09-12	minor: Fix typo in DPRINTF for Minor branch prediction	Andreas Hansson

2014-09-09	sim: Automatically unregister probe listeners	Andreas Sandberg
	The ProbeListener base class automatically registers itself with a probe manager. Currently, the class does not unregister a itself when it is destroyed, which makes removing probes listeners somewhat cumbersome. This patch adds an automatic call to manager->removeListener in the ProbeListener destructor, which solves the problem.
2014-09-09	config: Fix vectorparam command line parsing	Geoffrey Blake
	Parsing vectorparams from the command was slightly broken in that it wouldn't accept the input that the help message provided to the user and it didn't do the conversion on the second code path used to convert the string input to the actual internal representation. This patch fixes these bugs.
2014-09-09	cpu: Only iterate over possible threads on the o3 cpu	Mitch Hayenga
	Some places in O3 always iterated over "Impl::MaxThreads" even if a CPU had fewer threads. This removes a few of those instances.
2014-09-09	mem: Add accessor function for vaddr	Mitch Hayenga
	Determine if a request has an associated virtual address.
2014-09-09	sim: Fix resource leak in BaseGlobalEvent	Andreas Sandberg
	Static analysis revealed that BaseGlobalEvent::barrier was never deallocated. This changeset solves this leak by making the barrier allocation a part of the BaseGlobalEvent instead of storing a pointer to a separate heap-allocated barrier.
2014-09-09	misc: Fix a number of unitialised variables and members	Andreas Hansson
	Static analysis unearther a bunch of uninitialised variables and members, and this patch addresses the problem. In all cases these omissions seem benign in the end, but at least fixing them means less false positives next time round.
2014-09-03	dev: seperate legacy io offsets from PCI offset	Ali Saidi
	The PC platform has a single IO range that is used both legacy IO and PCI IO while other platforms may use seperate regions. Provide another mechanism to configure the legacy IO base address range and set it to the PCI IO address range for x86.
2014-09-03	arm: Support >2GB of memory for AArch64 systems	Ali Saidi

2014-09-03	dev, arm: Add support for linux generic pci host driver	Ali Saidi
	This change adds support for a generic pci host bus driver that has been included in recent Linux kernel instead of the more bespoke one we've been using to date. It also works with aarch64 so it provides PCI support for 64-bit ARM Linux. To make this work a new configuration option pci_io_base is added to the RealView platform that should be set to the start of the memory used as memory mapped IO ports (IO ports that are memory mapped, not regular memory mapped IO). And a parameter pci_cfg_gen_offsets which specifies if the config space offsets should be used that the generic driver expects. To use the pci-host-generic device you need to: pci_io_base = 0x2f000000 (Valid for VExpress EMM) pci_cfg_gen_offsets = True and add the following to your device tree: pci { compatible = "pci-host-ecam-generic"; device_type = "pci"; #address-cells = <0x3>; #size-cells = <0x2>; #interrupt-cells = <0x1>; //bus-range = <0x0 0x1>; // CPU_PHYSICAL(2) SIZE(2) // Note, some DTS blobs only support 1 size reg = <0x0 0x30000000 0x0 0x10000000>; // IO (1), no bus address (2), cpu address (2), size (2) // MMIO (1), at address (2), cpu address (2), size (2) ranges = <0x01000000 0x0 0x00000000 0x0 0x2f000000 0x0 0x10000>, <0x02000000 0x0 0x40000000 0x0 0x40000000 0x0 0x10000000>; // With gem5 we typically use INTA/B/C/D one per device interrupt-map = <0x0000 0x0 0x0 0x1 0x1 0x0 0x11 0x1 0x0000 0x0 0x0 0x2 0x1 0x0 0x12 0x1 0x0000 0x0 0x0 0x3 0x1 0x0 0x13 0x1 0x0000 0x0 0x0 0x4 0x1 0x0 0x14 0x1>; // Only match INTA/B/C/D and not BDF interrupt-map-mask = <0x0000 0x0 0x0 0x7>; };
2014-09-03	config: Add port splicing capability to PortRef class	Geoffrey Blake
	The new configuration scripts need the ability to splice a simobject between a pair of ports that are already connected. The primary use case is when a CommMonitor needs to be created after the system is configured and then spliced between the pair of ports it will monitor.
2014-09-03	config: Refactor RealviewEMM to fit into new config system	Geoffrey Blake
	This eliminates some default devices and adds in helper functions to connect the devices defined here to associate with the proper clock domains.
2014-09-03	base: Use STL C++11 random number generation	Andreas Hansson
	This patch changes the random number generator from the in-house Mersenne twister to an implementation relying entirely on C++11 STL. The format for the checkpointing of the twister is simplified. As the functionality was never used this should not matter. Note that this patch does not actually make use of the checkpointing functionality. As the random number generator is not thread safe, it may be sensible to create one generator per thread, system, or even object. Until this is decided the status quo is maintained in that no generator state is part of the checkpoint.
2014-09-03	base: Use the global Mersenne twister throughout	Andreas Hansson
	This patch tidies up random number generation to ensure that it is done consistently throughout the code base. In essence this involves a clean-up of Ruby, and some code simplifications in the traffic generator. As part of this patch a bunch of skewed distributions (off-by-one etc) have been fixed. Note that a single global random number generator is used, and that the object instantiation order will impact the behaviour (the sequence of numbers will be unaffected, but if module A calles random before module B then they would obviously see a different outcome). The dependency on the instantiation order is true in any case due to the execution-model of gem5, so we leave it as is. Also note that the global ranom generator is not thread safe at this point. Regressions using the memtest, TrafficGen or any Ruby tester are affected and will be updated accordingly.
2014-09-03	mem: Avoid unecessary retries when bus peer is not ready	Andreas Hansson
	This patch removes unecessary retries that happened when the bus layer itself was no longer busy, but the the peer was not yet ready. Instead of sending a retry that will inevitably not succeed, the bus now silenty waits until the peer sends a retry.
2014-09-03	arm: Make memory ops work on 64bit/128-bit quantities	Mitch Hayenga
	Multiple instructions assume only 32-bit load operations are available, this patch increases load sizes to 64-bit or 128-bit for many load pair and load multiple instructions.
2014-06-27	mem: write streaming support via WriteInvalidate promotion	Curtis Dunham
	Support full-block writes directly rather than requiring RMW: * a cache line is allocated in the cache upon receipt of a WriteInvalidateReq, not the WriteInvalidateResp. * only top-level caches allocate the line; the others just pass the request along and invalidate as necessary. * to close a timing window between the Req and the Resp, a new metadata bit tracks whether another cache has read a copy of the new line before the writeback to memory.
2014-09-03	mem: Fix a bug in the cache port flow control	Andreas Hansson
	This patch fixes a bug in the cache port where the retry flag was reset too early, allowing new requests to arrive before the retry was actually sent, but with the event already scheduled. This caused a deadlock in the interactions with the O3 LSQ. The patche fixes the underlying issue by shifting the resetting of the flag to be done by the event that also calls sendRetry(). The patch also tidies up the flow control in recvTimingReq and ensures that we also check if we already have a retry outstanding.
2014-05-13	cpu, mem: Make software prefetches non-blocking	Curtis Dunham
	Previously, they were treated so much like loads that they could stall at the head of the ROB. Now they are always treated like L1 hits. If they actually miss, a new request is created at the L1 and tracked from the MSHRs there if necessary (i.e. if it didn't coalesce with an existing outstanding load).
2014-05-13	mem: Refactor assignment of Packet types	Curtis Dunham
	Put the packet type swizzling (that is currently done in a lot of places) into a refineCommand() member function.
2014-09-03	x86: Flag instructions that call suspend as IsQuiesce	Mitch Hayenga
	The o3 cpu relies upon instructions that suspend a thread context being flagged as "IsQuiesce". If they are not, unpredictable behavior can occur. This patch fixes that for the x86 ISA.
2014-09-03	cpu: Fix o3 drain bug	Mitch Hayenga
	For X86, the o3 CPU would get stuck with the commit stage not being drained if an interrupt arrived while drain was pending. isDrained() makes sure that pcState.microPC() == 0, thus ensuring that we are at an instruction boundary. However, when we take an interrupt we execute: pcState.upc(romMicroPC(entry)); pcState.nupc(romMicroPC(entry) + 1); tc->pcState(pcState); As a result, the MicroPC is no longer zero. This patch ensures the drain is delayed until no interrupts are present. Once draining, non-synchronous interrupts are deffered until after the switch.
2014-09-03	arm: Fix v8 neon latency issue for loads/stores	Mitch Hayenga
	Neon memory ops that operate on multiple registers currently have very poor performance because of interleave/deinterleave micro-ops. This patch marks the deinterleave/interleave micro-ops as "No_OpClass" such that they take minumum cycles to execute and are never resource constrained. Additionaly the micro-ops over-read registers. Although one form may need to read up to 20 sources, not all do. This adds in new forms so false dependencies are not modeled. Instructions read their minimum number of sources.
2014-04-29	arm: use condition code registers for ARM ISA	Curtis Dunham
	Analogous to ee049bf (for x86). Requires a bump of the checkpoint version and corresponding upgrader code to move the condition code register values to the new register file.
2014-09-03	arm: ISA X31 destination register fix	Andrew Bardsley
	This patch substituted the zero register for X31 used as a destination register. This prevents false dependencies based on X31.