gem5 - gem5

Age	Commit message (Collapse)	Author
2016-07-01	mem: different HMC configuration	Abdul Mutaal Ahmad
	In this new hmc configuration we have used the existing components in gem5 mainly [SerialLink] [NoncoherentXbar]& [DRAMCtrl] to define 3 different architecture for HMC. Highlights 1- It explores 3 different HMC architectures 2- It creates 4-HMC crossbars and attaches 16 vault controllers with it. This will connect vaults to serial links 3- From the previous version, HMCController with round robin funtionality is being removed and all the serial links are being accessible directly from user ports 4- Latency incorporated by HMCController (in previous version) is being added to SerialLink Committed by Jason Lowe-Power <jason@lowepower.com>
2016-06-20	config: Fix omission of walker cache in config scripts	Andreas Hansson
	This patch ensures a walker cache is instantiated if specfied. Change-Id: I2c6b4bf3454d56bb19558c73b406e1875acbd986 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com>
2016-06-09	gpu-compute: parametrize Wavefront size	jkalamat
	Eliminate the VSZ constant that defined the Wavefront size (in numbers of work items); replaced it with a parameter in the GPU.py configuration script. Changed all data structures dependent on the Wavefront size to be dynamically sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at initialization time.
2016-05-27	mem, config: Selective use of snoop filter	Stephan Diestelhorst
	Disable the default snoop filter in the SystemXBar so that the typical membus does not have a snoop filter by default. Instead, add the snoop filter only when there are caches added to the system (with the caches / l2cache options). The underlying problem is that the snoop filter grows without bounds (for now) if there are no caches to tell it that lines have been evicted. This causes slow regression runs for all the atomic regressions. This patch fixes this behaviour. --HG-- extra : source : f97c20511828209757440839ed48d741d02d428f
2016-05-19	config, x86: Properly space pad the X86IntelMPBus Entry descriptions	Bjoern A. Zeeb
	According to the Intel Multi Processor Specification rev 1.4 (-006) (), section 4.3.2 Bus Entries, Bus type strings are >>6-character ASCII (blank-filled) strings<<. This patch properly pads the entries with the missing spaces at the end. () http://www.intel.com/design/pentium/datashts/24201606.pdf Committed by Jason Lowe-Power <power.jg@gmail.com>
2016-04-21	config: Add missing point of coherency to memcheck script	Andreas Hansson
	Bring in line with changes to the XBar class.
2016-04-14	dist: config file for distributed switch	Mohammad Alian
	Distributed gem5 is the result of the convergence effort between multi-gem5 and pd-gem5. It relies on the base multi-gem5 infrastructure for packet forwarding, synchronisation and checkpointing but combines those with the elaborated network switch model from pd-gem5.
2016-03-08	configs: Add a lat_mem_rd style test script	Andreas Hansson
	This patch adds a config script that broadly replicates the behaviour of lat_mem_rd. The test is based on traffic generators, and as such we simply randomise addresses in increasingly large ranges, and play them back using the trace functionality of the traffic generator. The test script is accompanied by a post-processing and visualisation script. At the moment no configurability is added to tweak the memory hierarchy, but a follow on patch could easily extend the functionality.
2016-02-10	mem: Move the point of coherency to the coherent crossbar	Andreas Hansson
	This patch introduces the ability of making the coherent crossbar the point of coherency. If so, the crossbar does not forward packets where a cache with ownership has already committed to responding, and also does not forward any coherency-related packets that are not intended for a downstream memory controller. Thus, invalidations and upgrades are turned around in the crossbar, and the memory controller only sees normal reads and writes. In addition this patch moves the express snoop promotion of a packet to the crossbar, thus allowing the downstream cache to check the express snoop flag (as it should) for bypassing any blocking, rather than relying on whether a cache is responding or not.
2016-02-10	mem: Deduce if cache should forward snoops	Andreas Hansson
	This patch changes how the cache determines if snoops should be forwarded from the memory side to the CPU side. Instead of having a parameter, the cache now looks at the port connected on the CPU side, and if it is a snooping port, then snoops are forwarded. Less error prone, and less parameters to worry about. The patch also tidies up the CPU classes to ensure that their I-side port is not snooping by removing overrides to the snoop request handler, such that snoop requests will panic via the default MasterPort implement
2016-02-06	style: remove trailing whitespace	Steve Reinhardt
	Result of running 'hg m5style --skip-all --fix-white -a'.
2016-01-22	ruby: changed all references to numCPs to num-cp	Brad Beckmann

2016-01-19	gpu-compute: AMD's baseline GPU model	Tony Gutierrez

2016-01-15	dev, arm: Add a platform with support for both aarch32 and aarch64	Andreas Sandberg
	Add a platform with support for both aarch32 and aarch64. This platform implements a subset of the devices in a real Versatile Express and extends it with some gem5-specific functionality. It is in many ways similar to the old VExpress_EMM64 platform, but supports the following new features: * Automatic PCI interrupt assignment * PCI interrupts allocated in a contiguous range. * Automatic boot loader selection (32-bit / 64-bit) * Cleaner memory map where gem5-specific devices live in CS5 which isn't used by current Versatile Express platforms. * No fake devices. Devices that were previously faked will be removed from the device tree instead. * Support for 510 GiB contiguous memory
2016-01-11	configs: Fix inheritance of HMCSystem and cleanup spacing	Andreas Hansson
	Minor fix to ensure the HMCSystem can actually be instantiated (SimObject cannot be created). Also address some spacing issues.
2016-01-07	config: Updates for distributed gem5 simulations	Gabor Dozsa

2015-12-17	configs: Make the default memtest behaviour more complex	Andreas Hansson
	Add functional and uncacheable accesses by default.
2015-07-20	ruby: more flexible ruby tester support	Brad Beckmann
	This patch allows the ruby random tester to use ruby ports that may only support instr or data requests. This patch is similar to a previous changeset (8932:1b2c17565ac8) that was unfortunately broken by subsequent changesets. This current patch implements the support in a more straight-forward way. Since retries are now tested when running the ruby random tester, this patch splits up the retry and drain check behavior so that RubyPort children, such as the GPUCoalescer, can perform those operations correctly without having to duplicate code. Finally, the patch also includes better DPRINTFs for debugging the tester.
2015-12-07	config: Enable elastic trace capture and replay in se/fs	Radhika Jagtap
	This patch adds changes to the configuration scripts to support elastic tracing and replay. The patch adds a command line option to enable elastic tracing in SE mode and FS mode. When enabled the Elastic Trace cpu probe is attached to O3CPU and a few O3 CPU parameters are tuned. The Elastic Trace probe writes out both instruction fetch and data dependency traces. The patch also enables configuring the TraceCPU to replay traces using the SE and FS script. The replay run is designed to resume from checkpoint using atomic cpu to restore state keeping it consistent with FS run flow. It then switches to TraceCPU to replay the input traces.
2015-12-05	dev: Rewrite PCI host functionality	Andreas Sandberg
	The gem5's current PCI host functionality is very ad hoc. The current implementations require PCI devices to be hooked up to the configuration space via a separate configuration port. Devices query the platform to get their config-space address range. Un-mapped parts of the config space are intercepted using the XBar's default port mechanism and a magic catch-all device (PciConfigAll). This changeset redesigns the PCI host functionality to improve code reuse and make config-space and interrupt mapping more transparent. Existing platform code has been updated to use the new PCI host and configured to stay backwards compatible (i.e., no guest-side visible changes). The current implementation does not expose any new functionality, but it can easily be extended with features such as automatic interrupt mapping. PCI devices now register themselves with a PCI host controller. The host controller interface is defined in the abstract base class PciHost. Registration is done by PciHost::registerDevice() which takes the device, its bus position (bus/dev/func tuple), and its interrupt pin (INTA-INTC) as a parameter. The registration interface returns a PciHost::DeviceInterface that the PCI device can use to query memory mappings and signal interrupts. The host device manages the entire PCI configuration space. Accesses to devices decoded into the devices bus position and then forwarded to the correct device. Basic PCI host functionality is implemented in the GenericPciHost base class. Most platforms can use this class as a basic PCI controller. It provides the following functionality: * Configurable configuration space decoding. The number of bits dedicated to a device is a prameter, making it possible to support both CAM, ECAM, and legacy mappings. * Basic interrupt mapping using the interruptLine value from a device's configuration space. This behavior is the same as in the old implementation. More advanced controllers can override the interrupt mapping method to dynamically assign host interrupts to PCI devices. * Simple (base + addr) remapping from the PCI bus's address space to physical addresses for PIO, memory, and DMA.
2015-12-04	arm, config: Automatically discover available platforms	Andreas Sandberg
	Add support for automatically discover available platforms. The Python-side uses functionality similar to what we use when auto-detecting available CPU models. The machine IDs have been updated to match the platform configurations. If there isn't a matching machine ID, the configuration scripts default to -1 which Linux uses for device tree only platforms.
2015-11-22	config: Added missing types to JSON/INI Python reader	Andrew Bardsley
	Added the missing types EthernetAddr and Current to the JSON/INI file reader example configs/example/read_config.py. Also added __str__ to EthernetAddr to make values appear in the same form in JSON an INI files.
2015-11-22	config: Minor fixes to the DRAM utilisation sweep	Andreas Hansson

2015-11-06	config: Update memtest to stress test clean writebacks	Andreas Hansson
	This patch adds yet another twist to the memtest cache hierarchy, in that the writeback_clean option is toggled at every level to match the clusivity of the downstream cache.
2015-11-06	mem: Add an option to perform clean writebacks from caches	Andreas Hansson
	This patch adds the necessary commands and cache functionality to allow clean writebacks. This functionality is crucial, especially when having exclusive (victim) caches. For example, if read-only L1 instruction caches are not sending clean writebacks, there will never be any spills from the L1 to the L2. At the moment the cache model defaults to not sending clean writebacks, and this should possibly be re-evaluated. The implementation of clean writebacks relies on a new packet command WritebackClean, which acts much like a Writeback (renamed WritebackDirty), and also much like a CleanEvict. On eviction of a clean block the cache either sends a clean evict, or a clean writeback, and if any copies are still cached upstream the clean evict/writeback is dropped. Similarly, if a clean evict/writeback reaches a cache where there are outstanding MSHRs for the block, the packet is dropped. In the typical case though, the clean writeback allocates a block in the downstream cache, and marks it writable if the evicted block was writable. The patch changes the O3_ARM_v7a L1 cache configuration and the default L1 caches in config/common/Caches.py
2015-11-06	config: Update memtest to stress test cache clusivity	Andreas Hansson
	This patch adds an new twist to the memtest cache hierarchy, in that it switches from mostly inclusive to mostly exclusive at every level in the tree. This has helped weed out plenty issues, and serves as a good stress tests.
2015-11-06	mem: Add cache clusivity	Andreas Hansson
	This patch adds a parameter to control the cache clusivity, that is if the cache is mostly inclusive or exclusive. At the moment there is no intention to support strict policies, and thus the options are: 1) mostly inclusive, or 2) mostly exclusive. The choice of policy guides the behaviuor on a cache fill, and a new helper function, allocOnFill, is created to encapsulate the decision making process. For the timing mode, the decision is annotated on the MSHR on sending out the downstream packet, and in atomic we directly pass the decision to handleFill. We (ab)use the tempBlock in cases where we are not allocating on fill, leaving the rest of the cache unaffected. Simple and effective. This patch also makes it more explicit that multiple caches are allowed to consider a block writable (this is the case also before this patch). That is, for a mostly inclusive cache, multiple caches upstream may also consider the block exclusive. The caches considering the block writable/exclusive all appear along the same path to memory, and from a coherency protocol point of view it works due to the fact that we always snoop upwards in zero time before querying any downstream cache. Note that this patch does not introduce clean writebacks. Thus, for clean lines we are essentially removing a cache level if it is made mostly exclusive. For example, lines from the read-only L1 instruction cache or table-walker cache are always clean, and simply get dropped rather than being passed to the L2. If the L2 is mostly exclusive and does not allocate on fill it will thus never hold the line. A follow on patch adds the clean writebacks. The patch changes the L2 of the O3_ARM_v7a CPU configuration to be mostly exclusive (and stats are affected accordingly).
2015-11-04	configs: fix bug introduced due to 276ad9121192	Nilay Vaish
	I had made a typo in changeset 276ad9121192. This changeset fixes it
2015-11-03	mem: hmc: top level design	Erfan Azarkhish
	This patch enables modeling a complete Hybrid Memory Cube (HMC) device. It highly reuses the existing components in gem5's general memory system with some small modifications. This changeset requires additional patches to model a complete HMC device. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-11-03	sparc: add missing parameter to makeSparcSystem()	Palle Lyckegaard
	makeSparcSystem() in configs/common/FSConfig.py is missing the cmdLine parameter Without the parameter the simulation fails to start. With the parameter the simulation starts properly.
2015-10-14	ruby: profiler: provide the number of vnets through ruby system	Nilay Vaish
	The aim is to ultimately do away with the static function Network::getNumberOfVirtualNetworks().
2015-10-01	config: Fix 'learning gem5' configs after SMT push	Andreas Hansson
	This patch updates the 'learning gem5' example scripts to match the recent push of the SMT patches.
2015-09-30	isa,cpu: Add support for FS SMT Interrupts	Mitch Hayenga
	Adds per-thread interrupt controllers and thread/context logic so that interrupts properly get routed in SMT systems.
2015-09-30	config,cpu: Add SMT support to Atomic and Timing CPUs	Mitch Hayenga
	Adds SMT support to the "simple" CPU models so that they can be used with other SMT-supported CPUs. Example usage: this enables the TimingSimpleCPU to be used to warmup caches before swapping to detailed mode with the in-order or out-of-order based CPU models.
2015-09-25	util: Fix minor issues in DRAM sweep scripts	Andreas Hansson
	This patch fixes a few issues in the sweep scripts, bringing them up-to-date with the latest memory configs and options.
2015-09-16	config: Add configs scripts used in Learning gem5	Jason Lowe-Power
	Added a new directory in configs (learning_gem5) to hold the scripts that are used in the book. See http://lowepower.com/jason/learning_gem5/ for a working copy. For now, only the scripts in Part 1: Getting started with gem5 have been added. A separate patch adds tests for these scripts. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-09-06	config: allow ruby to be used with Minor CPU	Nilay Vaish

2015-09-01	ruby: remove random seed	Nilay Vaish
	We no longer use the C library based random number generator: random(). Instead we use the C++ library provided rng. So setting the random seed for the RubySystem class has no effect. Hence the variable and the corresponding option are being dropped.
2015-08-30	ruby: specify number of vnets for each protocol	Nilay Vaish
	The default value for number of virtual networks is being removed. Each protocol should now specify the value it needs.
2015-08-21	mem: Add explicit Cache subclass and make BaseCache abstract	Andreas Hansson
	Open up for other subclasses to BaseCache and transition to using the explicit Cache subclass. --HG-- rename : src/mem/cache/BaseCache.py => src/mem/cache/Cache.py
2015-08-21	ruby: Move Rubys cache class from Cache.py to RubyCache.py	Andreas Hansson
	This patch serves to avoid name clashes with the classic cache. For some reason having two 'SimObject' files with the same name creates problems. --HG-- rename : src/mem/ruby/structures/Cache.py => src/mem/ruby/structures/RubyCache.py
2015-08-19	ruby: reverts to changeset: bf82f1f7b040	Nilay Vaish

2015-08-14	ruby: profiler: provide the number of vnets through ruby system	Nilay Vaish
	The aim is to ultimately do away with the static function Network::getNumberOfVirtualNetworks().
2015-08-14	ruby: remove random seed	Nilay Vaish
	We no longer use the C library based random number generator: random(). Instead we use the C++ library provided rng. So setting the random seed for the RubySystem class has no effect. Hence the variable and the corresponding option are being dropped.
2015-08-14	ruby: Protocol changes for SimObject MessageBuffers	Joel Hestness

2015-08-14	ruby: Expose MessageBuffers as SimObjects	Joel Hestness
	Expose MessageBuffers from SLICC controllers as SimObjects that can be manipulated in Python. This patch has numerous benefits: 1) First and foremost, it exposes MessageBuffers as SimObjects that can be manipulated in Python code. This allows parameters to be set and checked in Python code to avoid obfuscating parameters within protocol files. Further, now as SimObjects, MessageBuffer parameters are printed to config output files as a way to track parameters across simulations (e.g. buffer sizes) 2) Cleans up special-case code for responseFromMemory buffers, and aligns their instantiation and use with mandatoryQueue buffers. These two special buffers are the only MessageBuffers that are exposed to components outside of SLICC controllers, and they're both slave ends of these buffers. They should be exposed outside of SLICC in the same way, and this patch does it. 3) Distinguishes buffer-specific parameters from buffer-to-network parameters. Specifically, buffer size, randomization, ordering, recycle latency, and ports are all specific to a MessageBuffer, while the virtual network ID and type are intrinsics of how the buffer is connected to network ports. The former are specified in the Python object, while the latter are specified in the controller *.sm files. Unlike buffer-specific parameters, which may need to change depending on the simulated system structure, buffer-to-network parameters can be specified statically for most or all different simulated systems.
2015-08-14	ruby: Remove the RubyCache/CacheMemory latency	Joel Hestness
	The RubyCache (CacheMemory) latency parameter is only used for top-level caches instantiated for Ruby coherence protocols. However, the top-level cache hit latency is assessed by the Sequencer as accesses flow through to the cache hierarchy. Further, protocol state machines should be enforcing these cache hit latencies, but RubyCaches do not expose their latency to any existng state machines through the SLICC/C++ interface. Thus, the RubyCache latency parameter is superfluous for all caches. This is confusing for users. As a step toward pushing L0/L1 cache hit latency into the top-level cache controllers, move their latencies out of the RubyCache declarations and over to their Sequencers. Eventually, these Sequencer parameters should be exposed as parameters to the top-level cache controllers, which should assess the latency. NOTE: Assessing these latencies in the cache controllers will require modifying each to eliminate instantaneous Ruby hit callbacks in transitions that finish accesses, which is likely a large undertaking.
2015-08-03	misc: Coupling gem5 with SystemC TLM2.0	Matthias Jung
	Transaction Level Modeling (TLM2.0) is widely used in industry for creating virtual platforms (IEEE 1666 SystemC). This patch contains a standard compliant implementation of an external gem5 port, that enables the usage of gem5 as a TLM initiator component in SystemC based virtual platforms. Both TLM coding paradigms loosely timed (b_transport) and aproximately timed (nb_transport) are supported. Compared to the original patch a TLM memory manager was added. Furthermore, the transaction object was removed and for each TLM payload a PacketPointer that points to the original gem5 packet is added as an TLM extension. For event handling single events are now created. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-08-03	ruby: correctly number the sequencer in MESI_Three_Level.py	Nilay Vaish

2015-07-20	config: add base class for ruby controllers	David Hashe
	The CntrlBase python class handles configuration parameters such as running counts of controllers and sequencers.