gem5 - gem5

Age	Commit message (Collapse)	Author
2016-08-10	arm, config: Exit with fatal error if using Ruby	Andreas Sandberg
	Ruby on ARM is currently very experimental. Fail with a fatal error that explains this to make sure users are aware of the limitations (it doesn't actually work yet!). Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-08-10	arm, config: Add initial support for Ruby	Andreas Sandberg
	Add initial support for creating an ARM system with a Ruby-based memory system. This support is currently experimental and limited to the new VExpress_GEM5_V1 platform. Change-Id: I36baeb68b0d891e34ea46aafe17b5e55217b4bfa Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brad Beckmann <brad.beckmann@amd.com>
2016-07-19	config: Allow SPARC FS image to be specified on the command line	Jakub Jermar
	At the moment the SPARC FS machine configuration comes with a hardcoded value for using the Solaris 10 disk image from the OpenSPARC tarball. The --disk-image option is completely ignored for SPARC. This simple patch modifies the behavior so that --disk-image option is both taken into account and also required. This makes it possible to easily change SPARC FS images without having to modify the configuration files.
2016-07-01	mem: different HMC configuration	Abdul Mutaal Ahmad
	In this new hmc configuration we have used the existing components in gem5 mainly [SerialLink] [NoncoherentXbar]& [DRAMCtrl] to define 3 different architecture for HMC. Highlights 1- It explores 3 different HMC architectures 2- It creates 4-HMC crossbars and attaches 16 vault controllers with it. This will connect vaults to serial links 3- From the previous version, HMCController with round robin funtionality is being removed and all the serial links are being accessible directly from user ports 4- Latency incorporated by HMCController (in previous version) is being added to SerialLink Committed by Jason Lowe-Power <jason@lowepower.com>
2016-06-20	config: Fix omission of walker cache in config scripts	Andreas Hansson
	This patch ensures a walker cache is instantiated if specfied. Change-Id: I2c6b4bf3454d56bb19558c73b406e1875acbd986 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com>
2016-05-27	mem, config: Selective use of snoop filter	Stephan Diestelhorst
	Disable the default snoop filter in the SystemXBar so that the typical membus does not have a snoop filter by default. Instead, add the snoop filter only when there are caches added to the system (with the caches / l2cache options). The underlying problem is that the snoop filter grows without bounds (for now) if there are no caches to tell it that lines have been evicted. This causes slow regression runs for all the atomic regressions. This patch fixes this behaviour. --HG-- extra : source : f97c20511828209757440839ed48d741d02d428f
2016-05-19	config, x86: Properly space pad the X86IntelMPBus Entry descriptions	Bjoern A. Zeeb
	According to the Intel Multi Processor Specification rev 1.4 (-006) (), section 4.3.2 Bus Entries, Bus type strings are >>6-character ASCII (blank-filled) strings<<. This patch properly pads the entries with the missing spaces at the end. () http://www.intel.com/design/pentium/datashts/24201606.pdf Committed by Jason Lowe-Power <power.jg@gmail.com>
2016-02-10	mem: Deduce if cache should forward snoops	Andreas Hansson
	This patch changes how the cache determines if snoops should be forwarded from the memory side to the CPU side. Instead of having a parameter, the cache now looks at the port connected on the CPU side, and if it is a snooping port, then snoops are forwarded. Less error prone, and less parameters to worry about. The patch also tidies up the CPU classes to ensure that their I-side port is not snooping by removing overrides to the snoop request handler, such that snoop requests will panic via the default MasterPort implement
2016-02-06	style: remove trailing whitespace	Steve Reinhardt
	Result of running 'hg m5style --skip-all --fix-white -a'.
2016-01-19	gpu-compute: AMD's baseline GPU model	Tony Gutierrez

2016-01-15	dev, arm: Add a platform with support for both aarch32 and aarch64	Andreas Sandberg
	Add a platform with support for both aarch32 and aarch64. This platform implements a subset of the devices in a real Versatile Express and extends it with some gem5-specific functionality. It is in many ways similar to the old VExpress_EMM64 platform, but supports the following new features: * Automatic PCI interrupt assignment * PCI interrupts allocated in a contiguous range. * Automatic boot loader selection (32-bit / 64-bit) * Cleaner memory map where gem5-specific devices live in CS5 which isn't used by current Versatile Express platforms. * No fake devices. Devices that were previously faked will be removed from the device tree instead. * Support for 510 GiB contiguous memory
2016-01-11	configs: Fix inheritance of HMCSystem and cleanup spacing	Andreas Hansson
	Minor fix to ensure the HMCSystem can actually be instantiated (SimObject cannot be created). Also address some spacing issues.
2016-01-07	config: Updates for distributed gem5 simulations	Gabor Dozsa

2015-12-07	config: Enable elastic trace capture and replay in se/fs	Radhika Jagtap
	This patch adds changes to the configuration scripts to support elastic tracing and replay. The patch adds a command line option to enable elastic tracing in SE mode and FS mode. When enabled the Elastic Trace cpu probe is attached to O3CPU and a few O3 CPU parameters are tuned. The Elastic Trace probe writes out both instruction fetch and data dependency traces. The patch also enables configuring the TraceCPU to replay traces using the SE and FS script. The replay run is designed to resume from checkpoint using atomic cpu to restore state keeping it consistent with FS run flow. It then switches to TraceCPU to replay the input traces.
2015-12-05	dev: Rewrite PCI host functionality	Andreas Sandberg
	The gem5's current PCI host functionality is very ad hoc. The current implementations require PCI devices to be hooked up to the configuration space via a separate configuration port. Devices query the platform to get their config-space address range. Un-mapped parts of the config space are intercepted using the XBar's default port mechanism and a magic catch-all device (PciConfigAll). This changeset redesigns the PCI host functionality to improve code reuse and make config-space and interrupt mapping more transparent. Existing platform code has been updated to use the new PCI host and configured to stay backwards compatible (i.e., no guest-side visible changes). The current implementation does not expose any new functionality, but it can easily be extended with features such as automatic interrupt mapping. PCI devices now register themselves with a PCI host controller. The host controller interface is defined in the abstract base class PciHost. Registration is done by PciHost::registerDevice() which takes the device, its bus position (bus/dev/func tuple), and its interrupt pin (INTA-INTC) as a parameter. The registration interface returns a PciHost::DeviceInterface that the PCI device can use to query memory mappings and signal interrupts. The host device manages the entire PCI configuration space. Accesses to devices decoded into the devices bus position and then forwarded to the correct device. Basic PCI host functionality is implemented in the GenericPciHost base class. Most platforms can use this class as a basic PCI controller. It provides the following functionality: * Configurable configuration space decoding. The number of bits dedicated to a device is a prameter, making it possible to support both CAM, ECAM, and legacy mappings. * Basic interrupt mapping using the interruptLine value from a device's configuration space. This behavior is the same as in the old implementation. More advanced controllers can override the interrupt mapping method to dynamically assign host interrupts to PCI devices. * Simple (base + addr) remapping from the PCI bus's address space to physical addresses for PIO, memory, and DMA.
2015-12-04	arm, config: Automatically discover available platforms	Andreas Sandberg
	Add support for automatically discover available platforms. The Python-side uses functionality similar to what we use when auto-detecting available CPU models. The machine IDs have been updated to match the platform configurations. If there isn't a matching machine ID, the configuration scripts default to -1 which Linux uses for device tree only platforms.
2015-11-06	mem: Add an option to perform clean writebacks from caches	Andreas Hansson
	This patch adds the necessary commands and cache functionality to allow clean writebacks. This functionality is crucial, especially when having exclusive (victim) caches. For example, if read-only L1 instruction caches are not sending clean writebacks, there will never be any spills from the L1 to the L2. At the moment the cache model defaults to not sending clean writebacks, and this should possibly be re-evaluated. The implementation of clean writebacks relies on a new packet command WritebackClean, which acts much like a Writeback (renamed WritebackDirty), and also much like a CleanEvict. On eviction of a clean block the cache either sends a clean evict, or a clean writeback, and if any copies are still cached upstream the clean evict/writeback is dropped. Similarly, if a clean evict/writeback reaches a cache where there are outstanding MSHRs for the block, the packet is dropped. In the typical case though, the clean writeback allocates a block in the downstream cache, and marks it writable if the evicted block was writable. The patch changes the O3_ARM_v7a L1 cache configuration and the default L1 caches in config/common/Caches.py
2015-11-06	mem: Add cache clusivity	Andreas Hansson
	This patch adds a parameter to control the cache clusivity, that is if the cache is mostly inclusive or exclusive. At the moment there is no intention to support strict policies, and thus the options are: 1) mostly inclusive, or 2) mostly exclusive. The choice of policy guides the behaviuor on a cache fill, and a new helper function, allocOnFill, is created to encapsulate the decision making process. For the timing mode, the decision is annotated on the MSHR on sending out the downstream packet, and in atomic we directly pass the decision to handleFill. We (ab)use the tempBlock in cases where we are not allocating on fill, leaving the rest of the cache unaffected. Simple and effective. This patch also makes it more explicit that multiple caches are allowed to consider a block writable (this is the case also before this patch). That is, for a mostly inclusive cache, multiple caches upstream may also consider the block exclusive. The caches considering the block writable/exclusive all appear along the same path to memory, and from a coherency protocol point of view it works due to the fact that we always snoop upwards in zero time before querying any downstream cache. Note that this patch does not introduce clean writebacks. Thus, for clean lines we are essentially removing a cache level if it is made mostly exclusive. For example, lines from the read-only L1 instruction cache or table-walker cache are always clean, and simply get dropped rather than being passed to the L2. If the L2 is mostly exclusive and does not allocate on fill it will thus never hold the line. A follow on patch adds the clean writebacks. The patch changes the L2 of the O3_ARM_v7a CPU configuration to be mostly exclusive (and stats are affected accordingly).
2015-11-04	configs: fix bug introduced due to 276ad9121192	Nilay Vaish
	I had made a typo in changeset 276ad9121192. This changeset fixes it
2015-11-03	mem: hmc: top level design	Erfan Azarkhish
	This patch enables modeling a complete Hybrid Memory Cube (HMC) device. It highly reuses the existing components in gem5's general memory system with some small modifications. This changeset requires additional patches to model a complete HMC device. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-11-03	sparc: add missing parameter to makeSparcSystem()	Palle Lyckegaard
	makeSparcSystem() in configs/common/FSConfig.py is missing the cmdLine parameter Without the parameter the simulation fails to start. With the parameter the simulation starts properly.
2015-09-16	config: Add configs scripts used in Learning gem5	Jason Lowe-Power
	Added a new directory in configs (learning_gem5) to hold the scripts that are used in the book. See http://lowepower.com/jason/learning_gem5/ for a working copy. For now, only the scripts in Part 1: Getting started with gem5 have been added. A separate patch adds tests for these scripts. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-08-21	mem: Add explicit Cache subclass and make BaseCache abstract	Andreas Hansson
	Open up for other subclasses to BaseCache and transition to using the explicit Cache subclass. --HG-- rename : src/mem/cache/BaseCache.py => src/mem/cache/Cache.py
2015-08-03	misc: Coupling gem5 with SystemC TLM2.0	Matthias Jung
	Transaction Level Modeling (TLM2.0) is widely used in industry for creating virtual platforms (IEEE 1666 SystemC). This patch contains a standard compliant implementation of an external gem5 port, that enables the usage of gem5 as a TLM initiator component in SystemC based virtual platforms. Both TLM coding paradigms loosely timed (b_transport) and aproximately timed (nb_transport) are supported. Compared to the original patch a TLM memory manager was added. Furthermore, the transaction object was removed and for each TLM payload a PacketPointer that points to the original gem5 packet is added as an TLM extension. For event handling single events are now created. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-07-03	mem: Remove redundant is_top_level cache parameter	Andreas Hansson
	This patch takes the final step in removing the is_top_level parameter from the cache. With the recent changes to read requests and write invalidations, the parameter is no longer needed, and consequently removed. This also means that asymmetric cache hierarchies are now fully supported (and we are actually using them already with L1 caches, but no table-walker caches, connected to a shared L2).
2015-07-03	mem: Allow read-only caches and check compliance	Andreas Hansson
	This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data. A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data.
2015-06-01	kvm, arm: Add support for aarch64	Andreas Sandberg
	This changeset adds support for aarch64 in kvm. The CPU module supports both checkpointing and online CPU model switching as long as no devices are simulated by the host kernel. It currently has the following limitations: * The system register based generic timer can only be simulated by the host kernel. Workaround: Use a memory mapped timer instead to simulate the timer in gem5. * Simulating devices (e.g., the generic timer) in the host kernel requires that the host kernel also simulates the GIC. * ID registers in the host and in gem5 must match for switching between simulated CPUs and KVM. This is particularly important for ID registers describing memory system capabilities (e.g., ASID size, physical address size). * Switching between a virtualized CPU and a simulated CPU is currently not supported if in-kernel device emulation is used. This could be worked around by adding support for switching to the gem5 (e.g., the KvmGic) side of the device models. A simpler workaround is to avoid in-kernel device models altogether.
2015-05-05	arch, cpu: Do not forward snoops to table walker	Andreas Hansson
	This patch simplifies the overall CPU by changing the TLB caches such that they do not forward snoops to the table walker port(s). Note that only ARM and X86 are affected. There is no reason for the ports to snoop as they do not actually take any action, and from a performance point of view we are better of not snooping more than we have to. Should it at a later point be required to snoop for a particular TLB design it is easy enough to add it back.
2015-04-29	cpu: o3: replace issueLatency with bool pipelined	Nilay Vaish
	Currently, each op class has a parameter issueLat that denotes the cycles after which another op of the same class can be issued. As of now, this latency can either be one cycle (fully pipelined) or same as execution latency of the op (not at all pipelined). The fact that issueLat is a parameter of type Cycles makes one believe that it can be set to any value. To avoid the confusion, the parameter is being renamed as 'pipelined' with type boolean. If set to true, the op would execute in a fully pipelined fashion. Otherwise, it would execute in an unpipelined fashion.
2015-04-23	config: enable setting SE-mode environment variables from file	bpotter

2015-04-20	config: Remove memory aliases and rely on class name	Andreas Hansson
	Instead of maintaining two lists, rely entirely on the class name. There is really no point in causing unecessary confusion.
2015-04-14	config, cpu: fix progress interval for switched CPUs	Malek Musleh
	This patch ensures that the CPU progress Event is triggered for the new set of switched_cpus that get scheduled (e.g. during fast-forwarding). it also avoids printing the interval state if the cpu is currently switched out. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-13	cpu: re-organizes the branch predictor structure.	Dibakar Gope
	Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-08	config: Support full-system with SST's memory system	Curtis Dunham
	This patch adds an example configuration in ext/sst/tests/ that allows an SST/gem5 instance to simulate a 4-core AArch64 system with SST's memHierarchy components providing all the caches and memories.
2015-03-27	arm, configs: Do not forward snoops from I cache	Andreas Hansson
	This fix simply tells the I cache to not forward snoops to the fetch unit (since there is really no reason to do so).
2015-03-23	config: expand '~' and '~user' in paths	Steve Reinhardt

2015-03-23	config: Add ability to exit simulation after initialization	Curtis Dunham
	When using gem5 as a slave simulator, it will not advance the clock on its own and depends on the master simulator calling simulate(). This new option lets us use the Python scripts to do all the configuration while stopping short of actually simulating anything.
2015-03-19	config: Specify OS type and release on command line	Chris Emmons
	This patch enables users to speficy --os-type on the command line. This option is used to take specific actions for an OS type, such as changing the kernel command line. This patch is part of the Android KitKat enablement.
2015-03-09	config: Fix for 'android' lookup in disk name	Rizwana Begum
	This patch modifies FSConfig.py to look for 'android' only in disk image name. Before this patch, 'android' was searched in full disk path. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-03-02	mem: Move crossbar default latencies to subclasses	Andreas Hansson
	This patch introduces a few subclasses to the CoherentXBar and NoncoherentXBar to distinguish the different uses in the system. We use the crossbar in a wide range of places: interfacing cores to the L2, as a system interconnect, connecting I/O and peripherals, etc. Needless to say, these crossbars have very different performance, and the clock frequency alone is not enough to distinguish these scenarios. Instead of trying to capture every possible case, this patch introduces dedicated subclasses for the three primary use-cases: L2XBar, SystemXBar and IOXbar. More can be added if needed, and the defaults can be overridden.
2015-01-16	config: add --root-device machine parameter	Curtis Dunham
	In case /dev/sda1 is not actually the boot partition for an image, we can override it on the command line or in a benchmark definition.
2015-02-05	config: rename 'file' var	Steve Reinhardt
	Rename uses of 'file' as a local variable to avoid conflict with the built-in type of the same name.
2015-02-05	config: make M5_PATH a real search path	Steve Reinhardt
	Although you can put a list of colon-separated directory names in M5_PATH, the current code just takes the first one that exists and assumes all files must live there. This change makes the code search the specified list of directories for each individual binary or disk image that's requested. The main motivation is that the x86/Alpha binaries and the ARM binaries are in separate downloads, and thus naturally end up in separate directories. With this change, you can have M5_PATH point to those two directories, then run any FS regression test without changing M5_PATH. Currently, you either have to merge the two download directories or change M5_PATH (or do something else I haven't figured out).
2015-02-03	config: Add XOR hashing to the DRAM channel interleaving	Andreas Hansson
	This patch uses the recently added XOR hashing capabilities for the DRAM channel interleaving. This avoids channel biasing due to strided access patterns.
2015-02-03	config: Adjust DRAM channel interleaving defaults	Andreas Hansson
	This patch changes the DRAM channel interleaving default behaviour to be more representative. The default address mapping (RoRaBaCoCh) moves the channel bits towards the least significant bits, and uses 128 byte as the default channel interleaving granularity. These defaults can be overridden if desired, but should serve as a sensible starting point for most use-cases.
2015-01-30	config: arm: fix os_flags	Malek Musleh
	Fix the makeArmSystem routine to reflect recent changes that support kernel commandline option when running android. Without this fix, trying to run android encounters a 'reference before assignment' error. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-20	scons: Do not build the InOrderCPU	Andreas Hansson
	One step closer to shifting focus to the MinorCPU.
2014-12-23	config: Expose the DRAM ranks as a command-line option	Andreas Hansson
	This patch gives the user direct influence over the number of DRAM ranks to make it easier to tune the memory density without affecting the bandwidth (previously the only means of scaling the device count was through the number of channels). The patch also adds some basic sanity checks to ensure that the number of ranks is a power of two (since we rely on bit slices in the address decoding).
2014-12-23	config: Add --memchecker option	Marco Elver
	This patch adds the --memchecker option, to denote that a MemChecker should be instantiated for the system. The exact usage of the MemChecker depends on the system configuration. For now CacheConfig.py makes use of the option, adding MemCheckerMonitor instances between CPUs and D-Caches. Note, however, that currently this only provides limited checking on a running system; other parts of the system, such as I/O devices are not monitored, and may cause warnings to be issued by the monitor.
2014-12-23	config: Add options to take/resume from SimPoint checkpoints	Dam Sunwoo
	More documentation at http://gem5.org/Simpoints Steps to profile, generate, and use SimPoints with gem5: 1. To profile workload and generate SimPoint BBV file, use the following option: --simpoint-profile --simpoint-interval <interval length> Requires single Atomic CPU and fastmem. <interval length> is in number of instructions. 2. Generate SimPoint analysis using SimPoint 3.2 from UCSD. (SimPoint 3.2 not included with this flow.) 3. To take gem5 checkpoints based on SimPoint analysis, use the following option: --take-simpoint-checkpoint=<simpoint file path>,<weight file path>,<interval length>,<warmup length> <simpoint file> and <weight file> is generated by SimPoint analysis tool from UCSD. SimPoint 3.2 format expected. <interval length> and <warmup length> are in number of instructions. 4. To resume from gem5 SimPoint checkpoints, use the following option: --restore-simpoint-checkpoint -r <N> --checkpoint-dir <simpoint checkpoint path> <N> is (SimPoint index + 1). E.g., "-r 1" will resume from SimPoint #0.