gem5 - gem5

Age	Commit message (Collapse)	Author
2014-01-29	mem: prefetcher: add options, support for unaligned addresses	Mitch Hayenga ext:(%2C%20Amin%20Farmahini%20%3Caminfar%40gmail.com%3E)
	This patch extends the classic prefetcher to work on non-block aligned addresses. Because the existing prefetchers in gem5 mask off the lower address bits of cache accesses, many predictable strides fail to be detected. For example, if a load were to stride by 48 bytes, with 64 byte cachelines, the current stride based prefetcher would see an access pattern of 0, 64, 64, 128, 192.... Thus not detecting a constant stride pattern. This patch fixes this, by training the prefetcher on access and not masking off the lower address bits. It also adds the following configuration options: 1) Training/prefetching only on cache misses, 2) Training/prefetching only on data acceses, 3) Optionally tagging prefetches with a PC address. #3 allows prefetchers to train off of prefetch requests in systems with multiple cache levels and PC-based prefetchers present at multiple levels. It also effectively allows a pipelining of prefetch requests (like in POWER4) across multiple levels of cache hierarchy. Improves performance on my gem5 configuration by 4.3% for SPECINT and 4.7% for SPECFP (geomean).
2014-01-29	cpu: fix bug when TrafficGen deschedules event	Xiangyu Dong
	Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-28	arm: Enable umask syscall in SE mode	Mitch Hayenga
	Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-28	base: Fix race condition in the socket listen function	Mitch Hayenga
	gem5 makes the incorrect assumption that by binding a socket, it effectively has allocated a port. Linux only allocates ports once you call listen on the given socket, not when you call bind. So even if the port was free when bind was called, another process (gem5 instance) could race in between the bind & listen calls and steal the port. In the current code, if the call to bind fails due to the port being in use (EADDRINUSE), gem5 retries for a different port. However if listen fails, gem5 just panics. The fix is testing the return value of listen and re-trying if it was due to EADDRINUSE. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-28	mem: Remove redundant findVictim() input argument	Amin Farmahini
	The patch (1) removes the redundant writeback argument from findVictim() (2) fixes the description of access() function Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-28	mem: Fixes a bug in simple_dram write merging	Amin Farmahini
	Fixes updating the value of size in the write merge function. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-28	x86: add a warning about the number of memory controllers	Nilay Vaish
	When memory size > 3GB, print a warning that twice the number of memory controllers would be created.
2014-01-27	x86: use lfpimm instead of limm for fptan	Nilay Vaish

2014-01-27	x86: implements x87 add/sub instructions	Nilay Vaish

2014-01-27	x86: implements fxch instruction.	Nilay Vaish

2014-01-27	x86: correct error in emms instruction.	Nilay Vaish

2014-01-27	config: allow more than 3GB of memory for x86 simulations	Nilay Vaish
	This patch edits the configuration files so that x86 simulations can have more than 3GB of memory. It also corrects a bug in the MemConfig.py script.
2014-01-27	stats: update sparc fs stats	Nilay Vaish

2014-01-27	stats: update eio stats for recent changes	Steve Reinhardt

2014-01-24	stats: update stats for ARMv8 changes	Ali Saidi

2014-01-24	arm: Add support for ARMv8 (AArch64 & AArch32)	ARM gem5 Developers
	Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64 kernel you are restricted to AArch64 user-mode binaries. This will be addressed in a later patch. Note: Virtualization is only supported in AArch32 mode. This will also be fixed in a later patch. Contributors: Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation) Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation) Mbou Eyole (AArch64 NEON, validation) Ali Saidi (AArch64 Linux support, code integration, validation) Edmund Grimley-Evans (AArch64 FP) William Wang (AArch64 Linux support) Rene De Jong (AArch64 Linux support, performance opt.) Matt Horsnell (AArch64 MP, validation) Matt Evans (device models, code integration, validation) Chris Adeniyi-Jones (AArch64 syscall-emulation) Prakash Ramrakhyani (validation) Dam Sunwoo (validation) Chander Sudanthi (validation) Stephan Diestelhorst (validation) Andreas Hansson (code integration, performance opt.) Eric Van Hensbergen (performance opt.) Gabe Black
2014-01-24	stats: update stats for cache occupancy and clock domain changes	Ali Saidi

2014-01-24	arch: Make all register index flattening const	Andreas Hansson
	This patch makes all the register index flattening methods const for all the ISAs. As part of this, readMiscRegNoEffect for ARM is also made const.
2014-01-24	checker: CheckerCPU handling of MiscRegs was incorrect	Geoffrey Blake
	The CheckerCPU model in pre-v8 code was not checking the updates to miscellaneous registers due to some methods for setting misc regs were not instrumented. The v8 patches exposed this by calling the instrumented misc reg update methods and then invoking the checker before the main CPU had updated its misc regs, leading to false positives about register mismatches. This patch fixes the non-instrumented misc reg update methods and places calls to the checker in the proper places in the O3 model.
2014-01-24	arch, cpu: Add support for flattening misc register indexes.	Ali Saidi
	With ARMv8 support the same misc register id results in accessing different registers depending on the current mode of the processor. This patch adds the same orthogonality to the misc register file as the others (int, float, cc). For all the othre ISAs this is currently a null-implementation. Additionally, a system variable is added to all the ISA objects.
2014-01-24	cpu: Add support for Memory+Barrier instruction types in O3 cpu.	Giacomo Gabrielli

2014-01-24	cpu: Add support for instructions that zero cache lines.	Ali Saidi

2014-01-24	cpu: Add CPU support for generatig wake up events when LLSC adresses are ↵	Ali Saidi
	snooped. This patch add support for generating wake-up events in the CPU when an address that is currently in the exclusive state is hit by a snoop. This mechanism is required for ARMv8 multi-processor support.
2014-01-24	mem: Add flag to request if it was generated by a page table walk	Giacomo Gabrielli

2014-01-24	mem: Add support for a security bit in the memory system	Giacomo Gabrielli
	This patch adds the basic building blocks required to support e.g. ARM TrustZone by discerning secure and non-secure memory accesses.
2014-01-24	sim: Add openat/fstatat syscalls and fix mremap	Chris Adeniyi-Jones
	This patch adds support for the openat and fstatat syscalls and broadens the support for mremap to make it work on OS X.
2014-01-24	mem: Remove explict cast from memhelper.	Ali Saidi
	Previously we were casting the result type to the the memory type which is incorrect for things like dual-memory operations which still return a single result.
2014-01-24	Cache: Collect very basic stats on tag and data accesses	Timothy M. Jones
	Adds very basic statistics on the number of tag and data accesses within the cache, which is important for power modelling. For the tags, simply count the associativity of the cache each time. For the data, this depends on whether tags and data are accessed sequentially, which is given by a new parameter. In the parallel case, all data blocks are accessed each time, but with sequential accesses, a single data block is accessed only on a hit.
2014-01-24	mem: per-thread cache occupancy and per-block ages	Dam Sunwoo
	This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump.
2014-01-24	base: add support for probe points and common probes	Matt Horsnell
	The probe patch is motivated by the desire to move analytical and trace code away from functional code. This is achieved by the probe interface which is essentially a glorified observer model. What this means to users: * add a probe point and a "notify" call at the source of an "event" * add an isolated module, that is being used to carry out your analysis (e.g. generate a trace) * register that module as a probe listener Note: an example is given for reference in src/cpu/o3/simple_trace.[hh\|cc] and src/cpu/SimpleTrace.py What is happening under the hood: * every SimObject maintains has a ProbeManager. * during initialization (src/python/m5/simulate.py) first regProbePoints and the regProbeListeners is called on each SimObject. this hooks up the probe point notify calls with the listeners. FAQs: Why did you develop probe points: * to remove trace, stats gathering, analytical code out of the functional code. * the belief that probes could be generically useful. What is a probe point: * a probe point is used to notify upon a given event (e.g. cpu commits an instruction) What is a probe listener: * a class that handles whatever the user wishes to do when they are notified about an event. What can be passed on notify: * probe points are templates, and so the user can generate probes that pass any type of argument (by const reference) to a listener. What relationships can be generated (1:1, 1:N, N:M etc): * there isn't a restriction. You can hook probe points and listeners up in a 1:1, 1:N, N:M relationship. They become useful when a number of modules listen to the same probe points. The idea being that you can add a small number of probes into the source code and develop a larger number of useful analysis modules that use information passed by the probes. Can you give examples: * adding a probe point to the cpu's commit method allows you to build a trace module (outputting assembler), you could re-use this to gather instruction distribution (arithmetic, load/store, conditional, control flow) stats. Why is the probe interface currently restricted to passing a const reference: * the desire, initially at least, is to allow an interface to observe functionality, but not to change functionality. * of course this can be subverted by const-casting. What is the performance impact of adding probes: * when nothing is actively listening to the probes they should have a relatively minor impact. Profiling has suggested even with a large number of probes (60) the impact of them (when not active) is very minimal (<1%).
2014-01-24	sim: Expose the current voltage for each object as a stat	Andreas Hansson

2014-01-24	sim: Expose the current clock period as a stat	Andreas Hansson
	This patch adds observability to the clock period of the clock domains by including it as a stat. As a result of adding this, the regressions will be updated in a separate patch.
2014-01-24	mem: track per-request latencies and access depths in the cache hierarchy	Matt Horsnell
	Add some values and methods to the request object to track the translation and access latency for a request and which level of the cache hierarchy responded to the request.
2014-01-24	config: Make the Clock a Tick parameter like Latency/Frequency	Andreas Hansson
	This patch makes the Clock a TickParamValue just like Latency/Frequency. There is no longer any need to distinguish it (originally needed to support multiplication).
2014-01-24	x86: Fix memory leak in table walker	Andreas Hansson
	This patch fixes a memory leak in the table walker, by ensuring that the sender state is deleted again if the request packet cannot be successfully sent.
2014-01-24	cpu: Relax check on squashed non-speculative instructions	Andreas Hansson
	This patch relaxes the check performed when squashing non-speculative instructions, as it caused problems with loads that were marked ready, and then stalled on a blocked cache. The assertion is now allowing memory references to be non-faulting.
2014-01-24	util: updated Streamline flow to support ARM DS-5 v5.17 protocol	Dam Sunwoo
	The previous flow supported ARM DS-5 v5.13 protocol.
2014-01-24	cpu: remove faulty simpoint basic block inst count assertion	Dam Sunwoo
	This patch removes an assertion in the simpoint profiling code that asserts that a previously-seen basic block has the exact same number of instructions executed as before. This can be false if the basic block generates aborts or takes interrupts at different locations within the basic block. The basic block profiling are not affected significantly as these events are rare in general.
2014-01-17	ruby: remove unused label no_vector	Nilay Vaish

2014-01-10	stats: updates due to changes to ruby	Nilay Vaish

2014-01-10	ruby: move all statistics to stats.txt, eliminate ruby.stats	Nilay Vaish

2014-01-10	stats: add function for adding two histograms	Nilay Vaish
	This patch adds a function to the HistStor class for adding two histograms. This functionality is required for Ruby. It also adds support for printing histograms in a single line.
2014-01-09	ruby: fix bug introduced to revision 8523754f8885	Nilay Vaish

2014-01-08	ruby: slicc: remove variable 'addr' used in calls to doTransition	Nilay Vaish
	This variable causes trouble if a variable of same name is declared in a protocol file. Hence it is being eliminated.
2014-01-04	ruby: add a three level MESI protocol.	Nilay Vaish
	The first two levels (L0, L1) are private to the core, the third level (L2)is possibly shared. The protocol supports clustered designs. For example, one can have two sets of two cores. Each core has an L0 and L1 cache. There are two L2 controllers where each set accesses only one of the L2 controllers.
2014-01-04	ruby: rename MESI_CMP_directory to MESI_Two_Level	Nilay Vaish
	This is because the next patch introduces a three level hierarchy. --HG-- rename : build_opts/ALPHA_MESI_CMP_directory => build_opts/ALPHA_MESI_Two_Level rename : build_opts/X86_MESI_CMP_directory => build_opts/X86_MESI_Two_Level rename : configs/ruby/MESI_CMP_directory.py => configs/ruby/MESI_Two_Level.py rename : src/mem/protocol/MESI_CMP_directory-L1cache.sm => src/mem/protocol/MESI_Two_Level-L1cache.sm rename : src/mem/protocol/MESI_CMP_directory-L2cache.sm => src/mem/protocol/MESI_Two_Level-L2cache.sm rename : src/mem/protocol/MESI_CMP_directory-dir.sm => src/mem/protocol/MESI_Two_Level-dir.sm rename : src/mem/protocol/MESI_CMP_directory-dma.sm => src/mem/protocol/MESI_Two_Level-dma.sm rename : src/mem/protocol/MESI_CMP_directory-msg.sm => src/mem/protocol/MESI_Two_Level-msg.sm rename : src/mem/protocol/MESI_CMP_directory.slicc => src/mem/protocol/MESI_Two_Level.slicc rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/config.ini => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/config.ini rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/ruby.stats rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/simerr => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/simerr rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/simout => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/simout rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/stats.txt rename : tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_CMP_directory/system.pc.com_1.terminal => tests/long/fs/10.linux-boot/ref/x86/linux/pc-simple-timing-ruby-MESI_Two_Level/system.pc.com_1.terminal rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/config.ini => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/config.ini rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/ruby.stats rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/simerr => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/simerr rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/simout => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/simout rename : tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/00.hello/ref/alpha/linux/simple-timing-ruby-MESI_Two_Level/stats.txt rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/config.ini => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/config.ini rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/ruby.stats rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/simerr => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/simerr rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/simout => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/simout rename : tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/00.hello/ref/alpha/tru64/simple-timing-ruby-MESI_Two_Level/stats.txt rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/config.ini => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/config.ini rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/ruby.stats rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/simerr => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/simerr rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/simout => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/simout rename : tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/50.memtest/ref/alpha/linux/memtest-ruby-MESI_Two_Level/stats.txt rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/config.ini => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/config.ini rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/ruby.stats => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/ruby.stats rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/simerr => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/simerr rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/simout => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/simout rename : tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_CMP_directory/stats.txt => tests/quick/se/60.rubytest/ref/alpha/linux/rubytest-ruby-MESI_Two_Level/stats.txt
2014-01-04	ruby: remove cntrl_id from python config scripts.	Nilay Vaish

2014-01-04	ruby: add support for clusters	Nilay Vaish
	A cluster over here means a set of controllers that can be accessed only by a certain set of cores. For example, consider a two level hierarchy. Assume there are 4 L1 controllers (private) and 2 L2 controllers. We can have two different hierarchies here: a. the address space is partitioned between the two L2 controllers. Each L1 controller accesses both the L2 controllers. In this case, each L1 controller is a cluster initself. b. both the L2 controllers can cache any address. An L1 controller has access to only one of the L2 controllers. In this case, each L2 controller along with the L1 controllers that access it, form a cluster. This patch allows for each controller to have a cluster ID, which is 0 by default. By setting the cluster ID properly, one can instantiate hierarchies with clusters. Note that the coherence protocol might have to be changed as well.
2014-01-04	ruby: some small changes	Nilay Vaish

2014-01-03	config, x86: move kernel specification from tests to FSConfig.py	Steve Reinhardt
	For some reason, the default x86 kernel is specified in tests/configs/x86_generic.py and not in configs/common/FSConfig.py, where the kernels for all the other ISAs are. This means that running configs/example/fs.py for x86 fails because no kernel is specified. Moving the specification over fixes this problem. There is another problem that this uncovers, which is that going past the init stage (i.e., past where the regression test stops) fails because the fsck test on the disk device fails, but that's a separate issue.