summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-01-20cpu: Fix retry bug in MinorCPU LSQAndreas Hansson
2015-01-20mem: Move DRAM interleaving check to initAndreas Hansson
This patch fixes a bug where the DRAM controller tried to access the system cacheline size before the system pointer was initialised. It also fixes a bug where the granularity is 0 (no interleaving).
2015-01-10stats: changes due to recent changesets.Nilay Vaish
2015-01-10x86 : fxsave and fxrestore missing template codeEmilio Castillo
This patch corrects the FXSAVE and FXRSTOR Macroops. The actual code used for saving/restore the FP registers is in the file but it was not used. The FXSAVE and FXRSTOR instructions are used in the kernel for saving and loading the state of the mmx,xmm and fpu registers. This operation is triggered in FS by issuing a Device Not Available Fault. The cr0 register has a TS flag that is set upon each context change. Every time a task access any FP related register (SIMD as well) if the TS flag is set to one, the device not available fault is issued. The kernel saves the current state of the registers, and restore the previous state of the currently running task. Right now Gem5 lacks of this capability. the Device Not Available Fault is never issued, leading to several problems when different threads share the same CPU and SMT is not used. The PARSEC Ferret benchmark is an example of this behavior. In order to test this a hack in the atomic cpu code was done to detect if a static instruction has any FP operands and the cr0 reg TS bit is set. This check must be done in the ISA dependent code. But it seems to be tricky to access the cr0 register while executing an instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-10cpu: fix RetiredStores probe pointNikos Nikoleris
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-06dev: prevent intel 8254 timer counter events firing before startupcdirik
This change includes edits to Intel8254Timer to prevent counter events firing before startup to comply with SimObject initialization call sequence. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-07test: Add a unittest for the BitUnion types.Gabe Black
2015-01-07base: Fix assigning between identical bitfields.Gabe Black
If two bitfields are of the same type, also implying that they have the same first and last bit positions, the existing implementation would copy the entire bitfield. That includes the __data member which is shared among all the bitfields, effectively overwritting the entire bitunion. This change also adjusts the write only signed bitfield assignment operator to be like the unsigned version, using "using" instead of implementing it again and calling down to the underlying implementation.
2015-01-07stats: x86: Update stats for the CPUID change.Gabe Black
2015-01-06x86: Enable three bits in the FamilyModelStepping ECX CPUID bitfield.Gabe Black
These are for the monitor/mwait instructions, SSSE3, and XSAVE.
2015-01-06cpuid, x86: Revert "Enabling more features in CPUid"Gabe Black
That change enables CPUID bits for features that aren't implemented in gem5. If a simulated system tries to use those features because it was told it could, bad things can happen.
2015-01-04stats: changes due to recent changesets.Nilay Vaish
2015-01-03arm: fix build_drive_system when not using default optionsAnthony Gutierrez
when trying to dual boot on arm build_drive_system will only use the default values for the dtb file, number of processors, and disk image. if you are using the non-default files by passing values on the command line for example, or by making a new entry in Benchmarks.py, the build config scripts will still look for the default files. this will lead to the wrong system files being used, or the simulator will fail if you do not have them. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-03minor: fixed LSQ MasterPortIDAndrew Lukefahr
Minor was reporting the data cache access as ".inst" accesses. This just switches the MasterPortID to dataMasterPortId. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-03arm: Add unlinkat syscall implementationmike upton
added ARM aarch64 unlinkat syscall support, modeled on other <xxx>at syscalls. This gets all of the cpu2006 int workloads passing in SE mode on aarch64. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-03x86: implements the simd128 ADDSUBPD instructionMaxime Martinasso
This patch implements the simd128 ADDSUBPD instruction for the x86 architecture. Tested with a simple program in assembly language which executes the instruction. Checked that different versions of the instruction are executed by using the execution tracing option. Committed by: Nilay Vaish <nilay@cs.wisc.edu
2015-01-03dev: prevent RTC events firing before startupCagdas Dirik
This change includes edits to MC146818 timer to prevent RTC events firing before startup to comply with SimObject initialization call sequence. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-03configs: ruby: removes bug introduced by 05b5a6cf3521Nilay Vaish
2014-12-27syscall_emul: Return correct writev valueJoel Hestness
According to Linux man pages, if writev is successful, it returns the total number of bytes written. Otherwise, it returns an error code. Instead of returning 0, return the result from the actual call to writev in the system call.
2014-12-23stats: Bump stats for decoder, TLB, prefetcher and DRAM changesAndreas Hansson
Changes due to speculative execution of an unaligned PC, introduction of TLB stats, changes and re-work of the prefetcher, and the introduction of rank-wise refresh in the DRAM controller.
2014-12-23mem: Change prefetcher to use random_mtMitch Hayenga
Prefechers has used rand() to generate random numers previously.
2014-12-23mem: Hide WriteInvalidate requests from prefetchersCurtis Dunham
Without this tweak, a prefetcher will happily prefetch data that will promptly be invalidated and overwritten by a WriteInvalidate.
2014-12-23mem: Fix event scheduling issue for prefetchesMitch Hayenga
The cache's MemSidePacketQueue schedules a sendEvent based upon nextMSHRReadyTime() which is the time when the next MSHR is ready or whenever a future prefetch is ready. However, a prefetch being ready does not guarentee that it can obtain an MSHR. So, when all MSHRs are full, the simulation ends up unnecessiciarly scheduling a sendEvent every picosecond until an MSHR is finally freed and the prefetch can happen. This patch fixes this by not signaling the prefetch ready time if the prefetch could not be generated. The event is rescheduled as soon as a MSHR becomes available.
2014-12-23mem: Fix bug relating to writebacks and prefetchesMitch Hayenga
Previously the code commented about an unhandled case where it might be possible for a writeback to arrive after a prefetch was generated but before it was sent to the memory system. I hit that case. Luckily the prefetchSquash() logic already in the code handles dropping prefetch request in certian circumstances.
2014-12-23mem: Rework the structuring of the prefetchersMitch Hayenga
Re-organizes the prefetcher class structure. Previously the BasePrefetcher forced multiple assumptions on the prefetchers that inherited from it. This patch makes the BasePrefetcher class truly representative of base functionality. For example, the base class no longer enforces FIFO order. Instead, prefetchers with FIFO requests (like the existing stride and tagged prefetchers) now inherit from a new QueuedPrefetcher base class. Finally, the stride-based prefetcher now assumes a custimizable lookup table (sets/ways) rather than the previous fully associative structure.
2014-12-23mem: Add parameter to reserve MSHR entries for demand accessMitch Hayenga
Adds a new parameter that reserves some number of MSHR entries for demand accesses. This helps prevent prefetchers from taking all MSHRs, forcing demand requests from the CPU to stall.
2014-12-23arm: Add stats to table walkerCurtis Dunham
This patch adds table walker stats for: - Walk events - Instruction vs Data - Page size histogram - Wait time and service time histograms - Pending requests histogram (per cycle) - measures dist. of L (p(1..) = how often busy, p(0) = how often idle) - Squashes, before starting and after completion
2014-12-23config: Expose the DRAM ranks as a command-line optionAndreas Hansson
This patch gives the user direct influence over the number of DRAM ranks to make it easier to tune the memory density without affecting the bandwidth (previously the only means of scaling the device count was through the number of channels). The patch also adds some basic sanity checks to ensure that the number of ranks is a power of two (since we rely on bit slices in the address decoding).
2014-12-23mem: Ensure DRAM controller is idle when in atomic modeAndreas Hansson
This patch addresses an issue seen with the KVM CPU where the refresh events scheduled by the DRAM controller forces the simulator to switch out of the KVM mode, thus killing performance. The current patch works around the fact that we currently have no proper API to inform a SimObject of the mode switches. Instead we rely on drainResume being called after any switch, and cache the previous mode locally to be able to decide on appropriate actions. The switcheroo regression require a minor stats bump as a result.
2014-12-23mem: Add rank-wise refresh to the DRAM controllerOmar Naji
This patch adds rank-wise refresh to the controller, as opposed to the channel-wide refresh currently in place. In essence each rank can be refreshed independently, and for this to be possible the controller is extended with a state machine per rank. Without this patch the data bus is always idle during a refresh, as all the ranks are refreshing at the same time. With the rank-wise refresh it is possible to use one rank while another one is refreshing, and thus the data bus can be kept busy. The patch introduces a Rank class to encapsulate the state per rank, and also shifts all the relevant banks, activation tracking etc to the rank. The arbitration is also updated to consider the state of the rank.
2014-12-23mem: Fix a bug in the DRAM controller arbitrationOmar Naji
Fix a minor issue that affects multi-rank systems.
2014-12-23tests: Add a regression for the stack distance calculatorAndreas Hansson
Re-use the existing traffic generator regression, and enable the stack distance calculation in the comm monitor, along with the verification stack. The traffic generator config is also tuned to not increase the run-time too much (and actually have some address re-use).
2014-12-23mem: Add stack distance statistics to the CommMonitorKanishk Sugand
This patch adds the stack distance calculator to the CommMonitor. The stats are disabled by default.
2014-12-23mem: Add a stack distance calculatorKanishk Sugand
This patch adds a stand-alone stack distance calculator. The stack distance calculator is a passive SimObject that observes the addresses passed to it. It calculates stack distances (LRU Distances) of incoming addresses based on the partial sum hierarchy tree algorithm described by Alamasi et al. http://doi.acm.org/10.1145/773039.773043. For each transaction a hashtable look-up is performed. At every non-unique transaction the tree is traversed from the leaf at the returned index to the root, the old node is deleted from the tree, and the sums (to the right) are collected and decremented. The collected sum represets the stack distance of the found node. At every unique transaction the stack distance is returned as numeric_limits<uint64>::max(). In addition to the basic stack distance calculation, a feature to mark an old node in the tree is added. This is useful if it is required to see the reuse pattern. For example, Writebacks to the lower level (e.g. membus from L2), can be marked instead of being removed from the stack (isMarked flag of Node set to True). And then later if this same address is accessed (by L1), the value of the isMarked flag would be True. This gives some insight on how the Writeback policy of the lower level affect the read/write accesses in an application. Debugging is enabled by setting the verify flag to true. Debugging is implemented using a dummy stack that behaves in a naive way, using STL vectors. Note that this has a large impact on run time.
2014-12-23config: Add --memchecker optionMarco Elver
This patch adds the --memchecker option, to denote that a MemChecker should be instantiated for the system. The exact usage of the MemChecker depends on the system configuration. For now CacheConfig.py makes use of the option, adding MemCheckerMonitor instances between CPUs and D-Caches. Note, however, that currently this only provides limited checking on a running system; other parts of the system, such as I/O devices are not monitored, and may cause warnings to be issued by the monitor.
2014-12-23mem: Add MemChecker and MemCheckerMonitorMarco Elver
This patch adds the MemChecker and MemCheckerMonitor classes. While MemChecker can be integrated anywhere in the system and is independent, the most convenient usage is through the MemCheckerMonitor -- this however, puts limitations on where the MemChecker is able to observe read/write transactions.
2014-12-23arm: Raise an alignment fault if a PC has illegal alignmentAndreas Sandberg
We currently don't handle unaligned PCs correctly. There is one check for unaligned PCs in the TLB when running in aarch64 mode, but this check does not cover cases where the CPU does not do a TLB lookup when decoding an instruction (e.g., a branch stays within the same cache line). Additionally, the Decoder class sometimes throws an assertion for unaligned PCs which breaks speculation. This changeset introduces a decoder fault bit field in the ExtMachInst structure. This field can be used to signal a decoder failure. If set, the decoder generates an internal gem5fault instruction instead of a normal instruction. This instruction in turns either panics (fault type PANIC), returns an PCAlignmentFault (fault type UNALIGNED, aarch64) or PrefetchAbort (fault type UNALIGNED, aarch32). The patch causes minor changes to the realview64 regressions, and a stats bump will follow.
2014-12-23arm: Clean up and document decoder APIAndreas Sandberg
This changeset adds more documentation to the ArmISA::Decoder class and restructures it slightly to make API groups more obvious.
2014-12-23arm: Add support for filtering in the PMUAndreas Sandberg
This patch adds support for filtering events in the PMU. In order to do so, it updates the ISADevice base class to forward an ISA pointer to ISA devices. This enables such devices to access the MiscReg file to determine the current execution level.
2014-12-23config: Add options to take/resume from SimPoint checkpointsDam Sunwoo
More documentation at http://gem5.org/Simpoints Steps to profile, generate, and use SimPoints with gem5: 1. To profile workload and generate SimPoint BBV file, use the following option: --simpoint-profile --simpoint-interval <interval length> Requires single Atomic CPU and fastmem. <interval length> is in number of instructions. 2. Generate SimPoint analysis using SimPoint 3.2 from UCSD. (SimPoint 3.2 not included with this flow.) 3. To take gem5 checkpoints based on SimPoint analysis, use the following option: --take-simpoint-checkpoint=<simpoint file path>,<weight file path>,<interval length>,<warmup length> <simpoint file> and <weight file> is generated by SimPoint analysis tool from UCSD. SimPoint 3.2 format expected. <interval length> and <warmup length> are in number of instructions. 4. To resume from gem5 SimPoint checkpoints, use the following option: --restore-simpoint-checkpoint -r <N> --checkpoint-dir <simpoint checkpoint path> <N> is (SimPoint index + 1). E.g., "-r 1" will resume from SimPoint #0.
2014-12-22scons: Make the USE_KVM variable available in C++.Gabe Black
We need it to determine whether we should expect KVM related parameters exist in the cirrus graphics device.
2014-12-14Added tag stable_2014_12_14 for changeset bdb307e8be54Nilay Vaish
2014-12-09Let other objects set up memory like regions in a KVM VM.Gabe Black
2014-12-08arm: Fix decoding of PMXEVTYPER_EL0 and PMCCFILTR_EL0Andreas Sandberg
The aarch64 system register decoder is currently not decoding PMXEVTYPER_EL0 and PMCCFILTR_EL0 correctly. This changeset updates the decoder so that they are decoded using the values in table C5-6 in ARM DDI 0478A.c.
2014-12-08dev: Add response sanity checks in PioPortAndreas Sandberg
Add an assert in the PioPort that checks if a response packet from a device has the right flags set before passing it to them rest of the memory system.
2014-12-08dev: Correctly transform packets into responsesAndreas Sandberg
The VirtIO devices didn't correctly set the response flags in memory packets. This changeset adds the required Packet::makeResponse() calls.
2014-12-05misc: Generalize GDB single stepping.Gabe Black
The new single stepping implementation for x86 doesn't rely on any ISA specific properties or functionality. This change pulls out the per ISA implementation of those functions and promotes the X86 implementation to the base class. One drawback of that implementation is that the CPU might stop on an instruction twice if it's affected by both breakpoints and single stepping. While that might be a little surprising, it's harmless and would only happen under somewhat unlikely circumstances.
2014-12-05x86: Implement a remote GDB stub.Gabe Black
This stub should allow remote debugging of 32 bit and 64 bit targets. Single stepping seems to work, as do breakpoints. If both breakpoints and single stepping affect an instruction, gdb will stop at the instruction twice before continuing. That's a little surprising, but is generally harmless.
2014-12-05misc: Add some utility functions for schedule inst commit events.Gabe Black
These can be used to simplify the implementation of single step in derived classes.
2014-12-05misc: Rename the GDB "Event" event class to InputEvent.Gabe Black
The "Event" name is the same as the base event class. That's a bit confusing, and makes it a little awkward to add other event types.