summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-11-06mem: Use the packet delays and do not just zero them outAndreas Hansson
This patch updates the I/O devices, bridge and simple memory to take the packet header and payload delay into account in their latency calculations. In all cases we add the header delay, i.e. the accumulated pipeline delay of any crossbars, and the payload delay needed for deserialisation of any payload. Due to the additional unknown latency contribution, the packet queue of the simple memory is changed to use insertion sorting based on the time stamp. Moreover, since the memory hands out exclusive (non shared) responses, we also need to ensure ordering for reads to the same address.
2015-11-06mem: Align rules for sinking inhibited packets at the slaveAndreas Hansson
This patch aligns how the memory-system slaves, i.e. the various memory controllers and the bridge, identify and deal with sinking of inhibited packets that are only useful within the coherent part of the memory system. In the future we could shift the onus to the crossbar, and add a parameter "is_point_of_coherence" that would allow it to sink the aforementioned packets.
2015-11-06mem: Do not treat CleanEvict as a write operationAndreas Hansson
This patch changes the CleanEvict command type to not be considered a write. Initially it was made a zero-sized write to match the writeback command, but as things developed it became clear that it causes more problems than it solves. For example, the memory modules (and bridge) should not consider the CleanEvict as a write, but instead discard it. With this patch it will be neither a read, nor write, and as it does not need a response the slave will simply sink it.
2015-11-06mem: Unify delayed packet deletionAndreas Hansson
This patch unifies how we deal with delayed packet deletion, where the receiving slave is responsible for deleting the packet, but the sending agent (e.g. a cache) is still relying on the pointer until the call to sendTimingReq completes. Previously we used a mix of a deletion vector and a construct using unique_ptr. With this patch we ensure all slaves use the latter approach.
2015-11-06misc: Appease clang static analyzerAndreas Hansson
A few minor fixes to issues identified by the clang static analyzer.
2015-11-06mem: Check the XBar's port queues on functional snoopsAndreas Sandberg
The CoherentXBar currently doesn't check its queued slave ports when receiving a functional snoop. This caused data corruption in cases when a modified cache lines is forwarded between two caches. Add the required functional calls into the queued slave ports.
2015-11-04configs: fix bug introduced due to 276ad9121192Nilay Vaish
I had made a typo in changeset 276ad9121192. This changeset fixes it
2015-11-03mem: hmc: minor fixesErfan Azarkhish
This patch performs two minor fixes to DRAMCtrl.py and xbar.hh in favor of the HMC patch series. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-11-03mem: hmc: serial link modelErfan Azarkhish
This changeset adds a serial link model for the Hybrid Memory Cube (HMC). SerialLink is a simple variation of the Bridge class, with the ability to account for the latency of packet serialization. Also trySendTiming has been modified to correctly model bandwidth. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-11-03mem: hmc: adds controllerErfan Azarkhish
This patch models a simple HMC Controller. It simply schedules the incoming packets to HMC Serial Links using a round robin mechanism. This patch should be applied in series with other patches modeling a complete HMC device. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-11-03mem: hmc: top level designErfan Azarkhish
This patch enables modeling a complete Hybrid Memory Cube (HMC) device. It highly reuses the existing components in gem5's general memory system with some small modifications. This changeset requires additional patches to model a complete HMC device. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-11-03sparc: add missing parameter to makeSparcSystem()Palle Lyckegaard
makeSparcSystem() in configs/common/FSConfig.py is missing the cmdLine parameter Without the parameter the simulation fails to start. With the parameter the simulation starts properly.
2015-10-29arm: Add secure flag to TableWalker request when neededNathanael Premillieu
2015-10-29dev: Fix segfault in flash deviceSascha Bischoff
Fix a bug in which the flash device would write out of bounds and could either trigger a segfault and corrupt the memory of other objects. This was caused by using pageSize in the place of pagesPerBlock when running the garbage collector. Also, added an assert to flag this condition in the future.
2015-10-29dev: Fix draining for UFSHostDevice and FlashDeviceSascha Bischoff
This patch fixes the drain logic for the UFSHostDevice and the FlashDevice. In the case of the FlashDevice, the logic for CheckDrain needed to be reversed, whilst in the case of the UFSHostDevice check drain was never being called. In both cases the system would never complete draining if the initial attampt to drain failed.
2015-10-29kvm, arm: Fix compilation errors due to API changesVictor Garcia
The checkpoint changes, along with the SMT patches have changed a number of APIs. Adapt the ArmKvmCPU accordingly.
2015-10-29mem: Clarify cache MSHR handling on fillAndreas Hansson
This patch addresses the upgrading of deferred targets in the MSHR, and makes it clearer by explicitly calling out what is happening (deferred targets are promoted if we get exclusivity without asking for it).
2015-10-25power: Implement Remote GDBBoris Shingarov
2015-10-23x86: Add missing explicit overrides for X86 devicesAndreas Hansson
Make clang >= 3.5 happy when compiling build/X86/gem5.opt on OSX.
2015-10-23arm: Add missing explicit overrides for ARM devicesAndreas Hansson
Make clang >= 3.5 happy when compiling build/ARM/gem5.opt on OSX.
2015-10-14mem: Pass snoop retries through the CommMonitorAndreas Hansson
Allow the monitor to be placed after a snooping port, and do not fail on snoop retries, but instead pass them on to the slave port.
2015-10-14ruby: profiler: provide the number of vnets through ruby systemNilay Vaish
The aim is to ultimately do away with the static function Network::getNumberOfVirtualNetworks().
2015-10-14ruby: remove unused functionalRead() function.Nilay Vaish
Not required since functional reads cannot rely on messages that are inflight.
2015-10-14ruby: garnet: flexible: refactor flitNilay Vaish
2015-10-12misc: Add explicit overrides and fix other clang >= 3.5 issuesAndreas Hansson
This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication. As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables).
2015-10-12misc: Remove redundant compiler-specific definesAndreas Hansson
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions.
2015-10-10stats: Update for UDelayEvent quiesce changeJoel Hestness
2015-10-10sim: Don't quiesce UDelayEvents with 0 latencyJoel Hestness
ARM uses UDelayEvents to emulate kernel __*udelay functions and speed up simulation. UDelayEvents call Pseudoinst::quiesceNs to quiesce the system for a specified delay. Changeset 10341:0b4d10f53c2d introduced the requirement that any quiesce process that is started must also be completed by scheduling an EndQuiesceEvent. This change causes the CPU to hang if an IsQuiesce instruction is executed, but the corresponding EndQuiesceEvent is not scheduled. Changeset 11058:d0934b57735a introduces a fix for uses of PseudoInst::quiesce* that would conditionally execute the EndQuiesceEvent. ARM UDelayEvents specify quiesce period of 0 ns (src/arch/arm/linux/system.cc), so changeset 11058 causes these events to now execute full quiesce processes, greatly increasing the total instructions executed in kernel delay loops and slowing simulation. This patch updates the UDelayEvent to conditionally execute PseudoInst::quiesceNs (**a quiesce operation**) only if the specified delay is >0 ns. The result is ARM delay loops no longer execute instructions for quiesce handling, and regression time returns to normal.
2015-10-09isa: Add parameter to pick different decoder inside ISARekai Gonzalez Alberquilla
The decoder is responsible for splitting instructions in micro operations (uops). Given that different micro architectures may split operations differently, this patch allows to specify which micro architecture each isa implements, so different cores in the system can split instructions differently, also decoupling uop splitting (microArch) from ISA (Arch). This is done making the decodification calls templates that receive a type 'DecoderFlavour' that maps the name of the operation to the class that implements it. This way there is only one selection point (converting the command line enum to the appropriate DecodeFeatures object). In addition, there is no explicit code replication: template instantiation hides that, and the compiler should be able to resolve a number of things at compile-time.
2015-10-09sim: Add relative break schedulingDylan Johnson
Add schedRelBreak() function, executable within a debugger, that sets a breakpoint by relative rather than absolute tick.
2015-10-06arch: clean up isa_parser error handlingSteve Reinhardt
Although some decent error messages were getting generated inside isa_parser.py, they weren't always getting printed because of the screwy way we were handling exceptions. (Basically an inner exception would get hidden by an outer exception, and the more informative inner error message would not get printed.) Also line numbers were messed up, since they were taken from the lexer, which is typically a token (or more) ahead of the grammar rule that's being matched. Using the 'lineno' attribute that PLY associates with the grammar production is more accurate. The new LineTracker class extends lineno to track filenames as well as line numbers.
2015-10-06sim: add ExecMacro to Exec* compound debug flagsSteve Reinhardt
Really should have been there in the first place, IMO. Makes debugging x86 execution a lot easier.
2015-10-06sim: print pid in output headerSteve Reinhardt
This information is useful if you have a bunch of simulations running and want to know which one to kill, but you've neglected to take advantage of the previous patch and embed the pid in your output path.
2015-10-06x86: implement rcpps and rcpss SSE instsSteve Reinhardt
These are packed single-precision approximate reciprocal operations, vector and scalar versions, respectively. This code was basically developed by copying the code for sqrtps and sqrtss. The mrcp micro-op was simplified relative to msqrt since there are no double-precision versions of this operation.
2015-10-06x86: implement fild, fucomi, and fucomip x87 instsSteve Reinhardt
fild loads an integer value into the x87 top of stack register. fucomi/fucomip compare two x87 register values (the latter also doing a stack pop). These instructions are used by some versions of GNU libstdc++.
2015-10-06ext: fix SST connectorCurtis Dunham
The renamings in changesets 8f5993cf (2015-03-23) "mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMW" and fdd4a895 (2015-07-03) "mem: Split WriteInvalidateReq into write and invalidate" broke the SST connector. This commit repeats those renamings in ext/sst.
2015-09-02sim: Add ability to break at specific kernel functionDylan Johnson
Adds a GDB callable function that sets a breakpoint at the beginning of a kernel function.
2015-10-05tests: Update SMT tests to correctly configure CPUsAndreas Sandberg
The 01.hello-2T-smt test case for the O3 CPU didn't correctly setup the number of threads before creating interrupt controllers, which confused the constructor in BaseCPU. This changeset adds SMT support to the test configuration infrastructure. --HG-- rename : tests/configs/o3-timing.py => tests/configs/o3-timing-mt.py rename : tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing/config.ini => tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing-mt/config.ini rename : tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing/simerr => tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing-mt/simerr rename : tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing/simout => tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing-mt/simout rename : tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing/stats.txt => tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing-mt/stats.txt
2015-10-02stats: update EIO stats for snoop filter changesSteve Reinhardt
2015-10-01config: Fix 'learning gem5' configs after SMT pushAndreas Hansson
This patch updates the 'learning gem5' example scripts to match the recent push of the SMT patches.
2015-09-30base: remove Trace::enabled flagCurtis Dunham
The DTRACE() macro tests both Trace::enabled and the specific flag. This change uses the same administrative interface for enabling/disabling tracing, but masks the SimpleFlags settings directly. This eliminates a load for every DTRACE() test, e.g. DPRINTF.
2015-09-30arm: Change TLB Software CachingMitch Hayenga
In ARM, certain variables are only updated when a necessary change is detected. Having 2 SMT threads share a TLB resulted in these not being updated as required. This patch adds a thread context identifer to assist in the invalidation of these variables.
2015-09-30cpu,isa,mem: Add per-thread wakeup logicMitch Hayenga
Changes wakeup functionality so that only specific threads on SMT capable cpus are woken.
2015-09-30isa,cpu: Add support for FS SMT InterruptsMitch Hayenga
Adds per-thread interrupt controllers and thread/context logic so that interrupts properly get routed in SMT systems.
2015-09-30arm: SMT MPIDR SettingMitch Hayenga
Changes assignment of the MPIDR for multi-threaded systems only.
2015-09-30cpu: Add per-thread monitorsMitch Hayenga
Adds per-thread address monitors to support FullSystem SMT.
2015-09-30config,cpu: Add SMT support to Atomic and Timing CPUsMitch Hayenga
Adds SMT support to the "simple" CPU models so that they can be used with other SMT-supported CPUs. Example usage: this enables the TimingSimpleCPU to be used to warmup caches before swapping to detailed mode with the in-order or out-of-order based CPU models.
2015-09-30cpu: Change thread assignments for heterogenous SMTMitch Hayenga
Trying to run an SE system with varying threads per core (SMT cores + Non-SMT cores) caused failures due to the CPU id assignment logic. The comment about thread assignment (worrying about core 0 not having tid 0) seems not to be valid given that our configuration scripts initialize them in order. This removes that constraint so a heterogenously threaded sytem can work.
2015-09-29ruby: Fix CacheMemory allocate leakJoel Hestness
If a cache entry permission was previously set to NotPresent, but the entry was not deleted, a following cache allocation can cause the entry to be leaked by setting the entry pointer to a newly allocated entry. To eliminate this possibility, check if the new entry is different from the old one, and if so, delete the old one.
2015-09-29arch, x86: Delete packet in IntDevice::recvResponseJoel Hestness
IntDevice::recvResponse is called from two places in current mainline: (1) the short circuit path of X86ISA::IntDevice::IntMasterPort::sendMessage for atomic mode, and (2) the full request->response path to and from the x86 interrupts device (finally called from MessageMasterPort::recvTimingResp). In the former case, the packet was deleted correctly, but in the latter case, the packet and request leak. To fix the leak, move request and packet deletion into IntDevice inherited class implementations of recvResponse.