summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-10-06sim: print pid in output headerSteve Reinhardt
This information is useful if you have a bunch of simulations running and want to know which one to kill, but you've neglected to take advantage of the previous patch and embed the pid in your output path.
2015-10-06x86: implement rcpps and rcpss SSE instsSteve Reinhardt
These are packed single-precision approximate reciprocal operations, vector and scalar versions, respectively. This code was basically developed by copying the code for sqrtps and sqrtss. The mrcp micro-op was simplified relative to msqrt since there are no double-precision versions of this operation.
2015-10-06x86: implement fild, fucomi, and fucomip x87 instsSteve Reinhardt
fild loads an integer value into the x87 top of stack register. fucomi/fucomip compare two x87 register values (the latter also doing a stack pop). These instructions are used by some versions of GNU libstdc++.
2015-10-06ext: fix SST connectorCurtis Dunham
The renamings in changesets 8f5993cf (2015-03-23) "mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMW" and fdd4a895 (2015-07-03) "mem: Split WriteInvalidateReq into write and invalidate" broke the SST connector. This commit repeats those renamings in ext/sst.
2015-09-02sim: Add ability to break at specific kernel functionDylan Johnson
Adds a GDB callable function that sets a breakpoint at the beginning of a kernel function.
2015-10-05tests: Update SMT tests to correctly configure CPUsAndreas Sandberg
The 01.hello-2T-smt test case for the O3 CPU didn't correctly setup the number of threads before creating interrupt controllers, which confused the constructor in BaseCPU. This changeset adds SMT support to the test configuration infrastructure. --HG-- rename : tests/configs/o3-timing.py => tests/configs/o3-timing-mt.py rename : tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing/config.ini => tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing-mt/config.ini rename : tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing/simerr => tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing-mt/simerr rename : tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing/simout => tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing-mt/simout rename : tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing/stats.txt => tests/quick/se/01.hello-2T-smt/ref/alpha/linux/o3-timing-mt/stats.txt
2015-10-02stats: update EIO stats for snoop filter changesSteve Reinhardt
2015-10-01config: Fix 'learning gem5' configs after SMT pushAndreas Hansson
This patch updates the 'learning gem5' example scripts to match the recent push of the SMT patches.
2015-09-30base: remove Trace::enabled flagCurtis Dunham
The DTRACE() macro tests both Trace::enabled and the specific flag. This change uses the same administrative interface for enabling/disabling tracing, but masks the SimpleFlags settings directly. This eliminates a load for every DTRACE() test, e.g. DPRINTF.
2015-09-30arm: Change TLB Software CachingMitch Hayenga
In ARM, certain variables are only updated when a necessary change is detected. Having 2 SMT threads share a TLB resulted in these not being updated as required. This patch adds a thread context identifer to assist in the invalidation of these variables.
2015-09-30cpu,isa,mem: Add per-thread wakeup logicMitch Hayenga
Changes wakeup functionality so that only specific threads on SMT capable cpus are woken.
2015-09-30isa,cpu: Add support for FS SMT InterruptsMitch Hayenga
Adds per-thread interrupt controllers and thread/context logic so that interrupts properly get routed in SMT systems.
2015-09-30arm: SMT MPIDR SettingMitch Hayenga
Changes assignment of the MPIDR for multi-threaded systems only.
2015-09-30cpu: Add per-thread monitorsMitch Hayenga
Adds per-thread address monitors to support FullSystem SMT.
2015-09-30config,cpu: Add SMT support to Atomic and Timing CPUsMitch Hayenga
Adds SMT support to the "simple" CPU models so that they can be used with other SMT-supported CPUs. Example usage: this enables the TimingSimpleCPU to be used to warmup caches before swapping to detailed mode with the in-order or out-of-order based CPU models.
2015-09-30cpu: Change thread assignments for heterogenous SMTMitch Hayenga
Trying to run an SE system with varying threads per core (SMT cores + Non-SMT cores) caused failures due to the CPU id assignment logic. The comment about thread assignment (worrying about core 0 not having tid 0) seems not to be valid given that our configuration scripts initialize them in order. This removes that constraint so a heterogenously threaded sytem can work.
2015-09-29ruby: Fix CacheMemory allocate leakJoel Hestness
If a cache entry permission was previously set to NotPresent, but the entry was not deleted, a following cache allocation can cause the entry to be leaked by setting the entry pointer to a newly allocated entry. To eliminate this possibility, check if the new entry is different from the old one, and if so, delete the old one.
2015-09-29arch, x86: Delete packet in IntDevice::recvResponseJoel Hestness
IntDevice::recvResponse is called from two places in current mainline: (1) the short circuit path of X86ISA::IntDevice::IntMasterPort::sendMessage for atomic mode, and (2) the full request->response path to and from the x86 interrupts device (finally called from MessageMasterPort::recvTimingResp). In the former case, the packet was deleted correctly, but in the latter case, the packet and request leak. To fix the leak, move request and packet deletion into IntDevice inherited class implementations of recvResponse.
2015-09-29ruby: RubyPort delete snoop requestsJoel Hestness
In RubyPort::ruby_eviction_callback, prior changes fixed a memory leak caused by instantiating separate packets for each port that the eviction was forwarded to. That change, however, left the instantiated request to also leak. Allocate it on the stack to avoid the leak.
2015-09-29ruby: Fix memory leak in AbstractControllerJoel Hestness
Recent changes to memory access queuing allocate requests for packets sent to memory controllers, but did not free the requests. Delete them to avoid leaks.
2015-09-29ruby: RubyMemoryControl delete requestsJoel Hestness
Changes to the RubyMemoryControl removed the dequeue function, which deleted MemoryNode instances. This results in leaked MemoryNode instances. Correctly delete these instances.
2015-09-29syscall_emul: Bandage readlink /proc/self/exeJoel Hestness
The recent changeset to readlink() to handle reading the /proc/self/exe link introduces a number of problems. This patch fixes two: 1) Because readlink() called on /proc/self/exe now uses LiveProcess::progName() to find the binary path, it will only get the zeroth parameter of the simulated system command line. However, if a config script also specifies the process' executable, the executable parameter is used to create the LiveProcess rather than the zeroth command line parameter. Thus, the zeroth command line parameter is not necessarily the correct path to the binary executing in the simulated system. To fix this, add a LiveProcess data member, 'executable', which is correctly set during instantiation and returned from progName(). 2) If a config script allows a user to pass a relative path as the zeroth simulated system command line parameter or process executable, readlink() will incorrecly return a relative path when called on '/proc/self/exe'. /proc/self/exe is always set to a full path, so running benchmarks can fail if a relative path is returned. To fix this, clean up the handling of LiveProcess::progName() within readlink() to get the full binary path. NOTE: This patch still leaves the potential problem that host full path to the binary bleeds into the simulated system, potentially causing the appearance of non-deterministic simulated system execution.
2015-09-25mem: Add PacketInfo to be used for packet probe pointsAndreas Hansson
This patch fixes a use-after-delete issue in the packet probe points by adding a PacketInfo struct to retain the key fields before passing the packet onwards. We want to probe the packet after it is successfully sent, but by that time the fields may be modified, and the packet may even be deleted. Amazingly enough the issue has gone undetected for months, and only recently popped up in our regressions.
2015-09-25stats: Update stats to reflect snoop-filter changesAndreas Hansson
2015-09-25mem: Add check for block status on WriteLineReq fillAndreas Hansson
More checks to help with understanding of functionality.
2015-09-25mem: Fix WriteLineReq fill behaviourAndreas Hansson
This patch fixes issues in the interactions between deferred snoops and WriteLineReq. More specifically, the patch addresses an issue where deferred snoops caused assertion failures when being serviced on the arrival of an InvalidateResp. The response packet was perceived to be invalidating, when actually it is not for the cache that sent out the original invalidation request.
2015-09-25mem: Comment clean-up for the snoop filterAndreas Hansson
Merely fixing up some style issues and adding more comments.
2015-09-25mem: Avoid adding and then removing empty snoop-filter itemsAndreas Hansson
This patch tidies up how we access the snoop filter for snoops, and avoids adding items only to later remove them.
2015-09-25mem: Only track snooping ports in the snoop filterAndreas Hansson
This patch changes the tracking of ports in the snoop filter to use local dense port IDs so that we can have 64 snooping ports (rather than crossbar slave ports). This is achieved by adding a simple remapping vector that translates the actal port IDs into the local slave IDs used in the SnoopMask. Ultimately this patch allows us to scale to much larger systems without introducing a hierarchy of crossbars.
2015-09-25mem: Add snoop filters to L2 crossbars, and check sizeAli Jafri
This patch adds a snoop filter to the L2XBar. For now we refrain from globally adding a snoop filter to the SystemXBar, since the latter is also used in systems without caches. In scenarios without caches the snoop filter will not see any writeback/clean evicts from the CPU ports, despite the fact that they are snooping. To avoid inadvertent use of the snoop filter in these cases we leave it out for now. A size check is added to the snoop filter, merely to ensure it does not grow beyond the total capacity of the caches above it. The size has to be set manually, and a value of 8 MByte is choosen as suitably high default.
2015-09-25mem: Store snoop filter lookup result to avoid second lookupAndreas Hansson
This patch introduces a private member storing the iterator from the lookupRequest call, such that it can be re-used when the request eventually finishes. The method previously called updateRequest is renamed finishRequest to make it more clear that the two functions must be called together.
2015-09-25mem: Add snoops for CleanEvicts and Writebacks in atomic modeAli Jafri
This patch mirrors the logic in timing mode which sends up snoops to check for cached copies before sending CleanEvicts and Writebacks down the memory hierarchy. In case there is a copy in a cache above, discard CleanEvicts and set the BLOCK_CACHED flag in Writebacks so that writebacks do not reset the cache residency bit in the snoop filter below.
2015-09-25mem: Add CleanEvict and Writeback support to snoop filtersAli Jafri
This patch adds the functionality to properly track CleanEvicts and Writebacks in the snoop filter. Previously there were no CleanEvicts, and Writebacks did not send up snoops to ensure there were no copies in caches above. Hence a writeback could never erase an entry from the snoop filter. When a CleanEvict message reaches a snoop filter, it confirms that the BLOCK_CACHED flag is not set and resets the bits corresponding to the CleanEvict address and port it arrived on. If none of the other peer caches have (or have requested) the block, the snoop filter forwards the CleanEvict to lower levels of memory. In case of a Writeback message, the snoop filter checks if the BLOCK_CACHED flag is not set and only then resets the bits corresponding to the Writeback address. If any of the other peer caches have (or has requested) the same block, the snoop filter sets the BLOCK_CACHED flag in the Writeback before forwarding it to lower levels of memory heirarachy.
2015-09-25mem: Add check for snooping ports in the snoop filterAli Jafri
This patch prevents the snoop filter from creating items for requests originating from non-snooping ports. The allocation decision is thus based both on the cacheability of the line, and the snooping status of the source port. Ultimately we should check if the source of the packet is caching, since also the CPU ports are snooping (but not allocating). Thus, at the moment we rely on the snoop filter being used together with caches. The patch also transitions to use the Packet::getBlockAddr in determining the line address.
2015-09-25mem: Make the coherent crossbar account for timing snoopsAndreas Hansson
This patch introduces the concept of a snoop latency. Given the requirement to snoop and forward packets in zero time (due to the coherency mechanism), the latency is accounted for later. On a snoop, we establish the latency, and later add it to the header delay of the packet. To allow multiple caches to contribute to the snoop latency, we use a separate variable in the packet, and then take the maximum before adding it to the header delay.
2015-09-25mem: Do not include snoop-filter latency in crossbar occupancyAndreas Hansson
This patch ensures that the snoop-filter latency only contributes to the packet latency, and not to the crossbar throughput/occupancy. In essence we treat the snoop-filter lookup as pipelined.
2015-09-25util: Fix minor issues in DRAM sweep scriptsAndreas Hansson
This patch fixes a few issues in the sweep scripts, bringing them up-to-date with the latest memory configs and options.
2015-09-24ruby: simple network: refactor codeNilay Vaish
Drops an unused variable and marks three variables as const.
2015-09-23ruby: garnet: refactor code in network linksNilay Vaish
2015-09-23ruby: bloom filters: refactor codeNilay Vaish
2015-09-23ruby: abstract controller: mark some variables as constNilay Vaish
2015-09-22mem: Add initial HBM configurationsWendy Elsasser
Created the following HBM configurations: 1) HBM gen1 (x128/CH), 2Gb die, 4H stack, 1Gbps, 8 channels 2) HBM gen2 (x64/PC), 8Gb die, 4H stack, 1Gbps, 16 pseudo-channels The configuration values are based on: - The HBM gen1 public JEDEC spec - Publically released data from MemCon presentations - Timing extrapolated from existing LPDDR configurations Will adjust once specs become available.
2015-09-18ruby: garnet: mark some variables as constNilay Vaish
2015-09-18ruby: print addresses in hexNilay Vaish
Changeset 4872dbdea907 replaced Address by Addr, but did not make changes to print statements. So the addresses which were being printed in hex earlier along with their line address, were now being printed in decimals. This patch adds a function printAddress(Addr) that can be used to print the address in hex along with the lines address. This function has been put to use in some of the places. At other places, change has been made to print just the address in hex.
2015-09-18ruby: slicc: derive DataMember class from Var instead of PairContainerNilay Vaish
The DataMember class in Type.py was being derived from PairContainer. A separate Var object was also created for the DataMember. This meant some duplication of across the members of these two classes (Var and DataMember). This patch changes DataMember from Var instead. There is no obvious reason to derive from PairContainer which can only hold pairs, something that Var class already supports. The only thing that DataMember has over Var is init_code, which is being retained. This change would later on help in having pointers in DataMembers.
2015-09-17ruby: update WireBuffer API to match that of MessageBufferTony Gutierrez
this patch updates the WireBuffer API to mirror the changes in revision 11111
2015-09-16stats: updates due to changes to MOESI_hammerNilay Vaish
2015-09-16ruby: Add missing block deallocations in MOESI_hammerLena Olson
Some blocks in MOESI hammer were not getting deallocated when they were set to an idle state (e.g. by invalidate or other_getx/s messages). While functionally correct, this caused some bad effects on performance, such as blocks in I in the L1s getting sent to the L2 upon eviction, in turn evicting valid blocks. Also, if a valid block was in LRU, that block could be evicted rather than a block in I. This patch adds in the missing deallocations. Committed by: Nilay Vaish<nilay@cs.wisc.edu>
2015-09-16ruby: fix message buffer init orderJoe Gross
The recent changes to make MessageBuffers SimObjects required them to be initialized in a particular order, which could break some protocols. Fix this by calling initNetQueues on the external nodes of each external link in the constructor of Network. This patch also refactors the duplicated code for checking network allocation and setting net queues (which are called by initNetQueues) from the simple and garnet networks to be in Network.
2015-09-16stats: slight changes to MOESI_CMP_token.Nilay Vaish
Due slight change to latency for the reissue table.