summaryrefslogtreecommitdiff
path: root/src/cpu
AgeCommit message (Collapse)Author
2015-03-02cpu: Add a PC-value to the traffic generator requestsStephan Diestelhorst
Have the traffic generator add its masterID as the PC address to the requests. That way, prefetchers (and other components) that use a PC for request classification will see per-tester streams of requests. This enables us to test strided prefetchers with the memchecker, too.
2015-02-16cpu: TrafficGen sinks snoops without complainingAndreas Hansson
To be able to use the TrafficGen in a system with caches we need to allow it to sink incoming snoop requests. By default the master port panics, so silently ignore any snoops.
2015-02-16arch: Make readMiscRegNoEffect const throughoutAndreas Hansson
Finally took the plunge and made this apply to all ISAs, not just ARM.
2015-02-16cpu: add support for outputing a protobuf formatted CPU traceAli Saidi
Doesn't support x86 due to static instruction representation. --HG-- rename : src/cpu/CPUTracers.py => src/cpu/InstPBTrace.py
2015-02-11cpu: Tidy up the MemTest and make false sharing more obviousAndreas Hansson
The MemTest class really only tests false sharing, and as such there was a lot of old cruft that could be removed. This patch cleans up the tester, and also makes it more clear what the assumptions are. As part of this simplification the reference functional memory is also removed. The regression configs using MemTest are updated to reflect the changes, and the stats will be bumped in a separate patch. The example config will be updated in a separate patch due to more extensive re-work. In a follow-on patch a new tester will be introduced that uses the MemChecker to implement true sharing.
2015-02-11sim: Move the BaseTLB to src/arch/generic/Andreas Sandberg
The TLB-related code is generally architecture dependent and should live in the arch directory to signify that. --HG-- rename : src/sim/BaseTLB.py => src/arch/generic/BaseTLB.py rename : src/sim/tlb.cc => src/arch/generic/tlb.cc rename : src/sim/tlb.hh => src/arch/generic/tlb.hh
2015-02-06cpu: Idle CPU status logic revisedAlexandru Dutu
This patch sets the CPU status to idle when the last active thread gets suspended.
2015-02-03cpu: Ensure timing CPU sinks response before sending new requestAndreas Hansson
This patch changes how the timing CPU deals with processing responses, always scheduling an event, even if it is for the current tick. This helps to avoid situations where a new request shows up before a response is finished in the crossbar, and also is more in line with any realistic behaviour.
2015-01-25arm: always set the IsFirstMicroop flagAli Saidi
While the IsFirstMicroop flag exists it was only occasionally used in the ARM instructions that gem5 microOps and therefore couldn't be relied on to be correct.
2015-01-25sim: Clean up InstRecordAli Saidi
Track memory size and flags as well as add some comments and consts.
2015-01-25cpu: Remove all notion that we know when the cpu is misspeculating.Ali Saidi
We have no way of knowing if a CPU model is on the wrong path with our execute-in-execute CPU models. Don't pretend that we do.
2015-01-25cpu: Put all CPU instruction tracers in a single fileAli Saidi
2015-01-25cpu: remove legion tracerAli Saidi
If someone wants to debug with legion again they can restore the code from the repository, but no need to have it hang around indefinately.
2015-01-22mem: Clean up Request initialisationAndreas Hansson
This patch tidies up how we create and set the fields of a Request. In essence it tries to use the constructor where possible (as opposed to setPhys and setVirt), thus avoiding spreading the information across a number of locations. In fact, setPhys is made private as part of this patch, and a number of places where we callede setVirt instead uses the appropriate constructor.
2015-01-20cpu: commit probe notification on every microop or macroopNikos Nikoleris
The ppCommit should notify the attached listener every time the cpu commits a microop or non microcoded insturction. The listener can then decide whether it will process only the last microop (eg. SimPoint probe). Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-20cpu: Fix retry bug in MinorCPU LSQAndreas Hansson
2015-01-10cpu: fix RetiredStores probe pointNikos Nikoleris
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-03minor: fixed LSQ MasterPortIDAndrew Lukefahr
Minor was reporting the data cache access as ".inst" accesses. This just switches the MasterPortID to dataMasterPortId. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-12-09Let other objects set up memory like regions in a KVM VM.Gabe Black
2014-12-05cpu: Only check for PC events on instruction boundaries.Gabe Black
Only the instruction address is actually checked, so there's no need to check repeatedly while we're working through the microops of a macroop and that's not changing.
2014-12-02cpu: Fix retries on barrier/store in Minor's store bufferAndrew Bardsley
This patch fixes a case where a store in Minor's store buffer never leaves the store buffer as it is pre-maturely counted as having been issued, leading to the store buffer idling. LSQ::StoreBuffer::numUnissuedAccesses should count the number of accesses either in memory, or still in the store buffer after being completed. For stores which are also barriers, the store will stay in the store buffer for a cycle after it is completed and will be cleaned up by the barrier clearing code (to ensure that barriers are completed in-order). To acheive this, numUnissuedAccesses is not decremented when a store-barrier is issued to memory, but when its barrier effect is cleared. Without this patch, the correct behaviour happens when a memory transaction is immediately accepted, but not if it needs a retry.
2014-12-02cpu: Fix memoryIssueLimit checking in MinorAndrew Bardsley
This patch fixes the checking of the number of memory instructions issued per cycles in the Minor CPU.
2014-12-02cpu, o3: Ignored invalidate causing same-address load reorderingMarco Elver
In case the memory subsystem sends a combined response with invalidate (e.g. ReadRespWithInvalidate), we cannot ignore the invalidate part of the response. If we were to ignore the invalidate part, under certain circumstances this effectively leads to reordering of loads to the same address which is not permitted under any memory consistency model implemented in gem5. Consider the case where a later load's address is computed before an earlier load in program order, and is therefore sent to the memory subsystem first. At some point the earlier load's address is computed and in doing so correctly marks the later load as a possibleLoadViolation. In the meantime some other node writes and sends invalidations to all other nodes. The invalidation races with the later load's ReadResp, and arrives before ReadResp and is deferred. Upon receipt of the ReadResp, the response is changed to ReadRespWithInvalidate, and sent to the CPU. If we ignore the invalidate part of the packet, we let the later load read the old value of the address. Eventually the earlier load's ReadResp arrives, but with new data. As there was no invalidate snoop (sunk into the ReadRespWithInvalidate), and if we did not process the invalidate of the ReadRespWithInvalidate, we obtain a load reordering. A similar scenario can be constructed where the earlier load's address is computed after ReadRespWithInvalidate arrives for the younger load. In this case hitExternalSnoop needs to be set to true on the ReadRespWithInvalidate, so that upon knowing the address of the earlier load, checkViolations will cause the later load to be squashed. Finally we must account for the case where both loads are sent to the memory subsystem (reordered), a snoop invalidate arrives and correctly sets the later loads fault to ReExec. However, before the CPU processes the fault, the later load's ReadResp arrives and the writeback discards the outstanding fault. We must add a check to ensure that we do not skip any unprocessed faults.
2014-12-02cpu: Move packet deallocation to recvTimingResp in the O3 CPUStephan Diestelhorst
Move the packet deallocations in the O3 CPU so that the completeDataAccess deals only with the LSQ specific parts and the generic recvTimingResp frees the packet in all other cases.
2014-12-02mem: Assume all dynamic packet data is array allocatedAndreas Hansson
This patch simplifies how we deal with dynamically allocated data in the packet, always assuming that it is array allocated, and hence should be array deallocated (delete[] as opposed to delete). The only uses of dataDynamic was in the Ruby testers. The ARRAY_DATA flag in the packet is removed accordingly. No defragmentation of the flags is done at this point, leaving a gap in the bit masks. As the last part the patch, it renames dataDynamicArray to dataDynamic.
2014-12-02mem: Add const getters for write packet dataAndreas Hansson
This patch takes a first step in tightening up how we use the data pointer in write packets. A const getter is added for the pointer itself (getConstPtr), and a number of member functions are also made const accordingly. In a range of places throughout the memory system the new member is used. The patch also removes the unused isReadWrite function.
2014-12-02mem: Remove null-check bypassing in Packet::getPtrAndreas Hansson
This patch removes the parameter that enables bypassing the null check in the Packet::getPtr method. A number of call sites assume the value to be non-null. The one odd case is the RubyTester, which issues zero-sized prefetches(!), and despite being reads they had no valid data pointer. This is now fixed, but the size oddity remains (unless anyone object or has any good suggestions). Finally, in the Ruby Sequencer, appropriate checks are made for flush packets as they have no valid data pointer.
2014-11-23kvm, x86: Adding support for SE mode executionAlexandru Dutu
This patch adds methods in KvmCPU model to handle KVM exits caused by syscall instructions and page faults. These types of exits will be encountered if KvmCPU is run in SE mode.
2014-11-14arm: Fixes based on UBSan and static analysisAndreas Hansson
Another churn to clean up undefined behaviour, mostly ARM, but some parts also touching the generic part of the code base. Most of the fixes are simply ensuring that proper intialisation. One of the more subtle changes is the return type of the sign-extension, which is changed to uint64_t. This is to avoid shifting negative values (undefined behaviour) in the ISA code.
2014-11-12arm: Fix timing wakeup with LLSCAli Saidi
2014-11-06x86 isa: This patch attempts an implementation at mwait.Marc Orr
Mwait works as follows: 1. A cpu monitors an address of interest (monitor instruction) 2. A cpu calls mwait - this loads the cache line into that cpu's cache. 3. The cpu goes to sleep. 4. When another processor requests write permission for the line, it is evicted from the sleeping cpu's cache. This eviction is forwarded to the sleeping cpu, which then wakes up. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-11-06cpu: Minor Draining BugAndrew Lukefahr
Fixes a bug where Minor drains in the midst of committing a conditional store. While committing a conditional store, lastCommitWasEndOfMacroop is true (from the previous instruction) as we still haven't finished the conditional store. If a drain occurs before the cache response, Minor would check just lastCommitWasEndOfMacroop, which was true, and set drainState=DrainHaltFetch, which increases the streamSeqNum. This caused the conditional store to be squashed when the memory responded and it completed. However, to the memory the store succeeded, while to the instruction sequence it never occurred. In the case of an LLSC, the instruction sequence will replay the squashed STREX, which will fail as the cache is no longer in LLSC. Then the instruction sequence will loop back to a LDREX, which receives the updated (incorrect) value. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-10-29cpu: Add writeback modeling for drain functionalityMitch Hayenga
It is possible for the O3 CPU to consider itself drained and later have a squashed instruction perform a writeback. This patch re-adds tracking of in-flight instructions to prevent falsely signaling a drained event.
2014-10-29cpu: Add drain check functionality to IEWMitch Hayenga
IEW did not check the instQueue and memDepUnit to ensure they were drained. This caused issues when drainSanityCheck() did check those structures after asserting IEW was drained.
2014-10-29cpu: Add support to checker for CACHE_BLOCK_ZERO commands.Ali Saidi
The checker didn't know how to properly validate these new commands.
2014-10-29cpu: Fix barrier push to store buffer when full bug in MinorAndrew Bardsley
This patch fixes a bug where a completing load or store which is also a barrier can push a barrier into the store buffer without first checking that there is a free slot. The bug was not fatal but would print a warning that the store buffer was full when inserting.
2014-10-20cpu: o3: corrects base FP and CC register index in removeThread()Nilay Vaish
2014-10-16arch: Use shared_ptr for all FaultsAndreas Hansson
This patch takes quite a large step in transitioning from the ad-hoc RefCountingPtr to the c++11 shared_ptr by adopting its use for all Faults. There are no changes in behaviour, and the code modifications are mostly just replacing "new" with "make_shared".
2014-10-16o3: Use shared_ptr for MemDepEntryAndreas Hansson
This patch transitions the o3 MemDepEntry from the ad-hoc RefCountingPtr to the c++11 shared_ptr. There are no changes in behaviour, and the code modifications are mainly replacing "new" with "make_shared".
2014-10-16cpu: Probe points for basic PMU statsAndreas Sandberg
This changeset adds probe points that can be used to implement PMU counters for CPU stats. The following probes are supported: * BaseCPU::ppCycles / Cycles * BaseCPU::ppRetiredInsts / RetiredInsts * BaseCPU::ppRetiredLoads / RetiredLoads * BaseCPU::ppRetiredStores / RetiredStores * BaseCPU::ppRetiredBranches RetiredBranches
2014-10-16cpu: Add branch predictor PMU probe pointsAndreas Sandberg
This changeset adds probe points that can be used to implement PMU counters for branch predictor stats. The following probes are supported: * BPRedUnit::ppBranches / Branches * BPRedUnit::ppMisses / Misses
2014-10-11cpu: Fix o3 SMT IQCount bugAndrew Lukefahr
Commmitted by: Nilay Vaish <nilay@cs.wisc.edu>
2014-10-09cpu: Remove Ozone CPU from the source treeMitch Hayenga
The Ozone CPU is now very much out of date and completely non-functional, with no one actively working on restoring it. It is a source of confusion for new users who attempt to use it before realizing its current state. RIP
2014-09-27arch: Use const StaticInstPtr references where possibleAndreas Hansson
This patch optimises the passing of StaticInstPtr by avoiding copying the reference-counting pointer. This avoids first incrementing and then decrementing the reference-counting pointer.
2014-09-27scons: Address issues related to gcc 4.9.1Andreas Hansson
Fix a number few minor issues to please gcc 4.9.1. Removing the '-fuse-linker-plugin' flag means no libraries are part of the LTO process, but hopefully this is an acceptable loss, as the flag causes issues on a lot of systems (only certain combinations of gcc, ld and ar work).
2014-09-27misc: Fix a bunch of minor issues identified by static analysisAndreas Hansson
Add some missing initialisation, and fix a handful benign resource leaks (including some false positives).
2014-09-20cpu: Remove unused deallocateContext callsMitch Hayenga
The call paths for de-scheduling a thread are halt() and suspend(), from the thread context. There is no call to deallocateContext() in general, though some CPUs chose to define it. This patch removes the function from BaseCPU and the cores which do not require it.
2014-09-20alpha,arm,mips,power,x86,cpu,sim: Cleanup activate/deactivateMitch Hayenga
activate(), suspend(), and halt() used on thread contexts had an optional delay parameter. However this parameter was often ignored. Also, when used, the delay was seemily arbitrarily set to 0 or 1 cycle (no other delays were ever specified). This patch removes the delay parameter and 'Events' associated with them across all ISAs and cores. Unused activate logic is also removed.
2014-09-20mem: Rename Bus to XBar to better reflect its behaviourAndreas Hansson
This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus. As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet. --HG-- rename : src/mem/Bus.py => src/mem/XBar.py rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh rename : src/mem/bus.cc => src/mem/xbar.cc rename : src/mem/bus.hh => src/mem/xbar.hh
2014-09-20cpu: Update DRAM traffic genWendy Elsasser
Add new DRAM_ROTATE mode to traffic generator. This mode will generate DRAM traffic that rotates across banks per rank, command types, and ranks per channel The looping order is illustrated below: for (ranks per channel) for (command types) for (banks per rank) // Generate DRAM Command Series This patch also adds the read percentage as an input argument to the DRAM sweep script. If the simulated read percentage is 0 or 100, the middle for loop does not generate additional commands. This loop is used only when the read percentage is set to 50, in which case the middle loop will toggle between read and write commands. Modified sweep.py script, which generates DRAM traffic. Added input arguments and support for new DRAM_ROTATE mode. The script now has input arguments for: 1) Read percentage 2) Number of ranks 3) Address mapping 4) Traffic generator mode (DRAM or DRAM_ROTATE) The default values are: 100% reads, 1 rank, RoRaBaCoCh address mapping, and DRAM traffic gen mode For the DRAM traffic mode, added multi-rank support.