summaryrefslogtreecommitdiff
path: root/src/cpu
AgeCommit message (Collapse)Author
2015-05-23kvm: Fix dumping code for large registersAndreas Sandberg
The register dumping code in kvm tries to print the bytes in large registers (128 bits and larger) instead of printing them as hex. This changeset fixes that.
2015-05-23kvm, x86: Guard x86-specific APIs in KvmVMAndreas Sandberg
Protect x86-specific APIs in KvmVM with compile-time guards to avoid breaking ARM builds.
2015-05-15misc: Appease gcc 5.1Andreas Hansson
Three minor issues are resolved: 1. Apparently gcc 5.1 does not like negation of booleans followed by bitwise AND. 2. Somehow the compiler also gets confused and warns about NoopMachInst being unused (removing it causes compilation errors though). Most likely a compiler bug. 3. There seems to be a number of instances where loop unrolling causes false positives for the array-bounds check. For now, switch to std::array. Potentially we could disable the warning for newer gcc versions, but switching to std::array is probably a good move in any case.
2015-05-05mem, cpu: Add a separate flag for strictly ordered memoryAndreas Sandberg
The Request::UNCACHEABLE flag currently has two different functions. The first, and obvious, function is to prevent the memory system from caching data in the request. The second function is to prevent reordering and speculation in CPU models. This changeset gives the order/speculation requirement a separate flag (Request::STRICT_ORDER). This flag prevents CPU models from doing the following optimizations: * Speculation: CPU models are not allowed to issue speculative loads. * Write combining: CPU models and caches are not allowed to merge writes to the same cache line. Note: The memory system may still reorder accesses unless the UNCACHEABLE flag is set. It is therefore expected that the STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent this behavior.
2015-05-05mem: Snoop into caches on uncacheable accessesAndreas Hansson
This patch takes a last step in fixing issues related to uncacheable accesses. We do not separate uncacheable memory from uncacheable devices, and in cases where it is really memory, there are valid scenarios where we need to snoop since we do not support cache maintenance instructions (yet). On snooping an uncacheable access we thus provide data if possible. In essence this makes uncacheable accesses IO coherent. The snoop filter is also queried to steer the snoops, but not updated since the uncacheable accesses do not allocate a block.
2015-05-05cpu: Work around gcc 4.9 issues with Num_OpClassesAndreas Hansson
This patch fixes a recent issue with gcc 4.9 (and possibly more) being convinced that indices outside the array bounds are used when initialising the FUPool members.
2015-04-29cpu: o3: replace issueLatency with bool pipelinedNilay Vaish
Currently, each op class has a parameter issueLat that denotes the cycles after which another op of the same class can be issued. As of now, this latency can either be one cycle (fully pipelined) or same as execution latency of the op (not at all pipelined). The fact that issueLat is a parameter of type Cycles makes one believe that it can be set to any value. To avoid the confusion, the parameter is being renamed as 'pipelined' with type boolean. If set to true, the op would execute in a fully pipelined fashion. Otherwise, it would execute in an unpipelined fashion.
2015-04-29cpu: o3: single cycle default div microop latency on x86Nilay Vaish
This patch sets the default latency of the division microop to a single cycle on x86. This is because the division instructions DIV and IDIV have been implemented as loops of div microops, where each microop computes a single bit of the quotient.
2015-04-22cpu: remove conditional check (count > 0) on o3 IQ squashesBrandon Potter
The o3 cpu instruction queue model uses the count variable to track the number of unissued instructions in the queue. Previously, the squash method used this variable to avoid executing the doSquash method when there were no unissued instructions in the pipeline. A corner case problem exists when only issued instructions exist in the pipeline and a squash occurs; the doSquash code is not invoked and subsequently does not clean up state properly.
2015-04-20cpu: Remove the InOrderCPU from the treeAndreas Hansson
This patch takes the final step in removing the InOrderCPU from the tree. Rest in peace. The MinorCPU is now used to model an in-order microarchitecture, and long term the MinorCPU will eventually be renamed InOrderCPU.
2015-04-14config, cpu: fix progress interval for switched CPUsMalek Musleh
This patch ensures that the CPU progress Event is triggered for the new set of switched_cpus that get scheduled (e.g. during fast-forwarding). it also avoids printing the interval state if the cpu is currently switched out. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-13cpu: re-organizes the branch predictor structure.Dibakar Gope
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-03cpu: fix system total instructions accountingNikos Nikoleris
The totalInstructions counter is only incremented when the whole instruction is commited and not on every microop. It was incorrectly reset in atomic and timing cpus. Committed by: Nilay Vaish <nilay@cs.wisc.edu>"
2015-03-26cpu: Fix InstPBTrace inheritanceAndreas Hansson
This patch fixes an issue that prevented gem5 to be built with C++ config and without Python.
2015-03-23mem: rename Locked/LOCKED to LockedRMW/LOCKED_RMWSteve Reinhardt
Makes x86-style locked operations even more distinct from LLSC operations. Using "locked" by itself should be obviously ambiguous now.
2015-03-19cpu: Fix TrafficGen message formatWendy Elsasser
Fix erroneous message format for fatal error. Previously, code did not have type indicator (% instead of %d). Also removed redundant fatal check. Ran modified sweep.py with in range and out of range values to test.
2015-02-11mem: restructure Packet cmd initialization a bit moreSteve Reinhardt
Refactor the way that specific MemCmd values are generated for packets. The new approach is a little more elegant in that we assign the right value up front, and it's also more amenable to non-heap-allocated Packet objects. Also replaced the code in the Minor model that was still doing it the ad-hoc way. This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249.
2015-03-09cpu: o3: another assert instead of checkNilay Vaish
2015-03-09cpu: o3: Remove unused code in iew, add assert instead.Nilay Vaish
2015-03-09cpu: o3: commit: mark pipeline delay variable as constsNilay Vaish
2015-03-09cpu: o3: remove unused stat variables.Nilay Vaish
2015-03-09cpu: o3: combine if with same conditionNilay Vaish
2015-03-09cpu: o3: remove member variable squashCounterNilay Vaish
The variable is used in only one place and a whole new function setNextStatus() has been defined just to compute the value of the variable. Instead of calling the function, the value is now computed in the loop that preceded the function call.
2015-03-09cpu: o3: remove unused function annotateMemoryUnits()Nilay Vaish
2015-03-02mem: Move crossbar default latencies to subclassesAndreas Hansson
This patch introduces a few subclasses to the CoherentXBar and NoncoherentXBar to distinguish the different uses in the system. We use the crossbar in a wide range of places: interfacing cores to the L2, as a system interconnect, connecting I/O and peripherals, etc. Needless to say, these crossbars have very different performance, and the clock frequency alone is not enough to distinguish these scenarios. Instead of trying to capture every possible case, this patch introduces dedicated subclasses for the three primary use-cases: L2XBar, SystemXBar and IOXbar. More can be added if needed, and the defaults can be overridden.
2015-03-02arm: Share a port for the two table walker objectsAndreas Hansson
This patch changes how the MMU and table walkers are created such that a single port is used to connect the MMU and the TLBs to the memory system. Previously two ports were needed as there are two table walker objects (stage one and stage two), and they both had a port. Now the port itself is moved to the Stage2MMU, and each TableWalker is simply using the port from the parent. By using the same port we also remove the need for having an additional crossbar joining the two ports before the walker cache or the L2. This simplifies the creation of the CPU cache topology in BaseCPU.py considerably. Moreover, for naming and symmetry reasons, the TLB walker port is connected through the stage-one table walker thus making the naming identical to x86. Along the same line, we use the stage-one table walker to generate the master id that is used by all TLB-related requests.
2015-03-02cpu: o3 register renaming request handling improvedRekai
Now, prior to the renaming, the instruction requests the exact amount of registers it will need, and the rename_map decides whether the instruction is allowed to proceed or not.
2015-03-02mem: Split port retry for all different packet classesAndreas Hansson
This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios. The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting. The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks.
2015-03-02cpu: Add a PC-value to the traffic generator requestsStephan Diestelhorst
Have the traffic generator add its masterID as the PC address to the requests. That way, prefetchers (and other components) that use a PC for request classification will see per-tester streams of requests. This enables us to test strided prefetchers with the memchecker, too.
2015-02-16cpu: TrafficGen sinks snoops without complainingAndreas Hansson
To be able to use the TrafficGen in a system with caches we need to allow it to sink incoming snoop requests. By default the master port panics, so silently ignore any snoops.
2015-02-16arch: Make readMiscRegNoEffect const throughoutAndreas Hansson
Finally took the plunge and made this apply to all ISAs, not just ARM.
2015-02-16cpu: add support for outputing a protobuf formatted CPU traceAli Saidi
Doesn't support x86 due to static instruction representation. --HG-- rename : src/cpu/CPUTracers.py => src/cpu/InstPBTrace.py
2015-02-11cpu: Tidy up the MemTest and make false sharing more obviousAndreas Hansson
The MemTest class really only tests false sharing, and as such there was a lot of old cruft that could be removed. This patch cleans up the tester, and also makes it more clear what the assumptions are. As part of this simplification the reference functional memory is also removed. The regression configs using MemTest are updated to reflect the changes, and the stats will be bumped in a separate patch. The example config will be updated in a separate patch due to more extensive re-work. In a follow-on patch a new tester will be introduced that uses the MemChecker to implement true sharing.
2015-02-11sim: Move the BaseTLB to src/arch/generic/Andreas Sandberg
The TLB-related code is generally architecture dependent and should live in the arch directory to signify that. --HG-- rename : src/sim/BaseTLB.py => src/arch/generic/BaseTLB.py rename : src/sim/tlb.cc => src/arch/generic/tlb.cc rename : src/sim/tlb.hh => src/arch/generic/tlb.hh
2015-02-06cpu: Idle CPU status logic revisedAlexandru Dutu
This patch sets the CPU status to idle when the last active thread gets suspended.
2015-02-03cpu: Ensure timing CPU sinks response before sending new requestAndreas Hansson
This patch changes how the timing CPU deals with processing responses, always scheduling an event, even if it is for the current tick. This helps to avoid situations where a new request shows up before a response is finished in the crossbar, and also is more in line with any realistic behaviour.
2015-01-25arm: always set the IsFirstMicroop flagAli Saidi
While the IsFirstMicroop flag exists it was only occasionally used in the ARM instructions that gem5 microOps and therefore couldn't be relied on to be correct.
2015-01-25sim: Clean up InstRecordAli Saidi
Track memory size and flags as well as add some comments and consts.
2015-01-25cpu: Remove all notion that we know when the cpu is misspeculating.Ali Saidi
We have no way of knowing if a CPU model is on the wrong path with our execute-in-execute CPU models. Don't pretend that we do.
2015-01-25cpu: Put all CPU instruction tracers in a single fileAli Saidi
2015-01-25cpu: remove legion tracerAli Saidi
If someone wants to debug with legion again they can restore the code from the repository, but no need to have it hang around indefinately.
2015-01-22mem: Clean up Request initialisationAndreas Hansson
This patch tidies up how we create and set the fields of a Request. In essence it tries to use the constructor where possible (as opposed to setPhys and setVirt), thus avoiding spreading the information across a number of locations. In fact, setPhys is made private as part of this patch, and a number of places where we callede setVirt instead uses the appropriate constructor.
2015-01-20cpu: commit probe notification on every microop or macroopNikos Nikoleris
The ppCommit should notify the attached listener every time the cpu commits a microop or non microcoded insturction. The listener can then decide whether it will process only the last microop (eg. SimPoint probe). Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-20cpu: Fix retry bug in MinorCPU LSQAndreas Hansson
2015-01-10cpu: fix RetiredStores probe pointNikos Nikoleris
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-03minor: fixed LSQ MasterPortIDAndrew Lukefahr
Minor was reporting the data cache access as ".inst" accesses. This just switches the MasterPortID to dataMasterPortId. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-12-09Let other objects set up memory like regions in a KVM VM.Gabe Black
2014-12-05cpu: Only check for PC events on instruction boundaries.Gabe Black
Only the instruction address is actually checked, so there's no need to check repeatedly while we're working through the microops of a macroop and that's not changing.
2014-12-02cpu: Fix retries on barrier/store in Minor's store bufferAndrew Bardsley
This patch fixes a case where a store in Minor's store buffer never leaves the store buffer as it is pre-maturely counted as having been issued, leading to the store buffer idling. LSQ::StoreBuffer::numUnissuedAccesses should count the number of accesses either in memory, or still in the store buffer after being completed. For stores which are also barriers, the store will stay in the store buffer for a cycle after it is completed and will be cleaned up by the barrier clearing code (to ensure that barriers are completed in-order). To acheive this, numUnissuedAccesses is not decremented when a store-barrier is issued to memory, but when its barrier effect is cleared. Without this patch, the correct behaviour happens when a memory transaction is immediately accepted, but not if it needs a retry.
2014-12-02cpu: Fix memoryIssueLimit checking in MinorAndrew Bardsley
This patch fixes the checking of the number of memory instructions issued per cycles in the Minor CPU.