summaryrefslogtreecommitdiff
path: root/src/cpu/inorder/cpu.cc
AgeCommit message (Collapse)Author
2013-01-07cpu: Rename defer_registration->switched_outAndreas Sandberg
The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive.
2013-01-07cpu: Check that the memory system is in the correct modeAndreas Sandberg
This patch adds checks to all CPU models to make sure that the memory system is in the correct mode at startup and when resuming after a drain. Previously, we only checked that the memory system was in the right mode when resuming. This is inadequate since this is a configuration error that should be detected at startup as well as when resuming. Additionally, since the check was done using an assert, it wasn't performed when NDEBUG was set (e.g., the fast target).
2013-01-07arch: Make the ISA class inherit from SimObjectAndreas Sandberg
The ISA class on stores the contents of ID registers on many architectures. In order to make reset values of such registers configurable, we make the class inherit from SimObject, which allows us to use the normal generated parameter headers. This patch introduces a Python helper method, BaseCPU.createThreads(), which creates a set of ISAs for each of the threads in an SMT system. Although it is currently only needed when creating multi-threaded CPUs, it should always be called before instantiating the system as this is an obvious place to configure ID registers identifying a thread/CPU.
2013-01-07cpu: rename the misleading inSyscall to noSquashFromTCAli Saidi
isSyscall was originally created because during handling of a syscall in SE mode the threadcontext had to be updated. However, in many places this is used in FS mode (e.g. fault handlers) and the name doesn't make much sense. The boolean actually stops gem5 from squashing speculative and non-committed state when a write to a threadcontext happens, so re-name the variable to something more appropriate
2012-08-28Clock: Add a Cycles wrapper class and use where applicableAndreas Hansson
This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes.
2012-08-28Clock: Rework clocks to avoid tick-to-cycle transformationsAndreas Hansson
This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well.
2012-08-06process: add progName() virtual functionSteve Reinhardt
This replaces a (potentially uninitialized) string field with a virtual function so that we can have a safe interface without requiring changes to the eio code.
2012-07-09Port: Align port names in C++ and PythonAndreas Hansson
This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics. Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus.
2012-06-05cpu: Don't init simple and inorder CPUs if they are defered.Anthony Gutierrez
initCPU() will be called to initialize switched out CPUs for the simple and inorder CPU models. this patch prevents those CPUs from being initialized because they should get their state from the active CPU when it is switched out.
2012-05-26CPU: Merge the predecoder and decoder.Gabe Black
These classes are always used together, and merging them will give the ISAs more flexibility in how they cache things and manage the process. --HG-- rename : src/arch/x86/predecoder_tables.cc => src/arch/x86/decoder_tables.cc
2012-05-25Decode: Make the Decoder class defined per ISA.Gabe Black
--HG-- rename : src/cpu/decode.cc => src/arch/generic/decoder.cc rename : src/cpu/decode.hh => src/arch/generic/decoder.hh
2012-05-01MEM: Separate requests and responses for timing accessesAndreas Hansson
This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself.
2012-04-14MEM: Separate snoops and normal memory requests/responsesAndreas Hansson
This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions.
2012-03-30CPU: Unify initMemProxies across CPUs and simulation modesAndreas Hansson
This patch unifies where initMemProxies is called, in the init() method of each BaseCPU subclass, before TheISA::initCPU is called. Moreover, it also ensures that initMemProxies is called in both full-system and syscall-emulation mode, thus unifying also across the modes. An additional check is added in the ThreadState to ensure that initMemProxies is only called once.
2012-03-02CPU: Check that the interrupt controller is created when neededAndreas Hansson
This patch adds a creation-time check to the CPU to ensure that the interrupt controller is created for the cases where it is needed, i.e. if the CPU is not being switched in later and not a checker CPU. The patch also adds the "createInterruptController" call to a number of the regression scripts.
2012-02-24CPU: Round-two unifying instr/data CPU ports across modelsAndreas Hansson
This patch continues the unification of how the different CPU models create and share their instruction and data ports. Most importantly, it forces every CPU to have an instruction and a data port, and gives these ports explicit getters in the BaseCPU (getDataPort and getInstPort). The patch helps in simplifying the code, make assumptions more explicit, andfurther ease future patches related to the CPU ports. The biggest changes are in the in-order model (that was not modified in the previous unification patch), which now moves the ports from the CacheUnit to the CPU. It also distinguishes the instruction fetch and load-store unit from the rest of the resources, and avoids the use of indices and casting in favour of keeping track of these two units explicitly (since they are always there anyways). The atomic, timing and O3 model simply return references to their already existing ports.
2012-02-12cpu: add separate stats for insts/ops both globally and per cpu modelAnthony Gutierrez
2012-01-31Merge with head, hopefully the last time for this batch.Gabe Black
2012-01-31clang: Enable compiling gem5 using clang 2.9 and 3.0Koan-Sin Tan
This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh). clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places.
2012-01-28Merge with the main repo.Gabe Black
--HG-- rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
2012-01-17MEM: Add port proxies instead of non-structural portsAndreas Hansson
Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy. The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy --HG-- rename : src/mem/vport.cc => src/mem/fs_translating_port_proxy.cc rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
2012-01-07Merge with main repository.Gabe Black
2011-11-18SE/FS: Get rid of FULL_SYSTEM in the CPU directory.Gabe Black
2011-11-01SE/FS: Expose the same methods on the CPUs in SE and FS modes.Gabe Black
2011-10-31SE/FS: Make the functions available from the TC consistent between SE and FS.Gabe Black
2011-10-31GCC: Get everything working with gcc 4.6.1.Gabe Black
And by "everything" I mean all the quick regressions.
2011-10-30SE/FS: Build the base process class in FS.Gabe Black
2011-09-09Decode: Pull instruction decoding out of the StaticInst class into its own.Gabe Black
This change pulls the instruction decoding machinery (including caches) out of the StaticInst class and puts it into its own class. This has a few intrinsic benefits. First, the StaticInst code, which has gotten to be quite large, gets simpler. Second, the code that handles decode caching is now separated out into its own component and can be looked at in isolation, making it easier to understand. I took the opportunity to restructure the code a bit which will hopefully also help. Beyond that, this change also lays some ground work for each ISA to have its own, potentially stateful decode object. We'd be able to include less contextualizing information in the ExtMachInst objects since that context would be applied at the decoder. Also, the decoder could "know" ahead of time that all the instructions it's going to see are going to be, for instance, 64 bit mode, and it will have one less thing to check when it decodes them. Because the decode caching mechanism has been separated out, it's now possible to have multiple caches which correspond to different types of decoding context. Having one cache for each element of the cross product of different configurations may become prohibitive, so it may be desirable to clear out the cache when relatively static state changes and not to have one for each setting. Because the decode function is no longer universally accessible as a static member of the StaticInst class, a new function was added to the ThreadContexts that returns the applicable decode object.
2011-06-19inorder: se: squash after syscallsKorey Sewell
2011-06-19inorder: cleanup dprintfs in cache unitKorey Sewell
2011-06-19inorder: se compile fixesKorey Sewell
2011-06-19inorder: add necessary debug flag header filesKorey Sewell
2011-06-19inorder: use trapPending flag to manage trapsKorey Sewell
2011-06-19inorder: dont handle multiple faults on same cycleKorey Sewell
if a faulting instruction reaches an execution unit, then ignore it and pass it through the pipeline. Once we recognize the fault in the graduation unit, dont allow a second fault to creep in on the same cycle.
2011-06-19inorder: check for interrupts each tickKorey Sewell
use a dummy instruction to facilitate the squash after the interrupts trap
2011-06-19inorder: explicit fault checkKorey Sewell
Before graduating an instruction, explicitly check fault by making the fault check it's own separate command that can be put on an instruction schedule.
2011-06-19inorder: squash and trap behind a tlb faultKorey Sewell
2011-06-19inorder: make InOrder CPU FS compilable/visibleKorey Sewell
make syscall a SE mode only functionality copy over basic FS functions (hwrei) to make FS compile
2011-06-19inorder: redefine DynInst FP result typeKorey Sewell
Sharing the FP value w/the integer values was giving inconsistent results esp. when their is a 32-bit integer register matched w/a 64-bit float value
2011-06-19inorder: treat SE mode syscalls as a trapping instructionKorey Sewell
define a syscallContext to schedule the syscall and then use syscall() to actually perform the action
2011-06-19inorder: cleanup events in resource poolKorey Sewell
remove events in the resource pool that can be called from the CPU event, since the CPU event is scheduled at the same time at the resource pool event. ---- Also, match the resPool event function names to the cpu event function names ----
2011-06-19inorder: branch predictor updateKorey Sewell
only update BTB on a taken branch and update branch predictor w/pcstate from instruction --- only pay attention to branch predictor updates if the the inst. is in fact a branch
2011-06-19inorder: no dep. tracking for zero regKorey Sewell
this causes forwarding a bad value register value
2011-06-19imported patch recoverPCfromTrapKorey Sewell
2011-06-19imported patch squash_from_next_stageKorey Sewell
2011-06-19inorder: update event prioritiesKorey Sewell
dont use offset to calculate this but rather an enum that can be updated
2011-06-19inorder: implement trap handlingKorey Sewell
2011-06-19inorder: use setupSquash for misspeculationKorey Sewell
implement a clean interface to handle branch misprediction and eventually all pipeline flushing
2011-06-19inorder: make marking of dest. regs an explicit requestKorey Sewell
formerly, this was implicit when you accessed the execution unit or the use-def unit but it's better that this just be something that a user can specify.
2011-06-19inorder: simplify handling of split accessesKorey Sewell