summaryrefslogtreecommitdiff
path: root/src/cpu/simple
AgeCommit message (Collapse)Author
2012-09-19AddrRange: Transition from Range<T> to AddrRangeAndreas Hansson
This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap. In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. --HG-- rename : src/base/range.hh => src/base/addr_range.hh rename : src/base/range_map.hh => src/base/addr_range_map.hh
2012-08-28Clock: Add a Cycles wrapper class and use where applicableAndreas Hansson
This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes.
2012-08-28Clock: Rework clocks to avoid tick-to-cycle transformationsAndreas Hansson
This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well.
2012-08-22Packet: Remove NACKs from packet and its use in endpointsAndreas Hansson
This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that). The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe. Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up.
2012-08-15O3,ARM: fix some problems with drain/switchout functionality and add Drain ↵Anthony Gutierrez
DPRINTFs This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements. This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation.
2012-07-09Port: Align port names in C++ and PythonAndreas Hansson
This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics. Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus.
2012-07-09Port: Move retry from port base class to Master/SlavePortAndreas Hansson
This patch is the last part of moving all protocol-related functionality out of the Port base class. All the send/recv functions are already moved, and the retry (which still governs all the timing transport functions) is the only part that remained in the base class. The only point where this currently causes a bit of inconvenience is in the bus where the retry list is global and holds Port pointers (not Master/SlavePort). This is about to change with the split into a request/response bus and will soon be removed anyway. The patch has no impact on any regressions.
2012-06-08Timing CPU: Remove a redundant port pointerAndreas Hansson
This patch is trivial and merely prunes a pointer that was never set or used.
2012-06-05cpu: Don't init simple and inorder CPUs if they are defered.Anthony Gutierrez
initCPU() will be called to initialize switched out CPUs for the simple and inorder CPU models. this patch prevents those CPUs from being initialized because they should get their state from the active CPU when it is switched out.
2012-05-26CPU: Merge the predecoder and decoder.Gabe Black
These classes are always used together, and merging them will give the ISAs more flexibility in how they cache things and manage the process. --HG-- rename : src/arch/x86/predecoder_tables.cc => src/arch/x86/decoder_tables.cc
2012-05-25Decode: Make the Decoder class defined per ISA.Gabe Black
--HG-- rename : src/cpu/decode.cc => src/arch/generic/decoder.cc rename : src/cpu/decode.hh => src/arch/generic/decoder.hh
2012-05-01MEM: Separate requests and responses for timing accessesAndreas Hansson
This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself.
2012-04-15CPU: Tidy up some formatting and a DPRINTF in the simple CPU base class.Gabe Black
Put the { on the same line as the if and put a space between the if and the open paren. Also, use the # format modifier which puts a 0x in front of hex values automatically. If the ExtMachInst type isn't integral and actually prints something more complicated, the # falls away harmlessly and we aren't left with a phantom 0x followed by a bunch of unrelated text.
2012-04-14MEM: Remove the Broadcast destination from the packetAndreas Hansson
This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field. Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before). The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class. In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing.
2012-04-14MEM: Separate snoops and normal memory requests/responsesAndreas Hansson
This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions.
2012-04-06MEM: Enable multiple distributed generalized memoriesAndreas Hansson
This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. --HG-- rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py rename : src/mem/physical.cc => src/mem/abstract_mem.cc rename : src/mem/physical.hh => src/mem/abstract_mem.hh rename : src/mem/physical.cc => src/mem/simple_mem.cc rename : src/mem/physical.hh => src/mem/simple_mem.hh
2012-04-03Atomic: Remove the physmem_port and access memory directlyAndreas Hansson
This patch removes the physmem_port from the Atomic CPU and instead uses the system pointer to access the physmem when using the fastmem option. The system already keeps track of the physmem and the valid memory address ranges, and with this patch we merely make use of that existing functionality. As a result of this change, the overloaded getMasterPort in the Atomic CPU can be removed, thus unifying the CPUs.
2012-03-30MEM: Introduce the master/slave port sub-classes in C++William Wang
This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects. The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches. The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal. The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again.
2012-03-30CPU: Unify initMemProxies across CPUs and simulation modesAndreas Hansson
This patch unifies where initMemProxies is called, in the init() method of each BaseCPU subclass, before TheISA::initCPU is called. Moreover, it also ensures that initMemProxies is called in both full-system and syscall-emulation mode, thus unifying also across the modes. An additional check is added in the ThreadState to ensure that initMemProxies is only called once.
2012-03-09CheckerCPU: Make CheckerCPU runtime selectable instead of compile selectableGeoffrey Blake
Enables the CheckerCPU to be selected at runtime with the --checker option from the configs/example/fs.py and configs/example/se.py configuration files. Also merges with the SE/FS changes.
2012-02-24CPU: Round-two unifying instr/data CPU ports across modelsAndreas Hansson
This patch continues the unification of how the different CPU models create and share their instruction and data ports. Most importantly, it forces every CPU to have an instruction and a data port, and gives these ports explicit getters in the BaseCPU (getDataPort and getInstPort). The patch helps in simplifying the code, make assumptions more explicit, andfurther ease future patches related to the CPU ports. The biggest changes are in the in-order model (that was not modified in the previous unification patch), which now moves the ports from the CacheUnit to the CPU. It also distinguishes the instruction fetch and load-store unit from the rest of the resources, and avoids the use of indices and casting in favour of keeping track of these two units explicitly (since they are always there anyways). The atomic, timing and O3 model simply return references to their already existing ports.
2012-02-13MEM: Introduce the master/slave port roles in the Python classesAndreas Hansson
This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves.
2012-02-12cpu: add separate stats for insts/ops both globally and per cpu modelAnthony Gutierrez
2012-02-12mem: Add a master ID to each request object.Ali Saidi
This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python.
2012-02-10SE/FS: Record the system pointer all the time for the simple CPU.Gabe Black
This pointer was only being stored in code that came from SE mode. The system pointer is always meaningful and available, so it should always be stored.
2012-02-07Faults: Turn off arch/faults.hhGabe Black
Because there are no longer architecture independent but specialized functions in arch/XXX/faults.hh, code that isn't using the faults from a particular ISA no longer needs to be able to include them through the switching header file arch/faults.hh. By removing that header file (arch/faults.hh), the potential interface between ISA code and non ISA code is narrowed.
2012-01-31Merge with head, hopefully the last time for this batch.Gabe Black
2012-01-31clang: Enable compiling gem5 using clang 2.9 and 3.0Koan-Sin Tan
This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh). clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places.
2012-01-31CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5Geoffrey Blake
Brings the CheckerCPU back to life to allow FS and SE checking of the O3CPU. These changes have only been tested with the ARM ISA. Other ISAs potentially require modification.
2012-01-29Implement Ali's review feedback.Gabe Black
Try to decrease indentation, and remove some redundant FullSystem checks.
2012-01-28Merge with the main repo.Gabe Black
--HG-- rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
2012-01-17MEM: Separate queries for snooping and address rangesAndreas Hansson
This patch simplifies the address-range determination mechanism and also unifies the naming across ports and devices. It further splits the queries for determining if a port is snooping and what address ranges it responds to (aiming towards a separation of cache-maintenance ports and pure memory-mapped ports). Default behaviours are such that most ports do not have to define isSnooping, and master ports need not implement getAddrRanges.
2012-01-17MEM: Simplify ports by removing EventManagerAndreas Hansson
This patch removes the inheritance of EventManager from the ports and moves all responsibility for event queues to the owner. Eventually the event manager should be the interface block, which could either be the structural owner or a subblock like a LSQ in the O3 CPU for example.
2012-01-17CPU: Moving towards a more general port across CPU modelsAndreas Hansson
This patch performs minimal changes to move the instruction and data ports from specialised subclasses to the base CPU (to the largest degree possible). Ultimately it servers to make the CPU(s) have a well-defined interface to the memory sub-system.
2012-01-17MEM: Add port proxies instead of non-structural portsAndreas Hansson
Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy. The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy --HG-- rename : src/mem/vport.cc => src/mem/fs_translating_port_proxy.cc rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
2011-11-18SE/FS: Get rid of FULL_SYSTEM in the CPU directory.Gabe Black
2011-11-01SE/FS: Get rid of uses of FULL_SYSTEM in Alpha.Gabe Black
2011-11-01SE/FS: Expose the same methods on the CPUs in SE and FS modes.Gabe Black
2011-09-19Syscall: Make the syscall function available in both SE and FS modes.Gabe Black
In FS mode the syscall function will panic, but the interface will be consistent and code which calls syscall can be compiled in. This will allow, for instance, instructions that use syscall to be built unconditionally but then not returned by the decoder.
2011-09-09Decode: Pull instruction decoding out of the StaticInst class into its own.Gabe Black
This change pulls the instruction decoding machinery (including caches) out of the StaticInst class and puts it into its own class. This has a few intrinsic benefits. First, the StaticInst code, which has gotten to be quite large, gets simpler. Second, the code that handles decode caching is now separated out into its own component and can be looked at in isolation, making it easier to understand. I took the opportunity to restructure the code a bit which will hopefully also help. Beyond that, this change also lays some ground work for each ISA to have its own, potentially stateful decode object. We'd be able to include less contextualizing information in the ExtMachInst objects since that context would be applied at the decoder. Also, the decoder could "know" ahead of time that all the instructions it's going to see are going to be, for instance, 64 bit mode, and it will have one less thing to check when it decodes them. Because the decode caching mechanism has been separated out, it's now possible to have multiple caches which correspond to different types of decoding context. Having one cache for each element of the cross product of different configurations may become prohibitive, so it may be desirable to clear out the cache when relatively static state changes and not to have one for each setting. Because the decode function is no longer universally accessible as a static member of the StaticInst class, a new function was added to the ThreadContexts that returns the applicable decode object.
2011-08-07Translation: Use a pointer type as the template argument.Gabe Black
This allows regular pointers and reference counted pointers without having to use any shim structures or other tricks.
2011-07-02ExecContext: Rename the readBytes/writeBytes functions to readMem and writeMem.Gabe Black
readBytes and writeBytes had the word "bytes" in their names because they accessed blobs of bytes. This distinguished them from the read and write functions which handled higher level data types. Because those functions don't exist any more, this change renames readBytes and writeBytes to more general names, readMem and writeMem, which reflect the fact that they are how you read and write memory. This also makes their names more consistent with the register reading/writing functions, although those are still read and set for some reason.
2011-07-02ExecContext: Get rid of the now unused read/write templated functions.Gabe Black
2011-06-02scons: rename TraceFlags to DebugFlagsNathan Binkert
2011-05-04CPU: Add some useful debug message to the timing simple cpu.Ali Saidi
2011-05-04CPU: Fix a case where timing simple cpu faults can nest.Ali Saidi
If we fault, change the state to faulting so that we don't fault again in the same cycle.
2011-04-15trace: reimplement the DTRACE function so it doesn't use a vectorNathan Binkert
At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help
2011-04-15includes: sort all includesNathan Binkert
2011-03-17ARM: Detect and skip udelay() functions in linux kernel.Ali Saidi
This change speeds up booting, especially in MP cases, by not executing udelay() on the core but instead skipping ahead tha amount of time that is being delayed.
2011-03-01Spelling: Fix the a spelling error by changing mmaped to mmapped.Gabe Black
There may not be a formally correct spelling for the past tense of mmap, but mmapped is the spelling Google doesn't try to autocorrect. This makes sense because it mirrors the past tense of map->mapped and not the past tense of cape->caped. --HG-- rename : src/arch/alpha/mmaped_ipr.hh => src/arch/alpha/mmapped_ipr.hh rename : src/arch/arm/mmaped_ipr.hh => src/arch/arm/mmapped_ipr.hh rename : src/arch/mips/mmaped_ipr.hh => src/arch/mips/mmapped_ipr.hh rename : src/arch/power/mmaped_ipr.hh => src/arch/power/mmapped_ipr.hh rename : src/arch/sparc/mmaped_ipr.hh => src/arch/sparc/mmapped_ipr.hh rename : src/arch/x86/mmaped_ipr.hh => src/arch/x86/mmapped_ipr.hh