summaryrefslogtreecommitdiff
path: root/src/sim
AgeCommit message (Collapse)Author
2012-11-02ARM: dump stats and process info on context switchesDam Sunwoo
This patch enables dumping statistics and Linux process information on context switch boundaries (__switch_to() calls) that are used for Streamline integration (a graphical statistics viewer from ARM).
2012-11-02sim: Fix as issue where exit events on instr queues are used after freed.Ali Saidi
2012-10-25dev: Make default clock more reasonable for system and devicesAndreas Hansson
This patch changes the default system clock from 1THz to 1GHz. This clock is used by all modules that do not override the default (parent clock), and primarily affects the IO subsystem. Every DMA device uses its clock to schedule the next transfer, and the change will thus cause this inter-transfer delay to be longer. The default clock of the bus is removed, as the clock inherited from the system provides exactly the same value. A follow-on patch will bump the stats.
2012-10-15ruby: reset timing after cache warm upNilay Vaish
Ruby system was recently converted to a clocked object. Such objects maintain state related to the time that has passed so far. During the cache warmup, Ruby system changes its own time and the global time. Later on, the global time is restored. So Ruby system also needs to reset its own time.
2012-10-15Port: Add protocol-agnostic ports in the port hierarchyAndreas Hansson
This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.
2012-10-15Mem: Separate the host and guest views of memory backing storeAndreas Hansson
This patch moves all the memory backing store operations from the independent memory controllers to the global physical memory. The main reason for this patch is to allow address striping in a future set of patches, but at this point it already provides some useful functionality in that it is now possible to change the number of memory controllers and their address mapping in combination with checkpointing. Thus, the host and guest view of the memory backing store are now completely separate. With this patch, the individual memory controllers are far simpler as all responsibility for serializing/unserializing is moved to the physical memory. Currently, the functionality is more or less moved from AbstractMemory to PhysicalMemory without any major changes. However, in a future patch the physical memory will also resolve any ranges that are interleaved and properly assign the backing store to the memory controllers, and keep the host memory as a single contigous chunk per address range. Functionality for future extensions which involve CPU virtualization also enable the host to get pointers to the backing store.
2012-10-15Checkpoint: Make system serialize call childrenAndreas Hansson
This patch changes how the serialization of the system works. The base class had a non-virtual serialize and unserialize, that was hidden by a function with the same name for a number of subclasses (most likely not intentional as the base class should have been virtual). A few of the derived systems had no specialization at all (e.g. Power and x86 that simply called the System::serialize), but MIPS and Alpha adds additional symbol table entries to the checkpoint. Instead of overriding the virtual function, the additional entries are now printed through a virtual function (un)serializeSymtab. The reason for not calling System::serialize from the two related systems is that a follow up patch will require the system to also serialize the PhysicalMemory, and if this is done in the base class if ends up being between the general parts and the specialized symbol table. With this patch, the checkpoint is not modified, as the order of the segments is unchanged.
2012-10-15Clock: Inherit the clock from parent by defaultAndreas Hansson
This patch changes the default 1 Tick clock period to a proxy that resolves the parents clock. As a result of this, the caches and L1-to-L2 bus, for example, will automatically use the clock period of the CPU unless explicitly overridden. To ensure backwards compatibility, the System class overrides the proxy and specifies a 1 Tick clock. We could change this to something more reasonable in a follow-on patch, perhaps 1 GHz or something similar. With this patch applied, all clocked objects should have a reasonable clock period set, and could start specifying delays in Cycles instead of absolute time.
2012-09-25Statistics: Add a function to configure periodic stats dumpingSascha Bischoff
This patch adds a function, periodicStatDump(long long period), which will dump and reset the statistics every period. This function is designed to be called from the python configuration scripts. This allows the periodic stats dumping to be configured more easilly at run time. The period is currently specified as a long long as there are issues passing Tick into the C++ from the python as they have conflicting definitions. If the period is less than curTick, the first occurance occurs at curTick. If the period is set to 0, then the event is descheduled and the stats are not periodically dumped. Due to issues when resumung from a checkpoint, the StatDump event must be moved forward such that it occues AFTER the current tick. As the function is called from the python, the event is scheduled before the system resumes from the checkpoint. Therefore, the event is moved using the updateEvents() function. This is called from simulate.py once the system has resumed from the checkpoint. NOTE: It should be noted that this is a fairly temporary patch which re-adds the capability to extract temporal information from the communication monitors. It should not be used at the same time as anything that relies on dumping the statistics based on in simulation events i.e. a context switch.
2012-09-25ARM: Squash outstanding walks when instructions are squashed.Ali Saidi
2012-09-25sim: Move CPU-specific methods from SimObject to the BaseCPU classAndreas Sandberg
2012-09-25sim: Remove SimObject::setMemoryModeAndreas Sandberg
Remove SimObject::setMemoryMode from the main SimObject class since it is only valid for the System class. In addition to removing the method from the C++ sources, this patch also removes getMemoryMode and changeTiming from SimObject.py and updates the simulation code to call the (get|set)MemoryMode method on the System object instead.
2012-09-21SE: Ignore FUTEX_PRIVATE_FLAG of sys_futexLluc Alvarez
This patch ignores the FUTEX_PRIVATE_FLAG of the sys_futex system call in SE mode. With this patch, when sys_futex with the options FUTEX_WAIT_PRIVATE or FUTEX_WAKE_PRIVATE is emulated, the FUTEX_PRIVATE_FLAG is ignored and so their behaviours are the regular FUTEX_WAIT and FUTEX_WAKE. Emulating FUTEX_WAIT_PRIVATE and FUTEX_WAKE_PRIVATE as if they were non-private is safe from a functional point of view. The FUTEX_PRIVATE_FLAG does not change the semantics of the futex, it's just a mechanism to improve performance under certain circunstances that can be ignored in SE mode.
2012-09-10NetBSD: Build on NetBSDPalle Lyckegaard
Minor patch against so building on NetBSD is possible.
2012-09-07sim: Update the SimObject documentationAndreas Sandberg
Includes a small change in sim_object.cc that adds the name space to the output stream parameter in serializeAll. Leaving out the name space unfortunately confuses Doxygen.
2012-09-07sim: Remove the unused SimObject::regFormulas methodAndreas Sandberg
Simulation objects normally register derived statistics, presumably what regFormulas originally was meant for, in regStats(). This patch removes regRegformulas since there is no need to have a separate method call to register formulas.
2012-09-07sim: add validation to make sure there is memory where we're loading the kernelKrishnendra Nathella
2012-08-28Clock: Add a Cycles wrapper class and use where applicableAndreas Hansson
This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes.
2012-08-28Clock: Rework clocks to avoid tick-to-cycle transformationsAndreas Hansson
This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well.
2012-08-27sim: fix overflow check in simulate because Tick is now unsignedAnthony Gutierrez
2012-08-27System: Remove redundant call to startupCPUNilay Vaish
2012-08-21EventManager: Remove test for NULL pointer in constructorAndreas Hansson
This patch tidies up the EventManager constructor and prunes a corner case where the EventManager would initialise its eventq pointer to NULL. This would cause segmentation faults on actual use and should never happen.
2012-08-21Clock: Make Tick unsigned and remove UTickAndreas Hansson
This patch makes the Tick unsigned and removes the UTick typedef. The ticks should never be negative, and there was only one major issue with removing it, caused by the o3 CPU using a -1 as an initial value. The patch has no impact on any regressions.
2012-08-21Clock: Move the clock and related functions to ClockedObjectAndreas Hansson
This patch moves the clock of the CPU, bus, and numerous devices to the new class ClockedObject, that sits in between the SimObject and MemObject in the class hierarchy. Although there are currently a fair amount of MemObjects that do not make use of the clock, they potentially should do so, e.g. the caches should at some point have the same clock as the CPU, potentially with a 1:n ratio. This patch does not introduce any new clock objects or object hierarchies (clusters, clock domains etc), but is still a step in the direction of having a more structured approach clock domains. The most contentious part of this patch is the serialisation of clocks that some of the modules (but not all) did previously. This serialisation should not be needed as the clock is set through the parameters even when restoring from the checkpoint. In other words, the state is "stored" in the Python code that creates the modules. The nextCycle methods are also simplified and the clock phase parameter of the CPU is removed (this could be part of a clock object once they are introduced).
2012-08-15O3,ARM: fix some problems with drain/switchout functionality and add Drain ↵Anthony Gutierrez
DPRINTFs This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements. This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation.
2012-08-08System: set kernel to null, if unspecified.Nilay Vaish
2012-08-06process: add progName() virtual functionSteve Reinhardt
This replaces a (potentially uninitialized) string field with a virtual function so that we can have a safe interface without requiring changes to the eio code.
2012-08-06syscall_emul: clean up open() code a bit.Steve Reinhardt
2012-08-06str: add an overloaded startswith() utility methodSteve Reinhardt
for various string types and use it in a few places.
2012-08-06syscall emulation: Clean up ioctl handling, and implement for x86.Marc Orr
Enable different whitelists for different OS/arch combinations, since some use the generic Linux definitions only, and others use definitions inherited from earlier Unix flavors on those architectures. Also update x86 function pointers so ioctl is no longer unimplemented on that platform. This patch is a revised version of Vince Weaver's earlier patch.
2012-07-10syscall emulation: Add the futex system call.Marc Orr
2012-07-10Add hook to call map() on Process from python.Steve Reinhardt
This enables configuration scripts to set up mappings from process virtual addresses to specific physical addresses in SE mode. This feature is needed to support modeling of user-accessible memories or devices in SE mode, avoiding the complexities of FS mode and the need to write a device driver.
2012-07-09EventManager: Rename queue accessor and remove cast operatorAndreas Hansson
This patch renames the queue() accessor to the less ambigious eventQueue, and also removes the cast operator. The queue() member function cause problems in derived classes that declare members with the same name, e.g. a MemObject subclass that has a packet queue on its own. The operator is not causing any harm at this point, but as it is not used there is little point in keeping it.
2012-07-09Fix: Address a few benign memory leaksAndreas Hansson
This patch is the result of static analysis identifying a number of memory leaks. The leaks are all benign as they are a result of not deallocating memory in the desctructor. The fix still has value as it removes false positives in the static analysis.
2012-06-05cpt: update some comments in the checkpoint migration scriptAli Saidi
2012-06-05Mem: add per-master stats to physmemDam Sunwoo
Added per-master stats (similar to cache stats) to physmem.
2012-06-05sim: Provide a framework for detecting out of data checkpoints and migrating ↵Ali Saidi
them.
2012-06-05sim: Remove FastAllocAli Saidi
While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe. After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc when running twolf for ARM.
2012-05-30gcc: Small fixes to compile with gcc 4.7Andreas Hansson
This patch makes two very minor changes to please gcc 4.7. The CopyData function no longer exists and this has been replaced. For some reason previous versions of gcc did not complain on the const char casting not having an implementation, but this is now addressed.
2012-05-19Syscalls: warn when the length argument to mmap is excessive.Gabe Black
If the length argument to mmap is larger than the arbitrary but reasonable limit of 4GB, there's a good chance that the value is nonsense and not intentional. Rather than attempting to satisfy the mmap anyway, this change makes gem5 warn to make it more apparent what's going wrong.
2012-05-14Mem: Fix size check when allocating physical memoryLena Olson
2012-05-10gem5: fix a number of use after free issuesAli Saidi
2012-05-10stats: track if the stats have been enabled and prevent requesting master idAli Saidi
Track the point in the initialization where statistics have been registered. After this point registering new masterIds can no longer work as some SimObjects may have sized stats vectors based on the previous value. If someone tries to register a masterId after this point the simulator executes fatal().
2012-05-01MEM: Separate requests and responses for timing accessesAndreas Hansson
This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself.
2012-04-14MEM: Separate snoops and normal memory requests/responsesAndreas Hansson
This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions.
2012-04-14clang/gcc: Fix compilation issues with clang 3.0 and gcc 4.6Andreas Hansson
This patch addresses a number of minor issues that cause problems when compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it avoids using the deprecated ext/hash_map and instead uses unordered_map (and similarly so for the hash_set). To make use of the new STL containers, g++ and clang has to be invoked with "-std=c++0x", and this is now added for all gcc versions >= 4.6, and for clang >= 3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1 unordered_map to avoid the deprecation warning. The addition of c++0x in turn causes a few problems, as the compiler is more stringent and adds a number of new warnings. Below, the most important issues are enumerated: 1) the use of namespaces is more strict, e.g. for isnan, and all headers opening the entire namespace std are now fixed. 2) another other issue caused by the more stringent compiler is the narrowing of the embedded python, which used to be a char array, and is now unsigned char since there were values larger than 128. 3) a particularly odd issue that arose with the new c++0x behaviour is found in range.hh, where the operator< causes gcc to complain about the template type parsing (the "<" is interpreted as the beginning of a template argument), and the problem seems to be related to the begin/end members introduced for the range-type iteration, which is a new feature in c++11. As a minor update, this patch also fixes the build flags for the clang debug target that used to be shared with gcc and incorrectly use "-ggdb".
2012-04-06MEM: Enable multiple distributed generalized memoriesAndreas Hansson
This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. --HG-- rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py rename : src/mem/physical.cc => src/mem/abstract_mem.cc rename : src/mem/physical.hh => src/mem/abstract_mem.hh rename : src/mem/physical.cc => src/mem/simple_mem.cc rename : src/mem/physical.hh => src/mem/simple_mem.hh
2012-03-30MEM: Introduce the master/slave port sub-classes in C++William Wang
This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects. The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches. The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal. The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again.
2012-03-19gcc: Clean-up of non-C++0x compliant code, first stepsAndreas Hansson
This patch cleans up a number of minor issues aiming to get closer to compliance with the C++0x standard as interpreted by gcc and clang (compile with std=c++0x and -pedantic-errors). In particular, the patch cleans up enums where the last item was succeded by a comma, namespaces closed by a curcly brace followed by a semi-colon, and the use of the GNU-extension typeof (replaced by templated functions). It does not address variable-length arrays, zero-size arrays, anonymous structs, range expressions in switch statements, and the use of long long. The generated CPU code also has a large number of issues that remain to be fixed, mainly related to overflows in implicit constant conversion (due to shifts).
2012-03-19clang: Fix recently introduced clang compilation errorsAndreas Hansson
This patch makes the code compile with clang 2.9 and 3.0 again by making two very minor changes. Firt, it maintains a strict typing in the forward declaration of the BaseCPUParams. Second, it adds a FullSystemInt flag of the type unsigned int next to the boolean FullSystem flag. The FullSystemInt variable can be used in decode-statements (expands to switch statements) in the instruction decoder.