gem5 - gem5

Age	Commit message (Collapse)	Author
2013-07-18	mem: Set the cache line size on a system level	Andreas Hansson
	This patch removes the notion of a peer block size and instead sets the cache line size on the system level. Previously the size was set per cache, and communicated through the interconnect. There were plenty checks to ensure that everyone had the same size specified, and these checks are now removed. Another benefit that is not yet harnessed is that the cache line size is now known at construction time, rather than after the port binding. Hence, the block size can be locally stored and does not have to be queried every time it is used. A follow-on patch updates the configuration scripts accordingly.
2012-11-02	sim: Include object header files in SWIG interfaces	Andreas Sandberg
	When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it.
2012-10-15	memtest: move check on outstanding requests	Nilay Vaish
	The Memtest tester allows for only one request to be outstanding for a particular physical address. The check has been written separately for reads and writes. This patch moves the check earlier than its current position so that it need not be written separately for reads and writes.
2012-10-15	Port: Add protocol-agnostic ports in the port hierarchy	Andreas Hansson
	This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.
2012-08-28	Clock: Add a Cycles wrapper class and use where applicable	Andreas Hansson
	This patch addresses the comments and feedback on the preceding patch that reworks the clocks and now more clearly shows where cycles (relative cycle counts) are used to express time. Instead of bumping the existing patch I chose to make this a separate patch, merely to try and focus the discussion around a smaller set of changes. The two patches will be pushed together though. This changes done as part of this patch are mostly following directly from the introduction of the wrapper class, and change enough code to make things compile and run again. There are definitely more places where int/uint/Tick is still used to represent cycles, and it will take some time to chase them all down. Similarly, a lot of parameters should be changed from Param.Tick and Param.Unsigned to Param.Cycles. In addition, the use of curTick is questionable as there should not be an absolute cycle. Potential solutions can be built on top of this patch. There is a similar situation in the o3 CPU where lastRunningCycle is currently counting in Cycles, and is still an absolute time. More discussion to be had in other words. An additional change that would be appropriate in the future is to perform a similar wrapping of Tick and probably also introduce a Ticks class along with suitable operators for all these classes.
2012-08-28	Clock: Rework clocks to avoid tick-to-cycle transformations	Andreas Hansson
	This patch introduces the notion of a clock update function that aims to avoid costly divisions when turning the current tick into a cycle. Each clocked object advances a private (hidden) cycle member and a tick member and uses these to implement functions for getting the tick of the next cycle, or the tick of a cycle some time in the future. In the different modules using the clocks, changes are made to avoid counting in ticks only to later translate to cycles. There are a few oddities in how the O3 and inorder CPU count idle cycles, as seen by a few locations where a cycle is subtracted in the calculation. This is done such that the regression does not change any stats, but should be revisited in a future patch. Another, much needed, change that is not done as part of this patch is to introduce a new typedef uint64_t Cycle to be able to at least hint at the unit of the variables counting Ticks vs Cycles. This will be done as a follow-up patch. As an additional follow up, the thread context still uses ticks for the book keeping of last activate and last suspend and this should probably also be changed into cycles as well.
2012-08-21	Clock: Move the clock and related functions to ClockedObject	Andreas Hansson
	This patch moves the clock of the CPU, bus, and numerous devices to the new class ClockedObject, that sits in between the SimObject and MemObject in the class hierarchy. Although there are currently a fair amount of MemObjects that do not make use of the clock, they potentially should do so, e.g. the caches should at some point have the same clock as the CPU, potentially with a 1:n ratio. This patch does not introduce any new clock objects or object hierarchies (clusters, clock domains etc), but is still a step in the direction of having a more structured approach clock domains. The most contentious part of this patch is the serialisation of clocks that some of the modules (but not all) did previously. This serialisation should not be needed as the clock is set through the parameters even when restoring from the checkpoint. In other words, the state is "stored" in the Python code that creates the modules. The nextCycle methods are also simplified and the clock phase parameter of the CPU is removed (this could be part of a clock object once they are introduced).
2012-06-05	sim: Remove FastAlloc	Ali Saidi
	While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe. After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc when running twolf for ARM.
2012-05-01	MEM: Separate requests and responses for timing accesses	Andreas Hansson
	This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself.
2012-04-14	MEM: Remove the Broadcast destination from the packet	Andreas Hansson
	This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field. Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before). The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class. In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing.
2012-04-14	MEM: Separate snoops and normal memory requests/responses	Andreas Hansson
	This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions.
2012-03-30	MEM: Introduce the master/slave port sub-classes in C++	William Wang
	This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects. The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches. The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal. The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again.
2012-02-24	MEM: Move all read/write blob functions from Port to PortProxy	Andreas Hansson
	This patch moves the readBlob/writeBlob/memsetBlob from the Port class to the PortProxy class, thus making a clear separation of the basic port functionality (recv/send functional/atomic/timing), and the higher-level functional accessors available on the port proxies. There are only a few places in the code base where the blob functions were used on ports, and they are all for peeking into the memory system without making a normal memory access (in the memtest, and the malta and tsunami pchip). The memtest also exemplifies how easy it is to create a non-translating proxy if desired. The malta and tsunami pchip used a slave port to perform a functional read, and this is now changed to rely on the physProxy of the system (to which they already have a pointer).
2012-02-13	MEM: Introduce the master/slave port roles in the Python classes	Andreas Hansson
	This patch classifies all ports in Python as either Master or Slave and enforces a binding of master to slave. Conceptually, a master (such as a CPU or DMA port) issues requests, and receives responses, and conversely, a slave (such as a memory or a PIO device) receives requests and sends back responses. Currently there is no differentiation between coherent and non-coherent masters and slaves. The classification as master/slave also involves splitting the dual role port of the bus into a master and slave port and updating all the system assembly scripts to use the appropriate port. Similarly, the interrupt devices have to have their int_port split into a master and slave port. The intdev and its children have minimal changes to facilitate the extra port. Note that this patch does not enforce any port typing in the C++ world, it merely ensures that the Python objects have a notion of the port roles and are connected in an appropriate manner. This check is carried when two ports are connected, e.g. bus.master = memory.port. The following patches will make use of the classifications and specialise the C++ ports into masters and slaves.
2012-02-12	mem: Add a master ID to each request object.	Ali Saidi
	This change adds a master id to each request object which can be used identify every device in the system that is capable of issuing a request. This is part of the way to removing the numCpus+1 stats in the cache and replacing them with the master ids. This is one of a series of changes that make way for the stats output to be changed to python.
2012-01-17	MEM: Separate queries for snooping and address ranges	Andreas Hansson
	This patch simplifies the address-range determination mechanism and also unifies the naming across ports and devices. It further splits the queries for determining if a port is snooping and what address ranges it responds to (aiming towards a separation of cache-maintenance ports and pure memory-mapped ports). Default behaviours are such that most ports do not have to define isSnooping, and master ports need not implement getAddrRanges.
2011-06-30	Ruby: Add support for functional accesses	Brad Beckmann ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)
	This patch rpovides functional access support in Ruby. Currently only the M5Port of RubyPort supports functional accesses. The support for functional through the PioPort will be added as a separate patch.
2011-06-02	scons: rename TraceFlags to DebugFlags	Nathan Binkert

2011-04-15	trace: reimplement the DTRACE function so it doesn't use a vector	Nathan Binkert
	At the same time, rename the trace flags to debug flags since they have broader usage than simply tracing. This means that --trace-flags is now --debug-flags and --trace-help is now --debug-help
2011-04-15	includes: sort all includes	Nathan Binkert

2011-01-07	Replace curTick global variable with accessor functions.	Steve Reinhardt
	This step makes it easy to replace the accessor functions (which still access a global variable) with ones that access per-thread curTick values.
2010-12-21	memtest: delete some crufty dead code	Steve Reinhardt

2010-08-25	memtest: fix/cleanup functional access testing	Steve Reinhardt
	Don't assert that the response packet is marked as a response since it won't always be so for functional accesses. Also cleanup code to refer to functional accesses rather than "probes" (old terminology), and mention in the DPRINTF which type of access we're doing.
2010-08-24	testers: move testers to a new directory	Brad Beckmann
	This patch moves the testers to a new subdirectory under src/cpu and includes the necessary fixes to work with latest m5 initialization patches. --HG-- rename : configs/example/determ_test.py => configs/example/ruby_direct_test.py rename : src/cpu/directedtest/DirectedGenerator.cc => src/cpu/testers/directedtest/DirectedGenerator.cc rename : src/cpu/directedtest/DirectedGenerator.hh => src/cpu/testers/directedtest/DirectedGenerator.hh rename : src/cpu/directedtest/InvalidateGenerator.cc => src/cpu/testers/directedtest/InvalidateGenerator.cc rename : src/cpu/directedtest/InvalidateGenerator.hh => src/cpu/testers/directedtest/InvalidateGenerator.hh rename : src/cpu/directedtest/RubyDirectedTester.cc => src/cpu/testers/directedtest/RubyDirectedTester.cc rename : src/cpu/directedtest/RubyDirectedTester.hh => src/cpu/testers/directedtest/RubyDirectedTester.hh rename : src/cpu/directedtest/RubyDirectedTester.py => src/cpu/testers/directedtest/RubyDirectedTester.py rename : src/cpu/directedtest/SConscript => src/cpu/testers/directedtest/SConscript rename : src/cpu/directedtest/SeriesRequestGenerator.cc => src/cpu/testers/directedtest/SeriesRequestGenerator.cc rename : src/cpu/directedtest/SeriesRequestGenerator.hh => src/cpu/testers/directedtest/SeriesRequestGenerator.hh rename : src/cpu/memtest/MemTest.py => src/cpu/testers/memtest/MemTest.py rename : src/cpu/memtest/SConscript => src/cpu/testers/memtest/SConscript rename : src/cpu/memtest/memtest.cc => src/cpu/testers/memtest/memtest.cc rename : src/cpu/memtest/memtest.hh => src/cpu/testers/memtest/memtest.hh rename : src/cpu/rubytest/Check.cc => src/cpu/testers/rubytest/Check.cc rename : src/cpu/rubytest/Check.hh => src/cpu/testers/rubytest/Check.hh rename : src/cpu/rubytest/CheckTable.cc => src/cpu/testers/rubytest/CheckTable.cc rename : src/cpu/rubytest/CheckTable.hh => src/cpu/testers/rubytest/CheckTable.hh rename : src/cpu/rubytest/RubyTester.cc => src/cpu/testers/rubytest/RubyTester.cc rename : src/cpu/rubytest/RubyTester.hh => src/cpu/testers/rubytest/RubyTester.hh rename : src/cpu/rubytest/RubyTester.py => src/cpu/testers/rubytest/RubyTester.py rename : src/cpu/rubytest/SConscript => src/cpu/testers/rubytest/SConscript