summaryrefslogtreecommitdiff
path: root/src/mem/cache
AgeCommit message (Collapse)Author
2013-02-19mem: Add predecessor to SenderState base classAndreas Hansson
This patch adds a predecessor field to the SenderState base class to make the process of linking them up more uniform, and enable a traversal of the stack without knowing the specific type of the subclasses. There are a number of simplifications done as part of changing the SenderState, particularly in the RubyTest.
2013-02-15mem: Tighten up cache constness and scopingAndreas Hansson
This patch merely adopts a more strict use of const for the cache member functions and variables, and also moves a large portion of the member functions from public to protected.
2013-02-15sim: Add a system-global option to bypass cachesAndreas Sandberg
Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches. To make memory mode tests cleaner, the following methods are added to the System class: * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed. The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.
2013-01-28cache: remove drainManager because it's not usedAnthony Gutierrez
the cache drainManager is set but never cleared, this is because the cache itself does not need to be drained and thus never triggers a signalDrainDone(). because the drainManager variable is not used properly and does not appear to be necessary it has been removed with this patch.
2013-01-08mem: Make LL/SC locks fine grainedMitch Hayenga
The current implementation in gem5 just keeps a list of locks per cacheline. Due to this, a store to a non-overlapping portion of the cacheline can cause an LL/SC pair to fail. This patch simply adds an address range to the lock structure, so that the lock is only invalidated if the store overlaps the lock range.
2013-01-07mem: Fix guest corruption when caches handle uncacheable accessesAndreas Sandberg
When the classic gem5 cache sees an uncacheable memory access, it used to ignore it or silently drop the cache line in case of a write. Normally, there shouldn't be any data in the cache belonging to an uncacheable address range. However, since some architecture models don't implement cache maintenance instructions, there might be some dirty data in the cache that is discarded when this happens. The reason it has mostly worked before is because such cache lines were most likely evicted by normal memory activity before a TLB flush was requested by the OS. Previously, the cache model would invalidate cache lines when they were accessed by an uncacheable write. This changeset alters this behavior so all uncacheable memory accesses cause a cache flush with an associated writeback if necessary. This is implemented by reusing the cache flushing machinery used when draining the cache, which implies that writebacks are performed using functional accesses.
2013-01-07mem: Remove the IIC replacement policyAndreas Sandberg
The IIC replacement policy seems to be unused and has probably gathered too much bit rot to be useful. This patch removes the IIC and its associated cache parameters.
2013-01-07sim: Fatal if a clocked object is set to have a clock of 0Andreas Hansson
This patch adds a check to the clocked object constructor to ensure it is not configured to have a clock period of 0.
2013-01-07cache: add note about where conflicts are handledAli Saidi
2012-11-02mem: Add support for writing back and flushing cachesAndreas Sandberg
This patch adds support for the following optional drain methods in the classical memory system's cache model: memWriteback() - Write back all dirty cache lines to memory using functional accesses. memInvalidate() - Invalidate all cache lines. Dirty cache lines are lost unless a writeback is requested. Since memWriteback() is called when checkpointing systems, this patch adds support for checkpointing systems with caches. The serialization code now checks whether there are any dirty lines in the cache. If there are dirty lines in the cache, the checkpoint is flagged as bad and a warning is printed.
2012-11-02sim: Move the draining interface into a separate base classAndreas Sandberg
This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager.
2012-11-02sim: Include object header files in SWIG interfacesAndreas Sandberg
When casting objects in the generated SWIG interfaces, SWIG uses classical C-style casts ( (Foo *)bar; ). In some cases, this can degenerate into the equivalent of a reinterpret_cast (mainly if only a forward declaration of the type is available). This usually works for most compilers, but it is known to break if multiple inheritance is used anywhere in the object hierarchy. This patch introduces the cxx_header attribute to Python SimObject definitions, which should be used to specify a header to include in the SWIG interface. The header should include the declaration of the wrapped object. We currently don't enforce header the use of the header attribute, but a warning will be generated for objects that do not use it.
2012-10-15Port: Add protocol-agnostic ports in the port hierarchyAndreas Hansson
This patch adds an additional level of ports in the inheritance hierarchy, separating out the protocol-specific and protocl-agnostic parts. All the functionality related to the binding of ports is now confined to use BaseMaster/BaseSlavePorts, and all the protocol-specific parts stay in the Master/SlavePort. In the future it will be possible to add other protocol-specific implementations. The functions used in the binding of ports, i.e. getMaster/SlavePort now use the base classes, and the index parameter is updated to use the PortID typedef with the symbolic InvalidPortID as the default.
2012-10-15Fix: Address a few minor issues identified by cppcheckAndreas Hansson
This patch addresses a number of smaller issues identified by the code inspection utility cppcheck. There are a number of identified leaks in the arm/linux/system.cc (although the function only get's called once so it is not a major problem), a few deletes in dev/x86/i8042.cc that were not array deletes, and sprintfs where the character array had one element less than needed. In the IIC tags there was a function allocating an array of longs which is in fact never used.
2012-10-15Mem: Use cycles to express cache-related latenciesAndreas Hansson
This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.
2012-09-25MEM: Put memory system document into doxygenDjordje Kovacevic
2012-09-25Cache: add a response latency to the cachesMrinmoy Ghosh
In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path.
2012-09-19AddrRange: Transition from Range<T> to AddrRangeAndreas Hansson
This patch takes the final plunge and transitions from the templated Range class to the more specific AddrRange. In doing so it changes the obvious Range<Addr> to AddrRange, and also bumps the range_map to be AddrRangeMap. In addition to the obvious changes, including the removal of redundant includes, this patch also does some house keeping in preparing for the introduction of address interleaving support in the ranges. The Range class is also stripped of all the functionality that is never used. --HG-- rename : src/base/range.hh => src/base/addr_range.hh rename : src/base/range_map.hh => src/base/addr_range_map.hh
2012-09-11clang: Fix issues identified by the clang static analyzerAndreas Hansson
This patch addresses a few minor issues reported by the clang static analyzer. The analysis was run with: scan-build -disable-checker deadcode \ -enable-checker experimental.core \ -disable-checker experimental.core.CastToStruct \ -enable-checker experimental.cpluscplus
2012-09-11Cache: Split invalidateBlk up to seperate block vs. tagsLena Olson
This seperates the functionality to clear the state in a block into blk.hh and the functionality to udpate the tag information into the tags. This gets rid of the case where calling invalidateBlk on an already-invalid block does something different than calling it on a valid block, which was confusing.
2012-09-07Param: Transition to Cycles for relevant parametersAndreas Hansson
This patch is a first step to using Cycles as a parameter type. The main affected modules are the CPUs and the Ruby caches. There are definitely plenty more places that are affected, but this patch serves as a starting point to making the transition. An important part of this patch is to actually enable parameters to be specified as Param.Cycles which involves some changes to params.py.
2012-08-22Packet: Remove NACKs from packet and its use in endpointsAndreas Hansson
This patch removes the NACK frrom the packet as there is no longer any module in the system that issues them (the bridge was the only one and the previous patch removes that). The handling of NACKs was mostly avoided throughout the code base, by using e.g. panic or assert false, but in a few locations the NACKs were actually dealt with (although NACKs never occured in any of the regressions). Most notably, the DMA port will now never receive a NACK and the backoff time is thus never changed. As a consequence, the entire backoff mechanism (similar to a PCI bus) is now removed and the DMA port entirely relies on the bus performing the arbitration and issuing a retry when appropriate. This is more in line with e.g. PCIe. Surprisingly, this patch has no impact on any of the regressions. As mentioned in the patch that removes the NACK from the bridge, a follow-up patch should change the request and response buffer size for at least one regression to also verify that the system behaves as expected when the bridge fills up.
2012-08-22Port: Extend the QueuedPort interface and use where appropriateAndreas Hansson
This patch extends the queued port interfaces with methods for scheduling the transmission of a timing request/response. The methods are named similar to the corresponding sendTiming(Snoop)Req/Resp, replacing the "send" with "sched". As the queues are currently unbounded, the methods always succeed and hence do not return a value. This functionality was previously provided in the subclasses by calling PacketQueue::schedSendTiming with the appropriate parameters. With this change, there is no need to introduce these extra methods in the subclasses, and the use of the queued interface is more uniform and explicit.
2012-08-15O3,ARM: fix some problems with drain/switchout functionality and add Drain ↵Anthony Gutierrez
DPRINTFs This patch fixes some problems with the drain/switchout functionality for the O3 cpu and for the ARM ISA and adds some useful debug print statements. This is an incremental fix as there are still a few bugs/mem leaks with the switchout code. Particularly when switching from an O3CPU to a TimingSimpleCPU. However, when switching from O3 to O3 cores with the ARM ISA I haven't encountered any more assertion failures; now the kernel will typically panic inside of simulation.
2012-07-27cache: don't allow dirty data in the i-cacheAnthony Gutierrez
removes the optimization that forwards an exclusive copy to a requester on a read, only for the i-cache. this optimization isn't necessary because we typically won't be writing to the i-cache.
2012-07-09Port: Align port names in C++ and PythonAndreas Hansson
This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics. Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus.
2012-07-09Port: Make getAddrRanges constAndreas Hansson
This patch makes getAddrRanges const throughout the code base. There is no reason why it should not be, and making it const prevents adding any unintentional side-effects.
2012-07-09Port: Add isSnooping to slave port (asking master port)Andreas Hansson
This patch adds isSnooping to the slave port, and thus avoids going through getMasterPort to be able to ask the master. Over the course of the next few patches, all getMasterPort/getSlavePort in Port and MemObject are to be protocol agnostic, and the snooping is part of the protocol layer. The function is already present on the master port, where it is implemented by the module itself, e.g. a cache. On the slave side, it is merely asking the connected master port. The same name is used by both functions despite their difference in behaviour. The initial design used isMasterSnooping on the slave port side, but the more verbose function name was later changed.
2012-07-09Port: Move retry from port base class to Master/SlavePortAndreas Hansson
This patch is the last part of moving all protocol-related functionality out of the Port base class. All the send/recv functions are already moved, and the retry (which still governs all the timing transport functions) is the only part that remained in the base class. The only point where this currently causes a bit of inconvenience is in the bus where the retry list is global and holds Port pointers (not Master/SlavePort). This is about to change with the split into a request/response bus and will soon be removed anyway. The patch has no impact on any regressions.
2012-07-09Fix: Address a few benign memory leaksAndreas Hansson
This patch is the result of static analysis identifying a number of memory leaks. The leaks are all benign as they are a result of not deallocating memory in the desctructor. The fix still has value as it removes false positives in the static analysis.
2012-06-29Cache: Fix the LRU policy for classic memory hierarchyLena Olson
The LRU policy always evicted the least recently touched way, even if it contained valid data and another way was invalid, as can happen if a block has been invalidated by coherance. This can result in caches never warming up even though they are replacing blocks. This modifies the LRU policy to move blocks to LRU position on invalidation.
2012-06-29Mem: fix master id assertion in cache_impl.hhDam Sunwoo
The assertion was applied to the wrong packet. This patch fixes the issue rerported by Xiang Jiang on the gem5-dev mailing list.
2012-06-29Cache: Only invalidate a line in the cache when an uncacheable write is seen.Ali Saidi
2012-06-07mem: Delay deleting of incoming packets by one call.Ali Saidi
This patch is a temporary fix until Andreas' four-phase patches get reviewed and committed. Removing FastAlloc seems to have exposed an issue which previously was reasonable rare in which packets are freed before the sending cache is done with them. This change puts incoming packets no a pendingDelete queue which are deleted at the start of the next call and thus breaks the dependency between when the caller returns true and when the packet is actually used by the sending cache. Running valgrind on a multi-core linux boot and the memtester results in no valgrind warnings.
2012-06-05sim: Remove FastAllocAli Saidi
While FastAlloc provides a small performance increase (~1.5%) over regular malloc it isn't thread safe. After removing FastAlloc and using tcmalloc I've seen a performance increase of 12% over libc malloc when running twolf for ARM.
2012-05-30Bus: Turn the PortId into a transport function parameterAndreas Hansson
The main aim of this patch is to arrive at a suitable port interface for vector ports, including both the packet and the port id. This patch changes the bus transport functions (recvFunctional/Atomic/Timing) to require a PortId parameter indicating the source port. Previously this information was passed by setting the source field of the packet, and this is only required in the case of a timing request. With this patch, the use of the source and destination field is also more restrictive, as they are only needed for timing accesses. The modifications to these fields for atomic snoops is now removed entirely, also making minor modifications to the cache.
2012-05-30Packet: Unify the use of PortID in packet and portAndreas Hansson
This patch removes the Packet::NodeID typedef and unifies it with the Port::PortId. The src and dest fields in the packet are used to hold a port id (e.g. in the bus), and thus the two should actually be the same. The typedef PortID is now global (in base/types.hh) and aligned with the ThreadID in terms of capitalisation and naming of the InvalidPortID constant. Before this patch, two flags were used for valid destination and source, rather than relying on a named value (InvalidPortID), and this is now redundant, as the src and dest field themselves are sufficient to tell whether the current value is a valid port identifier or not. Consequently, the VALID_SRC and VALID_DST are removed. As part of the cleaning up, a number of int parameters and local variables are updated to use PortID. Note that Ruby still has its own NodeID typedef. Furthermore, the MemObject getMaster/SlavePort still has an int idx parameter with a default value of -1 which should eventually change to PortID idx = InvalidPortID.
2012-05-24Cache: Remove dangling doWriteback declarationAndreas Hansson
This patch removes the declaration of doWriteback as there is no implementation for this member function.
2012-05-10Cache: restructure code that actually isn't a loopAli Saidi
2012-05-10gem5: fix some iterator use and erase bugsAli Saidi
2012-05-10gem5: Fix a number of incorrect case statementsAli Saidi
2012-05-10Cache: Panic if you attempt to create a checkpoint with a cache in the systemAli Saidi
2012-05-01MEM: Separate requests and responses for timing accessesAndreas Hansson
This patch moves send/recvTiming and send/recvTimingSnoop from the Port base class to the MasterPort and SlavePort, and also splits them into separate member functions for requests and responses: send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq, send/recvTimingSnoopResp. A master port sends requests and receives responses, and also receives snoop requests and sends snoop responses. A slave port has the reciprocal behaviour as it receives requests and sends responses, and sends snoop requests and receives snoop responses. For all MemObjects that have only master ports or slave ports (but not both), e.g. a CPU, or a PIO device, this patch merely adds more clarity to what kind of access is taking place. For example, a CPU port used to call sendTiming, and will now call sendTimingReq. Similarly, a response previously came back through recvTiming, which is now recvTimingResp. For the modules that have both master and slave ports, e.g. the bus, the behaviour was previously relying on branches based on pkt->isRequest(), and this is now replaced with a direct call to the apprioriate member function depending on the type of access. Please note that send/recvRetry is still shared by all the timing accessors and remains in the Port base class for now (to maintain the current bus functionality and avoid changing the statistics of all regressions). The packet queue is split into a MasterPort and SlavePort version to facilitate the use of the new timing accessors. All uses of the PacketQueue are updated accordingly. With this patch, the type of packet (request or response) is now well defined for each type of access, and asserts on pkt->isRequest() and pkt->isResponse() are now moved to the appropriate send member functions. It is also worth noting that sendTimingSnoopReq no longer returns a boolean, as the semantics do not alow snoop requests to be rejected or stalled. All these assumptions are now excplicitly part of the port interface itself.
2012-04-14MEM: Remove the Broadcast destination from the packetAndreas Hansson
This patch simplifies the packet by removing the broadcast flag and instead more firmly relying on (and enforcing) the semantics of transactions in the classic memory system, i.e. request packets are routed from a master to a slave based on the address, and when they are created they have neither a valid source, nor destination. On their way to the slave, the request packet is updated with a source field for all modules that multiplex packets from multiple master (e.g. a bus). When a request packet is turned into a response packet (at the final slave), it moves the potentially populated source field to the destination field, and the response packet is routed through any multiplexing components back to the master based on the destination field. Modules that connect multiplexing components, such as caches and bridges store any existing source and destination field in the sender state as a stack (just as before). The packet constructor is simplified in that there is no longer a need to pass the Packet::Broadcast as the destination (this was always the case for the classic memory system). In the case of Ruby, rather than using the parameter to the constructor we now rely on setDest, as there is already another three-argument constructor in the packet class. In many places where the packet information was printed as part of DPRINTFs, request packets would be printed with a numeric "dest" that would always be -1 (Broadcast) and that field is now removed from the printing.
2012-04-14MEM: Separate snoops and normal memory requests/responsesAndreas Hansson
This patch introduces port access methods that separates snoop request/responses from normal memory request/responses. The differentiation is made for functional, atomic and timing accesses and builds on the introduction of master and slave ports. Before the introduction of this patch, the packets belonging to the different phases of the protocol (request -> [forwarded snoop request -> snoop response]* -> response) all use the same port access functions, even though the snoop packets flow in the opposite direction to the normal packet. That is, a coherent master sends normal request and receives responses, but receives snoop requests and sends snoop responses (vice versa for the slave). These two distinct phases now use different access functions, as described below. Starting with the functional access, a master sends a request to a slave through sendFunctional, and the request packet is turned into a response before the call returns. In a system without cache coherence, this is all that is needed from the functional interface. For the cache-coherent scenario, a slave also sends snoop requests to coherent masters through sendFunctionalSnoop, with responses returned within the same packet pointer. This is currently used by the bus and caches, and the LSQ of the O3 CPU. The send/recvFunctional and send/recvFunctionalSnoop are moved from the Port super class to the appropriate subclass. Atomic accesses follow the same flow as functional accesses, with request being sent from master to slave through sendAtomic. In the case of cache-coherent ports, a slave can send snoop requests to a master through sendAtomicSnoop. Just as for the functional access methods, the atomic send and receive member functions are moved to the appropriate subclasses. The timing access methods are different from the functional and atomic in that requests and responses are separated in time and send/recvTiming are used for both directions. Hence, a master uses sendTiming to send a request to a slave, and a slave uses sendTiming to send a response back to a master, at a later point in time. Snoop requests and responses travel in the opposite direction, similar to what happens in functional and atomic accesses. With the introduction of this patch, it is possible to determine the direction of packets in the bus, and no longer necessary to look for both a master and a slave port with the requested port id. In contrast to the normal recvFunctional, recvAtomic and recvTiming that are pure virtual functions, the recvFunctionalSnoop, recvAtomicSnoop and recvTimingSnoop have a default implementation that calls panic. This is to allow non-coherent master and slave ports to not implement these functions.
2012-04-06MEM: Enable multiple distributed generalized memoriesAndreas Hansson
This patch removes the assumption on having on single instance of PhysicalMemory, and enables a distributed memory where the individual memories in the system are each responsible for a single contiguous address range. All memories inherit from an AbstractMemory that encompasses the basic behaviuor of a random access memory, and provides untimed access methods. What was previously called PhysicalMemory is now SimpleMemory, and a subclass of AbstractMemory. All future types of memory controllers should inherit from AbstractMemory. To enable e.g. the atomic CPU and RubyPort to access the now distributed memory, the system has a wrapper class, called PhysicalMemory that is aware of all the memories in the system and their associated address ranges. This class thus acts as an infinitely-fast bus and performs address decoding for these "shortcut" accesses. Each memory can specify that it should not be part of the global address map (used e.g. by the functional memories by some testers). Moreover, each memory can be configured to be reported to the OS configuration table, useful for populating ATAG structures, and any potential ACPI tables. Checkpointing support currently assumes that all memories have the same size and organisation when creating and resuming from the checkpoint. A future patch will enable a more flexible re-organisation. --HG-- rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py rename : src/mem/physical.cc => src/mem/abstract_mem.cc rename : src/mem/physical.hh => src/mem/abstract_mem.hh rename : src/mem/physical.cc => src/mem/simple_mem.cc rename : src/mem/physical.hh => src/mem/simple_mem.hh
2012-03-30MEM: Introduce the master/slave port sub-classes in C++William Wang
This patch introduces the notion of a master and slave port in the C++ code, thus bringing the previous classification from the Python classes into the corresponding simulation objects and memory objects. The patch enables us to classify behaviours into the two bins and add assumptions and enfore compliance, also simplifying the two interfaces. As a starting point, isSnooping is confined to a master port, and getAddrRanges to slave ports. More of these specilisations are to come in later patches. The getPort function is not getMasterPort and getSlavePort, and returns a port reference rather than a pointer as NULL would never be a valid return value. The default implementation of these two functions is placed in MemObject, and calls fatal. The one drawback with this specific patch is that it requires some code duplication, e.g. QueuedPort becomes QueuedMasterPort and QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort (avoiding multiple inheritance). With the later introduction of the port interfaces, moving the functionality outside the port itself, a lot of the duplicated code will disappear again.
2012-03-22MEM: Split SimpleTimingPort into PacketQueue and portsAndreas Hansson
This patch decouples the queueing and the port interactions to simplify the introduction of the master and slave ports. By separating the queueing functionality from the port itself, it becomes much easier to distinguish between master and slave ports, and still retain the queueing ability for both (without code duplication). As part of the split into a PacketQueue and a port, there is now also a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The QueuedPort is useful for ports that want to leave the packet transmission of outgoing packets to the queue and is used by both master and slave ports. The SimpleTimingPort inherits from the QueuedPort and adds the implemention of recvTiming and recvFunctional through recvAtomic. The PioPort and MessagePort are cleaned up as part of the changes. --HG-- rename : src/mem/tport.cc => src/mem/packet_queue.cc rename : src/mem/tport.hh => src/mem/packet_queue.hh
2012-03-09cache: Allow main memory to be at disjoint address ranges.Ali Saidi
2012-03-01Cache: Fix an issue with LRU when bonus block is used to complete transaction.Ali Saidi
The block is never inserted because it's the one extra block in the cache, but it can be invalidated twice in a row. In that case the block doesn't have a new master id (beacuse it was never inserted), however it is valid and the accounting goes wrong at that point.