gem5 - gem5

Age	Commit message (Collapse)	Author
2016-08-12	mem: Add a FromCache packet attribute	Andreas Hansson
	This patch adds a FromCache attribute to the packet, and updates a number of the existing request commands to reflect that the request originates from a cache. The attribute simplifies checking if a requests came from a cache or not, and this is used by both the cache and snoop filter in follow-on patches. Change-Id: Ib0a7a080bbe4d6036ddd84b46fd45bc7eb41cd8f Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Steve Reinhardt <stever@gmail.com>
2016-01-19	* * *	Tony Gutierrez
	mem: support for gpu-style RMWs in ruby This patch adds support for GPU-style read-modify-write (RMW) operations in ruby. Such atomic operations are traditionally executed at the memory controller (instead of through an L1 cache using cache-line locking). Currently, this patch works by propogating operation functors through the memory system.
2016-01-11	scons: Enable -Wextra by default	Andreas Hansson
	Make best use of the compiler, and enable -Wextra as well as -Wall. There are a few issues that had to be resolved, but they are all trivial.
2015-12-31	mem: Do not rely on the NeedsWritable flag for responses	Andreas Hansson
	This patch removes the NeedsWritable flag for all responses, as it is really only the request that needs a writable response. The response, on the other hand, should in these cases always provide the line in a writable state, as indicated by the hasSharers flag not being set. When we send requests that has NeedsWritable set, the response will always have the hasSharers flag not set. Additionally, there are cases where the request did not have NeedsWritable set, and we still get a writable response with the hasSharers flag not set. This never happens on snoops, but is used by downstream caches to pass ownership upstream. As part of this patch, the affected response types are updated, and the snoop filter is similarly modified to check only the hasSharers flag (as it should). A sanity check is also added to the packet class, asserting that we never look at the NeedsWritable flag for responses. No regressions are affected.
2015-12-31	mem: Do not allocate space for packet data if not needed	Andreas Hansson
	This patch looks at the request and response command to determine if either actually has any data payload, and if not, we do not allocate any space for packet data. The only tricky case is where the command type is changed as part of the MSHR functionality. In these cases where the original packet had no data, but the new packet does, we need to explicitly call allocate().
2015-12-31	mem: Make cache terminology easier to understand	Andreas Hansson
	This patch changes the name of a bunch of packet flags and MSHR member functions and variables to make the coherency protocol easier to understand. In addition the patch adds and updates lots of descriptions, explicitly spelling out assumptions. The following name changes are made: * the packet memInhibit flag is renamed to cacheResponding * the packet sharedAsserted flag is renamed to hasSharers * the packet NeedsExclusive attribute is renamed to NeedsWritable * the packet isSupplyExclusive is renamed responderHadWritable * the MSHR pendingDirty is renamed to pendingModified The cache states, Modified, Owned, Exclusive, Shared are also called out in the cache and MSHR code to make it easier to understand.
2015-12-09	mem: remove acq/rel cmds from packet and add mem fence req	Tony Gutierrez

2015-11-06	mem: Add an option to perform clean writebacks from caches	Andreas Hansson
	This patch adds the necessary commands and cache functionality to allow clean writebacks. This functionality is crucial, especially when having exclusive (victim) caches. For example, if read-only L1 instruction caches are not sending clean writebacks, there will never be any spills from the L1 to the L2. At the moment the cache model defaults to not sending clean writebacks, and this should possibly be re-evaluated. The implementation of clean writebacks relies on a new packet command WritebackClean, which acts much like a Writeback (renamed WritebackDirty), and also much like a CleanEvict. On eviction of a clean block the cache either sends a clean evict, or a clean writeback, and if any copies are still cached upstream the clean evict/writeback is dropped. Similarly, if a clean evict/writeback reaches a cache where there are outstanding MSHRs for the block, the packet is dropped. In the typical case though, the clean writeback allocates a block in the downstream cache, and marks it writable if the evicted block was writable. The patch changes the O3_ARM_v7a L1 cache configuration and the default L1 caches in config/common/Caches.py
2015-09-25	mem: Make the coherent crossbar account for timing snoops	Andreas Hansson
	This patch introduces the concept of a snoop latency. Given the requirement to snoop and forward packets in zero time (due to the coherency mechanism), the latency is accounted for later. On a snoop, we establish the latency, and later add it to the header delay of the packet. To allow multiple caches to contribute to the snoop latency, we use a separate variable in the packet, and then take the maximum before adding it to the header delay.
2015-08-24	mem: Revert requirement on packet addr/size always valid	Andreas Hansson
	This patch reverts part of (842f56345a42), as apparently there are use-cases outside the main repository relying on the late setting of the physical address.
2015-08-21	mem: Reflect that packet address and size are always valid	Andreas Hansson
	This patch simplifies the packet, and removes the possibility of creating a packet without a valid address and/or size. Under no circumstances are these fields set at a later point, and thus they really have to be provided at construction time. The patch also fixes a case there the MinorCPU creates a packet without a valid address and size, only to later delete it.
2015-08-07	mem: Cleanup packet accessor methods	Andreas Sandberg
	The Packet::get() and Packet::set() methods both have very strange semantics. Currently, they automatically convert between the guest system's endianness and the host system's endianness. This behavior is usually undesired and unexpected. This patch introduces three new method pairs to access data: * getLE() / setLE() - Get data stored as little endian. * getBE() / setBE() - Get data stored as big endian. * get(ByteOrder) / set(v, ByteOrder) - Configurable endianness For example, a little endian device that is receiving a write request will use teh getLE() method to get the data from the packet. The old interface will be deprecated once all existing devices have been ported to the new interface.
2015-08-07	mem: Remove extraneous acquire/release flags and attributes	Andreas Hansson
	This patch removes the extraneous flags and attributes from the request and packet, and simply leaves the new commands. The change introduced when adding acquire/release breaks all compatibility with existing traces, and there is really no need for any new flags and attributes. The commands should be sufficient. This patch fixes packet tracing (urgent), and also removes the unnecessary complexity.
2015-07-20	mem: add request types for acquire and release	David Hashe
	Add support for acquire and release requests. These synchronization operations are commonly supported by several modern instruction sets.
2015-07-30	mem: Transition away from isSupplyExclusive for writebacks	Andreas Hansson
	This patch changes how writebacks communicate whether the line is passed as modified or owned. Previously we relied on the isSupplyExclusive mechanism, which was originally designed to avoid unecessary snoops. For normal cache requests we use the sharedAsserted mechanism to determine if a block should be marked writeable or not, and with this patch we transition the writebacks to also use this mechanism. Conceptually this is cleaner and more consistent.
2015-07-30	mem: Tidy up packet	Andreas Hansson
	Some minor fixes and removal of dead code. Changing the flags to be enums rather than static const (to avoid any linking issues caused by the latter). Also adding a getBlockAddr member which hopefully can slowly finds its way into caches, snoop filters etc.
2015-07-04	mem: packet: Add const to constructor argument	Nilay Vaish

2015-07-03	mem: Split WriteInvalidateReq into write and invalidate	Andreas Hansson
	WriteInvalidateReq ensures that a whole-line write does not incur the cost of first doing a read exclusive, only to later overwrite the data. This patch splits the existing WriteInvalidateReq into a WriteLineReq, which is done locally, and an InvalidateReq that is sent out throughout the memory system. The WriteLineReq re-uses the normal WriteResp. The change allows us to better express the difference between the cache that is performing the write, and the ones that are merely invalidating. As a consequence, we no longer have to rely on the isTopLevel flag. Moreover, the actual memory in the system does not see the intitial write, only the writeback. We were marking the written line as dirty already, so there is really no need to also push the write all the way to the memory. The overall flow of the write-invalidate operation remains the same, i.e. the operation is only carried out once the response for the invalidate comes back. This patch adds the InvalidateResp for this very reason.
2015-07-03	mem: Add ReadCleanReq and ReadSharedReq packets	Andreas Hansson
	This patch adds two new read requests packets: ReadCleanReq - For a cache to explicitly request clean data. The response is thus exclusive or shared, but not owned or modified. The read-only caches (see previous patch) use this request type to ensure they do not get dirty data. ReadSharedReq - We add this to distinguish cache read requests from those issued by other masters, such as devices and CPUs. Thus, devices use ReadReq, and caches use ReadCleanReq, ReadExReq, or ReadSharedReq. For the latter, the response can be any state, shared, exclusive, owned or even modified. Both ReadCleanReq and ReadSharedReq re-use the normal ReadResp. The two transactions are aligned with the emerging cache-coherent TLM standard and the AMBA nomenclature. With this change, the normal ReadReq should never be used by a cache, and is reserved for the actual (non-caching) masters in the system. We thus have a way of identifying if a request came from a cache or not. The introduction of ReadSharedReq thus removes the need for the current isTopLevel hack, and also allows us to stop relying on checking the packet size to determine if the source is a cache or not. This is fixed in follow-on patches.
2015-07-03	mem: Add clean evicts to improve snoop filter tracking	Ali Jafri
	This patch adds eviction notices to the caches, to provide accurate tracking of cache blocks in snoop filters. We add the CleanEvict message to the memory heirarchy and use both CleanEvicts and Writebacks with BLOCK_CACHED flags to propagate notice of clean and dirty evictions respectively, down the memory hierarchy. Note that the BLOCK_CACHED flag indicates whether there exist any copies of the evicted block in the caches above the evicting cache. The purpose of the CleanEvict message is to notify snoop filters of silent evictions in the relevant caches. The CleanEvict message behaves much like a Writeback. CleanEvict is a write and a request but unlike a Writeback, CleanEvict does not have data and does not need exclusive access to the block. The cache generates the CleanEvict message on a fill resulting in eviction of a clean block. Before travelling downwards CleanEvict requests generate zero-time snoop requests to check if the same block is cached in upper levels of the memory heirarchy. If the block exists, the cache discards the CleanEvict message. The snoops check the tags, writeback queue and the MSHRs of upper level caches in a manner similar to snoops generated from HardPFReqs. Currently CleanEvicts keep travelling towards main memory unless they encounter the block corresponding to their address or reach main memory (since we have no well defined point of serialisation). Main memory simply discards CleanEvict messages. We have modified the behavior of Writebacks, such that they generate snoops to check for the presence of blocks in upper level caches. It is possible in our current implmentation for a lower level cache to be writing back a block while a shared copy of the same block exists in the upper level cache. If the snoops find the same block in upper level caches, we set the BLOCK_CACHED flag in the Writeback message. We have also added logic to account for interaction of other message types with CleanEvicts waiting in the writeback queue. A simple example is of a response arriving at a cache removing any CleanEvicts to the same address from the cache's writeback queue.
2015-06-09	mem: Add check for express snoop in packet destructor	Ali Jafri
	Snoop packets share the request pointer with the originating packets. We need to ensure that the snoop packet destruction does not delete the request. Snoops are used for reads, invalidations, HardPFReqs, Writebacks and CleansEvicts. Reads, invalidations, and HardPFReqs need a response so their snoops do not delete the request. For Writebacks and CleanEvicts we need to check explicitly for whethere the current packet is an express snoop, in whcih case do not delete the request.
2015-03-27	mem: Rename PREFETCH_SNOOP_SQUASH flag to BLOCK_CACHED	Ali Jafri
	This patch subsumes the PREFETCH_SNOOP_SQUASH flag with the more generic BLOCK_CACHED flag. Future patches implementing cache eviction messages can use the BLOCK_CACHED flag in almost the same manner as hardware prefetches use the PREFETCH_SNOOP_SQUASH flag. The PREFTECH_SNOOP_FLAG is set if the prefetch target is found in the tags or the MSHRs in any state, so we are simply replacing calls to setPrefetchSquashed() with setBlockCached(). The case of where the prefetch target is found in the writeback MSHRs of upper level caches continues to be covered by the MEM_INHIBIT flag.
2015-02-11	mem: restructure Packet cmd initialization a bit more	Steve Reinhardt
	Refactor the way that specific MemCmd values are generated for packets. The new approach is a little more elegant in that we assign the right value up front, and it's also more amenable to non-heap-allocated Packet objects. Also replaced the code in the Minor model that was still doing it the ad-hoc way. This is basically a refinement of http://repo.gem5.org/gem5/rev/711eb0e64249.
2015-03-02	mem: Add byte mask to Packet::checkFunctional	Andreas Hansson
	This patch changes the valid-bytes start/end to a proper byte mask. With the changes in timing introduced in previous patches there are more packets waiting in queues, and there are regressions using the checker CPU failing due to non-contigous read data being found in the various cache queues. This patch also adds some more comments explaining what is going on, and adds the fourth and missing case to Packet::checkFunctional.
2015-02-11	mem: Clarification of packet crossbar timings	Marco Balboni
	This patch clarifies the packet timings annotated when going through a crossbar. The old 'firstWordDelay' is replaced by 'headerDelay' that represents the delay associated to the delivery of the header of the packet. The old 'lastWordDelay' is replaced by 'payloadDelay' that represents the delay needed to processing the payload of the packet. For now the uses and values remain identical. However, going forward the payloadDelay will be additive, and not include the headerDelay. Follow-on patches will make the headerDelay capture the pipeline latency incurred in the crossbar, whereas the payloadDelay will capture the additional serialisation delay.
2015-01-22	mem: Remove unused Packet src and dest fields	Andreas Hansson
	This patch takes the final step in removing the src and dest fields in the packet. These fields were rather confusing in that they only remember a single multiplexing component, and pushed the responsibility to the bridge and caches to store the fields in a senderstate, thus effectively creating a stack. With the recent changes to the crossbar response routing the crossbar is now responsible without relying on the packet fields. Thus, these variables are now unused and can be removed.
2014-12-02	mem: Support WriteInvalidate (again)	Curtis Dunham
	This patch takes a clean-slate approach to providing WriteInvalidate (write streaming, full cache line writes without first reading) support. Unlike the prior attempt, which took an aggressive approach of directly writing into the cache before handling the coherence actions, this approach follows the existing cache flows as closely as possible.
2014-12-02	mem: Relax packet src/dest check and shift onus to crossbar	Andreas Hansson
	This patch allows objects to get the src/dest of a packet even if it is not set to a valid port id. This simplifies (ab)using the bridge as a buffer and latency adapter in situations where the neighbouring MemObjects are not crossbars. The checks that were done in the packet are now shifted to the crossbar where the fields are used to index into the port arrays. Thus, the carrier of the information is not burdened with checking, and the crossbar can check not only that the destination is set, but also that the port index is within limits.
2014-12-02	mem: Clean up packet data allocation	Andreas Hansson
	This patch attempts to make the rules for data allocation in the packet explicit, understandable, and easy to verify. The constructor that copies a packet is extended with an additional flag "alloc_data" to enable the call site to explicitly say whether the newly created packet is short-lived (a zero-time snoop), or has an unknown life-time and therefore should allocate its own data (or copy a static pointer in the case of static data). The tricky case is the static data. In essence this is a copy-avoidance scheme where the original source of the request (DMA, CPU etc) does not ask the memory system to return data as part of the packet, but instead provides a pointer, and then the memory system carries this pointer around, and copies the appropriate data to the location itself. Thus any derived packet actually never copies any data. As the original source does not copy any data from the response packet when arriving back at the source, we must maintain the copy of the original pointer to not break the system. We might want to revisit this one day and pay the price for a few extra memcpy invocations. All in all this patch should make it easier to grok what is going on in the memory system and how data is actually copied (or not).
2014-12-02	mem: Cleanup Packet::checkFunctional and hasData usage	Andreas Hansson
	This patch cleans up the use of hasData and checkFunctional in the packet. The hasData function is unfortunately suggesting that it checks if the packet has a valid data pointer, when it does in fact only check if the specific packet type is specified to have a data payload. The confusion led to a bug in checkFunctional. The latter function is also tidied up to avoid name overloading.
2014-12-02	mem: Make the requests carried by packets const	Andreas Hansson
	This adds a basic level of sanity checking to the packet by ensuring that a request is not modified once the packet is created. The only issue that had to be worked around is the relaying of software-prefetches in the cache. The specific situation is now solved by first copying the request, and then creating a new packet accordingly.
2014-12-02	mem: Add checks and explanation for assertMemInhibit usage	Andreas Hansson

2014-12-02	mem: Assume all dynamic packet data is array allocated	Andreas Hansson
	This patch simplifies how we deal with dynamically allocated data in the packet, always assuming that it is array allocated, and hence should be array deallocated (delete[] as opposed to delete). The only uses of dataDynamic was in the Ruby testers. The ARRAY_DATA flag in the packet is removed accordingly. No defragmentation of the flags is done at this point, leaving a gap in the bit masks. As the last part the patch, it renames dataDynamicArray to dataDynamic.
2014-12-02	mem: Remove redundant Packet::allocate calls	Andreas Hansson
	This patch cleans up the packet memory allocation confusion. The data is always allocated at the requesting side, when a packet is created (or copied), and there is never a need for any device to allocate any space if it is merely responding to a paket. This behaviour is in line with how SystemC and TLM works as well, thus increasing interoperability, and matching established conventions. The redundant calls to Packet::allocate are removed, and the checks in the function are tightened up to make sure data is only ever allocated once. There are still some oddities in the packet copy constructor where we copy the data pointer if it is static (without ownership), and allocate new space if the data is dynamic (with ownership). The latter is being worked on further in a follow-on patch.
2014-12-02	mem: Use const pointers for port proxy write functions	Andreas Hansson
	This patch changes the various write functions in the port proxies to use const pointers for all sources (similar to how memcpy works). The one unfortunate aspect is the need for a const_cast in the packet, to avoid having to juggle a const and a non-const data pointer. This design decision can always be re-evaluated at a later stage.
2014-12-02	mem: Add const getters for write packet data	Andreas Hansson
	This patch takes a first step in tightening up how we use the data pointer in write packets. A const getter is added for the pointer itself (getConstPtr), and a number of member functions are also made const accordingly. In a range of places throughout the memory system the new member is used. The patch also removes the unused isReadWrite function.
2014-12-02	mem: Remove null-check bypassing in Packet::getPtr	Andreas Hansson
	This patch removes the parameter that enables bypassing the null check in the Packet::getPtr method. A number of call sites assume the value to be non-null. The one odd case is the RubyTester, which issues zero-sized prefetches(!), and despite being reads they had no valid data pointer. This is now fixed, but the size oddity remains (unless anyone object or has any good suggestions). Finally, in the Ruby Sequencer, appropriate checks are made for flush packets as they have no valid data pointer.
2014-09-27	misc: Fix a bunch of minor issues identified by static analysis	Andreas Hansson
	Add some missing initialisation, and fix a handful benign resource leaks (including some false positives).
2014-09-20	mem: Rename Bus to XBar to better reflect its behaviour	Andreas Hansson
	This patch changes the name of the Bus classes to XBar to better reflect the actual timing behaviour. The actual instances in the config scripts are not renamed, and remain as e.g. iobus or membus. As part of this renaming, the code has also been clean up slightly, making use of range-based for loops and tidying up some comments. The only changes outside the bus/crossbar code is due to the delay variables in the packet. --HG-- rename : src/mem/Bus.py => src/mem/XBar.py rename : src/mem/coherent_bus.cc => src/mem/coherent_xbar.cc rename : src/mem/coherent_bus.hh => src/mem/coherent_xbar.hh rename : src/mem/noncoherent_bus.cc => src/mem/noncoherent_xbar.cc rename : src/mem/noncoherent_bus.hh => src/mem/noncoherent_xbar.hh rename : src/mem/bus.cc => src/mem/xbar.cc rename : src/mem/bus.hh => src/mem/xbar.hh
2014-09-09	misc: Fix a number of unitialised variables and members	Andreas Hansson
	Static analysis unearther a bunch of uninitialised variables and members, and this patch addresses the problem. In all cases these omissions seem benign in the end, but at least fixing them means less false positives next time round.
2014-06-27	mem: write streaming support via WriteInvalidate promotion	Curtis Dunham
	Support full-block writes directly rather than requiring RMW: * a cache line is allocated in the cache upon receipt of a WriteInvalidateReq, not the WriteInvalidateResp. * only top-level caches allocate the line; the others just pass the request along and invalidate as necessary. * to close a timing window between the Req and the Resp, a new metadata bit tracks whether another cache has read a copy of the new line before the writeback to memory.
2014-05-13	cpu, mem: Make software prefetches non-blocking	Curtis Dunham
	Previously, they were treated so much like loads that they could stall at the head of the ROB. Now they are always treated like L1 hits. If they actually miss, a new request is created at the L1 and tracked from the MSHRs there if necessary (i.e. if it didn't coalesce with an existing outstanding load).
2014-05-13	mem: Refactor assignment of Packet types	Curtis Dunham
	Put the packet type swizzling (that is currently done in a lot of places) into a refineCommand() member function.
2014-05-09	mem: Squash prefetch requests from downstream caches	Mitch Hayenga
	This patch squashes prefetch requests from downstream caches, so that they do not steal cachelines away from caches closer to the cpu. It was originally coded by Mitch Hayenga and modified by Aasheesh Kolli.
2014-01-24	mem: Add support for a security bit in the memory system	Giacomo Gabrielli
	This patch adds the basic building blocks required to support e.g. ARM TrustZone by discerning secure and non-secure memory accesses.
2013-10-31	mem: Add "const" attribute to Packet getters	Stephan Diestelhorst
	Add a "const" keywords to the getters in the Packet class so these can be invoked on const Packet objects.
2013-04-22	mem: Adding verbose debug output in the memory system	Uri Wiener
	This patch provides useful printouts throughut the memory system. This includes pretty-printed cache tags and function call messages (call-stack like).
2013-02-19	mem: Make packet bus-related time accounting relative	Andreas Hansson
	This patch changes the bus-related time accounting done in the packet to be relative. Besides making it easier to align the cache timing to cache clock cycles, it also makes it possible to create a Last-Level Cache (LLC) directly to a memory controller without a bus inbetween. The bus is unique in that it does not ever make the packets wait to reflect the time spent forwarding them. Instead, the cache is currently responsible for making the packets wait. Thus, the bus annotates the packets with the time needed for the first word to appear, and also the last word. The cache then delays the packets in its queues before passing them on. It is worth noting that every object attached to a bus (devices, memories, bridges, etc) should be doing this if we opt for keeping this way of accounting for the bus timing.
2013-02-19	mem: Add deferred packet class to prefetcher	Andreas Hansson
	This patch removes the time field from the packet as it was only used by the preftecher. Similar to the packet queue, the prefetcher now wraps the packet in a deferred packet, which also has a tick representing the absolute time when the packet should be sent.
2013-02-19	mem: Fix SenderState related cache deadlock	Sascha Bischoff
	This patch fixes a potential deadlock in the caches. This deadlock could occur when more than one cache is used in a system, and pkt->senderState is modified in between the two caches. This happened as the caches relied on the senderState remaining unchanged, and used it for instantaneous upstream communication with other caches. This issue has been addressed by iterating over the linked list of senderStates until we are either able to cast to a MSHR* or senderState is NULL. If the cast is successful, we know that the packet has previously passed through another cache, and therefore update the downstreamPending flag accordingly. Otherwise, we do nothing.