summaryrefslogtreecommitdiff
path: root/src/mem/coherent_bus.cc
AgeCommit message (Collapse)Author
2013-05-30mem: Make returning snoop responses occupy response layerAndreas Hansson
This patch introduces a mirrored internal snoop port to facilitate easy addition of flow control for the snoop responses that are turned into normal responses on their return. To perform this, the slave ports of the coherent bus are wrapped in internal master ports that are passed as the source ports to the response layer in question. As a result of this patch, there is more contention for the response resources, and as such system performance will decrease slightly. A consequence of the mirrored internal port is that the port the bus tells to retry (the internal one) and the port actually retrying (the mirrored) one are not the same. Thus, the existing check in tryTiming is not longer correct. In fact, the test is redundant as the layer is only in the retry state while calling sendRetry on the waiting port, and if the latter does not immediately call the bus then the retry state is left. Consequently the check is removed.
2013-05-30mem: Make the buses multi layeredAndreas Hansson
This patch makes the buses multi layered, and effectively creates a crossbar structure with distributed contention ports at the destination ports. Before this patch, a bus could have a single request, response and snoop response in flight at any time, and with these changes there can be as many requests as connected slaves (bus master ports), and as many responses as connected masters (bus slave ports). Together with address interleaving, this patch enables us to create high-throughput memory interconnects, e.g. 50+ GByte/s.
2013-05-30mem: Separate the two snoop response cases in the busAndreas Hansson
This patch makes the flow control and state updates of the coherent bus more clear by separating the two cases, i.e. forward as a snoop response, or turn it into a normal response. With this change it is also more clear what resources are being occupied, and that we effectively bypass the busy check for the second case. As a result of the change in resource usage some stats change.
2013-05-30mem: Tidy up a few variables in the busAndreas Hansson
This patch does some minor housekeeping on the bus code, removing redundant code, and moving the extraction of the destination id to the top of the functions using it.
2013-05-30mem: Add basic stats to the busesUri Wiener
This patch adds a basic set of stats which are hard to impossible to implement using only communication monitors, and are needed for insight such as bus utilization, transactions through the bus etc. Stats added include throughput and transaction distribution, and also a two-dimensional vector capturing how many packets and how much data is exchanged between the masters and slaves connected to the bus.
2013-04-22mem: Adding verbose debug output in the memory systemUri Wiener
This patch provides useful printouts throughut the memory system. This includes pretty-printed cache tags and function call messages (call-stack like).
2013-03-26mem: Separate waiting for the bus and waiting for a peerAndreas Hansson
This patch splits the retryList into a list of ports that are waiting for the bus itself to become available, and a map that tracks the ports where forwarding failed due to a peer not accepting the packet. Thus, when a retry reaches the bus, it can be sent to the appropriate port that initiated that transaction. As a consequence of this patch, only ports that are really ready to go will get a retry, thus reducing the amount of redundant failed attempts. This patch also makes it easier to reason about the order of servicing requests as the ports waiting for the bus are now clearly FIFO and much easier to change if desired.
2013-02-19mem: Enforce strict use of busFirst- and busLastWordTimeAndreas Hansson
This patch adds a check to ensure that the delay incurred by the bus is not simply disregarded, but accounted for by someone. At this point, all the modules do is to zero it out, and no additional time is spent. This highlights where the bus timing is simply dropped instead of being paid for. As a follow up, the locations identified in this patch should add this additional time to the packets in one way or another. For now it simply acts as a sanity check and highlights where the delay is simply ignored. Since no time is added, all regressions remain the same.
2013-02-19mem: Make packet bus-related time accounting relativeAndreas Hansson
This patch changes the bus-related time accounting done in the packet to be relative. Besides making it easier to align the cache timing to cache clock cycles, it also makes it possible to create a Last-Level Cache (LLC) directly to a memory controller without a bus inbetween. The bus is unique in that it does not ever make the packets wait to reflect the time spent forwarding them. Instead, the cache is currently responsible for making the packets wait. Thus, the bus annotates the packets with the time needed for the first word to appear, and also the last word. The cache then delays the packets in its queues before passing them on. It is worth noting that every object attached to a bus (devices, memories, bridges, etc) should be doing this if we opt for keeping this way of accounting for the bus timing.
2013-02-19sim: Make clock private and access using clockPeriod()Andreas Hansson
This patch makes the clock member private to the ClockedObject and forces all children to access it using clockPeriod(). This makes it impossible to inadvertently change the clock, and also makes it easier to transition to a situation where the clock is derived from e.g. a clock domain, or through a multiplier.
2013-02-15sim: Add a system-global option to bypass cachesAndreas Sandberg
Virtualized CPUs and the fastmem mode of the atomic CPU require direct access to physical memory. We currently require caches to be disabled when using them to prevent chaos. This is not ideal when switching between hardware virutalized CPUs and other CPU models as it would require a configuration change on each switch. This changeset introduces a new version of the atomic memory mode, 'atomic_noncaching', where memory accesses are inserted into the memory system as atomic accesses, but bypass caches. To make memory mode tests cleaner, the following methods are added to the System class: * isAtomicMode() -- True if the memory mode is 'atomic' or 'direct'. * isTimingMode() -- True if the memory mode is 'timing'. * bypassCaches() -- True if caches should be bypassed. The old getMemoryMode() and setMemoryMode() methods should never be used from the C++ world anymore.
2012-11-02sim: Move the draining interface into a separate base classAndreas Sandberg
This patch moves the draining interface from SimObject to a separate class that can be used by any object needing draining. However, objects not visible to the Python code (i.e., objects not deriving from SimObject) still depend on their parents informing them when to drain. This patch also gets rid of the CountedDrainEvent (which isn't really an event) and replaces it with a DrainManager.
2012-10-11Mem: Determine bus block size during initialisationAndreas Hansson
This patch moves the block size computation from findBlockSize to initialisation time, once all the neighbouring ports are connected. There is no need to dynamically update the block size, and the caching of the value effectively avoided that anyhow. This is very similar to what was already in place, just with a slightly leaner implementation.
2012-07-09Port: Align port names in C++ and PythonAndreas Hansson
This patch is a first step to align the port names used in the Python world and the C++ world. Ultimately it serves to make the use of config.json together with output from the simulation easier, including post-processing of statistics. Most notably, the CPU, cache, and bus is addressed in this patch, and there might be other ports that should be updated accordingly. The dash name separator has also been replaced with a "." which is what is used to concatenate the names in python, and a separation is made between the master and slave port in the bus.
2012-07-09Bus: Split the bus into separate request/response layersAndreas Hansson
This patch splits the existing buses into multiple layers. The non-coherent bus is split into a request and a response layer, and the coherent bus adds an additional layer for the snoop responses. The layer is modified to be templatised on the port type, such that the different layers can have retryLists with either master or slave ports. This patch also removes the dynamic cast from the retry, as previously promised when moving the recvRetry from the port base class to the master/slave port respectively. Overall, the split bus more closely reflects any modern on-chip bus and should be at step in the right direction. From this point, it would be reasonable straight forward to add separate layers (and thus contention points and arbitration) for each port and thus create a true crossbar. The regressions all produce the correct output, but have varying degrees of changes to their statistics. A separate patch will be pushed with the updates to the reference statistics.
2012-07-09Bus: Add a notion of layers to the busesAndreas Hansson
This patch moves all flow control, arbitration and state information into a bus layer. The layer is thus responsible for all the state transitions, and for keeping hold of the retry list. Consequently the layer is also responsible for the draining. With this change, the non-coherent and coherent bus are given a single layer to avoid changing any temporal behaviour, but the patch opens up for adding more layers.
2012-07-09Bus: Replace tickNextIdle and inRetry with a state variableAndreas Hansson
This patch adds a state enum and member variable in the bus, tracking the bus state, thus eliminating the need for tickNextIdle and inRetry, and fixing an issue that allowed the bus to be occupied by multiple packets at once (hopefully it also makes it easier to understand the code). The bus, in its current form, uses tickNextIdle and inRetry to keep track of the state of the bus. However, it only updates tickNextIdle _after_ forwarding a packet using sendTiming, and the result is that the bus is still seen as idle, and a module that receives the packet and starts transmitting new packets in zero time will still see the bus as idle (and this is done by a number of DMA devices). The issue can also be seen in isOccupied where the bus calls reschedule on an event instead of schedule. This patch addresses the problem by marking the bus as _not_ idle already by the time we conclude that the bus is not occupied and we will deal with the packet. As a result of not allowing multiple packets to occupy the bus, some regressions have slight changes in their statistics. A separate patch updates these accordingly. Further ahead, a follow-on patch will introduce a separate state variable for request/responses/snoop responses, and thus implement a split request/response bus with separate flow control for the different message types (even further ahead it will introduce a multi-layer bus).
2012-07-09Port: Add isSnooping to slave port (asking master port)Andreas Hansson
This patch adds isSnooping to the slave port, and thus avoids going through getMasterPort to be able to ask the master. Over the course of the next few patches, all getMasterPort/getSlavePort in Port and MemObject are to be protocol agnostic, and the snooping is part of the protocol layer. The function is already present on the master port, where it is implemented by the module itself, e.g. a cache. On the slave side, it is merely asking the connected master port. The same name is used by both functions despite their difference in behaviour. The initial design used isMasterSnooping on the slave port side, but the more verbose function name was later changed.
2012-05-31Bus: Split the bus into a non-coherent and coherent busAndreas Hansson
This patch introduces a class hierarchy of buses, a non-coherent one, and a coherent one, splitting the existing bus functionality. By doing so it also enables further specialisation of the two types of buses. A non-coherent bus connects a number of non-snooping masters and slaves, and routes the request and response packets based on the address. The request packets issued by the master connected to a non-coherent bus could still snoop in caches attached to a coherent bus, as is the case with the I/O bus and memory bus in most system configurations. No snoops will, however, reach any master on the non-coherent bus itself. The non-coherent bus can be used as a template for modelling PCI, PCIe, and non-coherent AMBA and OCP buses, and is typically used for the I/O buses. A coherent bus connects a number of (potentially) snooping masters and slaves, and routes the request and response packets based on the address, and also forwards all requests to the snoopers and deals with the snoop responses. The coherent bus can be used as a template for modelling QPI, HyperTransport, ACE and coherent OCP buses, and is typically used for the L1-to-L2 buses and as the main system interconnect. The configuration scripts are updated to use a NoncoherentBus for all peripheral and I/O buses. A bit of minor tidying up has also been done. --HG-- rename : src/mem/bus.cc => src/mem/coherent_bus.cc rename : src/mem/bus.hh => src/mem/coherent_bus.hh rename : src/mem/bus.cc => src/mem/noncoherent_bus.cc rename : src/mem/bus.hh => src/mem/noncoherent_bus.hh