summaryrefslogtreecommitdiff
path: root/src/mem
AgeCommit message (Collapse)Author
2013-04-22sim: separate nextCycle() and clockEdge() in clockedObjectsDam Sunwoo
Previously, nextCycle() could return the *current* cycle if the current tick was already aligned with the clock edge. This behavior is not only confusing (not quite what the function name implies), but also caused problems in the drainResume() function. When exiting/re-entering the sim loop (e.g., to take checkpoints), the CPUs will drain and resume. Due to the previous behavior of nextCycle(), the CPU tick events were being rescheduled in the same ticks that were already processed before draining. This caused divergence from runs that did not exit/re-entered the sim loop. (Initially a cycle difference, but a significant impact later on.) This patch separates out the two behaviors (nextCycle() and clockEdge()), uses nextCycle() in drainResume, and uses clockEdge() everywhere else. Nothing (other than name) should change except for the drainResume timing.
2013-04-17ruby: moesi cmp directory: add copyright noticeNilay Vaish
2013-04-09Ruby: Fix RubyPort evict packet memory leakJoel Hestness
When using the o3 or inorder CPUs with many Ruby protocols, the caches may need to forward invalidations to the CPUs. The RubyPort was instantiating a packet to be sent to the CPUs to signal the eviction, but the packets were not being freed by the CPUs. Consistent with the classic memory model, stack allocate the packet and heap allocate the request so on ruby_eviction_callback() completion, the packet deconstructor is called, and deletes the request (*Note: stack allocating the request causes double deletion, since it will be deleted in the packet destructor). This results in the least memory allocations without memory errors.
2013-04-09Ruby: Delete packet requests during warmupJoel Hestness
When warming up caches in Ruby, the CacheRecorder sends fetch requests into Ruby Sequencers with packet types that require responses. Since responses are never generated for these CacheRecorder requests, the requests are not deleted in the packet destructor called from the Ruby hit callback. Free the request.
2013-04-09Ruby: Add field to slicc machine for generic typeJoel Hestness
This allows you to have (i.e.) an L2 cache that is not named "L2Cache" but is still a GenericMachineType_L2Cache. This is particularly helpful if the protocol has multiple L2 controllers.
2013-04-09Ruby: Order profilers based on versionJoel Hestness
When Ruby stats are printed for events and transitions, they include stats for all of the controllers of the same type, but they are not necessarily printed in order of the controller ID "version", because of the way the profilers were added to the profiler vector. This patch fixes the push order problem so that the stats are printed in ascending order 0->(# controllers), so statistics parsers may correctly assume the controller to which the stats belong.
2013-04-09Ruby: More descriptive message buffer connection fatalJason Power
When connecting message buffers between Ruby controllers, it is easy to mistakenly connect multiple controllers to the same message buffer. This patch prints a more descriptive fatal message than the previous assert statement in order to facilitate easier debugging.
2013-04-09Ruby: Fix typo in Slicc if-statement AST errorJason Power
The error in the SLICC code was hidden by the python error in SLICC parser before this patch
2013-04-07Ruby System, Cache Recorder: Use delete [] for trace varsJoel Hestness
The cache trace variables are array allocated uint8_t* in the RubySystem and the Ruby CacheRecorder, but the code used delete to free the memory, resulting in Valgrind memory errors. Change these deletes to delete [] to get rid of the errors.
2013-03-27mem: Fix cache latency bugMitch Hayenga
Fixes a latency calculation bug for accesses during a cache line fill. Under a cache miss, before the line is filled, accesses to the cache are associated with a MSHR and marked as targets. Once the line fill completes, MSHR target packets pay an additional latency of "responseLatency + busSerializationLatency". However, the "whenReady" field of the cache line is only set to an additional delay of "busSerializationLatency". This lacks the responseLatency component of the fill. It is possible for accesses that occur on the cycle of (or briefly after) the line fill to respond without properly paying the responseLatency. This also creates the situation where two accesses to the same address may be serviced in an order opposite of how they were received by the cache. For stores to the same address, this means that although the cache performs the stores in the order they were received, acknowledgements may be sent in a different order. Adding the responseLatency component to the whenReady field preserves the penalty that should be paid and prevents these ordering issues. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-03-26mem: Cancel cache retry event when blocking portRene de Jong
This patch solves the corner case scenario where the sendRetryEvent could be scheduled twice, when an io device stresses the IOcache in the system. This should not be possible in the cache system.
2013-03-26mem: Separate waiting for the bus and waiting for a peerAndreas Hansson
This patch splits the retryList into a list of ports that are waiting for the bus itself to become available, and a map that tracks the ports where forwarding failed due to a peer not accepting the packet. Thus, when a retry reaches the bus, it can be sent to the appropriate port that initiated that transaction. As a consequence of this patch, only ports that are really ready to go will get a retry, thus reducing the amount of redundant failed attempts. This patch also makes it easier to reason about the order of servicing requests as the ports waiting for the bus are now clearly FIFO and much easier to change if desired.
2013-03-26mem: Introduce a variable for the retrying portAndreas Hansson
This patch introduces a variable to keep track of the retrying port instead of relying on it being the front of the retryList. Besides the improvement in readability, this patch is a step towards separating out the two cases where a port is waiting for the bus to be free, and where the forwarding did not succeed and the bus is waiting for a retry to pass on to the original initiator of the transaction. The changes made are currently such that the regressions are not affected. This is ensured by always prioritizing the currently retrying port and putting it back at the front of the retry list.
2013-03-26mem: Add optional request flags to the packet traceAndreas Hansson
This patch adds an optional flags field to the packet trace to encode the request flags that contain information about whether the request is (un)cacheable, instruction fetch, preftech etc.
2013-03-22ruby: slicc: set sender, receiver clock objs for optional queueNilay Vaish
2013-03-22ruby: message buffer: correct previous errorsNilay Vaish
A recent set of patches added support for multiple clock domains to ruby. I had made some errors while writing those patches. The sender was using the receiver side clock while enqueuing a message in the buffer. Those errors became visible while creating (or restoring from) checkpoints. The errors also become visible when a multi eventq scenario occurs.
2013-03-22ruby: message buffer: remove _ptr from some variablesNilay Vaish
The names were getting too long.
2013-03-22ruby: message buffer node: used Tick in place of CyclesNilay Vaish
The message buffer node used to keep time in terms of Cycles. Since the sender and the receiver can have different clock periods, storing node time in cycles requires some conversion. Instead store the time directly in Ticks.
2013-03-22ruby: consumer: avoid using receiver side clockNilay Vaish
A set of patches was recently committed to allow multiple clock domains in ruby. In those patches, I had inadvertently made an incorrect use of the clocks. Suppose object A needs to schedule an event on object B. It was possible that A accesses B's clock to schedule the event. This is not possible in actual system. Hence, changes are being to the Consumer class so as to avoid such happenings. Note that in a multi eventq simulation, this can possibly lead to an incorrect simulation. There are two functions in the Consumer class that are used for scheduling events. The first function takes in the relative delay over the current time as the argument and adds the current time to it for scheduling the event. The second function takes in the absolute time (in ticks) for scheduling the event. The first function is now being moved to protected section of the class so that only objects of the derived classes can use it. All other objects will have to specify absolute time while scheduling an event for some consumer.
2013-03-22ruby: remove unsued profile functionsNilay Vaish
2013-03-22ruby: keep histogram of outstanding requests in seqNilay Vaish
The histogram for tracking outstanding counts per cycle is maintained in the profiler. For a parallel implementation of the memory system, we need that this histogram is maintained locally. Hence it will now be kept in the sequencer itself. The resulting histograms will be merged when the stats are printed.
2013-03-22slicc: remove check if the L1Cache has a sequencerNilay Vaish
2013-03-22ruby: move stall and wakeup functions to AbstractControllerNilay Vaish
These functions are currently implemented in one of the files related to Slicc. Since these are purely C++ functions, they are better suited to be in the base class.
2013-03-22ruby: connect two controllers using only message buffersNilay Vaish
This patch modifies ruby so that two controllers can be connected to each other with only message buffers in between. Before this patch, all the controllers had to be connected to the network for them to communicate with each other. With this patch, one can have protocols where a controller is not connected to the network, but communicates with another controller through a message buffer.
2013-03-22ruby: convert Topology to regular classNilay Vaish
The Topology class in Ruby does not need to inherit from SimObject class. This patch turns it into a regular class. The topology object is now created in the constructor of the Network class. All the parameters for the topology class have been moved to the network class.
2013-03-22ruby: network: move routers from topology to networkNilay Vaish
2013-03-18mem: Fix missing delete of packet in DRAM accessAndreas Hansson
This patch fixes a memory leak caused by not deleting packets that require no response.
2013-03-15ruby: set: corrects csprintf() call introduced by 7d95b650c9b6Nilay Vaish
2013-03-07ruby: Fix gcc 4.8 maybe-uninitialized compilation errorAndreas Hansson
This patch fixes the one-and-only gcc 4.8 compilation error, being a warning about "maybe uninitialized" in Orion.
2013-03-06ruby: remove the functional copy of memory in se modeNilay Vaish
This patch removes the functional copy of the memory that was maintained in the se mode. Now ruby itself will provide the data.
2013-03-06ruby: garnet: fixed: implement functional accessNilay Vaish
2013-03-02ruby: fixes functional writes to RubyRequestBlake Hechtman ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)
The functional write code was assuming that all writes are block sized, which may not be true for Ruby Requests. This bug can lead to a buffer overflow. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-03-01mem: Add check if SimpleDRAM nextReqEvent is scheduledAndreas Hansson
This check covers a case where a retry is called from the SimpleDRAM causing a new request to appear before the DRAM itself schedules a nextReqEvent. By adding this check, the event is not scheduled twice.
2013-03-01mem: Add a method to build multi-channel DRAM configurationsAndreas Hansson
This patch adds a class method that allows easy creation of channel-interleaved multi-channel DRAM configurations. It is enabled by a class method to allow customisation of the class independent of the channel configuration. For example, the user can create a MyDDR subclass of e.g. SimpleDDR3, and then create a four-channel configuration of the subclass by calling MyDDR.makeMultiChannel(4, mem_start, mem_size).
2013-03-01mem: SimpleDRAM variable naming and whitespace fixesAndreas Hansson
This patch fixes a number of small cosmetic issues in the SimpleDRAM module. The most important change is to move the accounting of received packets to after the check is made if the packet should be retried or not. Thus, packets are only counted if they are actually accepted.
2013-03-01mem: Add support for multi-channel DRAM configurationsAndreas Hansson
This patch adds support for multi-channel instances of the DRAM controller model by stripping away the channel bits in the address decoding. The patch relies on the availiability of address interleaving and, at this time, it is up to the user to configure the interleaving appropriately. At the moment it is assumed that the channel interleaving bits are immediately following the column bits (smallest sensible interleaving). Convenience methods for building multi-channel configurations will be added later.
2013-03-01mem: Merge interleaved ranges when creating backing storeAndreas Hansson
This patch adds merging of interleaved ranges before creating the backing stores. The backing stores are always a contigous chunk of the address space, and with this patch it is possible to have interleaved memories in the system.
2013-03-01mem: Merge ranges in bus before passing them onAndreas Hansson
This patch adds basic merging of address ranges to the bus, such that interleaved ranges are merged together before being passed on by the bus. As such, the bus aggregates the address ranges of the connected slave ports and then passes on the merged ranges through its master ports. The bus thus hides the complexity of the interleaved ranges and only exposes contigous ranges to the surrounding system. As part of this patch, the bus ranges are also cached for any future queries.
2013-02-28ruby: mesi coherence protocol: invalidate lockDibakar Gope ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)
The MESI CMP directory coherence protocol, while transitioning from SM to IM, did not invalidate the lock that it might have taken on a cache line. This patch adds an action for doing so. The problem was found by Dibakar, but I was not happy with his proposed solution. So I implemented a different solution. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-02-19slicc: remove unused variable message_buffer_namesNilay Vaish
2013-02-19ruby: remove unused variable m_print_config in class TopologyNilay Vaish
2013-02-19mem: Fix sender state bug and delay poppingAndreas Hansson
This patch fixes a newly introduced bug where the sender state was popped before checking that it should be. Amazingly all regressions pass, but Linux fails to boot on the detailed CPU with caches enabled.
2013-02-19scons: Fix warnings issued by clang 3.2svn (XCode 4.6)Andreas Hansson
This patch fixes the warnings that clang3.2svn emit due to the "-Wall" flag. There is one case of an uninitialised value in the ARM neon ISA description, and then a whole range of unused private fields that are pruned.
2013-02-19scons: Add warning for missing declarationsAndreas Hansson
This patch enables warnings for missing declarations. To avoid issues with SWIG-generated code, the warning is only applied to non-SWIG code.
2013-02-19scons: Fix up numerous warnings about name shadowingAndreas Hansson
This patch address the most important name shadowing warnings (as produced when using gcc/clang with -Wshadow). There are many locations where constructor parameters and function parameters shadow local variables, but these are left unchanged.
2013-02-19mem: Enforce strict use of busFirst- and busLastWordTimeAndreas Hansson
This patch adds a check to ensure that the delay incurred by the bus is not simply disregarded, but accounted for by someone. At this point, all the modules do is to zero it out, and no additional time is spent. This highlights where the bus timing is simply dropped instead of being paid for. As a follow up, the locations identified in this patch should add this additional time to the packets in one way or another. For now it simply acts as a sanity check and highlights where the delay is simply ignored. Since no time is added, all regressions remain the same.
2013-02-19mem: Change accessor function names to match the port interfaceAndreas Hansson
This patch changes the names of the cache accessor functions to be in line with those used by the ports. This is done to avoid confusion and get closer to a one-to-one correspondence between the interface of the memory object (the cache in this case) and the port itself. The member function timingAccess has been split into a snoop/non-snoop part to avoid branching on the isResponse() of the packet.
2013-02-19mem: Make packet bus-related time accounting relativeAndreas Hansson
This patch changes the bus-related time accounting done in the packet to be relative. Besides making it easier to align the cache timing to cache clock cycles, it also makes it possible to create a Last-Level Cache (LLC) directly to a memory controller without a bus inbetween. The bus is unique in that it does not ever make the packets wait to reflect the time spent forwarding them. Instead, the cache is currently responsible for making the packets wait. Thus, the bus annotates the packets with the time needed for the first word to appear, and also the last word. The cache then delays the packets in its queues before passing them on. It is worth noting that every object attached to a bus (devices, memories, bridges, etc) should be doing this if we opt for keeping this way of accounting for the bus timing.
2013-02-19mem: Add deferred packet class to prefetcherAndreas Hansson
This patch removes the time field from the packet as it was only used by the preftecher. Similar to the packet queue, the prefetcher now wraps the packet in a deferred packet, which also has a tick representing the absolute time when the packet should be sent.
2013-02-19sim: Make clock private and access using clockPeriod()Andreas Hansson
This patch makes the clock member private to the ClockedObject and forces all children to access it using clockPeriod(). This makes it impossible to inadvertently change the clock, and also makes it easier to transition to a situation where the clock is derived from e.g. a clock domain, or through a multiplier.