summaryrefslogtreecommitdiff
path: root/src/mem
AgeCommit message (Collapse)Author
2013-04-23sim: Fix two bugs relating to software caching of PageTable entries.Mitch Hayenga
The existing implementation can read uninitialized data or stale information from the cached PageTable entries. 1) Add a valid bit for the cache entries. Simply using zero for the virtual address to signify invalid entries is not sufficient. Speculative, wrong-path accesses frequently access page zero. The current implementation would return a uninitialized TLB entry when address zero was accessed and the PageTable cache entry was invalid. 2) When unmapping/mapping/remaping a page, invalidate the corresponding PageTable cache entry if one already exists.
2013-04-23ruby: mesi coherence protocol: remove unused state M_MBNilay Vaish
2013-04-23ruby: patch checkpoint restore with garnetNilay Vaish
Due to recent changes to clocking system in Ruby and the way Ruby restores state from a checkpoint, garnet was failing to run from a checkpointed state. The problem is that Ruby resets the time to zero while warming up the caches. If any component records a local copy of the time (read calls curCycle()) before the simulation has started, then that component will not operate until that time is reached. In the context of this particular patch, the Garnet Network class calls curCycle() at multiple places. Any non-operational component can block in requests in the memory system, which the system interprets as a deadlock. This patch makes changes so that Garnet can successfully run from checkpointed state. It adds a globally visible time at which the actual execution started. This time is initialized in RubySystem::startup() function. This variable is only meant for components with in Ruby. This replaces the private variable that was maintained within Garnet since it is not possible to figure out the correct time when the value of this variable can be set. The patch also does away with all cases where curCycle() is called with in some Ruby component before the system has actually started executing. This is required due to the quirky manner in which ruby restores from a checkpoint.
2013-04-22mem: Address mapping with fine-grained channel interleavingAndreas Hansson
This patch adds an address mapping scheme where the channel interleaving takes place on a cache line granularity. It is similar to the existing RaBaChCo that interleaves on a DRAM page, but should give higher performance when there is less locality in the address stream.
2013-04-22mem: More descriptive enum names for address mappingAndreas Hansson
This patch changes the slightly ambigious names used for the address mapping scheme to be more descriptive, and actually spell out what they do. With this patch we also open up for adding more flavours of open- and close-type mappings, i.e. interleaving across channels with the open map.
2013-04-22mem: Add a WideIO DRAM configurationAndreas Hansson
This patch adds a WideIO 200 MHz configuration that can be used as a baseline to compare with DDRx and LPDDRx. Note that it is a single channel and that it should be replicated 4 times. It is based on publically available information and attempts to capture an envisioned 8 Gbit single-die part (i.e. without TSVs).
2013-04-22mem: Adding verbose debug output in the memory systemUri Wiener
This patch provides useful printouts throughut the memory system. This includes pretty-printed cache tags and function call messages (call-stack like).
2013-04-22mem: Replace check with panic where inhibited should not happenAndreas Hansson
This patch changes the SimpleTimingPort and RubyPort to panic on inhibited requests as this should never happen in either of the cases. The SimpleTimingPort is only used for the I/O devices PIO port and the DMA devices config port and should thus never see an inhibited request. Similarly, the SimpleTimingPort is also used for the MessagePort in x86, and there should also not be any cases where the port sees an inhibited request.
2013-04-22sim: separate nextCycle() and clockEdge() in clockedObjectsDam Sunwoo
Previously, nextCycle() could return the *current* cycle if the current tick was already aligned with the clock edge. This behavior is not only confusing (not quite what the function name implies), but also caused problems in the drainResume() function. When exiting/re-entering the sim loop (e.g., to take checkpoints), the CPUs will drain and resume. Due to the previous behavior of nextCycle(), the CPU tick events were being rescheduled in the same ticks that were already processed before draining. This caused divergence from runs that did not exit/re-entered the sim loop. (Initially a cycle difference, but a significant impact later on.) This patch separates out the two behaviors (nextCycle() and clockEdge()), uses nextCycle() in drainResume, and uses clockEdge() everywhere else. Nothing (other than name) should change except for the drainResume timing.
2013-04-17ruby: moesi cmp directory: add copyright noticeNilay Vaish
2013-04-09Ruby: Fix RubyPort evict packet memory leakJoel Hestness
When using the o3 or inorder CPUs with many Ruby protocols, the caches may need to forward invalidations to the CPUs. The RubyPort was instantiating a packet to be sent to the CPUs to signal the eviction, but the packets were not being freed by the CPUs. Consistent with the classic memory model, stack allocate the packet and heap allocate the request so on ruby_eviction_callback() completion, the packet deconstructor is called, and deletes the request (*Note: stack allocating the request causes double deletion, since it will be deleted in the packet destructor). This results in the least memory allocations without memory errors.
2013-04-09Ruby: Delete packet requests during warmupJoel Hestness
When warming up caches in Ruby, the CacheRecorder sends fetch requests into Ruby Sequencers with packet types that require responses. Since responses are never generated for these CacheRecorder requests, the requests are not deleted in the packet destructor called from the Ruby hit callback. Free the request.
2013-04-09Ruby: Add field to slicc machine for generic typeJoel Hestness
This allows you to have (i.e.) an L2 cache that is not named "L2Cache" but is still a GenericMachineType_L2Cache. This is particularly helpful if the protocol has multiple L2 controllers.
2013-04-09Ruby: Order profilers based on versionJoel Hestness
When Ruby stats are printed for events and transitions, they include stats for all of the controllers of the same type, but they are not necessarily printed in order of the controller ID "version", because of the way the profilers were added to the profiler vector. This patch fixes the push order problem so that the stats are printed in ascending order 0->(# controllers), so statistics parsers may correctly assume the controller to which the stats belong.
2013-04-09Ruby: More descriptive message buffer connection fatalJason Power
When connecting message buffers between Ruby controllers, it is easy to mistakenly connect multiple controllers to the same message buffer. This patch prints a more descriptive fatal message than the previous assert statement in order to facilitate easier debugging.
2013-04-09Ruby: Fix typo in Slicc if-statement AST errorJason Power
The error in the SLICC code was hidden by the python error in SLICC parser before this patch
2013-04-07Ruby System, Cache Recorder: Use delete [] for trace varsJoel Hestness
The cache trace variables are array allocated uint8_t* in the RubySystem and the Ruby CacheRecorder, but the code used delete to free the memory, resulting in Valgrind memory errors. Change these deletes to delete [] to get rid of the errors.
2013-03-27mem: Fix cache latency bugMitch Hayenga
Fixes a latency calculation bug for accesses during a cache line fill. Under a cache miss, before the line is filled, accesses to the cache are associated with a MSHR and marked as targets. Once the line fill completes, MSHR target packets pay an additional latency of "responseLatency + busSerializationLatency". However, the "whenReady" field of the cache line is only set to an additional delay of "busSerializationLatency". This lacks the responseLatency component of the fill. It is possible for accesses that occur on the cycle of (or briefly after) the line fill to respond without properly paying the responseLatency. This also creates the situation where two accesses to the same address may be serviced in an order opposite of how they were received by the cache. For stores to the same address, this means that although the cache performs the stores in the order they were received, acknowledgements may be sent in a different order. Adding the responseLatency component to the whenReady field preserves the penalty that should be paid and prevents these ordering issues. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-03-26mem: Cancel cache retry event when blocking portRene de Jong
This patch solves the corner case scenario where the sendRetryEvent could be scheduled twice, when an io device stresses the IOcache in the system. This should not be possible in the cache system.
2013-03-26mem: Separate waiting for the bus and waiting for a peerAndreas Hansson
This patch splits the retryList into a list of ports that are waiting for the bus itself to become available, and a map that tracks the ports where forwarding failed due to a peer not accepting the packet. Thus, when a retry reaches the bus, it can be sent to the appropriate port that initiated that transaction. As a consequence of this patch, only ports that are really ready to go will get a retry, thus reducing the amount of redundant failed attempts. This patch also makes it easier to reason about the order of servicing requests as the ports waiting for the bus are now clearly FIFO and much easier to change if desired.
2013-03-26mem: Introduce a variable for the retrying portAndreas Hansson
This patch introduces a variable to keep track of the retrying port instead of relying on it being the front of the retryList. Besides the improvement in readability, this patch is a step towards separating out the two cases where a port is waiting for the bus to be free, and where the forwarding did not succeed and the bus is waiting for a retry to pass on to the original initiator of the transaction. The changes made are currently such that the regressions are not affected. This is ensured by always prioritizing the currently retrying port and putting it back at the front of the retry list.
2013-03-26mem: Add optional request flags to the packet traceAndreas Hansson
This patch adds an optional flags field to the packet trace to encode the request flags that contain information about whether the request is (un)cacheable, instruction fetch, preftech etc.
2013-03-22ruby: slicc: set sender, receiver clock objs for optional queueNilay Vaish
2013-03-22ruby: message buffer: correct previous errorsNilay Vaish
A recent set of patches added support for multiple clock domains to ruby. I had made some errors while writing those patches. The sender was using the receiver side clock while enqueuing a message in the buffer. Those errors became visible while creating (or restoring from) checkpoints. The errors also become visible when a multi eventq scenario occurs.
2013-03-22ruby: message buffer: remove _ptr from some variablesNilay Vaish
The names were getting too long.
2013-03-22ruby: message buffer node: used Tick in place of CyclesNilay Vaish
The message buffer node used to keep time in terms of Cycles. Since the sender and the receiver can have different clock periods, storing node time in cycles requires some conversion. Instead store the time directly in Ticks.
2013-03-22ruby: consumer: avoid using receiver side clockNilay Vaish
A set of patches was recently committed to allow multiple clock domains in ruby. In those patches, I had inadvertently made an incorrect use of the clocks. Suppose object A needs to schedule an event on object B. It was possible that A accesses B's clock to schedule the event. This is not possible in actual system. Hence, changes are being to the Consumer class so as to avoid such happenings. Note that in a multi eventq simulation, this can possibly lead to an incorrect simulation. There are two functions in the Consumer class that are used for scheduling events. The first function takes in the relative delay over the current time as the argument and adds the current time to it for scheduling the event. The second function takes in the absolute time (in ticks) for scheduling the event. The first function is now being moved to protected section of the class so that only objects of the derived classes can use it. All other objects will have to specify absolute time while scheduling an event for some consumer.
2013-03-22ruby: remove unsued profile functionsNilay Vaish
2013-03-22ruby: keep histogram of outstanding requests in seqNilay Vaish
The histogram for tracking outstanding counts per cycle is maintained in the profiler. For a parallel implementation of the memory system, we need that this histogram is maintained locally. Hence it will now be kept in the sequencer itself. The resulting histograms will be merged when the stats are printed.
2013-03-22slicc: remove check if the L1Cache has a sequencerNilay Vaish
2013-03-22ruby: move stall and wakeup functions to AbstractControllerNilay Vaish
These functions are currently implemented in one of the files related to Slicc. Since these are purely C++ functions, they are better suited to be in the base class.
2013-03-22ruby: connect two controllers using only message buffersNilay Vaish
This patch modifies ruby so that two controllers can be connected to each other with only message buffers in between. Before this patch, all the controllers had to be connected to the network for them to communicate with each other. With this patch, one can have protocols where a controller is not connected to the network, but communicates with another controller through a message buffer.
2013-03-22ruby: convert Topology to regular classNilay Vaish
The Topology class in Ruby does not need to inherit from SimObject class. This patch turns it into a regular class. The topology object is now created in the constructor of the Network class. All the parameters for the topology class have been moved to the network class.
2013-03-22ruby: network: move routers from topology to networkNilay Vaish
2013-03-18mem: Fix missing delete of packet in DRAM accessAndreas Hansson
This patch fixes a memory leak caused by not deleting packets that require no response.
2013-03-15ruby: set: corrects csprintf() call introduced by 7d95b650c9b6Nilay Vaish
2013-03-07ruby: Fix gcc 4.8 maybe-uninitialized compilation errorAndreas Hansson
This patch fixes the one-and-only gcc 4.8 compilation error, being a warning about "maybe uninitialized" in Orion.
2013-03-06ruby: remove the functional copy of memory in se modeNilay Vaish
This patch removes the functional copy of the memory that was maintained in the se mode. Now ruby itself will provide the data.
2013-03-06ruby: garnet: fixed: implement functional accessNilay Vaish
2013-03-02ruby: fixes functional writes to RubyRequestBlake Hechtman ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)
The functional write code was assuming that all writes are block sized, which may not be true for Ruby Requests. This bug can lead to a buffer overflow. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-03-01mem: Add check if SimpleDRAM nextReqEvent is scheduledAndreas Hansson
This check covers a case where a retry is called from the SimpleDRAM causing a new request to appear before the DRAM itself schedules a nextReqEvent. By adding this check, the event is not scheduled twice.
2013-03-01mem: Add a method to build multi-channel DRAM configurationsAndreas Hansson
This patch adds a class method that allows easy creation of channel-interleaved multi-channel DRAM configurations. It is enabled by a class method to allow customisation of the class independent of the channel configuration. For example, the user can create a MyDDR subclass of e.g. SimpleDDR3, and then create a four-channel configuration of the subclass by calling MyDDR.makeMultiChannel(4, mem_start, mem_size).
2013-03-01mem: SimpleDRAM variable naming and whitespace fixesAndreas Hansson
This patch fixes a number of small cosmetic issues in the SimpleDRAM module. The most important change is to move the accounting of received packets to after the check is made if the packet should be retried or not. Thus, packets are only counted if they are actually accepted.
2013-03-01mem: Add support for multi-channel DRAM configurationsAndreas Hansson
This patch adds support for multi-channel instances of the DRAM controller model by stripping away the channel bits in the address decoding. The patch relies on the availiability of address interleaving and, at this time, it is up to the user to configure the interleaving appropriately. At the moment it is assumed that the channel interleaving bits are immediately following the column bits (smallest sensible interleaving). Convenience methods for building multi-channel configurations will be added later.
2013-03-01mem: Merge interleaved ranges when creating backing storeAndreas Hansson
This patch adds merging of interleaved ranges before creating the backing stores. The backing stores are always a contigous chunk of the address space, and with this patch it is possible to have interleaved memories in the system.
2013-03-01mem: Merge ranges in bus before passing them onAndreas Hansson
This patch adds basic merging of address ranges to the bus, such that interleaved ranges are merged together before being passed on by the bus. As such, the bus aggregates the address ranges of the connected slave ports and then passes on the merged ranges through its master ports. The bus thus hides the complexity of the interleaved ranges and only exposes contigous ranges to the surrounding system. As part of this patch, the bus ranges are also cached for any future queries.
2013-02-28ruby: mesi coherence protocol: invalidate lockDibakar Gope ext:(%2C%20Nilay%20Vaish%20%3Cnilay%40cs.wisc.edu%3E)
The MESI CMP directory coherence protocol, while transitioning from SM to IM, did not invalidate the lock that it might have taken on a cache line. This patch adds an action for doing so. The problem was found by Dibakar, but I was not happy with his proposed solution. So I implemented a different solution. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-02-19slicc: remove unused variable message_buffer_namesNilay Vaish
2013-02-19ruby: remove unused variable m_print_config in class TopologyNilay Vaish
2013-02-19mem: Fix sender state bug and delay poppingAndreas Hansson
This patch fixes a newly introduced bug where the sender state was popped before checking that it should be. Amazingly all regressions pass, but Linux fails to boot on the detailed CPU with caches enabled.