gem5 - gem5

Age	Commit message (Collapse)	Author
2019-10-29	mem: Fix DRAM controller to operate on its own address space	Nikos Nikoleris
	Typically, a memory controller is assigned an address range of the form [start, end). This address range might be interleaved and therefore only a non-continuous subset of the addresses in the address range is handed by this controller. Prior to this patch, the DRAM controller was unaware of the interleaving and as a result the address range could affect the mapping of addresses to DRAM ranks, rows and columns. This patch changes the DRAM controller, to transform the input address to a continuous range of the form [0, size). As a result the DRAM controller always operates on a dense and continuous address range regardlesss of the system configuration. Change-Id: I7d273a630928421d1854658c9bb0ab34e9360851 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19328 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Wendy Elsasser <wendy.elsasser@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-10-03	mem: Remove unused variable	Tommaso Marinelli
	The variable *sys in dram_ctrl.cc was only used in an assert() check, therefore it has been removed to allow building gem5.fast without errors. A typo in a comment in abstract_mem.hh has also been corrected. Change-Id: I2663545449ecfdb5a27c3574b79dd42beb4a49c8 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21380 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2019-09-30	mem: Convert DRAM controller to new-style stats	Andreas Sandberg
	Note that this changes the stat format used by the DRAM controller. Previously, it would have a structure looking a bit like this: - system - dram: Main DRAM controller - dram_0: Rank 0 - dram_1: Rank 1 This structure can't be replicated with new-world stats since stats are confined to the SimObject name space. This means that the new structure looks like this: - system - dram: Main DRAM controller - rank0: Rank 0 - rank1: Rank 1 Change-Id: I7435cfaf137c94b0c18de619d816362dd0da8125 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/21142 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Wendy Elsasser <wendy.elsasser@arm.com>
2019-06-06	mem: Option to toggle DRAM low-power states	Matthew Poremba
	Adding an option to enable DRAM low-power states. The low power states can have a significant impact on application performance (sim_ticks) on the order of 2-3x, especially for compute-gpu apps. The options allows for it to easily be enabled/disabled to compare performance numbers. The option is disabled by default. Change-Id: Ib9bddbb792a1a6a4afb5339003472ff8f00a5859 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18548 Reviewed-by: Wendy Elsasser <wendy.elsasser@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-04-28	mem: Minimize the use of MemObject.	Gabe Black
	MemObject doesn't provide anything beyond its base ClockedObject any more, so this change removes it from most inheritance hierarchies. Occasionally MemObject is replaced with SimObject when I was fairly confident that the extra functionality of ClockedObject wasn't needed. Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com>
2019-04-19	mem: Make DRAMCtrl::decodeAddr const	Daniel R. Carvalho
	DRAMCtrl's decodeAddr does not need to modify the packet it receives, nor should it modify the contents of the class, and therefore both the packet and the function are made const. Change-Id: I577f48d9a43611ba54878a9a793cb7b4fbb326f4 Signed-off-by: Daniel R. Carvalho <odanrc@yahoo.com.br> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17540 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-04-05	mem: Reverse order of write/read mem queue check	Jason Lowe-Power
	For atomic RMW instructions that go directly to memory, we want to put them on the write queue instead of the read queue. Swap the if/else condition to accomplish this. Note: This is ignoring the read latency of the RMW, but these instructions should usually be handled in caches anyway. Change-Id: I62dbfff3a16ac470f1ebdb489abe878962b20bb6 Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17828 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-03-19	arch, cpu, dev, gpu, mem, sim, python: start using getPort.	Gabe Black
	Replace the getMasterPort, getSlavePort, and getEthPort functions with getPort, and remove extraneous mechanisms that are no longer necessary. Change-Id: Iab7e3c02d2f3a0cf33e7e824e18c28646b5bc318 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17040 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2019-01-17	mem: Determine if a packet queue forces ordering at construction	Nikos Nikoleris
	A packet queue is typically used to hold on to packets that are schedules to be sent in the future or when they need to queue behind younger packets that have been sent out yet. Due to memory order requirements, some MemObjects need to maintain the order for packet (mostly responses) that reference the same cache block. Prior to this patch the ordering requirements where determined when the packet was scheduled to be sent. This patch moves the parameter to the constructor. Change-Id: Ieb4d94e86bc7514f5036b313ec23ea47dd653164 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15555 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-09-07	mem: Make DRAMCtrl a QoS-aware Memory Controller	Matteo Andreozzi
	This patch is turning DRAMCtrl a QoS-aware Memory Controller with "no policy" as a default policy. Change-Id: I48163da8c8208498cf0398b07094cb840272507f Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/11973 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-07-23	mem: Rename Packet::checkFunctional to trySatisfyFunctional	Robert Kovacsics
	Packet::checkFunctional also wrote data to/from the packet depending on if it was read/write, respectively, which the 'check' in the name would suggest otherwise. This renames it to doFunctional, which is more suggestive. It also renames any function called checkFunctional which calls Packet::checkFunctional. These are - Bridge::BridgeMasterPort::checkFunctional - calls Packet::checkFunctional - MSHR::checkFunctional - calls Packet::checkFunctional - MSHR::TargetList::checkFunctional - calls Packet::checkFunctional - Queue<>::checkFunctional (of src/mem/cache/queue.hh, not src/cpu/minor/buffers.h) - Instantiated with Queue<WriteQueueEntry> and Queue<MSHR> - WriteQueueEntry - calls Packet::checkFunctional - WriteQueueEntry::TargetList - calls Packet::checkFunctional - MemDelay::checkFunctional - calls QueuedSlavePort/QueuedMasterPort::checkFunctional - Packet::checkFunctional - PacketQueue::checkFunctional - calls Packet::checkFunctional - QueuedSlavePort::checkFunctional - calls PacketQueue::doFunctional - QueuedMasterPort::checkFunctional - calls PacketQueue::doFunctional - SerialLink::SerialLinkMasterPort::checkFunctional - calls Packet::doFunctional Change-Id: Ieca2579c020c329040da053ba8e25820801b62c5 Reviewed-on: https://gem5-review.googlesource.com/11810 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-05-18	mem: Add support for more flexible DRAM timing and topologies	Wendy Elsasser
	This patch has 2 main aspects: 1) Add new parameter to adjust write-to-write delay 2) Enable support of more than 64 banks per controller Changes for new parameter: Incorporated a new parameter, tCCD_L_WR, which defaults to tCCD_L. This parameter can be used to set a unique delay between writes and between reads. To incorporate this parameter in the controller, modified the DRAMCtrl class to have separate variables for read and write column delays. Used these variables to account for tRTW, tWTR, tBURST, tCCD_L, and tCS as well as the new tCCD_L_WR parameter. Changes to support more than 64 banks: Modified the logic selecting the next command (reorderQueue and minBankPrep functions). Replaced the unint64_t variables with a vector of uint32_t elements. There is a uint32_t element defined per ranks to allow up to 32 banks per rank. This will automatically scale with ranks without issue. Change will allow analysis of memory sub-systems beyond the current landscape. Change-Id: I0ce466efed58276f843ad90e9ecc0ece6c37d646 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10103 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-05-18	mem: Optimize self-refresh entry	Wendy Elsasser
	Self-refresh is entered during a refresh event, when the rank was previously in a precharge power-down state. The original code would enter self-refresh after a refresh was issued. The device subsequently will issue a refresh on self-refresh entry. On self-refresh exit, the controller will issue another refresh command. Devices require at least one additional refresh to be issued between self-refresh exit and re-entry. This ensures that enough refreshes occur in the case when the device narrowly missed a refresh on self-refresh exit. To minimize the number of refresh operations and still maintain the device requirement, the current logic does the following: 1) The controller will still enter self-refresh from a refresh event, when the previous state was precharge power-down. However, the refresh itself will be bypassed and the controller will immediately issue a self-refresh entry. 2) On a self-refresh exit, the controller will immediately issue a refresh command (per the original logic). This ensures the devices requirements are met and is a convenient way to kick off the command state machine. Change-Id: I1c4b0dcbfa3bdafd755f3ccd65e267fcd700c491 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-on: https://gem5-review.googlesource.com/10102 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2018-04-06	mem: Remove unused 'using namespace'	Daniel R. Carvalho
	Removal of unused/barely used 'using namespace' from C++ files. Change-Id: I66dc548c04506db2e41180b9ea7ab5abd7d5375a Reviewed-on: https://gem5-review.googlesource.com/9601 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2017-11-16	ext, mem: Pull DRAMPower SHA 90d6290 and rebase	Radhika Jagtap
	This patch syncs the DRAMPower library of gem5 to the external github (https://github.com/ravenrd/DRAMPower). The version pulled in is the commit: 90d6290f802c29b3de9e10233ceee22290907ce6 from 30th Oct. 2016. This change also modifies the DRAM Ctrl interaction with the DRAMPower, due to changes in the lib API in the above version. Previously multiple functions were called to prepare the power lib before calling the function that would calculate the enery. With the new API, these functions are encompassed inside the function to calculate the energy and therefore should now be removed from the DRAM controller. The other key difference is the introduction of a new function called calcWindowEnergy which can be useful for any system that wants to do measurements over intervals. For gem5 DRAM ctrl that means we now need to accumulate the window energy measurements into the total stat. Change-Id: I3570fff2805962e166ff2a1a3217ebf2d5a197fb Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/5724 Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2017-06-20	mem: Replace EventWrapper use with EventFunctionWrapper	Sean Wilson
	NOTE: With this change there is a possibility for `DRAMCtrl::Rank`s event names to not properly match the rank they were generated by. This could occur if the public rank member is modified after the Rank's construction. A patch would mean refactoring Rank and `DRAMCtrl`b to privatize many of the members of Rank behind getters. Change-Id: I7b8bd15086f4ffdfd3f40be4aeddac5e786fd78e Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3745 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2017-06-20	mem: Move the Rank construction logic to the Rank constructor	Sean Wilson
	This change was made so Rank objects have their name assigned when they are instantiated. Therefore, they can initialize their member objects with their name and it is less likely to change during runtime. (NOTE: I would recommend hiding the fields which would cause the name to change behind getters. Since modification of `Rank.rank` during runtime will cause the `name()` to change.) Change-Id: Id51c3553b40e489792c57950e18b8ce927e43173 Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3742 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2017-02-15	mem: fix assertion in respondEvent	Wendy Elsasser
	Assertion in the respondEvent erroneously fired. The assertion verifies that the controller has not moved to a low-power state prior to receiving read data from the memory. The original assertion triggered if the state was not: PWR_IDLE or PWR_ACT. In the case that failed, a periodic refresh event occurred around the read. The REF is stalled until the final read burst is issued and the subsequent PRE closes the bank. While the PRE will temporarily move the state to PWR_IDLE, state will immediately transition to PWR_REF due to the pending refresh operation. This state does not match the assertion, which is subsequently triggered. Fixed the assertion by explicitly checking that the state is not a low power state !PWR_SREF && !PWR_PRE_PDN && !PWR_ACT_PDN Change-Id: I82921a733bbeac2bcb5a487c2f981448d41ed50b Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
2016-11-09	style: [patch 1/22] use /r/3648/ to reorganize includes	Brandon Potter

2016-10-13	mem: Add DRAM low-power functionality	Wendy Elsasser
	Added power-down state transitions to the DRAM controller model. Added per rank parameter, outstandingEvents, which tracks the number of outstanding command events and is used to determine when the controller should transition to a low power state. The controller will only transition when there are no outstanding events scheduled and the number of command entries for the given rank is 0. The outstandingEvents parameter is incremented for every RD/WR burst, PRE, and REF event scheduled. ACT is implicitly covered by RD/WR since burst will always issue and complete after a required ACT. The parameter is decremented when the event is serviced (completed). The controller will automatically transition to ACT power down, PRE power down, or SREF. Transition to ACT power down state scheduled from: 1) The RespondEvent, where read data is received from the memory. ACT power-down entry will be scheduled when one or more banks is open, all commands for the rank have completed (no more commands scheduled), and there are no commands in queue for the rank Transition to PRE power down scheduled from: 1) respondEvent, when all banks are closed, all commands have completed, and there are no commands in queue for the rank 2) prechargeEvent when all banks are closed, all commands have completed, and there are no commands in queue for the rank 3) refreshEvent, after the refresh is complete when the previous state was ACT power-down 4) refreshEvent, after the refresh is complete when the previous state was PRE power-down and there are commands in the queue. Transition to SREF will be scheduled from: 1) refreshEvent, after the refresh is completes when the previous state was PRE power-down with no commands in queue Power-down exit commands are scheduled from: 1) The refreshEvent, prior to issuing a refresh 2) doDRAMAccess, to wake-up the rank for RD/WR command issue. Self-refresh exit commands are scheduled from: 1) The next request event, when the queue has commands for the rank in the readQueue or there are commands for the rank in the writeQueue and the bus state is WRITE. Change-Id: I6103f660776e36c686655e71d92ec7b5b752050a Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
2016-10-13	mem: Add callback to compute stats prior to dump event	Wendy Elsasser
	The per rank statistics are periodically updated based on state transition and refresh events. Add a method to update these when a dump event occurs to ensure they reflect accurate values. Specifically, need to ensure that the low-power state durations, power, and energy are logged correctly. Change-Id: Ib642a6668340de8f494a608bb34982e58ba7f1eb Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
2016-10-13	mem: Modify drain to ensure banks and power are idled	Wendy Elsasser
	Add constraint that all ranks have to be in PWR_IDLE before signaling drain complete This will ensure that the banks are all closed and the rank has exited any low-power states. On suspend, update the power stats to sync the DRAM power logic The logic maintains the location of the signalDrainDone method, which is still triggered from either: 1) Read response event 2) Next request event This ensures that the drain will complete in the READ bus state and minimizes the changes required. Change-Id: If1476e631ea7d5999fe50a0c9379c5967a90e3d1 Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
2016-10-13	mem: Sort memory commands and update DRAMPower	Wendy Elsasser
	Add local variable to stores commands to be issued. These commands are in order within a single bank but will be out of order across banks & ranks. A new procedure, flushCmdList, sorts commands across banks / ranks, and flushes the sorted list, up to curTick() to DRAMPower. This is currently called in refresh, once all previous commands are guaranteed to have completed. Could be called in other events like the powerEvent as well. By only flushing commands up to curTick(), will not get out of sync when flushed at a periodic stats dump (done in subsequent patch). Change-Id: I4ac65a52407f64270db1e16a1fb04cfe7f638851 Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
2016-10-13	mem: add DRAM powerdown timing	Omar Naji

2016-02-10	mem: Move the point of coherency to the coherent crossbar	Andreas Hansson
	This patch introduces the ability of making the coherent crossbar the point of coherency. If so, the crossbar does not forward packets where a cache with ownership has already committed to responding, and also does not forward any coherency-related packets that are not intended for a downstream memory controller. Thus, invalidations and upgrades are turned around in the crossbar, and the memory controller only sees normal reads and writes. In addition this patch moves the express snoop promotion of a packet to the crossbar, thus allowing the downstream cache to check the express snoop flag (as it should) for bypassing any blocking, rather than relying on whether a cache is responding or not.
2016-02-06	style: fix missing spaces in control statements	Steve Reinhardt
	Result of running 'hg m5style --skip-all --fix-control -a'.
2015-12-31	mem: Make cache terminology easier to understand	Andreas Hansson
	This patch changes the name of a bunch of packet flags and MSHR member functions and variables to make the coherency protocol easier to understand. In addition the patch adds and updates lots of descriptions, explicitly spelling out assumptions. The following name changes are made: * the packet memInhibit flag is renamed to cacheResponding * the packet sharedAsserted flag is renamed to hasSharers * the packet NeedsExclusive attribute is renamed to NeedsWritable * the packet isSupplyExclusive is renamed responderHadWritable * the MSHR pendingDirty is renamed to pendingModified The cache states, Modified, Owned, Exclusive, Shared are also called out in the cache and MSHR code to make it easier to understand.
2015-11-06	mem: Enforce insertion order on the cache response path	Ali Jafri
	This patch enforces insertion order transmission of packets on the response path in the cache. Note that the logic to enforce order is already present in the packet queue, this patch simply turns it on for queues in the response path. Without this patch, there are corner cases where a request-response is faster than a response-response forwarded through the cache. This violation of queuing order causes problems in the snoop filter leaving it with inaccurate information. This causes assert failures in the snoop filter later on. A follow on patch relaxes the order enforcement in the packet queue to limit the performance impact.
2015-11-06	mem: Align rules for sinking inhibited packets at the slave	Andreas Hansson
	This patch aligns how the memory-system slaves, i.e. the various memory controllers and the bridge, identify and deal with sinking of inhibited packets that are only useful within the coherent part of the memory system. In the future we could shift the onus to the crossbar, and add a parameter "is_point_of_coherence" that would allow it to sink the aforementioned packets.
2015-11-06	mem: Unify delayed packet deletion	Andreas Hansson
	This patch unifies how we deal with delayed packet deletion, where the receiving slave is responsible for deleting the packet, but the sending agent (e.g. a cache) is still relying on the pointer until the call to sendTimingReq completes. Previously we used a mix of a deletion vector and a construct using unique_ptr. With this patch we ensure all slaves use the latter approach.
2015-11-06	misc: Appease clang static analyzer	Andreas Hansson
	A few minor fixes to issues identified by the clang static analyzer.
2015-07-07	sim: Refactor and simplify the drain API	Andreas Sandberg
	The drain() call currently passes around a DrainManager pointer, which is now completely pointless since there is only ever one global DrainManager in the system. It also contains vestiges from the time when SimObjects had to keep track of their child objects that needed draining. This changeset moves all of the DrainState handling to the Drainable base class and changes the drain() and drainResume() calls to reflect this. Particularly, the drain() call has been updated to take no parameters (the DrainManager argument isn't needed) and return a DrainState instead of an unsigned integer (there is no point returning anything other than 0 or 1 any more). Drainable objects should return either DrainState::Draining (equivalent to returning 1 in the old system) if they need more time to drain or DrainState::Drained (equivalent to returning 0 in the old system) if they are already in a consistent state. Returning DrainState::Running is considered an error. Drain done signalling is now done through the signalDrainDone() method in the Drainable class instead of using the DrainManager directly. The new call checks if the state of the object is DrainState::Draining before notifying the drain manager. This means that it is safe to call signalDrainDone() without first checking if the simulator has requested draining. The intention here is to reduce the code needed to implement draining in simple objects.
2015-07-07	sim: Decouple draining from the SimObject hierarchy	Andreas Sandberg
	Draining is currently done by traversing the SimObject graph and calling drain()/drainResume() on the SimObjects. This is not ideal when non-SimObjects (e.g., ports) need draining since this means that SimObjects owning those objects need to be aware of this. This changeset moves the responsibility for finding objects that need draining from SimObjects and the Python-side of the simulator to the DrainManager. The DrainManager now maintains a set of all objects that need draining. To reduce the overhead in classes owning non-SimObjects that need draining, objects inheriting from Drainable now automatically register with the DrainManager. If such an object is destroyed, it is automatically unregistered. This means that drain() and drainResume() should never be called directly on a Drainable object. While implementing the new functionality, the DrainManager has now been made thread safe. In practice, this means that it takes a lock whenever it manipulates the set of Drainable objects since SimObjects in different threads may create Drainable objects dynamically. Similarly, the drain counter is now an atomic_uint, which ensures that it is manipulated correctly when objects signal that they are done draining. A nice side effect of these changes is that it makes the drain state changes stricter, which the simulation scripts can exploit to avoid redundant drains.
2015-07-07	sim: Make the drain state a global typed enum	Andreas Sandberg
	The drain state enum is currently a part of the Drainable interface. The same state machine will be used by the DrainManager to identify the global state of the simulator. Make the drain state a global typed enum to better cater for this usage scenario.
2015-07-03	mem: Update DRAM command scheduler for bank groups	Wendy Elsasser
	This patch updates the command arbitration so that bank group timing as well as rank-to-rank delays will be taken into account. The resulting arbitration no longer selects commands (prepped or not) that cannot issue seamlessly if there are commands that can issue back-to-back, minimizing the effect of rank-to-rank (tCS) & same bank group (tCCD_L) delays. The arbitration selects a new command based on the following priority. Within each priority band, the arbitration will use FCFS to select the appropriate command: 1) Bank is prepped and burst can issue seamlessly, without a bubble 2) Bank is not prepped, but can prep and issue seamlessly, without a bubble 3) Bank is prepped but burst cannot issue seamlessly. In this case, a bubble will occur on the bus Thus, to enable more parallelism in subsequent selections, an unprepped packet is given higher priority if the bank prep can be hidden. If the bank prep cannot be hidden, the selection logic will choose a prepped packet that cannot issue seamlessly if one exist. Otherwise, the default selection will choose the packet with the minimum bank prep delay.
2015-07-03	mem: Avoid DRAM write queue iteration for merging and read lookup	Andreas Hansson
	This patch adds a simple lookup structure to avoid iterating over the write queue to find read matches, and for the merging of write bursts. Instead of relying on iteration we simply store a set of currently-buffered write-burst addresses and compare against these. For the reads we still perform the iteration if we have a match. For the writes, we rely entirely on the set. Note that there are corner-cases where sub-bursts would actually not be mergeable without a read-modify-write. We ignore these cases and opt for speed.
2015-07-03	mem: Add clean evicts to improve snoop filter tracking	Ali Jafri
	This patch adds eviction notices to the caches, to provide accurate tracking of cache blocks in snoop filters. We add the CleanEvict message to the memory heirarchy and use both CleanEvicts and Writebacks with BLOCK_CACHED flags to propagate notice of clean and dirty evictions respectively, down the memory hierarchy. Note that the BLOCK_CACHED flag indicates whether there exist any copies of the evicted block in the caches above the evicting cache. The purpose of the CleanEvict message is to notify snoop filters of silent evictions in the relevant caches. The CleanEvict message behaves much like a Writeback. CleanEvict is a write and a request but unlike a Writeback, CleanEvict does not have data and does not need exclusive access to the block. The cache generates the CleanEvict message on a fill resulting in eviction of a clean block. Before travelling downwards CleanEvict requests generate zero-time snoop requests to check if the same block is cached in upper levels of the memory heirarchy. If the block exists, the cache discards the CleanEvict message. The snoops check the tags, writeback queue and the MSHRs of upper level caches in a manner similar to snoops generated from HardPFReqs. Currently CleanEvicts keep travelling towards main memory unless they encounter the block corresponding to their address or reach main memory (since we have no well defined point of serialisation). Main memory simply discards CleanEvict messages. We have modified the behavior of Writebacks, such that they generate snoops to check for the presence of blocks in upper level caches. It is possible in our current implmentation for a lower level cache to be writing back a block while a shared copy of the same block exists in the upper level cache. If the snoops find the same block in upper level caches, we set the BLOCK_CACHED flag in the Writeback message. We have also added logic to account for interaction of other message types with CleanEvicts waiting in the writeback queue. A simple example is of a response arriving at a cache removing any CleanEvicts to the same address from the cache's writeback queue.
2015-04-29	mem: Simplify page close checks for adaptive policies	Rizwana Begum
	Both open_adaptive and close_adaptive page polices keep the page open if a row hit is found. If a row hit is not found, close_adaptive page policy precharges the row, and open_adaptive policy precharges the row only if there is a bank conflict request waiting in the queue. This patch makes the checks for above conditions simpler. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-03-02	mem: Downstream components consumes new crossbar delays	Marco Balboni
	This patch makes the caches and memory controllers consume the delay that is annotated to a packet by the crossbar. Previously many components simply threw these delays away. Note that the devices still do not pay for these delays.
2015-03-02	mem: Split port retry for all different packet classes	Andreas Hansson
	This patch fixes a long-standing isue with the port flow control. Before this patch the retry mechanism was shared between all different packet classes. As a result, a snoop response could get stuck behind a request waiting for a retry, even if the send/recv functions were split. This caused message-dependent deadlocks in stress-test scenarios. The patch splits the retry into one per packet (message) class. Thus, sendTimingReq has a corresponding recvReqRetry, sendTimingResp has recvRespRetry etc. Most of the changes to the code involve simply clarifying what type of request a specific object was accepting. The biggest change in functionality is in the cache downstream packet queue, facing the memory. This queue was shared by requests and snoop responses, and it is now split into two queues, each with their own flow control, but the same physical MasterPort. These changes fixes the previously seen deadlocks.
2015-02-11	mem: Clarification of packet crossbar timings	Marco Balboni
	This patch clarifies the packet timings annotated when going through a crossbar. The old 'firstWordDelay' is replaced by 'headerDelay' that represents the delay associated to the delivery of the header of the packet. The old 'lastWordDelay' is replaced by 'payloadDelay' that represents the delay needed to processing the payload of the packet. For now the uses and values remain identical. However, going forward the payloadDelay will be additive, and not include the headerDelay. Follow-on patches will make the headerDelay capture the pipeline latency incurred in the crossbar, whereas the payloadDelay will capture the additional serialisation delay.
2015-01-20	mem: Move DRAM interleaving check to init	Andreas Hansson
	This patch fixes a bug where the DRAM controller tried to access the system cacheline size before the system pointer was initialised. It also fixes a bug where the granularity is 0 (no interleaving).
2014-12-23	config: Expose the DRAM ranks as a command-line option	Andreas Hansson
	This patch gives the user direct influence over the number of DRAM ranks to make it easier to tune the memory density without affecting the bandwidth (previously the only means of scaling the device count was through the number of channels). The patch also adds some basic sanity checks to ensure that the number of ranks is a power of two (since we rely on bit slices in the address decoding).
2014-12-23	mem: Ensure DRAM controller is idle when in atomic mode	Andreas Hansson
	This patch addresses an issue seen with the KVM CPU where the refresh events scheduled by the DRAM controller forces the simulator to switch out of the KVM mode, thus killing performance. The current patch works around the fact that we currently have no proper API to inform a SimObject of the mode switches. Instead we rely on drainResume being called after any switch, and cache the previous mode locally to be able to decide on appropriate actions. The switcheroo regression require a minor stats bump as a result.
2014-12-23	mem: Add rank-wise refresh to the DRAM controller	Omar Naji
	This patch adds rank-wise refresh to the controller, as opposed to the channel-wide refresh currently in place. In essence each rank can be refreshed independently, and for this to be possible the controller is extended with a state machine per rank. Without this patch the data bus is always idle during a refresh, as all the ranks are refreshing at the same time. With the rank-wise refresh it is possible to use one rank while another one is refreshing, and thus the data bus can be kept busy. The patch introduces a Rank class to encapsulate the state per rank, and also shifts all the relevant banks, activation tracking etc to the rank. The arbitration is also updated to consider the state of the rank.
2014-12-23	mem: Fix a bug in the DRAM controller arbitration	Omar Naji
	Fix a minor issue that affects multi-rank systems.
2014-12-02	mem: Add a GDDR5 DRAM config	Omar Naji
	This patch adds a first cut GDDR5 config to accommodate the users combining gem5 and GPUSim. The config is based on a SK Hynix datasheet, and the Nvidia GTX580 specification. Someone from the GPUSim user-camp should tweak the default page-policy and static frontend and backend latencies.
2014-10-29	arm, mem: Fix drain bug and provide drain prints for more components.	Ali Saidi

2014-10-20	mem: Fix DRAM activationlLimit bug	Omar Naji
	Ensure that we do the proper event scheduling also when the activation limit is disabled.
2014-10-20	mem: Add DRAM device size and check against config	Omar Naji
	This patch adds the size of the DRAM device to the DRAM config. It also compares the actual DRAM size (calculated using information from the config) to the size defined in the system. If these two values do not match gem5 will print a warning. In order to do correct DRAM research the size of the memory defined in the system should match the size of the DRAM in the config. The timing and current parameters found in the DRAM configs are defined for a DRAM device with a specific size and would differ for another device with a different size.