gem5 - gem5

Age	Commit message (Collapse)	Author
2017-04-05	ruby: Fix MOESI_CMP_directory for new DMA status changes.	Javier Cano-Cano
	Multiple outstanding DMA requests introduced new DMA states that didn't be considered into slicc code. This patch implements the missed DMA state changes on MOESI_CMP_directory protocol. Change-Id: I700d441d76556b7e77e0d507904af6ec6ba59cc2 Signed-off-by: Michael LeBeane <michael.lebeane@amd.com> Reviewed-on: https://gem5-review.googlesource.com/2380 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Michael LeBeane <Michael.Lebeane@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2017-03-09	misc: add missing copyright/author information in previous commit	Pierre-Yves Péneau
	See a06a46f and a854373. Change-Id: Id66427db22b7d7764c218b9cd78d95db929f4127 Signed-off-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> Reviewed-on: https://gem5-review.googlesource.com/2224 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2017-03-09	ruby: fix MOESI_hammer directory to work with > 3GB memory	Lena Olson
	The MOESI_hammer directory assumes a contiguous address space, but X86 has an IO gap from 3-4GB. This patch allows the directory to work with more than 3GB of memory on X86. Assumptions: the physical address space (range of possible physical addresses) is 0-XGB when X <= 3GB, and 0-(X+1)GB when X > 3GB. If there is no IO gap this patch should still work. Change-Id: I5453a09e953643cada2c096a91d339a3676f55ee Reviewed-on: https://gem5-review.googlesource.com/2169 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2017-03-07	gpu-compute: Fix Python/C++ object hierarchy discrepancies	Andreas Sandberg
	The GPUCoalescer and the Shader classes have different base classes in C++ and Python. This causes subtle bugs in SWIG and compilation errors for PyBind. Change-Id: I1ddd2a8ea43f083470538ddfea891347b21d14d8 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-on: https://gem5-review.googlesource.com/2228 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
2017-03-03	mem: Make blkAlign a common function between all tag classes	Nikos Nikoleris
	blkAlign was defined as a separate function in the base associative and fully-associative tags classes although both functions implemented identical functionality. This patch moves the blkAlign in the base tags class. Change-Id: I3d415d0e62bddeec7ce0d559667e40a8c5fdc2d4 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2017-03-03	mem: Use pkt::getBlockAddr instead of BaseCace::blockAlign	Nikos Nikoleris
	Change-Id: I0ed4e528cb750a323facdc811dde7f0ed1ff228e Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2017-03-01	ruby: fix and/or precedence in slicc	Lena Olson
	The slicc compiler currently treats && and \|\| with the same precedence. This is highly non-intuitive to people used to C, and was probably an error. This patch makes && bind tighter than \|\|. For example, previously: if (A \|\| B && C) compiled to: if ((A \|\| B) && C) With this patch, it compiles to: if (A \|\| (B && C)) Change-Id: Idbbd5b50cc86a8d6601045adc14a253284d7b791 Signed-off-by: Lena Olson (leolson@google.com) Reviewed-on: https://gem5-review.googlesource.com/2168 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Joe Gross <criusx@gmail.com> Reviewed-by: Sooraj Puthoor <puthoorsooraj@gmail.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> [ Rebased onto master ] Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-27	syscall_emul: [PATCH 15/22] add clone/execve for threading and multiprocess ↵	Brandon Potter
	simulations Modifies the clone system call and adds execve system call. Requires allowing processes to steal thread contexts from other processes in the same system object and the ability to detach pieces of process state (such as MemState) to allow dynamic sharing.
2017-02-21	mem: Remove unused size field from the CacheBlk class	Nikos Nikoleris
	Change-Id: I6149290d6d2ac1a4bd6165871c93d7b7d6a980ad Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-21	mem: Remove the unused asid field from the CacheBlk class	Nikos Nikoleris
	Change-Id: I29f45733c5fad822bdd0d8dcc7939d86b2e8c97b Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-21	mem: Remove unused arguments (asid/contex_id) from accessBlock	Nikos Nikoleris
	Change-Id: I79c2662fc81630ab321db8a75be6cd15fa07d372 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-21	mem: Remove unused type BlkList from the cache and the tags	Nikos Nikoleris
	Change-Id: If9ebb8488e8db587482ecfa99d2c12cfe5734fb9 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-21	mem: Remove unused functions from the tag classes	Nikos Nikoleris
	Change-Id: I4f3c2c027b1acaaf791a4c71086f34a9b9fbf4df Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-21	mem: Always use the helper function to invalidate a block	Nikos Nikoleris
	Policies like the LRU need to be notified when a block is invalidated, the helper function does this along with invalidating the block. Change-Id: I3ed59cf07938caa7f394ee6054b0af9e00b267ea Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-21	mem: Fix MSHR assert triggering for invalidated prefetches	Sascha Bischoff
	This changeset updates an assert in src/mem/cache/mshr.cc which was erroneously catching invalidated prefetch requests. These requests can become invalidated if another component writes (an exclusive access) to this location during the time that the read request is in flight. The original assert made the assumption that these cases can only occur for reads generated by the CPU, and hence prefetcher-generated requests would sometimes trip the assert. Change-Id: If4f043273a688c2bab8f7a641192a2b583e7b20e Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-21	mem: Populate the secure flag in the writeback visitor	Nikos Nikoleris
	Previously the writeback visitor would not consider and set the secure flag for the blocks that are written back to memory. This patch fixes this. Change-Id: Ie1a425fa9211407a70a4343f2c6b3d073371378f Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-21	mem: Remove stale argument from a panic statement	Nikos Nikoleris
	Change-Id: I7ae5fa44a937f641a2ddd242a49e0cd23f68b9f2 Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-19	sim: Ensure draining is deterministic	Andreas Hansson
	The traversal of drainable objects could potentially be non-deterministic when using an unordered set containing object pointers. To ensure that the iteration is deterministic, we switch to a vector. Note that the lookup and traversal of the drainable objects is not performance critical, so the change has no negative consequences.
2017-02-19	mem: Ensure deferred snoops are cache-line aligned	Andreas Hansson
	This patch fixes a bug where a deferred snoop ended up being to a partial cache line, and not cache-line aligned, all due to how we copy the packet.
2017-02-19	mem: Fix memory footprint includes	Andreas Hansson
	Fix compilation errors due to missing include.
2017-02-15	mem, stats: fix typos in CommMonitor and Stats	Pierre-Yves Péneau
	Signed-off-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed at http://reviews.gem5.org/r/3802/
2017-02-15	mem, misc: fix building issue with CommMonitor (unused variables)	Pierre-Yves Péneau
	Signed-off-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed at http://reviews.gem5.org/r/3801/
2017-02-15	mem: fix assertion in respondEvent	Wendy Elsasser
	Assertion in the respondEvent erroneously fired. The assertion verifies that the controller has not moved to a low-power state prior to receiving read data from the memory. The original assertion triggered if the state was not: PWR_IDLE or PWR_ACT. In the case that failed, a periodic refresh event occurred around the read. The REF is stalled until the final read burst is issued and the subsequent PRE closes the bank. While the PRE will temporarily move the state to PWR_IDLE, state will immediately transition to PWR_REF due to the pending refresh operation. This state does not match the assertion, which is subsequently triggered. Fixed the assertion by explicitly checking that the state is not a low power state !PWR_SREF && !PWR_PRE_PDN && !PWR_ACT_PDN Change-Id: I82921a733bbeac2bcb5a487c2f981448d41ed50b Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
2017-02-14	mem: Update DRAM configuration names	Wendy Elsasser
	Names of DRAM configurations were updated to reflect both the channel and device data width. Previous naming format was: <DEVICE_TYPE>_<DATA_RATE>_<CHANNEL_WIDTH> The following nomenclature is now used: <DEVICE_TYPE>_<DATA_RATE>_<n>x<w> where n = The number of devices per rank on the channel x = Device width Total channel width can be calculated by n*w Example: A 64-bit DDR4, 2400 channel consisting of 4-bit devices: n = 16 w = 4 The resulting configuration name is: DDR4_2400_16x4 Updated scripts to match new naming convention. Added unique configurations for DDR4 for: 1) 16x4 2) 8x8 3) 4x16 Change-Id: Ibd7f763b7248835c624309143cb9fc29d56a69d1 Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2017-02-12	ruby: fix round robin arbiter in garnet2.0	Tushar Krishna
	The rr arbiter pointer in garnet was getting updated on every request, even if there is no grant. This was leading to a huge variance in wait time at a router at high injection rates. This patch corrects it to update upon a grant.
2017-02-11	mem: fix printing of 1st cache tags line	Bjoern A. Zeeb
	Rather than having the 1st line on the Log line and every other line on its own, add a new line to have a common format for all of them. Makes parsing a lot easier. Reviewed at http://reviews.gem5.org/r/3808/ Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-09	misc: add a MasterId to the ExternalPort	Christian Menard
	The Request constructor requires a MasterID. However, an external transactor has no chance of getting a MasterID as it does not have a pointer to the System. This patch adds a MasterID to ExternalMaster to allow external modules to easily genrerate new Requests. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-27	mem: Refactor CommMonitor stats, add basic atomic mode stats	Rahul Thakur
	Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-27	mem: Add memory footprint probe	Rahul Thakur
	Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-11-09	style: [patch 3/22] reduce include dependencies in some headers	Brandon Potter
	Used cppclean to help identify useless includes and removed them. This involved erroneously included headers, but also cases where forward declarations could have been used rather than a full include.
2017-01-19	ruby: guard usage of GPUCoalescer code in Profiler	Tony Gutierrez
	the GPUCoalescer code is used in the ruby profiler regardless of whether or not the coalescer code has been compiled, which can lead to link/run time errors. here we add #ifdefs to guard the usage of GPUCoalescer code. eventually we should refactor this code to use probe points.
2017-01-19	ruby: Check MessageBuffer space in garnet NetworkInterface	Matthew Poremba
	Garnet's NetworkInterface does not consider the size of MessageBuffers when ejecting a Message from the network. Add a size check for the MessageBuffer and only enqueue if space is available. If space is not available, the message if placed in a queue and the credit is held. A callback from the MessageBuffer is implemented to wake the NetworkInterface. If there are messages in the stalled queue, they are processed first, in a FIFO manner and if succesfully ejected, the credit is finally sent back upstream. The maximum size of the stall queue is equal to the number of valid VNETs with MessageBuffers attached.
2017-01-19	ruby: Add occupancy stats to MessageBuffers	Matthew Poremba
	This patch is an updated version of /r/3297. "The most important statistic for measuring memory hierarchy performance is throughput, which is affected by independent variables, buffer sizing and communication latency. It is difficult/impossible to debug performance issues through series buffers without knowing which are the bottlenecks. For finite buffers, this patch adds statistics for the average number of messages in the buffer, the occupancy of the buffer slots, and number of message stalls."
2017-01-19	ruby: Check all VNETs for injection in garnet NetworkInterface	Matthew Poremba
	The NetworkInterface wakeup currently iterates over all VNETs and breaks the loop if a VNET is unable to allocate a VC. This can cause a deadlock if a lower numbered VNET is unable to allocate a VC while a higher numbered VNET has idle VCs. This seems like a bug as Garnet 1.0 uses a while loop over an if-statement, suggesting the break was intended for this while loop. This patch removes the break statement, which allows up to one message to be dequeued from a VNET and injected into the network.
2016-11-09	style: [patch 1/22] use /r/3648/ to reorganize includes	Brandon Potter

2016-12-20	ruby: Make MessageBuffers actually finite sized	Joel Hestness
	When Ruby controllers stall messages in MessageBuffers, the buffer moves those messages off the priority heap and into a per-address stall map. When buffers are finite-sized, the test areNSlotsAvailable() only checks the size of the priority heap, but ignores the stall map, so the map is allowed to grow unbounded if the controller stalls numerous messages. This patch fixes the problem by tracking the stall map size and testing the total number of messages in the buffer appropriately.
2016-12-20	ruby: fix typo in DMASequencer::ackCallback()	Tony Gutierrez

2016-12-20	ruby: fix issue with unused var in DMASequencer	Tony Gutierrez
	the iterator declared in DMASequencer::ackCallback() is only used in an assert, this causes clang to fail when building fast. here we move the find call on the request table directly into the assert.
2016-12-19	mem: Make the BaseXBar public to not confuse Python wrappers	Andreas Sandberg
	The Python wrappers generally assume that destructors are public. Make the BaseXBar destructor public to avoid confusing the Python wrapper. Change-Id: If958802409c0be74e875dd6e279742abfdb3ede1 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2016-12-15	ruby: Detect garnet network-level deadlock.	Jieming Yin
	This patch detects garnet network deadlock by monitoring network interfaces. If a network interface continuously fails to allocate virtual channels for a message, a possible deadlock is detected.
2016-12-05	ruby: Remove RubyMemoryControl and associated files	Andreas Hansson
	This patch removes the deprecated RubyMemoryControl. The DRAMCtrl module should be used instead.
2016-12-05	mem: Respond to InvalidateReq when the block is (pending) dirty	Nikos Nikoleris
	Previously when an InvalidateReq snooped a cache with a dirty block or a pending modified MSHR, it would invalidate the block or set the postInv flag. The cache would not send an InvalidateResp. though, causing memory order violations. This patches changes this behavior, making the cache with the dirty block or pending modified MSHR the ordering point. Change-Id: Ib4c31012f4f6693ffb137cd77258b160fbc239ca Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05	mem: Invalidate a blk when servicing the 1st invalidating target	Nikos Nikoleris
	Previously an MSHR with one or more invalidating targets would first service all targets in the MSHR TargetList and then invalidate the block. As a result any service snooping targets would lookup in the cache and incorrectly find the block. This patch forces the invalidation to happen when the first invalidating target is encountered. Change-Id: I9df15de24e1d351cd96f5a2c424d9a03d81c2cce Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05	mem: Allow non invalidating snoops on an InvalidateReq MSHR	Nikos Nikoleris
	This patch changes an assertion that previously assumed that a non invalidating snoop request should never be serviced by an InvalidateReq MSHR. The MSHR serves as the ordering point for the snooping packet. When the InvalidateResp reaches the cache the snooping packet snoops the caches above to find the requested block. One or more of the caches above will have the block since earlier it has seen a WriteLineReq. Change-Id: I0c147c8b5d5019e18bd34adf9af0fccfe431ae07 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05	mem: Don't use hasSharers in the snoopFilter for memory responses	Nikos Nikoleris
	When the snoopFilter receives a response, it updates its state using the hasSharers flag (indicates whether there are more than one copies of the block in the caches above). The hasSharers flag of the packet was previously populated when the request was traversing and snooping the caches looking for the block. 1) When the response is coming from the memory-side port, its order with respect to other responses is not necessarily preserved (e.g., a request that arrived second to the xbar can get its response first). As a result the snoopFilter might process responses out of order updating its residency information using the non valid hasSharers flag which was populated much earlier. 2) When the response is from an on-chip, the MSHRs preserve a well defined order and the hasSharers flag should contain valid information. This patch changes the snoopFilter by avoiding the hasSharers flag when the response is from the memory-side port. Change-Id: Ib2d22a5b7bf3eccac64445127d2ea20ee74bb25b Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05	mem: Always use InvalidateReq to service WriteLineReq misses	Nikos Nikoleris
	Previously, a WriteLineReq that missed in a cache would send out an InvalidateReq if the block lookup failed or an UpgradeReq if the block lookup succeeded but the block had sharers. This changes ensures that a WriteLineReq always sends an InvalidateReq to invalidate all copies of the block and satisfy the WriteLineReq. Change-Id: I207ff5b267663abf02bc0b08aeadde69ad81be61 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05	mem: Assert that the responderHadWritable is set only once	Nikos Nikoleris
	Change-Id: Ie3beeef25331f84a0a5bcc17f7a791f4a829695b Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05	mem: Ensure InvalidateReq is considered isForward by MSHRs	Andreas Hansson
	This patch fixes an issue where an MSHR would incorrectly be perceived to provide data to targets arriving after an InvalidateReq. To address this the InvalidateReq is now treated as isForward, much like an UpgradeReq that did not hit in the cache. Change-Id: Ia878444d949539b5c33fd19f3e12b0b8a872275e Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05	mem: Make packet debug printing more uniform	Nikos Nikoleris
	Previously DPRINTFs printing information about a packet would use ad hoc formats. This patch changes all DPRINTFs to use the print function defined by the packet class, making the packet printing format more uniform and easier to change. Change-Id: Idd436a9758d4bf70c86a574d524648b2a2580970 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05	mem: Service only the 1st FromCPU MSHR target on ReadRespWithInv	Nikos Nikoleris
	A response to a ReadReq can either be a ReadResp or a ReadRespWithInvalidate. As we add targets to an MSHR for a ReadReq we assume that the response will be a ReadResp. When the response is invalidating (ReadRespWithInvalidate) servicing more than one targets can potentially violate the memory ordering. This change fixes the way we handle a ReadRespWithInvalidate. When a cache receives a ReadRespWithInvalidate we service only the first FromCPU target and all the FromSnoop targets from the MSHR target list. The rest of the FromCPU targets are deferred and serviced by a new request. Change-Id: I75c30c268851987ee5f8644acb46f440b4eeeec2 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>