summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2017-01-20syscall_emul: #ifdef new system calls to allow builds on OSX and BSDBrandon Potter
2017-01-19ruby: guard usage of GPUCoalescer code in ProfilerTony Gutierrez
the GPUCoalescer code is used in the ruby profiler regardless of whether or not the coalescer code has been compiled, which can lead to link/run time errors. here we add #ifdefs to guard the usage of GPUCoalescer code. eventually we should refactor this code to use probe points.
2017-01-19ruby: Check MessageBuffer space in garnet NetworkInterfaceMatthew Poremba
Garnet's NetworkInterface does not consider the size of MessageBuffers when ejecting a Message from the network. Add a size check for the MessageBuffer and only enqueue if space is available. If space is not available, the message if placed in a queue and the credit is held. A callback from the MessageBuffer is implemented to wake the NetworkInterface. If there are messages in the stalled queue, they are processed first, in a FIFO manner and if succesfully ejected, the credit is finally sent back upstream. The maximum size of the stall queue is equal to the number of valid VNETs with MessageBuffers attached.
2017-01-19ruby: Add occupancy stats to MessageBuffersMatthew Poremba
This patch is an updated version of /r/3297. "The most important statistic for measuring memory hierarchy performance is throughput, which is affected by independent variables, buffer sizing and communication latency. It is difficult/impossible to debug performance issues through series buffers without knowing which are the bottlenecks. For finite buffers, this patch adds statistics for the average number of messages in the buffer, the occupancy of the buffer slots, and number of message stalls."
2017-01-19ruby: Check all VNETs for injection in garnet NetworkInterfaceMatthew Poremba
The NetworkInterface wakeup currently iterates over all VNETs and breaks the loop if a VNET is unable to allocate a VC. This can cause a deadlock if a lower numbered VNET is unable to allocate a VC while a higher numbered VNET has idle VCs. This seems like a bug as Garnet 1.0 uses a while loop over an if-statement, suggesting the break was intended for this while loop. This patch removes the break statement, which allows up to one message to be dequeued from a VNET and injected into the network.
2016-11-09syscall_emul: [patch 2/22] move SyscallDesc into its own .hh and .ccBrandon Potter
The class was crammed into syscall_emul.hh which has tons of forward declarations and template definitions. To clean it up a bit, moved the class into separate files and commented the class with doxygen style comments. Also, provided some encapsulation by adding some accessors and a mutator. The syscallreturn.hh file was renamed syscall_return.hh to make it consistent with other similarly named files in the src/sim directory. The DPRINTF_SYSCALL macro was moved into its own header file with the include the Base and Verbose flags as well. --HG-- rename : src/sim/syscallreturn.hh => src/sim/syscall_return.hh
2016-11-09style: [patch 1/22] use /r/3648/ to reorganize includesBrandon Potter
2017-01-03sim: Remove declaration of unused CountedDrainEventAndreas Sandberg
The CountedDrainEvent event was used to keep track of objects that required additional simulation to drain. It was removed as a part of the great drain rewrite, but the declaration remained. Change-Id: I767a3213669040d3f27e2afafa2e4a5bb997e325 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2017-01-03python: Don't use Swig to cast statsAndreas Sandberg
Call the stat visitor from the stat itself rather than casting stats in Python. This reduces the number of ways visitors are called. Change-Id: Ic4d0b7b32e3ab9897b9a34cd22d353f4da62d738 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Joe Gross <joseph.gross@amd.com>
2017-01-03sim: Remove redundant export_method_cxx_predeclsAndreas Sandberg
The headers declared in export_method_cxx_predecls are redundant since a SimObject's main header is automatically included. Change-Id: Ied9e84630b36960e54efe91d16f8c66fba7e0da0 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Joe Gross <joseph.gross@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-23sim: Fix SE mode checkpoint restore file handlingJoel Hestness
When restoring from a checkpoint, the simulation used to use file handles from the checkpoint. This disallows multiple separate restore simulations from using separate input and output files and directories, and plays havoc when the checkpointed file locations may have changed. Add handling to allow the command line specified files to be used as input/output for the restored simulation (Note: this is the similar functionality to FS mode for output and error).
2016-12-21cpu: implement an L-TAGE branch predictorArthur Perais
This patch implements an L-TAGE predictor, based on André Seznec's code available from CBP-2 (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). Signed-off-by Jason Lowe-Power <jason@lowepower.com>
2016-12-21cpu: disallow speculative update of branch predictor tables (o3)Arthur Perais
The Minor and o3 cpu models share the branch prediction code. Minor relies on the BPredUnit::squash() function to update the branch predictor tables on a branch mispre- diction. This is fine because Minor executes in-order, so the update is on the correct path. However, this causes the branch predictor to be updated on out-of-order branch mispredictions when using the o3 model, which should not be the case. This patch guards against speculative update of the branch prediction tables. On a branch misprediction, BPredUnit::squash() calls BpredUnit::update(..., squashed = true). The underlying branch predictor tests against the value of squashed. If it is true, it restores any speculatively updated internal state it might have (e.g., global/local branch history), then returns. If false, it updates its prediction tables. Previously, exist- ing predictors did not test against the "squashed" parameter. To accomodate for this change, the Minor model must now call BPredUnit::squash() then BPredUnit::update(..., squashed = false) on branch mispredictions. Before, calling BpredUnit::squash() performed the prediction tables update. The effect is a slight MPKI improvement when using the o3 model. A further patch should perform the same modifications for the indirect target predictor and BTB (less critical). Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21cpu: correct comments in tournament branch predictorArthur Perais
The tournament predictor is presented as doing speculative update of the global history and non-speculative update of the local history used to generate the branch prediction. However, the code does speculative update of both histories. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21cpu: Resolve targets of predicted 'taken' decode for O3Arthur Perais
The target of taken conditional direct branches does not need to be resolved in IEW: the target can be computed at decode, usually using the decoded instruction word and the PC. The higher-than-necessary penalty is taken only on conditional branches that are predicted taken but miss in the BTB. Thus, this is mostly inconsequential on IPC if the BTB is big/associative enough (fewer capacity/conflict misses). Nonetheless, what gem5 simulates is not representative of how conditional branch targets can be handled. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21cpu: Clarify meaning of cachePorts variable in lsq_unit.hh of O3Arthur Perais
cachePorts currently constrains the number of store packets written to the D-Cache each cycle), but loads currently affect this variable. This leads to unexpected congestion (e.g., setting cachePorts to a realistic 1 will in fact allow a store to WB only if no loads have accessed the D-Cache this cycle). In the absence of arbitration, this patch decouples how many loads can be done per cycle from how many stores can be done per cycle. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-20ruby: Make MessageBuffers actually finite sizedJoel Hestness
When Ruby controllers stall messages in MessageBuffers, the buffer moves those messages off the priority heap and into a per-address stall map. When buffers are finite-sized, the test areNSlotsAvailable() only checks the size of the priority heap, but ignores the stall map, so the map is allowed to grow unbounded if the controller stalls numerous messages. This patch fixes the problem by tracking the stall map size and testing the total number of messages in the buffer appropriately.
2016-12-20ruby: fix typo in DMASequencer::ackCallback()Tony Gutierrez
2016-12-20ruby: fix issue with unused var in DMASequencerTony Gutierrez
the iterator declared in DMASequencer::ackCallback() is only used in an assert, this causes clang to fail when building fast. here we move the find call on the request table directly into the assert.
2016-12-19arm: provide correct timer availability in ID_PFR1 registerCurtis Dunham
Change-Id: Id4cd839c12b70616017a5830e3f9bbb59b0f97ba Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19arm: compute ID_AA64PFR{0,1}_EL1 registersCurtis Dunham
Compute the proper values of the aforementioned registers from the system configuration rather than configuring the values themselves. Change-Id: If9774b6610a29568b80ae4866107b9a6a5b5be0f Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19arm: compute ID_PFR{0,1} registersCurtis Dunham
Compute the proper values of the aforementioned registers from the system configuration rather than configuring the values themselves. Change-Id: Ie7685b5d8b5f2dd9d6380b4af74f16d596b2bfd1 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19arm: miscreg refactoringCurtis Dunham
Change-Id: I4e9e8f264a4a4239dd135a6c7a1c8da213b6d345 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19arm: audit SCTLRCurtis Dunham
Change-Id: I814f1431a5f754f75721c9ac51171f860a714d24 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19arm: remove SCTLR.FICurtis Dunham
Removed from ARMARM. Change-Id: Ie8f28e4fa6e1b46dfd9c8c4b379e5b42fe25421d Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19arm: update AArch{64,32} register mappingsCurtis Dunham
Change-Id: Idaaaeb3f7b1a0bdbf18d8e2d46686c78bb411317 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-19mem: Make the BaseXBar public to not confuse Python wrappersAndreas Sandberg
The Python wrappers generally assume that destructors are public. Make the BaseXBar destructor public to avoid confusing the Python wrapper. Change-Id: If958802409c0be74e875dd6e279742abfdb3ede1 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2016-12-19python: Export periodicStatDumpAndreas Sandberg
Some configuration scripts need periodic stat dumps. One of the ways this can be achieved is by using the pariodicStatDump helper function. This function was previously only exported in the internal name space. Export it as a normal function in m5.stat instead. Change-Id: Ic88bf1fd33042a62ab436d5944d8ed778264ac98 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com>
2016-12-19dev: Include DmaDevice in NULL buildsAndreas Sandberg
Builds for the NULL ISA include Device.py, which contains the Python declaration of DmaDevice, but don't include the actual C++ implementation. Add dma_device.cc to the NULL build to the Python and C++ worlds consistent again. Change-Id: I47a57181a1f4d5a7276467678bf16fbc7f161681 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com>
2016-12-19python: Fix incorrect header in the DmaDevice wrapperAndreas Sandberg
The header declared in the DmaDevice wrapper doesn't actually contain the DmaDevice class. This can potentially lead to incorrect type cases in Swig. Change-Id: If2266d4180d1d6fd13359a81067068854c5e96fe Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com>
2016-12-19sim: Remove redundant buildEnv importAndreas Sandberg
Change-Id: Id6bdbc0c988aa92b96e292cabc913e6b974f14bb Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2016-12-15ruby: Detect garnet network-level deadlock.Jieming Yin
This patch detects garnet network deadlock by monitoring network interfaces. If a network interface continuously fails to allocate virtual channels for a message, a possible deadlock is detected.
2016-11-09base: remove header file to prevent a macro name collisionBrandon Potter
2016-12-15syscall_emul: implement fallocateBrandon Potter
2016-12-15syscall_emul: add support for x86 statfs system callsBrandon Potter
2016-12-15syscall_emul: extend sysinfo system call to include mem_unitBrandon Potter
2016-12-06dev: Fix race conditions at terminating dist-gem5 simulationsGabor Dozsa
Two problems may arise when a distributed gem5 simulation terminates: (i) simulation thread(s) may get stuck in an incomplete synchronisation event which prohibits processing the simulation exit event; and (ii) a stale receiver thread may try to access objects that have already been deleted while exiting gem5. This patch terminates receive threads properly and aborts the processing of any incomplete synchronisation event. Change-Id: I72337aa12c7926cece00309640d478b61e55a429 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-12-05ruby: Remove RubyMemoryControl and associated filesAndreas Hansson
This patch removes the deprecated RubyMemoryControl. The DRAMCtrl module should be used instead.
2016-12-05mem: Respond to InvalidateReq when the block is (pending) dirtyNikos Nikoleris
Previously when an InvalidateReq snooped a cache with a dirty block or a pending modified MSHR, it would invalidate the block or set the postInv flag. The cache would not send an InvalidateResp. though, causing memory order violations. This patches changes this behavior, making the cache with the dirty block or pending modified MSHR the ordering point. Change-Id: Ib4c31012f4f6693ffb137cd77258b160fbc239ca Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05mem: Invalidate a blk when servicing the 1st invalidating targetNikos Nikoleris
Previously an MSHR with one or more invalidating targets would first service all targets in the MSHR TargetList and then invalidate the block. As a result any service snooping targets would lookup in the cache and incorrectly find the block. This patch forces the invalidation to happen when the first invalidating target is encountered. Change-Id: I9df15de24e1d351cd96f5a2c424d9a03d81c2cce Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05mem: Allow non invalidating snoops on an InvalidateReq MSHRNikos Nikoleris
This patch changes an assertion that previously assumed that a non invalidating snoop request should never be serviced by an InvalidateReq MSHR. The MSHR serves as the ordering point for the snooping packet. When the InvalidateResp reaches the cache the snooping packet snoops the caches above to find the requested block. One or more of the caches above will have the block since earlier it has seen a WriteLineReq. Change-Id: I0c147c8b5d5019e18bd34adf9af0fccfe431ae07 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05mem: Don't use hasSharers in the snoopFilter for memory responsesNikos Nikoleris
When the snoopFilter receives a response, it updates its state using the hasSharers flag (indicates whether there are more than one copies of the block in the caches above). The hasSharers flag of the packet was previously populated when the request was traversing and snooping the caches looking for the block. 1) When the response is coming from the memory-side port, its order with respect to other responses is not necessarily preserved (e.g., a request that arrived second to the xbar can get its response first). As a result the snoopFilter might process responses out of order updating its residency information using the non valid hasSharers flag which was populated much earlier. 2) When the response is from an on-chip, the MSHRs preserve a well defined order and the hasSharers flag should contain valid information. This patch changes the snoopFilter by avoiding the hasSharers flag when the response is from the memory-side port. Change-Id: Ib2d22a5b7bf3eccac64445127d2ea20ee74bb25b Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05mem: Always use InvalidateReq to service WriteLineReq missesNikos Nikoleris
Previously, a WriteLineReq that missed in a cache would send out an InvalidateReq if the block lookup failed or an UpgradeReq if the block lookup succeeded but the block had sharers. This changes ensures that a WriteLineReq always sends an InvalidateReq to invalidate all copies of the block and satisfy the WriteLineReq. Change-Id: I207ff5b267663abf02bc0b08aeadde69ad81be61 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com>
2016-12-05mem: Assert that the responderHadWritable is set only onceNikos Nikoleris
Change-Id: Ie3beeef25331f84a0a5bcc17f7a791f4a829695b Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05mem: Ensure InvalidateReq is considered isForward by MSHRsAndreas Hansson
This patch fixes an issue where an MSHR would incorrectly be perceived to provide data to targets arriving after an InvalidateReq. To address this the InvalidateReq is now treated as isForward, much like an UpgradeReq that did not hit in the cache. Change-Id: Ia878444d949539b5c33fd19f3e12b0b8a872275e Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05mem: Make packet debug printing more uniformNikos Nikoleris
Previously DPRINTFs printing information about a packet would use ad hoc formats. This patch changes all DPRINTFs to use the print function defined by the packet class, making the packet printing format more uniform and easier to change. Change-Id: Idd436a9758d4bf70c86a574d524648b2a2580970 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05cpu: Change traffic generators to use different values for writesNikos Nikoleris
Previously all traffic generators would use the same value for write requests. With this change traffic generators use their master id as the payload of write requests making them more useful for the memchecker. Change-Id: Id1a6b8f02853789b108ef6003f4c32ab929bb123 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05mem: Service only the 1st FromCPU MSHR target on ReadRespWithInvNikos Nikoleris
A response to a ReadReq can either be a ReadResp or a ReadRespWithInvalidate. As we add targets to an MSHR for a ReadReq we assume that the response will be a ReadResp. When the response is invalidating (ReadRespWithInvalidate) servicing more than one targets can potentially violate the memory ordering. This change fixes the way we handle a ReadRespWithInvalidate. When a cache receives a ReadRespWithInvalidate we service only the first FromCPU target and all the FromSnoop targets from the MSHR target list. The rest of the FromCPU targets are deferred and serviced by a new request. Change-Id: I75c30c268851987ee5f8644acb46f440b4eeeec2 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05mem: Keep track of allocOnFill in the TargetListNikos Nikoleris
Previously the information of whether a response was allocating or not was a property of the MSHR. This change makes this flag a property of the TargetList. Differernt TargetLists, e.g. the targets and the deferred targets lists might have different values. Additionally, the information about whether each of the target expects an allocating response is stored inside the TargetList container. This allows for repopulating the flag in case some of the targets are removed. Change-Id: If3ec2516992f42a6d9da907009ffe3ab8d0d2021 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>
2016-12-05mem: Add support for repopulating the flags of an MSHR TargetListNikos Nikoleris
This patch adds support for repopulating the flags of an MSHR TargetList. The added functionality makes it possible to remove targets from a TargetList without leaving it in an inconsistent state. Change-Id: I3f7a8e97bfd3e2e49bebad056d11bbfb087aad91 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Stephan Diestelhorst <stephan.diestelhorst@arm.com>