summaryrefslogtreecommitdiff
path: root/src/cpu/pred
AgeCommit message (Collapse)Author
2018-12-11cpu: Fixed typos in parameter/stats descriptionsPau Cabre
Change-Id: I7b3274a3e37128da35f497da150af08343e97ee6 Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14795 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Ilias Vougioukas <ilias.vougioukas@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-12-11cpu: Added parameters to enable/disable features in LTAGEPau Cabre
They are for the following features in the LTAGE loop predictor: - Hashing for calculating the loop table entry - Add direction information - Add speculative iteration number information Change-Id: I395f4526163ee0d0229d1e87cde2bb046f1dd43a Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14597 Reviewed-by: Ilias Vougioukas <ilias.vougioukas@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Louis Delhez <ldelhez@ucla.edu> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-11-28cpu: Added new stats to TAGE and LTAGE branch predictorsPau Cabre
They are basically used to tell wich component of the predictor is providing the prediction and whether it is correct or wrong Change-Id: I7b3db66535f159091f1b37d70c2d942d50b20fb2 Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14535 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-11-28cpu: split LTAGE implementation into a base TAGE and a derived LTAGEPau Cabre
The new derived LTAGE is equivalent to the original LTAGE implementation The default values of the TAGE branch predictor match the settings of the 8C-TAGE configuration described in https://www.jilp.org/vol8/v8paper1.pdf Change-Id: I8323adbfd5c9a45db23cfff234218280e639f9ed Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14435 Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-11-22cpu: Made LTAGE parameters configurablePau Cabre
This includes TAGE tag sizes, TAGE table sizes, U counters reset period, loop predictor associativity, path history size, the USE_ALT_ON_NA size and the WITHLOOP size Change-Id: I935823f0a5794f5d55b744263798897a813dc1bd Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14417 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-11-22cpu: Fixed useful counter handling in LTAGEPau Cabre
Increased to 2 bits of useful counter per TAGE entry as described in the LTAGE paper (and made the size configurable) Changed how the useful counters are incremented/decremented as described in the LTAGE paper Change-Id: I8c692cc7c180d29897cb77781681ff498a1d16c8 Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14215 Reviewed-by: Ilias Vougioukas <ilias.vougioukas@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-11-22cpu: Fixes on the loop predictor part of LTAGEPau Cabre
Fixed the following fields of the loop predictor entries as described on the LTAGE paper: - Age counter (it was 3 bits and it should be 8 bits) - Tag (it was 16 bits and it should be 14 bits). Also some times it used int variables and some times uint16_t, leading to wrong behaviour - Confidence counter (it was 2 bits ins some parts of the code and 3 bits in some other parts. It should be 2 bits) - Iteration counters (they were 16 bits and they should be 14 bits) All the new sizes are now configurable Change-Id: I8884c7454c1e510b65160eb4d5749d3259d34096 Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14216 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2018-11-17cpu: Fix LTAGE max number of allocations on updatePau Cabre
The LTAGE paper states that only one TAGE entry can be allocated when updating Change-Id: I6cfb4d80ce835e93d4bf5099ef88a7d425abaddd Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14195 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Ilias Vougioukas <ilias.vougioukas@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2018-11-17configs: Added an option for choosing branch predictor typePau Cabre
Added the parameter "--bp-type" to set the branch predictor type Added the parameter "--list-bp-types" to list all the available branch predictor types Change-Id: Ia6aae90c784aef359b6d8233c8383cd7a871aca1 Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14015 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2018-11-14cpu: Fixed ratio of pred to hyst bits for LTAGE BimodalPau Cabre
The LTAGE paper states 1 hyst bit shared for 4 pred bits. Made this ratio configurable use 4 by default. Also changed the Bimodal structure to use two std::vector<bool> (one for pred and one for hyst bits) Change-Id: I6793e8e358be01b75b8fd181ddad50f259862d79 Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14120 Reviewed-by: Ilias Vougioukas <ilias.vougioukas@arm.com> Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2018-11-13cpu: Fixed PC shifting on LTAGE branch predictorPau Cabre
The PC needs to be shifted according to the instShiftAmt parameter Change-Id: I272619c093695b56cf7f8ff7163e3b5d23205d16 Signed-off-by: Pau Cabre <pau.cabre@metempsy.com> Reviewed-on: https://gem5-review.googlesource.com/c/14035 Reviewed-by: Sudhanshu Jha <sudhanshu.jha@arm.com> Reviewed-by: Ilias Vougioukas <ilias.vougioukas@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2017-12-04misc: Rename misc.(hh|cc) to logging.(hh|cc)Gabe Black
These files aren't a collection of miscellaneous stuff, they're the definition of the Logger interface, and a few utility macros for calling into that interface (panic, warn, etc.). Change-Id: I84267ac3f45896a83c0ef027f8f19c5e9a5667d1 Reviewed-on: https://gem5-review.googlesource.com/6226 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Gabe Black <gabeblack@google.com>
2017-09-06cpu: Fix bi-mode branch predictor thresholdsRico Amslinger
When different sizes were set for the choice and global saturation counter (e.g. ex5_big), the threshold calculation used the wrong size. Thus the branch predictor always predicted "not taken" for choice > global. Change-Id: I076549ff1482e2280cef24a0d16b7bb2122d4110 Reviewed-on: https://gem5-review.googlesource.com/4560 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2016-11-09style: [patch 1/22] use /r/3648/ to reorganize includesBrandon Potter
2016-12-21cpu: implement an L-TAGE branch predictorArthur Perais
This patch implements an L-TAGE predictor, based on André Seznec's code available from CBP-2 (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). Signed-off-by Jason Lowe-Power <jason@lowepower.com>
2016-12-21cpu: disallow speculative update of branch predictor tables (o3)Arthur Perais
The Minor and o3 cpu models share the branch prediction code. Minor relies on the BPredUnit::squash() function to update the branch predictor tables on a branch mispre- diction. This is fine because Minor executes in-order, so the update is on the correct path. However, this causes the branch predictor to be updated on out-of-order branch mispredictions when using the o3 model, which should not be the case. This patch guards against speculative update of the branch prediction tables. On a branch misprediction, BPredUnit::squash() calls BpredUnit::update(..., squashed = true). The underlying branch predictor tests against the value of squashed. If it is true, it restores any speculatively updated internal state it might have (e.g., global/local branch history), then returns. If false, it updates its prediction tables. Previously, exist- ing predictors did not test against the "squashed" parameter. To accomodate for this change, the Minor model must now call BPredUnit::squash() then BPredUnit::update(..., squashed = false) on branch mispredictions. Before, calling BpredUnit::squash() performed the prediction tables update. The effect is a slight MPKI improvement when using the o3 model. A further patch should perform the same modifications for the indirect target predictor and BTB (less critical). Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21cpu: correct comments in tournament branch predictorArthur Perais
The tournament predictor is presented as doing speculative update of the global history and non-speculative update of the local history used to generate the branch prediction. However, the code does speculative update of both histories. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-11-30cpu: Remove branch predictor function predictInOrderJason Lowe-Power
This function was used by the now-defunct InOrderCPU model. Since this model is no longer in gem5, this function was not called from anywhere in the code.
2016-06-06stats: Fixing regStats function for some SimObjectsDavid Guillen Fandos
Fixing an issue with regStats not calling the parent class method for most SimObjects in Gem5. This causes issues if one adds new stats in the base class (since they are never initialized properly!). Change-Id: Iebc5aa66f58816ef4295dc8e48a357558d76a77c Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-04-05cpu: Implement per-thread GHRsMitch Hayenga
Branch predictors that use GHRs should index them on a per-thread basis. This makes that so. This is a re-spin of fb51231 after the revert (bd1c6789).
2016-04-05cpu: Add an indirect branch target predictorMitch Hayenga
This patch adds a configurable indirect branch predictor that can be indexed by a combination of GHR and path history hashes. Implements the functionality described in: "Target prediction for indirect jumps" by Chang, Hao, and Patt http://dl.acm.org/citation.cfm?id=264209 This is a re-spin of fb9d142 after the revert (bd1c6789).
2016-04-05cpu: Fix BTB threading oversightMitch Hayenga
The extant BTB code doesn't hash on the thread id but does check the thread id for 'btb hits'. This results in 1-thread of a multi-threaded workload taking a BTB entry, and all other threads missing for the same branch missing.
2016-04-06Revert power patch sets with unexpected interactionsAndreas Sandberg
The following patches had unexpected interactions with the current upstream code and have been reverted for now: e07fd01651f3: power: Add support for power models 831c7f2f9e39: power: Low-power idle power state for idle CPUs 4f749e00b667: power: Add power states to ClockedObject Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> --HG-- extra : amend_source : 0b6fb073c6bbc24be533ec431eb51fbf1b269508
2016-04-05cpu: Implement per-thread GHRsCurtis Dunham
Branch predictors that use GHRs should index them on a per-thread basis. This makes that so.
2016-04-05cpu: Add an indirect branch target predictorMitch Hayenga
This patch adds a configurable indirect branch predictor that can be indexed by a combination of GHR and path history hashes. Implements the functionality described in: "Target prediction for indirect jumps" by Chang, Hao, and Patt http://dl.acm.org/citation.cfm?id=264209
2016-04-05cpu: Fix BTB threading oversightMitch Hayenga
The extant BTB code doesn't hash on the thread id but does check the thread id for 'btb hits'. This results in 1-thread of a multi-threaded workload taking a BTB entry, and all other threads missing for the same branch missing.
2016-02-06style: fix missing spaces in control statementsSteve Reinhardt
Result of running 'hg m5style --skip-all --fix-control -a'.
2016-01-11scons: Enable -Wextra by defaultAndreas Hansson
Make best use of the compiler, and enable -Wextra as well as -Wall. There are a few issues that had to be resolved, but they are all trivial.
2015-10-12misc: Add explicit overrides and fix other clang >= 3.5 issuesAndreas Hansson
This patch adds explicit overrides as this is now required when using "-Wall" with clang >= 3.5, the latter now part of the most recent XCode. The patch consequently removes "virtual" for those methods where "override" is added. The latter should be enough of an indication. As part of this patch, a few minor issues that clang >= 3.5 complains about are also resolved (unused methods and variables).
2015-10-12misc: Remove redundant compiler-specific definesAndreas Hansson
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap (and similar) abstractions, as these are no longer needed with gcc 4.7 and clang 3.1 as minimum compiler versions.
2015-09-15cpu: pred: Local Predictor Reset in Tournament PredictorAndrew Lukefahr
When a branch gets squashed, it's speculative branch predictor state should get rolled back in squash(). However, only the globalHistory state was being rolled back. This patch adds (at least some) support for rolling back the local predictor state also. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-13cpu: re-organizes the branch predictor structure.Dibakar Gope
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-10-16cpu: Add branch predictor PMU probe pointsAndreas Sandberg
This changeset adds probe points that can be used to implement PMU counters for branch predictor stats. The following probes are supported: * BPRedUnit::ppBranches / Branches * BPRedUnit::ppMisses / Misses
2014-09-27arch: Use const StaticInstPtr references where possibleAndreas Hansson
This patch optimises the passing of StaticInstPtr by avoiding copying the reference-counting pointer. This avoids first incrementing and then decrementing the reference-counting pointer.
2014-09-03cpu: fix bimodal predictor to use correct global history regDam Sunwoo
A small bug in the bimodal predictor caused significant degradation in performance on some benchmarks. This was caused by using the wrong globalHistoryReg during the update phase. This patches fixes the bug and brings the performance to normal level.
2014-09-03cpu: Fix incorrect speculative branch predictor behaviorMitch Hayenga
When a branch mispredicted gem5 would squash all history after and including the mispredicted branch. However, the mispredicted branch is still speculative and its history is required to rollback state if another, older, branch mispredicts. This leads to things like RAS corruption.
2014-08-13scons: Build the branch predictor for all CPUsAndreas Sandberg
The branch predictor is normally only built when a CPU that uses a branch predictor is built. The list of CPUs is currently incomplete as the simple CPUs support branch predictors (for warming, branch stats, etc). In practice, all CPU models now use branch predictors, so this changeset removes the CPU model check and replaces it with a check for the NULL ISA.
2014-08-13cpu: Modernise the branch predictor (STL and C++11)Andreas Hansson
This patch does some minor house keeping of the branch predictor by adopting STL containers, and shifting some iterator to use range-based for loops. The predictor history is also changed from a list to a deque as we never to insertion/deletion other than at the front and back.
2014-07-23cpu: `Minor' in-order CPU modelAndrew Bardsley
This patch contains a new CPU model named `Minor'. Minor models a four stage in-order execution pipeline (fetch lines, decompose into macroops, decompose macroops into microops, execute). The model was developed to support the ARM ISA but should be fixable to support all the remaining gem5 ISAs. It currently also works for Alpha, and regressions are included for ARM and Alpha (including Linux boot). Documentation for the model can be found in src/doc/inside-minor.doxygen and its internal operations can be visualised using the Minorview tool utils/minorview.py. Minor was designed to be fairly simple and not to engage in a lot of instruction annotation. As such, it currently has very few gathered stats and may lack other gem5 features. Minor is faster than the o3 model. Sample results: Benchmark | Stat host_seconds (s) ---------------+--------v--------v-------- (on ARM, opt) | simple | o3 | minor | timing | timing | timing ---------------+--------+--------+-------- 10.linux-boot | 169 | 1883 | 1075 10.mcf | 117 | 967 | 491 20.parser | 668 | 6315 | 3146 30.eon | 542 | 3413 | 2414 40.perlbmk | 2339 | 20905 | 11532 50.vortex | 122 | 1094 | 588 60.bzip2 | 2045 | 18061 | 9662 70.twolf | 207 | 2736 | 1036
2014-06-30cpu: implement a bi-mode branch predictorAnthony Gutierrez
2013-10-17cpu: add consistent guarding to *_impl.hh files.Matt Horsnell
2013-05-14cpu: remove local/globalHistoryBits params from branch predAnthony Gutierrez
having separate params for the local/globalHistoryBits and the local/globalPredictorSize can lead to inconsistencies if they are not carefully set. this patch dervies the number of bits necessary to index into the local/global predictors based on their size. the value of the localHistoryTableSize for the ARM O3 CPU has been increased to 1024 from 64, which is more accurate for an A15 based on some correlation against A15 hardware.
2013-01-24branch predictor: move out of o3 and inorder cpusNilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)
This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh
2012-12-06TournamentBP: Fix some bugs with table sizes and countersErik Tomusk
globalHistoryBits, globalPredictorSize, and choicePredictorSize are decoupled. globalHistoryBits controls how much history is kept, global and choice predictor sizes control how much of that history is used when accessing predictor tables. This way, global and choice predictors can actually be different sizes, and it is no longer possible to walk off the predictor arrays and cause a seg fault. There are now individual thresholds for choice, global, and local saturating counters, so that taken/not taken decisions are correct even when the predictors' counters' sizes are different. The interface for localPredictorSize has been removed from TournamentBP because the value can be calculated from localHistoryBits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2012-11-02o3: Fix a couple of issues with the local predictor.Mrinmoy Ghosh
Fix some issues with the local predictor and the way it's indexed.
2012-02-13BPred: Fix RAS to handle predicated call/return instructions.Mrinmoy Ghosh
Change RAS to fix issues with predicated call/return instructions. Handled all cases in the life of a predicated call and return instruction.
2012-02-13BP: Fix several Branch Predictor issues.Mrinmoy Ghosh
1. Updates the Branch Predictor correctly to the state just after a mispredicted branch, if a squash occurs. 2. If a BTB does not find an entry, the branch is predicted not taken. The global history is modified to correctly reflect this prediction. 3. Local history is now updated at the fetch stage instead of execute stage. 4. In the Update stage of the branch predictor the local predictors are now correctly updated according to the state of local history during fetch stage. This patch also improves performance by as much as 17% on some benchmarks
2011-08-07O3: Fix uninitialized variable in the tournament branch predictor.Ali Saidi
2011-07-10Branch predictor: Fixes the tournament branch predictor.Mrinmoy Ghosh
Branch predictor could not predict a branch in a nested loop because: 1. The global history was not updated after a mispredict squash. 2. The global history was updated in the fetch stage. The choice predictors that were updated used the changed global history. This is incorrect, as it incorporates the state of global history after the branch in encountered. Fixed update to choice predictor using the global history state before the branch happened. 3. The global predictor table was also updated using the global history state before the branch happened as above. Additionally, parameters to initialize ctr and history size were reversed.
2011-06-02scons: rename TraceFlags to DebugFlagsNathan Binkert