summaryrefslogtreecommitdiff
path: root/configs/common/O3_ARM_v7a.py
AgeCommit message (Collapse)Author
2015-05-05arch, cpu: Do not forward snoops to table walkerAndreas Hansson
This patch simplifies the overall CPU by changing the TLB caches such that they do not forward snoops to the table walker port(s). Note that only ARM and X86 are affected. There is no reason for the ports to snoop as they do not actually take any action, and from a performance point of view we are better of not snooping more than we have to. Should it at a later point be required to snoop for a particular TLB design it is easy enough to add it back.
2015-04-29cpu: o3: replace issueLatency with bool pipelinedNilay Vaish
Currently, each op class has a parameter issueLat that denotes the cycles after which another op of the same class can be issued. As of now, this latency can either be one cycle (fully pipelined) or same as execution latency of the op (not at all pipelined). The fact that issueLat is a parameter of type Cycles makes one believe that it can be set to any value. To avoid the confusion, the parameter is being renamed as 'pipelined' with type boolean. If set to true, the op would execute in a fully pipelined fashion. Otherwise, it would execute in an unpipelined fashion.
2015-04-13cpu: re-organizes the branch predictor structure.Dibakar Gope
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-03-27arm, configs: Do not forward snoops from I cacheAndreas Hansson
This fix simply tells the I cache to not forward snoops to the fetch unit (since there is really no reason to do so).
2014-09-03cpu: Change writeback modeling for outstanding instructionsMitch Hayenga
As highlighed on the mailing list gem5's writeback modeling can impact performance. This patch removes the limitation on maximum outstanding issued instructions, however the number that can writeback in a single cycle is still respected in instToCommit().
2014-07-28arm: make the PseudoLRU tags the default for the O3_ARM_v7aL2Anthony Gutierrez
the Cortex-A15 has a random replacement policy for its L2 cache. see the Cortex-A15 Technical Reference Manual 1.7 About the L2 memory system. this patch makes the PseudoLRU tags the default for the ARM O3 CPU's L2 cache.
2014-06-30arm: make the bi-mode predictor the default for O3_ARM_v7a_BPAnthony Gutierrez
the branch predictor used in the Cortex-A15 is a bi-mode style predictor, see: http://arm.com/files/pdf/at-exploring_the_design_of_the_cortex-a15.pdf and http://nvidia.com/docs/IO/116757/NVIDIA_Quad_a15_whitepaper_FINALv2.pdf this patch makes the bi-mode predictor the default for the ARM O3 CPU.
2014-01-24arm: Add support for ARMv8 (AArch64 & AArch32)ARM gem5 Developers
Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64 kernel you are restricted to AArch64 user-mode binaries. This will be addressed in a later patch. Note: Virtualization is only supported in AArch32 mode. This will also be fixed in a later patch. Contributors: Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation) Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation) Mbou Eyole (AArch64 NEON, validation) Ali Saidi (AArch64 Linux support, code integration, validation) Edmund Grimley-Evans (AArch64 FP) William Wang (AArch64 Linux support) Rene De Jong (AArch64 Linux support, performance opt.) Matt Horsnell (AArch64 MP, validation) Matt Evans (device models, code integration, validation) Chris Adeniyi-Jones (AArch64 syscall-emulation) Prakash Ramrakhyani (validation) Dam Sunwoo (validation) Chander Sudanthi (validation) Stephan Diestelhorst (validation) Andreas Hansson (code integration, performance opt.) Eric Van Hensbergen (performance opt.) Gabe Black
2013-11-15cpu: allow the fetch buffer to be smaller than a cache lineAnthony Gutierrez
the current implementation of the fetch buffer in the o3 cpu is only allowed to be the size of a cache line. some architectures, e.g., ARM, have fetch buffers smaller than a cache line, see slide 22 at: http://www.arm.com/files/pdf/at-exploring_the_design_of_the_cortex-a15.pdf this patch allows the fetch buffer to be set to values smaller than a cache line.
2013-07-18config: Update script to set cache line size on systemAndreas Hansson
This patch changes the config scripts such that they do not set the cache line size per cache instance, but rather for the system as a whole.
2013-05-14cpu: remove local/globalHistoryBits params from branch predAnthony Gutierrez
having separate params for the local/globalHistoryBits and the local/globalPredictorSize can lead to inconsistencies if they are not carefully set. this patch dervies the number of bits necessary to index into the local/global predictors based on their size. the value of the localHistoryTableSize for the ARM O3 CPU has been increased to 1024 from 64, which is more accurate for an A15 based on some correlation against A15 hardware.
2013-01-24branch predictor: move out of o3 and inorder cpusNilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)
This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh
2013-01-07cpu: Rename defer_registration->switched_outAndreas Sandberg
The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive.
2012-12-06TournamentBP: Fix some bugs with table sizes and countersErik Tomusk
globalHistoryBits, globalPredictorSize, and choicePredictorSize are decoupled. globalHistoryBits controls how much history is kept, global and choice predictor sizes control how much of that history is used when accessing predictor tables. This way, global and choice predictors can actually be different sizes, and it is no longer possible to walk off the predictor arrays and cause a seg fault. There are now individual thresholds for choice, global, and local saturating counters, so that taken/not taken decisions are correct even when the predictors' counters' sizes are different. The interface for localPredictorSize has been removed from TournamentBP because the value can be calculated from localHistoryBits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2012-10-15Mem: Use cycles to express cache-related latenciesAndreas Hansson
This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.
2012-09-25Cache: add a response latency to the cachesMrinmoy Ghosh
In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path.
2012-02-12prefetcher: Make prefetcher a sim object instead of it being a parameter on ↵Mrinmoy Ghosh
cache
2012-01-26configs: actually add ARMv7a-like cpu/cache fileRonald Dreslinski