summaryrefslogtreecommitdiff
path: root/src/cpu/o3
AgeCommit message (Collapse)Author
2019-08-07cpu-o3: fix atomic instructions non-speculativeJordi Vaquero
Fix problem with O3 and AMO instructions. At initial stages amo instruction is considered a type of non-speculative store. After the instruction has been commited and during the squash step, acquire_release version of the AMO operation is considered speculative, that differents results in an assert fault. This fix ensures that AMO instructions are always considered non-speculative, during early stages and during squas/removal of the instruction. Change-Id: Ia0c5fbb9dc44a9991337b57eb759b1ed08e4149e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19815 Maintainer: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-08-07cpu-o3: added _amo_op parameter in o3 LSQJordi Vaquero
Fix bug with AMO (or RMW) instructions where the amo_op variable is not being propagated to the LSQ request. Change-Id: I60c59641d9b497051376f638e27f3c4cc361f615 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19814 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com> Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com>
2019-07-28cpu-o3: Fix too strict assert condition in writeback()Gabor Dozsa
The assert() in the LSQ writeback() only allowed ReExec faults. However, a SplitRequest which completed the translation in PartialFault state (i.e. any but the very first cacheline translation failed) may end up here. The assert() condition is extended accordingly. The patch also removes the superfluous/unused Complete/Squashed states from the LSQ request. (The completion of the request is recorded in the flags still.) Change-Id: Ie575f4d3b4d5295585828ad8c7d3f4c7c1fe15d0 Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19174 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-07-27cpu: Add first-/non-faulting load support to Minor and O3Gabor Dozsa
Some architectures allow masking faults of memory load instructions in some specific circumstances (e.g. first-faulting and non-faulting loads in Arm SVE). This patch adds support for such loads in the Minor and O3 CPU models. Change-Id: I264a81a078f049127779aa834e89f0e693ba0bea Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19178 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-07-16cpu: isDrained renamed to isCpuDrainedGiacomo Travaglini
cpu models inheriting from BaseCPU implement a draining checker called isDrained. This hides the base Drainable::isDrained method and might create confusion in the reader. This patch is renaming it to isCpuDrained in order to avoid any ambiguity Change-Id: Ie5221da6a4673432c2403996e42d451cae960bbf Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19468 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-07-13cpu-o3: Set packet data type for IPR readPouya Fotouhi
This change assigns packet data type to static for IPR read. Caused by change (e13d6dc9c0d7a4ae0215f1ee6793eb32570c5169), and has been reported a few times in the mailing list. Change-Id: I0f02c20a16824e220df876e9e552bbc1c9636f95 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19449 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-07-08cpu-o3: Reset fault status for mem access in pushRequestGabor Dozsa
Reset the fault status always before translation is initiated in pushRequest() in the LSQ. This avoids the problem when a strictly ordered load needs to be re-executed multiple times. If the translation is delayed at one of those attempts then the internal panicFault (from the previous execution attempt) can get fired at commit. Change-Id: I0c22b2f7afd6e2cb00bc359a4a01042efd2d01d2 Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/19388 Reviewed-by: Ciro Santilli <ciro.santilli@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-31cpu-o3: Increase LSQ buffer sizes to match max vector lengthGabor Dozsa
Change-Id: I5890c7cfa147125ce3389001f85d56d4b5a9911d Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13525 Tested-by: kokoro <noreply+kokoro@google.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Michael LeBeane <Michael.Lebeane@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2019-05-30cpu-o3: Add support for pinned writesGiacomo Gabrielli
This patch adds support for pinning registers for a certain number of consecutive writes. This is only relevant for timing CPU models (functional-only models are unaffected), and it is primarily needed to provide a realistic execution model for micro-coded operations whose microops can write to non-overlapping portions of a destination register, e.g. vector gather loads. In those cases, this mechanism can disable renaming for a sequence of consecutive writes, thus making the resulting execution more efficient: allocating a new physical register for each microop would introduce a read-modify-write chain of dependencies, while with these modifications the microops can write back in parallel. Please note that this new feature is only leveraged by O3CPU for the time being. Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> Change-Id: I07eb5fdbd1fa0b748c9bdc1174d9f330fda34f81 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13520 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-30arch, base, cpu, gpu, sim: Merge getMemProxy and getVirtProxy.Gabe Black
These two functions were performing the same function but had two different names for historical reasons. This change merges them together, keeping the getVirtProxy name to be consistent with the getPhysProxy method used to get a non-translating proxy port. Change-Id: Idd83c6b899f9343795075b030ccbc723a79e52a4 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18581 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2019-05-30cpu, sim: Return PortProxy &s from all the proxy accessors.Gabe Black
This is a step towards merging the accessors for SE and FS modes. Change-Id: I76818ab88b97097ac363e243be9cc1911b283090 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18579 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2019-05-29cpu: Added correct return type for ROB::countInstsAndrea Mondelli
- return size_t (unsigned) according to the .size() return type - fixed typo in doc (source of warning with some compilers) Change-Id: I48ee2e317cf41011a6fcb5ca45aef67e75329bfa Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18948 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-18arch, base, cpu, dev, mem, sim: Remove #if 0-ed out code.Gabe Black
This code will be preserved through version control, but otherwise creates clutter and will rot in place since it's never compiled. Change-Id: Id265f6deac445116843956ea5cf1210d8127274e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18608 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-05-11cpu,mem: Add support for partial loads/stores and wide mem. accessesGiacomo Gabrielli
This changeset adds support for partial (or masked) loads/stores, i.e. loads/stores that can disable accesses to individual bytes within the target address range. In addition, this changeset extends the code to crack memory accesses across most CPU models (TimingSimpleCPU still TBD), so that arbitrarily wide memory accesses are supported. These changes are required for supporting ISAs with wide vectors. Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> - Tiago Muck <tiago.muck@arm.com> Change-Id: Ibad33541c258ad72925c0b1d5abc3e5e8bf92d92 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13518 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-05-11cpu: Add a memory access predicateGiacomo Gabrielli
This changeset introduces a new predicate to guard memory accesses. The most immediate use for this is to allow proper handling of predicated-false vector contiguous loads and predicated-false micro-ops of vector gather loads (added in separate changesets). Change-Id: Ice6894fe150faec2f2f7ab796a00c99ac843810a Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17991 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Bradley Wang <radwang@ucdavis.edu> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Maintainer: Nikos Nikoleris <nikos.nikoleris@arm.com>
2019-04-30cpu: alpha: Delete all occurrances of the simPalCheck function.Gabe Black
This is now handled within the ISA description. Change-Id: Ie409bb46d102e59d4eb41408d9196fe235626d32 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18434 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-04-30cpu: Remove hwrei from the generic interfaces.Gabe Black
This mechanism is specific to Alpha and doesn't belong sprinkled around the CPU's generic mechanisms. Change-Id: I87904d1a08df2b03eb770205e2c4b94db25201a1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18432 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-04-30arch: cpu: Track kernel stats using the base ISA agnostic type.Gabe Black
Then cast to the ISA specific type when necessary. This removes (mostly) an ISA specific aspect to some of the interfaces. The ISA specific version of the kernel stats still needs to be constructed and stored in a few places which means that kernel_stats.hh still needs to be a switching arch header, for instance. In the future, I'd like to make the kernel its own object like the Process objects in SE mode, and then it would be able to instantiate and maintain its own stats. Change-Id: I8309d49019124f6bea1482aaea5b5b34e8c97433 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18429 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2019-04-29cpu: Get rid of the (read|set)RegOtherThread methods.Gabe Black
These are implemented by MIPS internally now. Change-Id: If7465e1666e51e1314968efb56a5a814e62ee2d1 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18436 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-04-28mem: Minimize the use of MemObject.Gabe Black
MemObject doesn't provide anything beyond its base ClockedObject any more, so this change removes it from most inheritance hierarchies. Occasionally MemObject is replaced with SimObject when I was fairly confident that the extra functionality of ClockedObject wasn't needed. Change-Id: Ic014ab61e56402e62548e8c831eb16e26523fdce Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18289 Tested-by: kokoro <noreply+kokoro@google.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com>
2019-04-22cpu: Eliminate the ProxyThreadContext class.Gabe Black
Replace it with direct inheritance from the ThreadContext class in the SimpleThread class which was the only place it was used. Also take the opportunity to use some specialized types instead of ints, etc., add some consts, and fix some style issues. Change-Id: I5d2cfa87b20dc43615e33e6755c9d016564e9c0e Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/18048 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Gabe Black <gabeblack@google.com> Tested-by: kokoro <noreply+kokoro@google.com>
2019-04-10cpu: O3 switchFreeList checking VecElems instead of FloatRegsGiacomo Travaglini
Vector elements should be checked instead of floats since those are the ones mapped to the vector registers. Change-Id: I36088ab90e63720d846fcf5b43360da105b6c736 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17850 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-04-03misc: Removed inconsistency in O3* debug msgsAndrea Mondelli
Added consistency in the DEBUG message form, to allow a better parsing. Fixed sn/tid type parameter. Removed some annoying newlines Change-Id: I4761c49fc12b874a7d8b46779475b606865cad4b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17248 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-04-03arch-mips: added missing override specifier (o3)Andrea Mondelli
Change-Id: Ic538825a2964fd62def672b933a83067a15bd12a Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17648 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-03-28cpu: Added a probe to notify the address of retired instructionsJavier Bueno
A probe is added to notify the address of each retired instruction. Change-Id: Iefc1b09d74b3aa0aa5773b17ba637bf51f5a59c9 Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/17632 Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2019-03-14cpu: Refactor of Physical Register implementationAndrea Mondelli
The implementation of the PhyRegId class is shared between multiple cpu models. The o3/misc.hh should only be included in o3 models. This patch removes the dependencies between different model implementations, allowing to add new O3-like CPU model. Change-Id: Ibb812517043befe75c48fab3ce9605a0d272870b Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/16908 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Bradley Wang <radwang@ucdavis.edu> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2019-03-14arch-arm,cpu: Add initial support for Arm SVEGiacomo Gabrielli
This changeset adds initial support for the Arm Scalable Vector Extension (SVE) by implementing: - support for most data-processing instructions (no loads/stores yet); - basic system-level support. Additional authors: - Javier Setoain <javier.setoain@arm.com> - Gabor Dozsa <gabor.dozsa@arm.com> - Giacomo Travaglini <giacomo.travaglini@arm.com> Thanks to Pau Cabre for his contribution of bugfixes. Change-Id: I1808b5ff55b401777eeb9b99c9a1129e0d527709 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/13515 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2019-02-27misc: Segmentation Fault during O3PipeView executionAndrea Mondelli
During the O3PipeView execution, a potential invalid iterator is used to Update the instruction storeTick field. If the store_idx iterator is the first() of the StoreQueue, the corresponding instruction is removed from the queue, leaving the iterator invalid and not usable in the TRACING_ON block. This patch uses the store_inst variable to access (and update) the instruction tick, instead of the (potential) invalid one. Change-Id: I671052ef282b9048e5239da8629b89e8afa86bf0 Reviewed-on: https://gem5-review.googlesource.com/c/16322 Maintainer: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2019-02-22cpu-o3: Add cache read ports limit to LSQGabor Dozsa
This change introduces cache read ports to limit the number of per-cycle loads. Previously only the number of per-cycle stores could be limited. Change-Id: I39bbd984056c5a696725ee2db462a55b2079e2d4 Signed-off-by: Gabor Dozsa <gabor.dozsa@arm.com> Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13517 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2019-02-19cpu: Add ISA* getter in Thread interfaceGiacomo Gabrielli
This patch is adding a ISA* getter to the TC interface Change-Id: Ib8ddc5d8fdd44e782f50a2ad15878a6bcf931e58 Reviewed-on: https://gem5-review.googlesource.com/c/16462 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2019-02-15cpu: Fix fast build broken due to unused variableGiacomo Travaglini
This fixes fast build for commit 25dc765889d948693995cfa622f001aa94b5364b (fast build is striping out assertions) Change-Id: I9536ad58a3d85990b16a1f8c2515f6bf5d3acf71 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/16463 Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-02-12python: Don't assume SimObjects live in the global namespaceAndreas Sandberg
The importer in Python 3 doesn't like the way we import SimObjects from the global namespace. Convert the existing SimObject declarations to import from m5.objects. As a side-effect, this makes these files consistent with configuration files. Change-Id: I11153502b430822130722839e1fa767b82a027aa Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15981 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
2019-02-08cpu: support atomic memory request type with AtomicOpFunctorTuan Ta
This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU, MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory system. Atomic memory instruction is treated as a special store instruction in all CPU models. In simple CPUs, an AMO request with an associated AtomicOpFunctor is simply sent to L1 dcache. In MinorCPU, an AMO request bypasses store buffer and waits for any conflicting store request(s) currently in the store buffer to retire before the AMO request is sent to the cache. AMO requests are not buffered in the store buffer, so their effects appear immediately in the cache. In DerivO3CPU, an AMO request is inserted in the store buffer so that it is delivered to the cache only after all previous stores are issued to the cache. Data forwarding between between an outstanding AMO in the store buffer and a subsequent load is not allowed since the AMO request does not hold valid data until it's executed in the cache. This implementation assumes that a target ISA implementation must insert enough memory fences as micro-ops around an atomic instruction to enforce a correct order of memory instructions with respect to its memory consistency model. Without extra memory fences, this implementation can allow AMOs and other memory instructions that do not conflict (i.e., not target the same address) to reorder. This implementation also assumes that atomic instructions execute within a cache line boundary since the cache for now is not able to execute an operation on two different cache lines in one single step. Therefore, ISAs like x86 that require multi-cache-line atomic instructions need to either use a pair of locking load and unlocking store or change the cache implementation to guarantee the atomicity of an atomic instruction. Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a Reviewed-on: https://gem5-review.googlesource.com/c/8188 Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-02-08sim,cpu: make exit_group halt all threads in a groupTuan Ta
When a thread calls exit_group, in addition to halting the thread itself, it needs to halt all other threads in its group (i.e., threads sharing the same thread group ID). This patch enables threads to do that. Change-Id: Ib2e158fb27cf98843f177a64a2d643b1bbc94d03 Reviewed-on: https://gem5-review.googlesource.com/c/9623 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-02-08cpu: fixed how O3 CPU executes an exit system callTuan Ta
When a thread executed an exit syscall in SE mode, the thread context was removed immediately in the same cycle, which left inflight squash operations and trap event incomplete. The problem happened when a new thread was assigned to the CPU later. The new thread started with some incomplete transactions of the previous thread (e.g., squashing). This problem could cause incorrect execution flow for the new thread (i.e., pc was not reset properly at the exit point), deadlock (i.e., some stage-to-stage signals were not reset) and incorrect rename map between logical and physical registers. This patch adds a new state called 'Halting' to the thread context and defers removing thread context from a CPU until a trap event initiated by an exit syscall execution is processed. This patch also makes sure that the removal of a thread context happens after all inflight transactions of the to-be-removed thread in the pipeline complete. Change-Id: If7ef1462fb8864e22b45371ee7ae67e2a5ad38b8 Reviewed-on: https://gem5-review.googlesource.com/c/8184 Reviewed-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-02-05misc: added missing override specifierAndrea Mondelli
Added missing specifier for various virtual functions. Change-Id: I4783e92d78789a9ae182fad79aadceafb00b2458 Reviewed-on: https://gem5-review.googlesource.com/c/16103 Reviewed-by: Hoa Nguyen <hoanguyen@ucdavis.edu> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-02-01cpu, arch: Replace the CCReg type with RegVal.Gabe Black
Most architectures weren't using the CCReg type, and in x86 and arm it was already a uint64_t. Change-Id: I0b3d5e690e6b31db6f2627f449c89bde0f6750a6 Reviewed-on: https://gem5-review.googlesource.com/c/14515 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com>
2019-01-31arch: cpu: Rename *FloatRegBits* to *FloatReg*.Gabe Black
Now that there's no plain FloatReg, there's no reason to distinguish FloatRegBits with a special suffix since it's the only way to read or write FP registers. Change-Id: I3a60168c1d4302aed55223ea8e37b421f21efded Reviewed-on: https://gem5-review.googlesource.com/c/14460 Reviewed-by: Brandon Potter <Brandon.Potter@amd.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Maintainer: Gabe Black <gabeblack@google.com>
2019-01-30arch,cpu: Add vector predicate registersGiacomo Gabrielli
Latest-gen. vector/SIMD extensions, including the Arm Scalable Vector Extension (SVE), introduce the notion of a predicate register file. This changeset adds this feature across architectures and CPU models. Change-Id: Iebcadbad89c0a582ff8b1b70de353305db603946 Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13715 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2019-01-25cpu, arch, arch-arm: Wire unused VecElem code in the O3 modelGiacomo Travaglini
VecElem code had been introduced in order to simulate change of renaming for vector registers. Most of the work is happening on the rename_map switchRenameMode. Change of renaming can happen after a squash in the pipeline. This patch is also changing the interface to the ISA part so that a PCState is used instead of ISA in order to check if rename mode has changed. Change-Id: I8af795d771b958e0a0d459abfeceff5f16b4b5d4 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15601
2019-01-25cpu: O3 rename using the flatIndex instead of indexGiacomo Travaglini
This patch is replacing the RegId::index with RegId::flatIndex so that it provides a valid register number when used by a VecElem register. Change-Id: I5b000abb9457cd325c2a3021e772a75ea33d8a4c Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15600 Maintainer: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2019-01-25cpu: Fix VecElemClass bugs in cpu modelsGiacomo Travaglini
This patch is: * Adding a missing VecElemClass entry * Fixing assertion in rename map which was checking the number of free vector registers rather than free vector element registers * Fixing assertion in read/setVecElemOperand APIs. * Using the right register index in SimpleThread * Using VecElem instead of VecReg on O3 readArchVecElem Change-Id: I265320dcbe35eb47075991301dfc99333c5190c4 Signed-off-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15598 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
2019-01-24cpu-o3: O3 LSQ GeneralisationRekai Gonzalez-Alberquilla
This patch does a large modification of the LSQ in the O3 model. The main goal of the patch is to remove the 'an operation can be served with one or two memory requests' assumption that is present in the LSQ and the instruction with the req, reqLow, reqHigh triplet, and generalising it to operations that can be addressed with one request, and operations that require many requests, embodied in the SingleDataRequest and the SplitDataRequest. This modification has been done mimicking the minor model to an extent, shifting the responsibilities of dealing with VtoP translation and tracking the status and resources from the DynInst to the LSQ via the LSQRequest. The LSQRequest models the information concerning the operation, handles the creation of fragments for translation and request as well as assembling/splitting the data accordingly. With this modifications, the implementation of vector ISAs, particularly on the memory side, become more rich, as the new model permits a dissociation of the ISA characteristics as vector length, from the microarchitectural characteristics that govern how contiguous loads are executing, allowing exploration of different LSQ to DL1 bus widths to understand the tradeoffs in complexity and performance. Part of the complexities introduced stem from the fact that gem5 keeps a large amount of metadata regarding, in particular, memory operations, thus, when an instruction is squashed while some operation as TLB lookup or cache access is ongoing, when the relevant structure communicates to the LSQ that the operation is over, it tries to access some pieces of data that should have died when the instruction is squashed, leading to asserts, panics, or memory corruption. To ensure the correct behaviour, the LSQRequest rely on assesing who is their owner, and self-destroying if they detect their owner is done with the request, and there will be no subsequent action. For example, in the case of an instruction squashed whal the TLB is doing a walk to serve the translation, when the translation is served by the TLB, the LSQRequest detects that the instruction was squashed, and as the translation is done, no one else expect to access its information, and therefore, it self-destructs. Having destroyed the LSQRequest earlier, would lead to wrong behaviour as the TLB walk may access some fields of it. Additional authors: - Gabor Dozsa <gabor.dozsa@arm.com> Change-Id: I9578a1a3f6b899c390cdd886856a24db68ff7d0c Signed-off-by: Giacomo Gabrielli <giacomo.gabrielli@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/13516 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2019-01-22arch: cpu: Stop passing around misc registers by reference.Gabe Black
These values are all basic integers (specifically uint64_t now), and so passing them by const & is actually less efficient since there's a extra level of indirection and an extra value, and the same sized value (a 64 bit pointer vs. a 64 bit int) is being passed around. Change-Id: Ie9956b8dc4c225068ab1afaba233ec2b42b76da3 Reviewed-on: https://gem5-review.googlesource.com/c/13626 Maintainer: Gabe Black <gabeblack@google.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2019-01-17cpu-o3: Make the smtCommitPolicy a Param.ScopedEnumNikos Nikoleris
The smtCommitPolicy is a parameter in the o3 cpu that can have 3 different values. Previously this setting was done through a string and a parser function would turn it into a c++ enum value. This changeset turns the string into a python Param.ScopedEnum. Change-Id: I3625f2c08a1ae0c3b0dce7a641c6ae1ce3fd79a5 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15400 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-01-17cpu-o3: Make the smtROBPolicy a Param.ScopedEnumNikos Nikoleris
The smtROBPolicy is a parameter in the o3 cpu that can have 3 different values. Previously this setting was done through a string and a parser function would turn it into a c++ enum value. This changeset turns the string into a python Param.ScopedEnum. Change-Id: Ie104d055dbbc6e44997ae0c1470de714239be5a3 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15399 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-01-17cpu-o3: Make the smtIQPolicy a Param.ScopedEnumNikos Nikoleris
The smtIQPolicy is a parameter in the o3 cpu that can have 3 different values. Previously this setting was done through a string and a parser function would turn it into a c++ enum value. This changeset turns the string into a python Param.ScopedEnum. Change-Id: Ieecf0a19427dd250b0d5ae3d531ab46a37326ae5 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15398 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-01-17cpu-o3: Make the smtLSQPolicy a Param.ScopedEnumNikos Nikoleris
The smtLSQPolicy is a parameter in the o3 cpu that can have 3 different values. Previously this setting was done through a string and a parser function would turn it into a c++ enum value. This changeset turns the string into a python Param.ScopedEnum. Change-Id: I82041b88bd914c5dc660058d9e3998e3114e7c35 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15397 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Maintainer: Jason Lowe-Power <jason@lowepower.com>
2019-01-17cpu-o3: Make the smtFetchPolicy a Param.ScopedEnumNikos Nikoleris
The smtFetchPolicy is a parameter in the o3 cpu that can have 5 different values. Previously this setting was done through a string and a parser function would turn it into a c++ enum value. This changeset turns the string into a python Param.ScopedEnum. Change-Id: Iafb4b4b27587541185ea912e5ed581bce09695f5 Signed-off-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com> Reviewed-on: https://gem5-review.googlesource.com/c/15396 Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2019-01-16cpu: dev: sim: gpu-compute: Banish some ISA specific register types.Gabe Black
These types are IntReg, FloatReg, FloatRegBits, and MiscReg. There are some remaining types, specifically the vector registers and the CCReg. I'm less familiar with these new types of registers, and so will look at getting rid of them at some later time. Change-Id: Ide8f76b15c531286f61427330053b44074b8ac9b Reviewed-on: https://gem5-review.googlesource.com/c/13624 Reviewed-by: Gabe Black <gabeblack@google.com> Maintainer: Gabe Black <gabeblack@google.com>