summaryrefslogtreecommitdiff
path: root/src/arch
AgeCommit message (Collapse)Author
2016-10-26gpu-compute: support in-order data delivery in GM pipeTony Gutierrez
this patch adds an ordered response buffer to the GM pipeline to ensure in-order data delivery. the buffer is implemented as a stl ordered map, which sorts the request in program order by using their sequence ID. when requests return to the GM pipeline they are marked as done. only the oldest request may be serviced from the ordered buffer, and only if is marked as done. the FIFO response buffers are kept and used in OoO delivery mode
2016-10-26gpu-compute, hsail: pass GPUDynInstPtr to getRegisterIndex()Tony Gutierrez
for HSAIL an operand's indices into the register files may be calculated trivially, because the operands are always read from a register file, or are an immediate. for machine ISA, however, an op selector may specify special registers, or may specify special SGPRs with an alias op selector value. the location of some of the special registers values are dependent on the size of the RF in some cases. here we add a way for the underlying getRegisterIndex() method to know about the size of the RFs, so that it may find the relative positions of the special register values.
2016-10-26gpu-compute, hsail: make the PC a byte address, not an instruction indexTony Gutierrez
currently the PC is incremented on an instruction granularity, and not as an instruction's byte address. machine ISA instructions assume the PC is a byte address, and is incremented accordingly. here we make the GPU model, and the HSAIL instructions treat the PC as a byte address as well.
2016-10-26gpu-compute: add gpu_isa.hh to switch hdrs, add GPUISA to WFTony Gutierrez
the GPUISA class is meant to encapsulate any ISA-specific behavior - special register accesses, isa-specific WF/kernel state, etc. - in a generic enough way so that it may be used in ISA-agnostic code. gpu-compute: use the GPUISA object to advance the PC the GPU model treats the PC as a pointer to individual instruction objects - which are store in a contiguous array - and not a byte address to be fetched from the real memory system. this is ok for HSAIL because all instructions are considered by the model to be the same size. in machine ISA, however, instructions may be 32b or 64b, and branches are calculated by advancing the PC by the number of words (4 byte chunks) it needs to advance in the real instruction stream. because of this there is a mismatch between the PC we use to index into the instruction array, and the actual byte address PC the ISA expects. here we move the PC advance calculation to the ISA so that differences in the instrucion sizes may be accounted for in generic way.
2016-10-26gpu-compute, hsail: call discardFetch() from the WFTony Gutierrez
because every taken branch causes fetch to be discarded, we move the call to the WF to avoid to have to call it from each and every branch instruction type.
2016-10-26hsail, gpu-compute: remove doGm/SmReturn add completeAccTony Gutierrez
we are removing doGmReturn from the GM pipe, and adding completeAcc() implementations for the HSAIL mem ops. the behavior in doGmReturn is dependent on HSAIL and HSAIL mem ops, however the completion phase of memory ops in machine ISA can be very different, even amongst individual machine ISA mem ops. so we remove this functionality from the pipeline and allow it to be implemented by the individual instructions.
2016-10-26gpu-compute: remove inst enums and use bit flag for attributesTony Gutierrez
this patch removes the GPUStaticInst enums that were defined in GPU.py. instead, a simple set of attribute flags that can be set in the base instruction class are used. this will help unify the attributes of HSAIL and machine ISA instructions within the model itself. because the static instrution now carries the attributes, a GPUDynInst must carry a pointer to a valid GPUStaticInst so a new static kernel launch instruction is added, which carries the attributes needed to perform a the kernel launch.
2016-10-26gpu-compute: move disassemle() implementation to GPUStaticInstTony Gutierrez
2016-10-26gpu-compute, arch: add some methods to the base inst classes for ISA supportTony Gutierrez
2016-10-15cpu, arm: Distinguish Float* and SimdFloat*, create FloatMem* opClassFernando Endo
Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which distinguishes writes to the INT and FP register banks. Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72, where the "latency" of FMADD is 3 if the next instruction is a FMADD and has only the augend to destination dependency, otherwise it's 7 cycles. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-10-13isa,arm: Add missing AArch32 FP instructionsMitch Hayenga
This commit adds missing non-predicated, scalar floating point instructions. Specifically VRINT* floating point integer rounding instructions and VSEL* floating point conditional selects. Change-Id: I23cbd1389f151389ac8beb28a7d18d5f93d000e7 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
2016-10-04kvm: Adding details to kvm page fault in x86Alexandru Dutu
Adding details, e.g. rip, rsp etc. to the kvm pagefault exit when in SE mode.
2016-09-16hsail: Fix disassembly of load instruction with 3 destination operandsAlexandru Dutu
2016-09-16gpu-compute: Refactoring Wavefront::dynWaveIdAlexandru Dutu
2016-09-16gpu-compute: Wavefront refactoringAlexandru Dutu
Renaming members of the Wavefront class in accordance with the style guide.
2016-09-15arm: Add m5_fail support for aarch64Ricardo Alves
Change-Id: Id2acbc09772be310a0eb9e33295afab07e08a4fa Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-09-13x86: Force strict ordering for memory mapped m5opsMichael LeBeane
Normal MMAPPED_IPR requests are allowed to execute speculatively under the assumption that they have no side effects. The special case of m5ops that are treated like MMAPPED_IPR should not be allowed to execute speculatively, since they can have side-effects. Adding the STRICT_ORDER flag to these requests blocks execution until the associated instruction hits the ROB head.
2016-08-15cpu, arch: fix the type used for the request flagsNikos Nikoleris
Change-Id: I183b9942929c873c3272ce6d1abd4ebc472c7132 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-08-05sim: fix issues with pwrite(); don't enable fstatfsTony Gutierrez
this patch fixes issues with changeset 11593 use the host's pwrite() syscall for pwrite64Func(), as opposed to pwrite64(), because pwrite64() does not work well on all distros. undo the enabling of fstatfs, as we will add this in a separate pate.
2016-08-04x86, sim: add some syscalls to X86Tony Gutierrez
this patch adds an implementation for the pwrite64 syscall and enables it for x86_64, and enables fstatfs for x86_64.
2016-08-02arm: refactor page table walkingCurtis Dunham
Introduce and use a lookup table. Using fetchDescriptor() rather than DMA cleanly handles nested paging. Change-Id: I69ec762f176bd752ba1040890e731826b58d15a6
2016-08-02arm: warn not fail on use of missing miscreg CNTHCTL_EL2Dylan Johnson
During host bootup, KVM reads/writes to CNTHCTL_EL2. Because this miscreg has not been implemented, the simulation would end there. This patch causes the simulation to warn about the read/write instead of fail. Change-Id: If034bfd0818a9a5e50c5fe86609e945258c96fa3
2016-08-02arm: Check TLB stage 2 permissions in AArch64Dylan Johnson
This fixes a bug where stage 2 lookups used the AArch32 permissions rules even if we were executing in AArch64 mode. Change-Id: Ia40758f0599667ca7ca15268bd3bf051342c24c1
2016-08-02arm: correctly assign faulting IPA's to HPFAR_EL2Dylan Johnson
This patch corrects IPA reporting if the translation faults in a stage 2 lookup. Change-Id: I0b914527f8a9f98a5e980a131cf9d03e5584b4e9
2016-08-02arm: Add TLBI instruction for stage 2 IPA'sDylan Johnson
This patch adds support for stage 2 TLBI instructions such as TLBI IPAS2E1_Xt. Change-Id: I0cd5e8055b0c1003e03439aa5183252f50ea0a88
2016-08-02arm: Fix stage 2 memory attribute checking in AArch64Dylan Johnson
Change-Id: I14c93a5460550051a12129e792a9a9bd522a145c
2016-08-02arm: Fix trapping to Hypervisor during MSR/MRS read/writeDylan Johnson
This patch restricts trapping to hypervisor only if we are in the correct exception level for the trap to happen. Change-Id: I0a382b6a572ef835ea36d2702b8a81b633bd3df0
2016-08-02arm: Fix secure state checking in various placesDylan Johnson
Faults that could potentially be routed to the hypervisor checked whether or not they were in a secure state without checking if security was enabled or not. This caused faults not to be routed correctly. This patch causes secure state checking to first ask if security is enabled. Change-Id: I179e9b181b27f552734c9bab2b18d05ac579a119
2016-08-02arm: Fix stage 2 determination in table walkerDylan Johnson
We recompute if we are doing a stage 2 walk inside of the table walker but we have already figured it out in the tlb. Pass the information in to the walk instead of recomputing it. Change-Id: I39637ce99309b2ddbc30344d45ac9ebf6a203401
2016-08-02arm: Refactor aarch64 table walk logic to remove redundancyDylan Johnson
The functional case is already handled within the fetchDescriptor() function. We can thus use that function for both atomic and functional mode when we start the table walk. Change-Id: Iacaed28cd9024d259fd37a58150efd00ff94d86e
2016-08-02arm: Add check to fault routing for hypervisor/virtualizationDylan Johnson
This patch adds the option for faults to be routed to the hypervisor using the pre-existing routeToHyp() functions that are present in each fault type. Change-Id: I9735512c094457636b9870456a5be5432288e004
2016-08-02arm: Fix EL perceived at TLB for address translation instructionsDylan Johnson
During address translation instructions (such as AT S1E1R_Xt) the exception level can be different than the current exception level. This patch fixes how the TLB determines what EL to use during these instructions. Change-Id: Ia9ce229404de9e284bc1f7479fd2c580efd55f8f
2016-08-02arm: Add AArch64 hypervisor call instruction 'hvc'Dylan Johnson
This patch adds the AArch64 instruction hvc which raises an exception from EL1 into EL2. The host OS uses this instruction to world switch into the guest. Change-Id: I930ee43f4f0abd4b35a68eb2a72e44e3ea6570be
2016-08-02arm: add stage2 translation supportDylan Johnson
Change-Id: I8f7c09c7ec3a97149ebebf4b21471b244e6cecc1
2016-08-02arm: enable EL2 supportCurtis Dunham
Change-Id: I59fa4fae98c33d9e5c2185382e1411911d27d341
2016-08-02arm: invalidate TLB miscreg cache on modification of HSCTLRDylan Johnson
Change-Id: I5212c91c56435fe008950ed99feacc6921609226
2016-08-02arm: change instruction classes to catch hyp trapsDylan Johnson
Change-Id: I122918d0e3dfd01ae1a4ca4f19240a069115c8b7
2016-07-21isa: Modify get/check interrupt routinesMitch Hayenga
Make it so that getInterrupt *always* returns an interrupt if checkInterrupts() returns true. This fixes/simplifies handling of interrupts on the SMT FS CPUs (currently minor).
2016-07-11arm: Don't consult the TLB test iface for functional translationsAndreas Sandberg
Don't consult the TLB test interface for PA's returned by functional translations by the AT instruction. We implement this by chaning the ISA code to synthesize 0-length functional reads for the TLB lookup. The TLB then bypasses the final PA check in the tester if the size is zero. Change-Id: I2487b7f829cea88c37e229e9fc7a4543aced961b Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2016-06-20arm: Mark uninitialized new TLB entries as not validNikos Nikoleris
Previously when we initialized the TLB we would allocate a number of TLB entries which would be marked as valid. As a result the TLB contained an entry which would be considered a valid entry for the 0 page. Change-Id: I23ace86426a171a4f6200ebeb29ad57c21647036 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-06-20kern, arm: Dump dmesg on kernel panic/oopsAndreas Sandberg
Add helper functions to dump the guest kernel's dmesg buffer to a text file in m5out. This functionality is split into two parts. First, a dmesg dump function that can be used in other places: void Linux::dumpDmesg(ThreadContext *, std::ostream &) This function is used to implement two PCEvents: DmesgDumpEvent and KernelPanic event. The only difference between the two is that the latter produces a gem5 panic instead of a warning in addition to dumping the kernel log. Change-Id: I6d2af1d666ace57124089648ea906f6c787ac63c Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
2016-06-18gpu-compute: Fixed a bug in decoding Atomic STTuan Ta
There is a mismatch between DataType and SrcDataType in constructing Atomic ST instruction. The mismatch causes atomic_store and atomic_store_explicit function to store incorrect value in memory.
2016-06-09gpu-compute: parametrize Wavefront sizejkalamat
Eliminate the VSZ constant that defined the Wavefront size (in numbers of work items); replaced it with a parameter in the GPU.py configuration script. Changed all data structures dependent on the Wavefront size to be dynamically sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at initialization time.
2016-06-06stats: Fixing regStats function for some SimObjectsDavid Guillen Fandos
Fixing an issue with regStats not calling the parent class method for most SimObjects in Gem5. This causes issues if one adds new stats in the base class (since they are never initialized properly!). Change-Id: Iebc5aa66f58816ef4295dc8e48a357558d76a77c Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-06-06sim: Call regStats of base-class as wellStephan Diestelhorst
We want to extend the stats of objects hierarchically and thus it is necessary to register the statistics of the base-class(es), as well. For now, these are empty, but generic stats will be added there. Patch originally provided by Akash Bagdia at ARM Ltd.
2016-06-02arm: refactor page table format determinationCurtis Dunham
In particular, when EL0 is in AArch32 but EL1 is AArch64, AArch64 memory translation must be used. This is essential for typical AArch64/32 interworking use cases.
2016-06-02arm: Rewrite ERET to behave according to the ARMv8 ARMAndreas Sandberg
The ERET instruction doesn't set PSTATE correctly in some cases (particularly when returning to aarch32 code). Among other things, this breaks EL0 thumb code when using a 64-bit kernel. This changeset updates the ERET implementation to match the ARM ARM. Change-Id: I408e7c69a23cce437859313dfe84e68744b07c98 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
2016-06-02arm: Correctly check FP/SIMD access permission in aarch32Andreas Sandberg
The current implementation of aarch32 FP/SIMD in gem5 assumes that EL1 and higher are all 32-bit. This breaks interprocessing since an aarch64 EL1 uses different enable/disable bits. This change updates the permission checks to according to what is prescribed by the ARM ARM. Change-Id: Icdcef31b00644cfeebec00216b3993aa1de12b88 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com> Reviewed-by: Nathanael Premillieu <nathanael.premillieu@arm.com>
2016-05-31arm: Enable LPAE support by defaultAndreas Sandberg
LPAE has been tested with Linux 4.4 and seems to work just fine. Let's enable it by default. Change-Id: Id88c6e3c91ae9c353279d42f2aa1f8a78485bd32 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
2016-05-31arm: Correctly check translation mode (aarch64/aarch32)Andreas Sandberg
According to the ARM ARM (see AArch32.TranslateAddress in the pseudocode library), the TLB should be operating in aarch64 mode if the EL0 is aarch32 and EL1 is aarch64. This is currently not the case in gem5, which breaks 64/32 interprocessing. Update the check to match the reference manual. Change-Id: I6f1444d57c0e2eb5f8880f513f33a9197b7cb2ce Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>