summaryrefslogtreecommitdiff
path: root/src/gpu-compute/compute_unit.cc
AgeCommit message (Collapse)Author
2017-07-12gpu-compute: Refactor some Event subclasses to lambdasSean Wilson
Change-Id: Ic1332b8e8ba0afacbe591c80f4d06afbf5f04bd9 Signed-off-by: Sean Wilson <spwilson2@wisc.edu> Reviewed-on: https://gem5-review.googlesource.com/3922 Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Anthony Gutierrez <anthony.gutierrez@amd.com> Maintainer: Anthony Gutierrez <anthony.gutierrez@amd.com>
2016-10-26gpu-compute: support in-order data delivery in GM pipeTony Gutierrez
this patch adds an ordered response buffer to the GM pipeline to ensure in-order data delivery. the buffer is implemented as a stl ordered map, which sorts the request in program order by using their sequence ID. when requests return to the GM pipeline they are marked as done. only the oldest request may be serviced from the ordered buffer, and only if is marked as done. the FIFO response buffers are kept and used in OoO delivery mode
2016-10-26gpu-compute: use System cache line size in the GPUTony Gutierrez
2016-10-26gpu-compute: add instruction mix stats for the gpuTony Gutierrez
2016-10-26gpu-compute: remove inst enums and use bit flag for attributesTony Gutierrez
this patch removes the GPUStaticInst enums that were defined in GPU.py. instead, a simple set of attribute flags that can be set in the base instruction class are used. this will help unify the attributes of HSAIL and machine ISA instructions within the model itself. because the static instrution now carries the attributes, a GPUDynInst must carry a pointer to a valid GPUStaticInst so a new static kernel launch instruction is added, which carries the attributes needed to perform a the kernel launch.
2016-10-04gpu-compute: Added method to compute the actual workgroup sizeAlexandru Dutu
This patch adds a method to the Wavefront class to compute the actual workgroup size. This can be different from the maximum workgroup size specified when launching the kernel through the NDRange object. Current solution is still not optimal, as we are computing these for each wavefront and the dispatcher also needs to have this information and can't actually call Wavefront::computeActuallWgSz before the wavefronts are being created. A long term solution would be to have a Workgroup class that deals with all these details.
2016-09-16gpu-compute: Refactoring Wavefront::dynWaveIdAlexandru Dutu
2016-09-16gpu-compute: Wavefront refactoringAlexandru Dutu
Renaming members of the Wavefront class in accordance with the style guide.
2016-09-16gpu-compute: Remove WFContextAlexandru Dutu
WFContext struct is currently unused and it has been rendered not useful in saving and restoring the context of a Wavefront. Wavefront class should be sufficient for that purpose and the runtime can figure out the memory size it will need to allocate for a Wavefront through an IOCTL.
2016-06-09gpu-compute: parametrize Wavefront sizejkalamat
Eliminate the VSZ constant that defined the Wavefront size (in numbers of work items); replaced it with a parameter in the GPU.py configuration script. Changed all data structures dependent on the Wavefront size to be dynamically sized. Legal values of Wavefront size are 16, 32, 64 for now and checked at initialization time.
2016-06-06stats: Fixing regStats function for some SimObjectsDavid Guillen Fandos
Fixing an issue with regStats not calling the parent class method for most SimObjects in Gem5. This causes issues if one adds new stats in the base class (since they are never initialized properly!). Change-Id: Iebc5aa66f58816ef4295dc8e48a357558d76a77c Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2016-04-07mem: Remove threadId from memory request classMitch Hayenga
In general, the ThreadID parameter is unnecessary in the memory system as the ContextID is what is used for the purposes of locks/wakeups. Since we allocate sequential ContextIDs for each thread on MT-enabled CPUs, ThreadID is unnecessary as the CPUs can identify the requesting thread through sideband info (SenderState / LSQ entries) or ContextID offset from the base ContextID for a cpu. This is a re-spin of 20264eb after the revert (bd1c6789) and includes some fixes of that commit.
2016-03-04base: Fix gpu-compute output stream creationAndreas Hansson
Match changes in output stream.
2016-02-18gpu: fix bugs with MemFence, Flat Instrs and Resource utilizationJohn Kalamatianos
Both Memory Fence is now flagged as Global Memory only to avoid resource oversubscribing. Flat instructions now check for Shared Memory resource busy to avoid oversubscribing resources. All WaitClass resources now use cycles (not ticks) to register the number of pipe stages between Scoreboard and Execute to be consistent with instruction scheduling logic which always used clock cycles.
2016-01-19gpu-compute: AMD's baseline GPU modelTony Gutierrez