diff options
author | Tuan Ta <qtt2@cornell.edu> | 2018-01-22 13:12:50 -0500 |
---|---|---|
committer | Tuan Ta <qtt2@cornell.edu> | 2019-02-08 15:27:04 +0000 |
commit | 25dc765889d948693995cfa622f001aa94b5364b (patch) | |
tree | 38a8e93881ad150a482020a1fd706d664ee0c061 /src/cpu/base_dyn_inst.hh | |
parent | 165a7dab558c8118622a387683521bea1ebf2e6c (diff) | |
download | gem5-25dc765889d948693995cfa622f001aa94b5364b.tar.xz |
cpu: support atomic memory request type with AtomicOpFunctor
This patch enables all 4 CPU models (AtomicSimpleCPU, TimingSimpleCPU,
MinorCPU and DerivO3CPU) to issue atomic memory (AMO) requests to memory
system.
Atomic memory instruction is treated as a special store instruction in
all CPU models.
In simple CPUs, an AMO request with an associated AtomicOpFunctor is
simply sent to L1 dcache.
In MinorCPU, an AMO request bypasses store buffer and waits for any
conflicting store request(s) currently in the store buffer to retire
before the AMO request is sent to the cache. AMO requests are not buffered
in the store buffer, so their effects appear immediately in the cache.
In DerivO3CPU, an AMO request is inserted in the store buffer so that it
is delivered to the cache only after all previous stores are issued to
the cache. Data forwarding between between an outstanding AMO in the
store buffer and a subsequent load is not allowed since the AMO request
does not hold valid data until it's executed in the cache.
This implementation assumes that a target ISA implementation must insert
enough memory fences as micro-ops around an atomic instruction to
enforce a correct order of memory instructions with respect to its
memory consistency model. Without extra memory fences, this implementation
can allow AMOs and other memory instructions that do not conflict
(i.e., not target the same address) to reorder.
This implementation also assumes that atomic instructions execute within
a cache line boundary since the cache for now is not able to execute an
operation on two different cache lines in one single step. Therefore,
ISAs like x86 that require multi-cache-line atomic instructions need to
either use a pair of locking load and unlocking store or change the
cache implementation to guarantee the atomicity of an atomic
instruction.
Change-Id: Ib8a7c81868ac05b98d73afc7d16eb88486f8cf9a
Reviewed-on: https://gem5-review.googlesource.com/c/8188
Reviewed-by: Giacomo Travaglini <giacomo.travaglini@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Diffstat (limited to 'src/cpu/base_dyn_inst.hh')
-rw-r--r-- | src/cpu/base_dyn_inst.hh | 19 |
1 files changed, 19 insertions, 0 deletions
diff --git a/src/cpu/base_dyn_inst.hh b/src/cpu/base_dyn_inst.hh index c24517937..9a1ab062c 100644 --- a/src/cpu/base_dyn_inst.hh +++ b/src/cpu/base_dyn_inst.hh @@ -303,6 +303,9 @@ class BaseDynInst : public ExecContext, public RefCounted Fault writeMem(uint8_t *data, unsigned size, Addr addr, Request::Flags flags, uint64_t *res); + Fault initiateMemAMO(Addr addr, unsigned size, Request::Flags flags, + AtomicOpFunctor *amo_op); + /** True if the DTB address translation has started. */ bool translationStarted() const { return instFlags[TranslationStarted]; } void translationStarted(bool f) { instFlags[TranslationStarted] = f; } @@ -920,4 +923,20 @@ BaseDynInst<Impl>::writeMem(uint8_t *data, unsigned size, Addr addr, /* st */ false, data, size, addr, flags, res); } +template<class Impl> +Fault +BaseDynInst<Impl>::initiateMemAMO(Addr addr, unsigned size, + Request::Flags flags, + AtomicOpFunctor *amo_op) +{ + // atomic memory instructions do not have data to be written to memory yet + // since the atomic operations will be executed directly in cache/memory. + // Therefore, its `data` field is nullptr. + // Atomic memory requests need to carry their `amo_op` fields to cache/ + // memory + return cpu->pushRequest( + dynamic_cast<typename DynInstPtr::PtrType>(this), + /* atomic */ false, nullptr, size, addr, flags, nullptr, amo_op); +} + #endif // __CPU_BASE_DYN_INST_HH__ |