Age | Commit message (Collapse) | Author |
|
This makes it explicit which type of serialization you want, and also
makes it possible to make a macroop serialize before. The old
serializing directive was renamed .serialize_after in the microcode
assembler, and throughout the microcode implementation, and its
behavior is unchanged. More specifically, it still marks the last
microop within the macroop as IsSerializing and IsSerializeAfter.
The new .serialize_before directive does something similar and marks
the first microop as IsSerializing and IsSerializeBefore.
Change-Id: Ia53466c734c651c65400809de7ef903c4a6c3e7e
Reviewed-on: https://gem5-review.googlesource.com/9041
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Gabe Black <gabeblack@google.com>
|
|
This patch adds support for cache flushing instructions in x86.
It piggybacks on support for similar instructions in arm ISA
added by Nikos Nikoleris. I have tested each instruction using
microbenchmarks.
Change-Id: I72b6b8dc30c236a21eff7958fa231f0663532d7d
Reviewed-on: https://gem5-review.googlesource.com/7401
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
|
|
This means the instruction is treated as cmpxchg8b when the effective
operand size is 16 bits.
Change-Id: I4d9bb295f96097e1746a9bbccb2c579d14738fab
Reviewed-on: https://gem5-review.googlesource.com/6603
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Gabe Black <gabeblack@google.com>
|
|
The microcode for those instructions needs a directive which overrides
that setting in the instructions emulation environment.
Reported-by: Matt Sinclair <mattdsinclair@gmail.com>
Change-Id: I474d938c0b3cf01da92ec817a58b08de783f1967
Reviewed-on: https://gem5-review.googlesource.com/6301
Reviewed-by: Gabe Black <gabeblack@google.com>
Maintainer: Gabe Black <gabeblack@google.com>
|
|
When in 64-bit mode, if the stack is accessed implicitly by an instruction
the alternate address prefix should be ignored if present.
This patch adds an extra flag to the ldstop which signifies when the
address override should be ignored. Then, for all of the affected
instructions, this patch adds two options to the ld and st opcode to use
the current stack addressing mode for all addresses and to ignore the
AddressSizeFlagBit. Finally, this patch updates the x86 TLB to not
truncate the address if it is in 64-bit mode and the IgnoreAddrSizeFlagBit
is set.
This fixes a problem when calling __libc_start_main with a binary that is
linked with a recent version of ld. This version of ld uses the address
override prefix (0x67) on the call instruction instead of a nop.
Note: This has not been tested in compatibility mode and only the call
instruction with the address override prefix has been tested.
See [1] page 9 (pdf page 45)
For instructions that are affected see [1] page 519 (pdf page 555).
[1] http://support.amd.com/TechDocs/24594.pdf
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
|
|
The previous implementation did a pair of nested RMW operations,
which isn't compatible with the way that locked RMW operations are
implemented in the cache models. It was convenient though in that
it didn't require any new micro-ops, and supported cmpxchg16b using
64-bit memory ops. It also worked in AtomicSimpleCPU where
atomicity was guaranteed by the core and not by the memory system.
It did not work with timing CPU models though.
This new implementation defines new 'split' load and store micro-ops
which allow a single memory operation to use a pair of registers as
the source or destination, then uses a single ldsplit/stsplit RMW
pair to implement cmpxchg. This patch requires support for 128-bit
memory accesses in the ISA (added via a separate patch) to support
cmpxchg16b.
|
|
Result of running 'hg m5style --skip-all --fix-white -a'.
|
|
Added explicit data sizes and an opcode type for correct execution.
|
|
The data size used for actually writing the base value for the segment was the
default size, but really it should set the entire value without any possible
truncation.
|
|
The far pointer should be shifted right to get the selector value, not left.
Also, when calculating the width of the offset, the wrong register was used in
one spot.
|
|
This patch takes quite a large step in transitioning from the ad-hoc
RefCountingPtr to the c++11 shared_ptr by adopting its use for all
Faults. There are no changes in behaviour, and the code modifications
are mostly just replacing "new" with "make_shared".
|
|
This is an implementation of the x86 int3 and int immediate
instructions for long mode according to 'AMD64 Programmers Manual
Volume 3'.
|
|
Currently call and return instructions are marked as IsCall and IsReturn. Thus, the
branch predictor does not use RAS for these instructions. Similarly, the number of
function calls that took place is recorded as 0. This patch marks these instructions
as they should be.
|
|
The 'lret' instruction reloads instruction pointer and code segment from the
stack and then pops them. But the popping part is missing from the current
implementation. This caused incorrect behavior in some code related to the
Fiasco OS. Microops are being added to rectify the behavior of the instruction.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
|
|
The disp displacement was left off the load microop so the wrong value was
used.
|
|
|
|
This patch adds a new microop for memory barrier. The microop itself does
nothing, but since it is marked as a memory barrier, the O3 CPU should flush
all the pending loads and stores before the fence to the memory system.
|
|
|
|
During SYSCALL_64, use dataSize=8 when handling new rip (ref
http://www.intel.com/Assets/PDF/manual/253668.pdf 5.8.8 IA32_LSTAR is a 64-bit
address)
|
|
JMP_FAR_I was unpacking its far pointer operand using sll instead of srl like
it should, and also putting the components in the wrong registers for use by
other microcode.
|
|
During iret access LDT/GDT at CPL0 rather than after transition to user mode
(if I'm reading the Intel IA-64 architecture spec correctly, the contents of
the descriptor table are read before the CPL is updated).
|
|
|
|
|
|
|
|
|
|
|
|
Unfortunately my implementation of the movd instruction had two bugs.
In one case, when moving a 32-bit value into an xmm register, the
lower half of the xmm register was not zero extended.
The other case is that xmm was used instead of xmmlm as the source
for a register move. My test case didn't notice this at first
as it moved xmm0 to eax, which both have the same register
number.
|
|
This patch implements the movd_Vo_Edp series of instructions.
It addresses various concerns by Gabe Black about which file the
instruction belonged in, as well as supporting REX prefixed
instructions properly.
This instruction is needed for some of the spec2k benchmarks, most
notably bzip2.
|
|
|
|
|
|
variants.
|
|
|
|
|
|
|
|
|
|
|
|
CMOVcc.
The manuals from both AMD and Intel say that when writing to a 32 bit
destination in 64 bit mode, the upper 32 bits of the register are filled with
zeros. They also both say that the CMOV instructions leave their destination
alone when their condition fails. Unfortunately, it seems that CMOV will zero
extend its destination register whether or not it was supposed to actually do
a move on both platforms. This seems to be the only case where this happens,
but it would be hard to say for sure.
|
|
|
|
|
|
operations zero extend.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|