gem5 - gem5

Age	Commit message (Collapse)	Author
2017-02-10	x86: Fix implicit stack addressing in 64-bit mode	Jason Lowe-Power
	When in 64-bit mode, if the stack is accessed implicitly by an instruction the alternate address prefix should be ignored if present. This patch adds an extra flag to the ldstop which signifies when the address override should be ignored. Then, for all of the affected instructions, this patch adds two options to the ld and st opcode to use the current stack addressing mode for all addresses and to ignore the AddressSizeFlagBit. Finally, this patch updates the x86 TLB to not truncate the address if it is in 64-bit mode and the IgnoreAddrSizeFlagBit is set. This fixes a problem when calling __libc_start_main with a binary that is linked with a recent version of ld. This version of ld uses the address override prefix (0x67) on the call instruction instead of a nop. Note: This has not been tested in compatibility mode and only the call instruction with the address override prefix has been tested. See [1] page 9 (pdf page 45) For instructions that are affected see [1] page 519 (pdf page 555). [1] http://support.amd.com/TechDocs/24594.pdf Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-02-06	x86: revamp cmpxchg8b/cmpxchg16b implementation	Alexandru Dutu
	The previous implementation did a pair of nested RMW operations, which isn't compatible with the way that locked RMW operations are implemented in the cache models. It was convenient though in that it didn't require any new micro-ops, and supported cmpxchg16b using 64-bit memory ops. It also worked in AtomicSimpleCPU where atomicity was guaranteed by the core and not by the memory system. It did not work with timing CPU models though. This new implementation defines new 'split' load and store micro-ops which allow a single memory operation to use a pair of registers as the source or destination, then uses a single ldsplit/stsplit RMW pair to implement cmpxchg. This patch requires support for 128-bit memory accesses in the ISA (added via a separate patch) to support cmpxchg16b.
2016-02-06	style: remove trailing whitespace	Steve Reinhardt
	Result of running 'hg m5style --skip-all --fix-white -a'.
2015-10-06	x86: implement rcpps and rcpss SSE insts	Steve Reinhardt
	These are packed single-precision approximate reciprocal operations, vector and scalar versions, respectively. This code was basically developed by copying the code for sqrtps and sqrtss. The mrcp micro-op was simplified relative to msqrt since there are no double-precision versions of this operation.
2015-10-06	x86: implement fild, fucomi, and fucomip x87 insts	Steve Reinhardt
	fild loads an integer value into the x87 top of stack register. fucomi/fucomip compare two x87 register values (the latter also doing a stack pop). These instructions are used by some versions of GNU libstdc++.
2015-07-20	x86: x86 instruction-implementation bug fixes	David Hashe
	Added explicit data sizes and an opcode type for correct execution.
2015-07-04	x86: Adjust the size of the values written to the x87 misc registers	Nikos Nikoleris
	All x87 misc registers are implemented in an array of 64 bit values but in real hardware the size of some of these registers is smaller. Previsouly all 64 bits where incorrectly set and then later read. To ensure correctness we mask the value in setMiscRegNoEffect to write only the valid bits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-13	x86: implements x87 mult/div instructions	Nilay Vaish

2015-01-10	x86 : fxsave and fxrestore missing template code	Emilio Castillo
	This patch corrects the FXSAVE and FXRSTOR Macroops. The actual code used for saving/restore the FP registers is in the file but it was not used. The FXSAVE and FXRSTOR instructions are used in the kernel for saving and loading the state of the mmx,xmm and fpu registers. This operation is triggered in FS by issuing a Device Not Available Fault. The cr0 register has a TS flag that is set upon each context change. Every time a task access any FP related register (SIMD as well) if the TS flag is set to one, the device not available fault is issued. The kernel saves the current state of the registers, and restore the previous state of the currently running task. Right now Gem5 lacks of this capability. the Device Not Available Fault is never issued, leading to several problems when different threads share the same CPU and SMT is not used. The PARSEC Ferret benchmark is an example of this behavior. In order to test this a hack in the atomic cpu code was done to detect if a static instruction has any FP operands and the cr0 reg TS bit is set. This check must be done in the ISA dependent code. But it seems to be tricky to access the cr0 register while executing an instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-01-03	x86: implements the simd128 ADDSUBPD instruction	Maxime Martinasso
	This patch implements the simd128 ADDSUBPD instruction for the x86 architecture. Tested with a simple program in assembly language which executes the instruction. Checked that different versions of the instruction are executed by using the execution tracing option. Committed by: Nilay Vaish <nilay@cs.wisc.edu
2014-11-17	x86: Fix setting segment bases in real mode.	Gabe Black
	The data size used for actually writing the base value for the segment was the default size, but really it should set the entire value without any possible truncation.
2014-11-17	x86: Fix some bugs in the real mode far jmp instruction.	Gabe Black
	The far pointer should be shifted right to get the selector value, not left. Also, when calculating the width of the offset, the wrong register was used in one spot.
2014-10-16	arch: Use shared_ptr for all Faults	Andreas Hansson
	This patch takes quite a large step in transitioning from the ad-hoc RefCountingPtr to the c++11 shared_ptr by adopting its use for all Faults. There are no changes in behaviour, and the code modifications are mostly just replacing "new" with "make_shared".
2014-01-27	x86: use lfpimm instead of limm for fptan	Nilay Vaish

2014-01-27	x86: implements x87 add/sub instructions	Nilay Vaish

2014-01-27	x86: implements fxch instruction.	Nilay Vaish

2013-11-26	x86: Implementation of Int3 and Int_Ib in long mode	Christian Menard
	This is an implementation of the x86 int3 and int immediate instructions for long mode according to 'AMD64 Programmers Manual Volume 3'.
2013-09-30	x86: Add support for FXSAVE, FXSAVE64, FXRSTOR, and FXRSTOR64	Andreas Sandberg

2013-09-30	x86: Add support for FLDENV & FNSTENV	Andreas Sandberg

2013-09-30	x86: Add support for loading 32-bit and 80-bit floats in the x87	Andreas Sandberg
	The x87 FPU supports three floating point formats: 32-bit, 64-bit, and 80-bit floats. The current gem5 implementation supports 32-bit and 64-bit floats, but only works correctly for 64-bit floats. This changeset fixes the 32-bit float handling by correctly loading and rounding (using truncation) 32-bit floats instead of simply truncating the bit pattern. 80-bit floats are loaded by first loading the 80-bits of the float to two temporary integer registers. A micro-op (cvtint_fp80) then converts the contents of the two integer registers to the internal FP representation (double). Similarly, when storing an 80-bit float, there are two conversion routines (ctvfp80h_int and cvtfp80l_int) that convert an internal FP register to 80-bit and stores the upper 64-bits or lower 32-bits to an integer register, which is the written to memory using normal integer stores.
2013-09-30	x86: Fix re-entrancy problems in x87 store instructions	Andreas Sandberg
	X87 store instructions typically loads and pops the top value of the stack and stores it in memory. The current implementation pops the stack at the same time as the floating point value is loaded to a temporary register. This will corrupt the state of the x87 stack if the store fails. This changeset introduces a pop87 micro-instruction that pops the stack and uses this instruction in the affected macro-instructions to pop the stack after storing the value to memory.
2013-06-18	x86: Fix loading of floating point constants	Andreas Sandberg
	This changeset actually fixes two issues: * The lfpimm instruction didn't work correctly when applied to a floating point constant (it did work for integers containing the bit string representation of a constant) since it used reinterpret_cast to convert a double to a uint64_t. This caused a compilation error, at least, in gcc 4.6.3. * The instructions loading floating point constants in the x87 processor didn't work correctly since they just stored a truncated integer instead of a double in the floating point register. This changeset fixes the old microcode by using lfpimm instruction instead of the limm instructions.
2013-06-18	x86: Make fprem like the fprem on a real x87	Andreas Sandberg
	The current implementation of fprem simply does an fmod and doesn't simulate any of the iterative behavior in a real fprem. This isn't normally a problem, however, it can lead to problems when switching between CPU models. If switching from a real CPU in the middle of an fprem loop to a simulated CPU, the output of the fprem loop becomes correupted. This changeset changes the fprem implementation to work like the one on real hardware.
2013-06-18	x86: Fix the flag handling code in FABS and FCHS	Andreas Sandberg
	This changeset fixes two problems in the FABS and FCHS implementation. First, the ISA parser expects the assignment in flag_code to be a pure assignment and not an and-assignment, which leads to the isa_parser omitting the misc reg update. Second, the FCHS and FABS macro-ops don't set the SetStatus flag, which means that the default micro-op version, which doesn't update FSW, is executed.
2013-05-21	x86: mark instructions for being function call/return	Nilay Vaish
	Currently call and return instructions are marked as IsCall and IsReturn. Thus, the branch predictor does not use RAS for these instructions. Similarly, the number of function calls that took place is recorded as 0. This patch marks these instructions as they should be.
2013-04-23	x86: increment the stack pointer in lret inst	Christian Menard
	The 'lret' instruction reloads instruction pointer and code segment from the stack and then pops them. But the popping part is missing from the current implementation. This caused incorrect behavior in some code related to the Fiasco OS. Microops are being added to rectify the behavior of the instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-03-11	x86: implement some of the x87 instructions	Nilay Vaish
	This patch implements ftan, fprem, fyl2x, fld* floating-point instructions.
2013-01-15	x86: implements fsin, fcos instructions	Nilay Vaish

2013-01-15	x86: implements emms instruction	Nilay Vaish

2013-01-15	x86: implement fabs, fchs instructions	Nilay Vaish

2012-12-30	x86: implement x87 fp instruction fnstsw	Nilay Vaish
	This patch implements the fnstsw instruction. The code was originally written by Vince Weaver. Gabe had made some comments about the code, but those were never addressed. This patch addresses those comments.
2012-12-30	x86: implement x87 fp instruction fsincos	Nilay Vaish
	This patch implements the fsincos instruction. The code was originally written by Vince Weaver. Gabe had made some comments about the code, but those were never addressed. This patch addresses those comments.
2012-05-19	x86 ISA: Implement the sse3 haddps instruction.	Marc Orr
	Shuffle the 32 bit values into position, and then add in parallel.
2012-04-29	X86: Fix the IMUL_R_P_I macroop.	Gabe Black
	The disp displacement was left off the load microop so the wrong value was used.
2012-01-09	X86: Add memory fence to I/O instructions	Nilay Vaish

2011-11-03	x86: Add microop for fence	Nilay Vaish
	This patch adds a new microop for memory barrier. The microop itself does nothing, but since it is marked as a memory barrier, the O3 CPU should flush all the pending loads and stores before the fence to the memory system.
2011-05-06	X86: Fix the Lldt instructions so they load the ldtr and not the tr.	Gabe Black

2011-03-01	X86: Mark IO reads and writes as non-speculative.	Gabe Black

2011-02-07	X86: Use all 64 bits of the lstar register in the SYSCALL_64 macroop.	Tim Harris
	During SYSCALL_64, use dataSize=8 when handling new rip (ref http://www.intel.com/Assets/PDF/manual/253668.pdf 5.8.8 IA32_LSTAR is a 64-bit address)
2011-02-07	X86: Fix JMP_FAR_I to unpack a far pointer correctly.	Tim Harris
	JMP_FAR_I was unpacking its far pointer operand using sll instead of srl like it should, and also putting the components in the wrong registers for use by other microcode.
2011-02-07	X86: Read the LDT/GDT at CPL0 when executing an iret.	Tim Harris
	During iret access LDT/GDT at CPL0 rather than after transition to user mode (if I'm reading the Intel IA-64 architecture spec correctly, the contents of the descriptor table are read before the CPL is updated).
2011-02-02	X86: Replace the stupd microop with a store/update sequence.	Gabe Black

2010-09-29	X86: Fix the RIP relative versions of the BT, BTC, BTR, and BTS instructions.	Gabe Black

2010-08-23	X86: Mark serializing macroops and regular instructions as such.	Gabe Black

2010-07-21	Fix x86 XCHG macro-op to use locked micro-ops for all memory accesses	Tushar Krishna

2010-05-23	copyright: Change HP copyright on x86 code to be more friendly	Nathan Binkert

2009-12-19	X86: Add a common named flag for signed media operations.	Gabe Black

2009-12-19	X86: Create a common flag with a name to indicate high multiplies.	Gabe Black

2009-12-19	X86: Create a common flag with a name to indicate scalar media instructions.	Gabe Black

2009-11-10	X86: Fix bugs in movd implementation.	Vince Weaver
	Unfortunately my implementation of the movd instruction had two bugs. In one case, when moving a 32-bit value into an xmm register, the lower half of the xmm register was not zero extended. The other case is that xmm was used instead of xmmlm as the source for a register move. My test case didn't notice this at first as it moved xmm0 to eax, which both have the same register number.