gem5 - gem5

Age	Commit message (Collapse)	Author
2014-01-24	mem: Remove explict cast from memhelper.	Ali Saidi
	Previously we were casting the result type to the the memory type which is incorrect for things like dual-memory operations which still return a single result.
2014-01-24	mem: per-thread cache occupancy and per-block ages	Dam Sunwoo
	This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump.
2014-01-24	x86: Fix memory leak in table walker	Andreas Hansson
	This patch fixes a memory leak in the table walker, by ensuring that the sender state is deleted again if the request packet cannot be successfully sent.
2013-12-29	mips: Floating point convert bug fix	Christopher Torng
	In mips architecture, floating point convert instructions use the FloatConvertOp format defined in src/arch/mips/isa/formats/fp.isa. The type of the operands in the ISA description file (_sw for signed word, or _sf for signed float, etc.) is used to create a type for the operand in C++. Then the operand is converted using the fpConvert() function in src/arch/mips/utility.cc. If we are converting from a word to a float, and we want to convert 0xffffffff, we expect -1 to be passed into fpConvert(). Instead, we see MAX_INT passed in. Then fpConvert() converts _val_ to MAX_INT in single-precision floating point, and we get the wrong value. To fix it, the signs of the convert operands are being changed from unsigned to signed in the MIPS ISA description. Then, the FloatConvertOp format is being changed to insert a int32_t into the C++ code instead of a uint32_t. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-11-26	x86: Implementation of Int3 and Int_Ib in long mode	Christian Menard
	This is an implementation of the x86 int3 and int immediate instructions for long mode according to 'AMD64 Programmers Manual Volume 3'.
2013-10-31	ARM: add support for TEEHBR access	Chander Sudanthi
	Thumb2 ARM kernels may access the TEEHBR via thumbee_notifier in arch/arm/kernel/thumbee.c. The Linux kernel code just seems to be saving and restoring the register. This patch adds support for the TEEHBR cp14 register. Note, this may be a special case when restoring from an image that was run on a system that supports ThumbEE.
2013-10-31	mem: Add privilege info to request class	Prakash Ramrakhyani
	This patch adds a flag in the request class that indicates if the request was made in privileged mode.
2013-10-17	arm: Accomodate function name changes in newer linux kernels	Eric Van Hensbergen

2013-10-15	arch/x86: add support for explicit CC register file	Yasuko Eckert
	Convert condition code registers from being specialized ("pseudo") integer registers to using the recently added CC register class. Nilay Vaish also contributed to this patch.
2013-10-15	cpu: add a condition-code register class	Yasuko Eckert
	Add a third register class for condition codes, in parallel with the integer and FP classes. No ISAs use the CC class at this point though.
2013-10-15	cpu: rename _DepTag constants to _Reg_Base	Steve Reinhardt
	Make these names more meaningful. Specifically, made these substitutions: s/FP_Base_DepTag/FP_Reg_Base/g; s/Ctrl_Base_DepTag/Misc_Reg_Base/g; s/Max_DepTag/Max_Reg_Index/g;
2013-10-15	isa: clean up register constants	Steve Reinhardt
	Clean up and add some consistency to the *_Base_DepTag constants as well as some related register constants: - Get rid of NumMiscArchRegs, TotalArchRegs, and TotalDataRegs since they're never used and not always defined - Set FP_Base_DepTag = NumIntRegs when possible (i.e., every case except x86) - Set Ctrl_Base_DepTag = FP_Base_DepTag + NumFloatRegs (this was true before, but wasn't always expressed that way) - Drastically reduce the number of arbitrary constants appearing in these calculations
2013-10-15	cpu: clean up architectural register classification	Steve Reinhardt
	Move from a poorly documented scheme where the mapping of unified architectural register indices to register classes is hardcoded all over to one where there's an enum for the register classes and a function that encapsulates the mapping.
2013-10-15	mem: Rename the ASI_BITS flag field in Request	Andreas Sandberg
	ASI_BITS in the Request object were originally used to store a memory request's ASI on SPARC. This is not the case any more since other ISAs use the ASI bits to store architecture-dependent information. This changeset renames the ASI_BITS to ARCH_BITS which better describes their use. Additionally, the getAsi() accessor is renamed to getArchFlags().
2013-10-15	mem: Use a flag instead of address bit 63 for generic IPRs	Andreas Sandberg
	Using address bit 63 to identify generic IPRs caused problems on SPARC, where IPRs are heavily used. This changeset redefines how generic IPRs are identified. Instead of using bit 63, we now use a separate flag (GENERIC_IPR) a memory request.
2013-10-07	x86: enables lstat and readlink syscalls	Nilay Vaish

2013-09-30	x86: Add support for m5ops through a memory mapped interface	Andreas Sandberg
	In order to support m5ops in virtualized environments, we need to use a memory mapped interface. This changeset adds support for that by reserving 0xFFFF0000-0xFFFFFFFF and mapping those to the generic IPR interface for m5ops. The mapping is done in the X86ISA::TLB::finalizePhysical() which means that it just works for all of the CPU models, including virtualized ones.
2013-09-30	arch: Add support for m5ops using mmapped IPRs	Andreas Sandberg
	In order to support m5ops on virtualized CPUs, we need to either intercept hypercall instructions or provide a memory mapped m5ops interface. Since KVM does not normally pass the results of hypercalls to userspace, which makes that method unfeasible. This changeset introduces support for m5ops using memory mapped mmapped IPRs. This is implemented by adding a class of "generic" IPRs which are handled by architecture-independent code. Such IPRs always have bit 63 set and are handled by handleGenericIprRead() and handleGenericIprWrite(). Platform specific impementations of handleIprRead and handleIprWrite should use GenericISA::isGenericIprAccess to determine if an IPR address should be handled by the generic code instead of the architecture-specific code. Platforms that don't need their own IPR support can reuse GenericISA::handleIprRead() and GenericISA::handleIprWrite().
2013-09-30	x86: Add support for FXSAVE, FXSAVE64, FXRSTOR, and FXRSTOR64	Andreas Sandberg

2013-09-30	x86: Add support for FLDENV & FNSTENV	Andreas Sandberg

2013-09-30	x86: Add support for loading 32-bit and 80-bit floats in the x87	Andreas Sandberg
	The x87 FPU supports three floating point formats: 32-bit, 64-bit, and 80-bit floats. The current gem5 implementation supports 32-bit and 64-bit floats, but only works correctly for 64-bit floats. This changeset fixes the 32-bit float handling by correctly loading and rounding (using truncation) 32-bit floats instead of simply truncating the bit pattern. 80-bit floats are loaded by first loading the 80-bits of the float to two temporary integer registers. A micro-op (cvtint_fp80) then converts the contents of the two integer registers to the internal FP representation (double). Similarly, when storing an 80-bit float, there are two conversion routines (ctvfp80h_int and cvtfp80l_int) that convert an internal FP register to 80-bit and stores the upper 64-bits or lower 32-bits to an integer register, which is the written to memory using normal integer stores.
2013-09-30	x86: Fix re-entrancy problems in x87 store instructions	Andreas Sandberg
	X87 store instructions typically loads and pops the top value of the stack and stores it in memory. The current implementation pops the stack at the same time as the floating point value is loaded to a temporary register. This will corrupt the state of the x87 stack if the store fails. This changeset introduces a pop87 micro-instruction that pops the stack and uses this instruction in the affected macro-instructions to pop the stack after storing the value to memory.
2013-09-30	x86: Add support routines to load and store 80-bit floats	Andreas Sandberg
	The x87 FPU on x86 supports extended floating point. We currently handle all floating point on x86 as double and don't support 80-bit loads/stores. This changeset add a utility function to load and convert 80-bit floats to doubles (loadFloat80) and another function to store doubles as 80-bit floats (storeFloat80). Both functions use libfputils to do the conversion in software. The functions are currently not used, but are required to handle floating point in KVM and to properly support all x87 loads/stores.
2013-09-30	x86: Add limited support for extracting function call arguments	Andreas Sandberg
	Add support for extracting the first 6 64-bit integer argumements to a function call in X86ISA::getArgument().
2013-09-19	x86: Add support routines to convert between x87 tag formats	Andreas Sandberg
	This changeset adds the convX87XTagsToTags() and convX87TagsToXTags() which convert between the tag formats in the FTW register and the format used in the xsave area. The conversion from to the x87 FTW representation is currently loses some information since it does not reconstruct the valid/zero/special flags which are not included in the xsave representation.
2013-09-18	x86: Expose the raw hash map of MSRs	Andreas Sandberg
	This patch allows the KVM CPU module to initialize it's MSRs by enumerating the MSRs in the gem5 x86 implementation.
2013-09-18	x86: Add support for checking the raw state of an interrupt	Andreas Sandberg
	In order to support hardware virtualization, we need to be able to check if there are any interrupts pending irregardless of the rflags.intf value. This changeset adds the checkInterruptsRaw() method to the x86 interrupt control. It returns true if there are pending interrupts that can be delivered as soon as the CPU is ready for interrupt delivery.
2013-09-18	x86: Expose the interrupt vector in faults	Andreas Sandberg
	This patch allows a hardware virtualized CPU to discover which interrupt to deliver to the guest.
2013-09-04	arch: Resurrect the NOISA build target and rename it NULL	Andreas Hansson
	This patch makes it possible to once again build gem5 without any ISA. The main purpose is to enable work around the interconnect and memory system without having to build any CPU models or device models. The regress script is updated to include the NULL ISA target. Currently no regressions make use of it, but all the testers could (and perhaps should) transition to it. --HG-- rename : build_opts/NOISA => build_opts/NULL rename : src/arch/noisa/SConsopts => src/arch/null/SConsopts rename : src/arch/noisa/cpu_dummy.hh => src/arch/null/cpu_dummy.hh rename : src/cpu/intr_control.cc => src/cpu/intr_control_noisa.cc
2013-09-04	alpha: Move system virtProxy to Alpha only	Andreas Hansson
	This patch moves the system virtual port proxy to the Alpha system only to make the resurrection of the NOISA slightly less painful. Alpha is the only ISA that is actually using it.
2013-08-19	alpha: Check interrupts before quiesce	Andreas Hansson
	This patch adds a check to the quiesce operation to ensure that the CPU does not suspend itself when there are unmasked interrupts pending. Without this patch there are corner cases when the CPU gets an interrupt before the quiesce is executed and then never wakes up again.
2013-08-07	x86: add tlb checkpointing	Nilay Vaish
	This patch adds checkpointing support to x86 tlb. It upgrades the cpt_upgrader.py script so that previously created checkpoints can be updated. It moves the checkpoint version to 6.
2013-07-18	mem: Set the cache line size on a system level	Andreas Hansson
	This patch removes the notion of a peer block size and instead sets the cache line size on the system level. Previously the size was set per cache, and communicated through the interconnect. There were plenty checks to ensure that everyone had the same size specified, and these checks are now removed. Another benefit that is not yet harnessed is that the cache line size is now known at construction time, rather than after the port binding. Hence, the block size can be locally stored and does not have to be queried every time it is used. A follow-on patch updates the configuration scripts accordingly.
2013-07-11	dev: make BasicPioDevice take size in constructor	Steve Reinhardt
	Instead of relying on derived classes explicitly assigning to the BasicPioDevice pioSize field, require them to pass a size value in to the constructor. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-11	dev: consistently end device classes in 'Device'	Steve Reinhardt
	PciDev and IntDev stuck out as the only device classes that ended in 'Dev' rather than 'Device'. This patch takes care of that inconsistency. Note that you may need to delete pre-existing files matching build//python/m5/internal/param_ as scons does not pick up indirect dependencies on imported python modules when generating params, and the PciDev -> PciDevice rename takes place in a file (dev/Device.py) that gets imported quite a bit. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-11	devices: make more classes derive from BasicPioDevice	Steve Reinhardt
	A couple of devices that have single fixed memory mapped regions were not derived from BasicPioDevice, when that's exactly the functionality that BasicPioDevice provides. This patch gets rid of a little bit of redundant code by making those devices actually do so. Also fixed the weird case of X86ISA::Interrupts, where the class already did derive from BasicPioDevice but didn't actually use all the features it could have. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-06-27	sim: Add the notion of clock domains to all ClockedObjects	Akash Bagdia
	This patch adds the notion of source- and derived-clock domains to the ClockedObjects. As such, all clock information is moved to the clock domain, and the ClockedObjects are grouped into domains. The clock domains are either source domains, with a specific clock period, or derived domains that have a parent domain and a divider (potentially chained). For piece of logic that runs at a derived clock (a ratio of the clock its parent is running at) the necessary derived clock domain is created from its corresponding parent clock domain. For now, the derived clock domain only supports a divider, thus ensuring a lower speed compared to its parent. Multiplier functionality implies a PLL logic that has not been modelled yet (create a separate clock instead). The clock domains should be used as a mechanism to provide a controllable clock source that affects clock for every clocked object lying beneath it. The clock of the domain can (in a future patch) be controlled by a handler responsible for dynamic frequency scaling of the respective clock domains. All the config scripts have been retro-fitted with clock domains. For the System a default SrcClockDomain is created. For CPUs that run at a different speed than the system, there is a seperate clock domain created. This domain incorporates the CPU and the associated caches. As before, Ruby runs under its own clock domain. The clock period of all domains are pre-computed, such that no virtual functions or multiplications are needed when calling clockPeriod. Instead, the clock period is pre-computed when any changes occur. For this to be possible, each clock domain tracks its children.
2013-06-18	x86: Add support for maintaining the x87 tag word	Andreas Sandberg
	The current implementation of the x87 never updates the x87 tag word. This is currently not a big issue since the simulated x87 never checks for stack overflows, however this becomes an issue when switching between a virtualized CPU and a simulated CPU. This changeset adds support, which is enabled by default, for updating the tag register to every floating point microop that updates the stack top using the spm mechanism. The new tag words is generated by the helper function X86ISA::genX87Tags(). This function is currently limited to flagging a stack position as valid or invalid and does not try to distinguish between the valid, zero, and special states.
2013-06-18	x86: Fix loading of floating point constants	Andreas Sandberg
	This changeset actually fixes two issues: * The lfpimm instruction didn't work correctly when applied to a floating point constant (it did work for integers containing the bit string representation of a constant) since it used reinterpret_cast to convert a double to a uint64_t. This caused a compilation error, at least, in gcc 4.6.3. * The instructions loading floating point constants in the x87 processor didn't work correctly since they just stored a truncated integer instead of a double in the floating point register. This changeset fixes the old microcode by using lfpimm instruction instead of the limm instructions.
2013-06-18	x86: Initialize the MXCSR register	Andreas Sandberg

2013-06-18	x86: Make the boot state VMX compliant	Andreas Sandberg
	This patch allows the default x86 state to be used when by CPUs that use hardware virtualization.
2013-06-18	x86: Make fprem like the fprem on a real x87	Andreas Sandberg
	The current implementation of fprem simply does an fmod and doesn't simulate any of the iterative behavior in a real fprem. This isn't normally a problem, however, it can lead to problems when switching between CPU models. If switching from a real CPU in the middle of an fprem loop to a simulated CPU, the output of the fprem loop becomes correupted. This changeset changes the fprem implementation to work like the one on real hardware.
2013-06-18	x86: Add helper functions to access rflags	Andreas Sandberg
	The rflags register is spread across several different registers. Most of the flags are stored in MISCREG_RFLAGS, but some are stored in microcode registers. When accessing RFLAGS, we need to reconstruct it from these registers. This changeset adds two functions, X86ISA::getRFlags() and X86ISA::setRFlags(), that take care of this magic.
2013-06-18	x86: Fix the flag handling code in FABS and FCHS	Andreas Sandberg
	This changeset fixes two problems in the FABS and FCHS implementation. First, the ISA parser expects the assignment in flag_code to be a pure assignment and not an and-assignment, which leads to the isa_parser omitting the misc reg update. Second, the FCHS and FABS macro-ops don't set the SetStatus flag, which means that the default micro-op version, which doesn't update FSW, is executed.
2013-06-11	x86: Fix bug when copying TSC on CPU handover	Andreas Sandberg
	The TSC value stored in MISCREG_TSC is actually just an offset from the current CPU cycle to the actual TSC value. Writes with side-effects to the TSC subtract the current cycle count before storing the new value, while reads add the current cycle count. When switching CPUs, the current value is copied without side-effects. This works as long as the source and the destination CPUs have the same clock frequencies. The TSC will jump, sometimes backwards, if they have different clock frequencies. Most OSes assume the TSC to be monotonic and break when this happens. This changeset makes sure that the TSC is copied with side-effects to ensure that the offset is updated to match the new CPU.
2013-06-03	arch: Create a method to finalize physical addresses	Andreas Sandberg
	in the TLB Some architectures (currently only x86) require some fixing-up of physical addresses after a normal address translation. This is usually to remap devices such as the APIC, but could be used for other memory mapped devices as well. When running the CPU in a using hardware virtualization, we still need to do these address fix-ups before inserting the request into the memory system. This patch moves this patch allows that code to be used by such CPUs without doing full address translations.
2013-05-21	x86: Squash outstanding walks when instructions are squashed.	Gedare Bloom
	This is the x86 version of the ARM changeset baa17ba80e06. In case an instruction has been squashed by the o3 cpu, this patch allows page table walker to avoid carrying out a pending translation that the instruction requested for.
2013-05-21	x86: mark instructions for being function call/return	Nilay Vaish
	Currently call and return instructions are marked as IsCall and IsReturn. Thus, the branch predictor does not use RAS for these instructions. Similarly, the number of function calls that took place is recorded as 0. This patch marks these instructions as they should be.
2013-05-21	x86: add op class for int and fp microops in isa description	Nilay Vaish
	Currently all the integer microops are marked as IntAluOp and the floating point microops are marked as FloatAddOp. This patch adds support for marking different microops differently. Now IntMultOp, IntDivOp, FloatDivOp, FloatMultOp, FloatCvtOp, FloatSqrtOp classes will be used as well. This will help in providing different latencies for different op class.
2013-05-14	arm: Add support for the m5fail pseudo-op	Andreas Sandberg