summaryrefslogtreecommitdiff
path: root/src/arch
AgeCommit message (Collapse)Author
2014-01-24mem: Remove explict cast from memhelper.Ali Saidi
Previously we were casting the result type to the the memory type which is incorrect for things like dual-memory operations which still return a single result.
2014-01-24mem: per-thread cache occupancy and per-block agesDam Sunwoo
This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump.
2014-01-24x86: Fix memory leak in table walkerAndreas Hansson
This patch fixes a memory leak in the table walker, by ensuring that the sender state is deleted again if the request packet cannot be successfully sent.
2013-12-29mips: Floating point convert bug fixChristopher Torng
In mips architecture, floating point convert instructions use the FloatConvertOp format defined in src/arch/mips/isa/formats/fp.isa. The type of the operands in the ISA description file (_sw for signed word, or _sf for signed float, etc.) is used to create a type for the operand in C++. Then the operand is converted using the fpConvert() function in src/arch/mips/utility.cc. If we are converting from a word to a float, and we want to convert 0xffffffff, we expect -1 to be passed into fpConvert(). Instead, we see MAX_INT passed in. Then fpConvert() converts _val_ to MAX_INT in single-precision floating point, and we get the wrong value. To fix it, the signs of the convert operands are being changed from unsigned to signed in the MIPS ISA description. Then, the FloatConvertOp format is being changed to insert a int32_t into the C++ code instead of a uint32_t. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-11-26x86: Implementation of Int3 and Int_Ib in long modeChristian Menard
This is an implementation of the x86 int3 and int immediate instructions for long mode according to 'AMD64 Programmers Manual Volume 3'.
2013-10-31ARM: add support for TEEHBR accessChander Sudanthi
Thumb2 ARM kernels may access the TEEHBR via thumbee_notifier in arch/arm/kernel/thumbee.c. The Linux kernel code just seems to be saving and restoring the register. This patch adds support for the TEEHBR cp14 register. Note, this may be a special case when restoring from an image that was run on a system that supports ThumbEE.
2013-10-31mem: Add privilege info to request classPrakash Ramrakhyani
This patch adds a flag in the request class that indicates if the request was made in privileged mode.
2013-10-17arm: Accomodate function name changes in newer linux kernelsEric Van Hensbergen
2013-10-15arch/x86: add support for explicit CC register fileYasuko Eckert
Convert condition code registers from being specialized ("pseudo") integer registers to using the recently added CC register class. Nilay Vaish also contributed to this patch.
2013-10-15cpu: add a condition-code register classYasuko Eckert
Add a third register class for condition codes, in parallel with the integer and FP classes. No ISAs use the CC class at this point though.
2013-10-15cpu: rename *_DepTag constants to *_Reg_BaseSteve Reinhardt
Make these names more meaningful. Specifically, made these substitutions: s/FP_Base_DepTag/FP_Reg_Base/g; s/Ctrl_Base_DepTag/Misc_Reg_Base/g; s/Max_DepTag/Max_Reg_Index/g;
2013-10-15isa: clean up register constantsSteve Reinhardt
Clean up and add some consistency to the *_Base_DepTag constants as well as some related register constants: - Get rid of NumMiscArchRegs, TotalArchRegs, and TotalDataRegs since they're never used and not always defined - Set FP_Base_DepTag = NumIntRegs when possible (i.e., every case except x86) - Set Ctrl_Base_DepTag = FP_Base_DepTag + NumFloatRegs (this was true before, but wasn't always expressed that way) - Drastically reduce the number of arbitrary constants appearing in these calculations
2013-10-15cpu: clean up architectural register classificationSteve Reinhardt
Move from a poorly documented scheme where the mapping of unified architectural register indices to register classes is hardcoded all over to one where there's an enum for the register classes and a function that encapsulates the mapping.
2013-10-15mem: Rename the ASI_BITS flag field in RequestAndreas Sandberg
ASI_BITS in the Request object were originally used to store a memory request's ASI on SPARC. This is not the case any more since other ISAs use the ASI bits to store architecture-dependent information. This changeset renames the ASI_BITS to ARCH_BITS which better describes their use. Additionally, the getAsi() accessor is renamed to getArchFlags().
2013-10-15mem: Use a flag instead of address bit 63 for generic IPRsAndreas Sandberg
Using address bit 63 to identify generic IPRs caused problems on SPARC, where IPRs are heavily used. This changeset redefines how generic IPRs are identified. Instead of using bit 63, we now use a separate flag (GENERIC_IPR) a memory request.
2013-10-07x86: enables lstat and readlink syscallsNilay Vaish
2013-09-30x86: Add support for m5ops through a memory mapped interfaceAndreas Sandberg
In order to support m5ops in virtualized environments, we need to use a memory mapped interface. This changeset adds support for that by reserving 0xFFFF0000-0xFFFFFFFF and mapping those to the generic IPR interface for m5ops. The mapping is done in the X86ISA::TLB::finalizePhysical() which means that it just works for all of the CPU models, including virtualized ones.
2013-09-30arch: Add support for m5ops using mmapped IPRsAndreas Sandberg
In order to support m5ops on virtualized CPUs, we need to either intercept hypercall instructions or provide a memory mapped m5ops interface. Since KVM does not normally pass the results of hypercalls to userspace, which makes that method unfeasible. This changeset introduces support for m5ops using memory mapped mmapped IPRs. This is implemented by adding a class of "generic" IPRs which are handled by architecture-independent code. Such IPRs always have bit 63 set and are handled by handleGenericIprRead() and handleGenericIprWrite(). Platform specific impementations of handleIprRead and handleIprWrite should use GenericISA::isGenericIprAccess to determine if an IPR address should be handled by the generic code instead of the architecture-specific code. Platforms that don't need their own IPR support can reuse GenericISA::handleIprRead() and GenericISA::handleIprWrite().
2013-09-30x86: Add support for FXSAVE, FXSAVE64, FXRSTOR, and FXRSTOR64Andreas Sandberg
2013-09-30x86: Add support for FLDENV & FNSTENVAndreas Sandberg
2013-09-30x86: Add support for loading 32-bit and 80-bit floats in the x87Andreas Sandberg
The x87 FPU supports three floating point formats: 32-bit, 64-bit, and 80-bit floats. The current gem5 implementation supports 32-bit and 64-bit floats, but only works correctly for 64-bit floats. This changeset fixes the 32-bit float handling by correctly loading and rounding (using truncation) 32-bit floats instead of simply truncating the bit pattern. 80-bit floats are loaded by first loading the 80-bits of the float to two temporary integer registers. A micro-op (cvtint_fp80) then converts the contents of the two integer registers to the internal FP representation (double). Similarly, when storing an 80-bit float, there are two conversion routines (ctvfp80h_int and cvtfp80l_int) that convert an internal FP register to 80-bit and stores the upper 64-bits or lower 32-bits to an integer register, which is the written to memory using normal integer stores.
2013-09-30x86: Fix re-entrancy problems in x87 store instructionsAndreas Sandberg
X87 store instructions typically loads and pops the top value of the stack and stores it in memory. The current implementation pops the stack at the same time as the floating point value is loaded to a temporary register. This will corrupt the state of the x87 stack if the store fails. This changeset introduces a pop87 micro-instruction that pops the stack and uses this instruction in the affected macro-instructions to pop the stack after storing the value to memory.
2013-09-30x86: Add support routines to load and store 80-bit floatsAndreas Sandberg
The x87 FPU on x86 supports extended floating point. We currently handle all floating point on x86 as double and don't support 80-bit loads/stores. This changeset add a utility function to load and convert 80-bit floats to doubles (loadFloat80) and another function to store doubles as 80-bit floats (storeFloat80). Both functions use libfputils to do the conversion in software. The functions are currently not used, but are required to handle floating point in KVM and to properly support all x87 loads/stores.
2013-09-30x86: Add limited support for extracting function call argumentsAndreas Sandberg
Add support for extracting the first 6 64-bit integer argumements to a function call in X86ISA::getArgument().
2013-09-19x86: Add support routines to convert between x87 tag formatsAndreas Sandberg
This changeset adds the convX87XTagsToTags() and convX87TagsToXTags() which convert between the tag formats in the FTW register and the format used in the xsave area. The conversion from to the x87 FTW representation is currently loses some information since it does not reconstruct the valid/zero/special flags which are not included in the xsave representation.
2013-09-18x86: Expose the raw hash map of MSRsAndreas Sandberg
This patch allows the KVM CPU module to initialize it's MSRs by enumerating the MSRs in the gem5 x86 implementation.
2013-09-18x86: Add support for checking the raw state of an interruptAndreas Sandberg
In order to support hardware virtualization, we need to be able to check if there are any interrupts pending irregardless of the rflags.intf value. This changeset adds the checkInterruptsRaw() method to the x86 interrupt control. It returns true if there are pending interrupts that can be delivered as soon as the CPU is ready for interrupt delivery.
2013-09-18x86: Expose the interrupt vector in faultsAndreas Sandberg
This patch allows a hardware virtualized CPU to discover which interrupt to deliver to the guest.
2013-09-04arch: Resurrect the NOISA build target and rename it NULLAndreas Hansson
This patch makes it possible to once again build gem5 without any ISA. The main purpose is to enable work around the interconnect and memory system without having to build any CPU models or device models. The regress script is updated to include the NULL ISA target. Currently no regressions make use of it, but all the testers could (and perhaps should) transition to it. --HG-- rename : build_opts/NOISA => build_opts/NULL rename : src/arch/noisa/SConsopts => src/arch/null/SConsopts rename : src/arch/noisa/cpu_dummy.hh => src/arch/null/cpu_dummy.hh rename : src/cpu/intr_control.cc => src/cpu/intr_control_noisa.cc
2013-09-04alpha: Move system virtProxy to Alpha onlyAndreas Hansson
This patch moves the system virtual port proxy to the Alpha system only to make the resurrection of the NOISA slightly less painful. Alpha is the only ISA that is actually using it.
2013-08-19alpha: Check interrupts before quiesceAndreas Hansson
This patch adds a check to the quiesce operation to ensure that the CPU does not suspend itself when there are unmasked interrupts pending. Without this patch there are corner cases when the CPU gets an interrupt before the quiesce is executed and then never wakes up again.
2013-08-07x86: add tlb checkpointingNilay Vaish
This patch adds checkpointing support to x86 tlb. It upgrades the cpt_upgrader.py script so that previously created checkpoints can be updated. It moves the checkpoint version to 6.
2013-07-18mem: Set the cache line size on a system levelAndreas Hansson
This patch removes the notion of a peer block size and instead sets the cache line size on the system level. Previously the size was set per cache, and communicated through the interconnect. There were plenty checks to ensure that everyone had the same size specified, and these checks are now removed. Another benefit that is not yet harnessed is that the cache line size is now known at construction time, rather than after the port binding. Hence, the block size can be locally stored and does not have to be queried every time it is used. A follow-on patch updates the configuration scripts accordingly.
2013-07-11dev: make BasicPioDevice take size in constructorSteve Reinhardt
Instead of relying on derived classes explicitly assigning to the BasicPioDevice pioSize field, require them to pass a size value in to the constructor. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-11dev: consistently end device classes in 'Device'Steve Reinhardt
PciDev and IntDev stuck out as the only device classes that ended in 'Dev' rather than 'Device'. This patch takes care of that inconsistency. Note that you may need to delete pre-existing files matching build/*/python/m5/internal/param_* as scons does not pick up indirect dependencies on imported python modules when generating params, and the PciDev -> PciDevice rename takes place in a file (dev/Device.py) that gets imported quite a bit. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-11devices: make more classes derive from BasicPioDeviceSteve Reinhardt
A couple of devices that have single fixed memory mapped regions were not derived from BasicPioDevice, when that's exactly the functionality that BasicPioDevice provides. This patch gets rid of a little bit of redundant code by making those devices actually do so. Also fixed the weird case of X86ISA::Interrupts, where the class already did derive from BasicPioDevice but didn't actually use all the features it could have. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-06-27sim: Add the notion of clock domains to all ClockedObjectsAkash Bagdia
This patch adds the notion of source- and derived-clock domains to the ClockedObjects. As such, all clock information is moved to the clock domain, and the ClockedObjects are grouped into domains. The clock domains are either source domains, with a specific clock period, or derived domains that have a parent domain and a divider (potentially chained). For piece of logic that runs at a derived clock (a ratio of the clock its parent is running at) the necessary derived clock domain is created from its corresponding parent clock domain. For now, the derived clock domain only supports a divider, thus ensuring a lower speed compared to its parent. Multiplier functionality implies a PLL logic that has not been modelled yet (create a separate clock instead). The clock domains should be used as a mechanism to provide a controllable clock source that affects clock for every clocked object lying beneath it. The clock of the domain can (in a future patch) be controlled by a handler responsible for dynamic frequency scaling of the respective clock domains. All the config scripts have been retro-fitted with clock domains. For the System a default SrcClockDomain is created. For CPUs that run at a different speed than the system, there is a seperate clock domain created. This domain incorporates the CPU and the associated caches. As before, Ruby runs under its own clock domain. The clock period of all domains are pre-computed, such that no virtual functions or multiplications are needed when calling clockPeriod. Instead, the clock period is pre-computed when any changes occur. For this to be possible, each clock domain tracks its children.
2013-06-18x86: Add support for maintaining the x87 tag wordAndreas Sandberg
The current implementation of the x87 never updates the x87 tag word. This is currently not a big issue since the simulated x87 never checks for stack overflows, however this becomes an issue when switching between a virtualized CPU and a simulated CPU. This changeset adds support, which is enabled by default, for updating the tag register to every floating point microop that updates the stack top using the spm mechanism. The new tag words is generated by the helper function X86ISA::genX87Tags(). This function is currently limited to flagging a stack position as valid or invalid and does not try to distinguish between the valid, zero, and special states.
2013-06-18x86: Fix loading of floating point constantsAndreas Sandberg
This changeset actually fixes two issues: * The lfpimm instruction didn't work correctly when applied to a floating point constant (it did work for integers containing the bit string representation of a constant) since it used reinterpret_cast to convert a double to a uint64_t. This caused a compilation error, at least, in gcc 4.6.3. * The instructions loading floating point constants in the x87 processor didn't work correctly since they just stored a truncated integer instead of a double in the floating point register. This changeset fixes the old microcode by using lfpimm instruction instead of the limm instructions.
2013-06-18x86: Initialize the MXCSR registerAndreas Sandberg
2013-06-18x86: Make the boot state VMX compliantAndreas Sandberg
This patch allows the default x86 state to be used when by CPUs that use hardware virtualization.
2013-06-18x86: Make fprem like the fprem on a real x87Andreas Sandberg
The current implementation of fprem simply does an fmod and doesn't simulate any of the iterative behavior in a real fprem. This isn't normally a problem, however, it can lead to problems when switching between CPU models. If switching from a real CPU in the middle of an fprem loop to a simulated CPU, the output of the fprem loop becomes correupted. This changeset changes the fprem implementation to work like the one on real hardware.
2013-06-18x86: Add helper functions to access rflagsAndreas Sandberg
The rflags register is spread across several different registers. Most of the flags are stored in MISCREG_RFLAGS, but some are stored in microcode registers. When accessing RFLAGS, we need to reconstruct it from these registers. This changeset adds two functions, X86ISA::getRFlags() and X86ISA::setRFlags(), that take care of this magic.
2013-06-18x86: Fix the flag handling code in FABS and FCHSAndreas Sandberg
This changeset fixes two problems in the FABS and FCHS implementation. First, the ISA parser expects the assignment in flag_code to be a pure assignment and not an and-assignment, which leads to the isa_parser omitting the misc reg update. Second, the FCHS and FABS macro-ops don't set the SetStatus flag, which means that the default micro-op version, which doesn't update FSW, is executed.
2013-06-11x86: Fix bug when copying TSC on CPU handoverAndreas Sandberg
The TSC value stored in MISCREG_TSC is actually just an offset from the current CPU cycle to the actual TSC value. Writes with side-effects to the TSC subtract the current cycle count before storing the new value, while reads add the current cycle count. When switching CPUs, the current value is copied without side-effects. This works as long as the source and the destination CPUs have the same clock frequencies. The TSC will jump, sometimes backwards, if they have different clock frequencies. Most OSes assume the TSC to be monotonic and break when this happens. This changeset makes sure that the TSC is copied with side-effects to ensure that the offset is updated to match the new CPU.
2013-06-03arch: Create a method to finalize physical addressesAndreas Sandberg
in the TLB Some architectures (currently only x86) require some fixing-up of physical addresses after a normal address translation. This is usually to remap devices such as the APIC, but could be used for other memory mapped devices as well. When running the CPU in a using hardware virtualization, we still need to do these address fix-ups before inserting the request into the memory system. This patch moves this patch allows that code to be used by such CPUs without doing full address translations.
2013-05-21x86: Squash outstanding walks when instructions are squashed.Gedare Bloom
This is the x86 version of the ARM changeset baa17ba80e06. In case an instruction has been squashed by the o3 cpu, this patch allows page table walker to avoid carrying out a pending translation that the instruction requested for.
2013-05-21x86: mark instructions for being function call/returnNilay Vaish
Currently call and return instructions are marked as IsCall and IsReturn. Thus, the branch predictor does not use RAS for these instructions. Similarly, the number of function calls that took place is recorded as 0. This patch marks these instructions as they should be.
2013-05-21x86: add op class for int and fp microops in isa descriptionNilay Vaish
Currently all the integer microops are marked as IntAluOp and the floating point microops are marked as FloatAddOp. This patch adds support for marking different microops differently. Now IntMultOp, IntDivOp, FloatDivOp, FloatMultOp, FloatCvtOp, FloatSqrtOp classes will be used as well. This will help in providing different latencies for different op class.
2013-05-14arm: Add support for the m5fail pseudo-opAndreas Sandberg