gem5 - gem5

Age	Commit message (Collapse)	Author
2013-09-30	x86: Add support for FXSAVE, FXSAVE64, FXRSTOR, and FXRSTOR64	Andreas Sandberg

2013-09-30	x86: Add support for FLDENV & FNSTENV	Andreas Sandberg

2013-09-30	x86: Add support for loading 32-bit and 80-bit floats in the x87	Andreas Sandberg
	The x87 FPU supports three floating point formats: 32-bit, 64-bit, and 80-bit floats. The current gem5 implementation supports 32-bit and 64-bit floats, but only works correctly for 64-bit floats. This changeset fixes the 32-bit float handling by correctly loading and rounding (using truncation) 32-bit floats instead of simply truncating the bit pattern. 80-bit floats are loaded by first loading the 80-bits of the float to two temporary integer registers. A micro-op (cvtint_fp80) then converts the contents of the two integer registers to the internal FP representation (double). Similarly, when storing an 80-bit float, there are two conversion routines (ctvfp80h_int and cvtfp80l_int) that convert an internal FP register to 80-bit and stores the upper 64-bits or lower 32-bits to an integer register, which is the written to memory using normal integer stores.
2013-09-30	x86: Fix re-entrancy problems in x87 store instructions	Andreas Sandberg
	X87 store instructions typically loads and pops the top value of the stack and stores it in memory. The current implementation pops the stack at the same time as the floating point value is loaded to a temporary register. This will corrupt the state of the x87 stack if the store fails. This changeset introduces a pop87 micro-instruction that pops the stack and uses this instruction in the affected macro-instructions to pop the stack after storing the value to memory.
2013-09-30	kvm: Add support for thread-specific instruction events	Andreas Sandberg
	Instruction events are currently ignored when executing in KVM. This changeset adds support for triggering KVM exits based on instruction counts using hardware performance counters. Depending on the underlying performance counter implementation, there might be some inaccuracies due to instructions being counted in the host kernel when entering/exiting KVM. Due to limitations/bugs in Linux's performance counter interface, we can't reliably change the period of an overflow counter. We work around this issue by detaching and reattaching the counter if we need to reconfigure it.
2013-09-30	kvm: FPU synchronization support on x86	Andreas Sandberg
	This changeset adds support for synchronizing the FPU and SIMD state of a virtual x86 CPU with gem5. It supports both the XSave API and the KVM_(GET\|SET)_FPU kernel API. The XSave interface can be disabled using the useXSave parameter (in case of kernel issues). Unfortunately, KVM_(GET\|SET)_FPU interface seems to be buggy in some kernels (specifically, the MXCSR register isn't always synchronized), which means that it might not be possible to synchronize MXCSR on old kernels without the XSave interface. This changeset depends on the __float80 type in gcc and might not build using llvm.
2013-09-30	x86: Add support routines to load and store 80-bit floats	Andreas Sandberg
	The x87 FPU on x86 supports extended floating point. We currently handle all floating point on x86 as double and don't support 80-bit loads/stores. This changeset add a utility function to load and convert 80-bit floats to doubles (loadFloat80) and another function to store doubles as 80-bit floats (storeFloat80). Both functions use libfputils to do the conversion in software. The functions are currently not used, but are required to handle floating point in KVM and to properly support all x87 loads/stores.
2013-09-30	x86: Add limited support for extracting function call arguments	Andreas Sandberg
	Add support for extracting the first 6 64-bit integer argumements to a function call in X86ISA::getArgument().
2013-09-30	kvm: x86: Fix segment registers to make them VMX compatible	Andreas Sandberg
	There are cases when the segment registers in gem5 are not compatible with VMX. This changeset works around all known such issues. Specifically: * The accessed bits in CS, SS, DD, ES, FS, GS are forced to 1. * The busy bit in TR is forced to 1. * The protection level of SS is forced to the same protection level as CS. The difference /seems/ to be caused by a bug in gem5's x86 implementation.
2013-09-25	kvm: Add x86 segment register verification to help debugging	Andreas Sandberg

2013-09-25	kvm: Initial x86 support	Andreas Sandberg
	This changeset adds support for KVM on x86. Full support is split across a number of commits since some features are relatively complex. This changeset includes support for: * Integer state synchronization (including segment regs) * CPUID (gem5's CPUID values are inserted into KVM) * x86 legacy IO (remapped and handled by gem5's memory system) * Memory mapped IO * PCI * MSRs * State dumping Most of the functionality is fairly straight forward. There are some quirks to support PCI enumerations since this is done in the TLB(!) in the simulated CPUs. We currently replicate some of that code. Unlike the ARM implementation, the x86 implementation of the virtual CPU does not use the cycles hardware counter. KVM on x86 simulates the time stamp counter (TSC) in the kernel. If we just measure host cycles using perfevent, we might end up measuring a slightly different number of cycles. If we don't get the cycle accounting right, we might end up rewinding the TSC, with all kinds of chaos as a result. An additional feature of the KVM CPU on x86 is extended state dumping. This enables Python scripts controlling the simulator to request dumping of a subset of the processor state. The following methods are currenlty supported: * dumpFpuRegs * dumpIntRegs * dumpSpecRegs * dumpDebugRegs * dumpXCRs * dumpXSave * dumpVCpuEvents * dumpMSRs Known limitations: * M5 ops are currently not supported. * FPU synchronization is not supported (only affects CPU switching). Both of the limitations will be addressed in separate commits.
2013-09-19	kvm: Correctly handle the return value from handleIpr(Read\|Write)	Andreas Sandberg
	The KVM base class incorrectly assumed that handleIprRead and handleIprWrite both return ticks. This is not the case, instead they return cycles. This changeset converts the returned cycles to ticks when handling IPR accesses.
2013-09-19	kvm: Fix a case where the run timers weren't armed properly	Andreas Sandberg
	There is a possibility that the timespec used to arm a timer becomes zero if the number of ticks used when arming a timer is close to the resolution of the timer. Due to the semantics of POSIX timers, this actually disarms the timer. This changeset fixes this issue by eliminating the rounding error (we always round away from zero now). It also reuses the minimum number of cycles, which were previously only used for cycle-based timers, to calculate a more useful resolution.
2013-09-19	x86: Add support routines to convert between x87 tag formats	Andreas Sandberg
	This changeset adds the convX87XTagsToTags() and convX87TagsToXTags() which convert between the tag formats in the FTW register and the format used in the xsave area. The conversion from to the x87 FTW representation is currently loses some information since it does not reconstruct the valid/zero/special flags which are not included in the xsave representation.
2013-09-18	sim: Fix undefined behavior in the pseudo-inst interface	Andreas Sandberg
	The order between updating and using arg_num in PseudoInst::pseudoInst() is currently undefined. This changeset explicitly updates arg_num after it has been used to extract an argument. --HG-- extra : rebase_source : 67c46dc3333d16ce56687ee8aea41ce6c6d133bb
2013-09-18	mem: Fix scheduling bug in SimpleMemory	Andreas Hansson
	This patch ensures that a dequeue event is not scheduled if the memory controller is waiting for a retry already. Without this check it is possible for the controller to attempt sending something whilst already having one packet that is in retry, thus causing the bus to have an assertion failure.
2013-09-18	swig: Fix issue with circular import in 2.0.9/2.0.10	Andreas Hansson
	This patch fixes an issue which prevented gem5 from running when built using swig 2.0.9 and 2.0.10. The generated event.py tried to import m5.internal which in turn relied on importing event. This patch seems to fix the problem, and so far has not caused any other issues.
2013-09-18	x86: Expose the raw hash map of MSRs	Andreas Sandberg
	This patch allows the KVM CPU module to initialize it's MSRs by enumerating the MSRs in the gem5 x86 implementation.
2013-09-18	x86: Add support for checking the raw state of an interrupt	Andreas Sandberg
	In order to support hardware virtualization, we need to be able to check if there are any interrupts pending irregardless of the rflags.intf value. This changeset adds the checkInterruptsRaw() method to the x86 interrupt control. It returns true if there are pending interrupts that can be delivered as soon as the CPU is ready for interrupt delivery.
2013-09-18	x86: Expose the interrupt vector in faults	Andreas Sandberg
	This patch allows a hardware virtualized CPU to discover which interrupt to deliver to the guest.
2013-09-11	ruby: Fix Topology throttle connections	Joel Hestness
	The Topology source sets up input and output buffers for each of the external nodes of a topology by indexing on Ruby's generated controller unique IDs. These unique IDs are found by adding the MachineType_base_number to the version number of each controller (see any generated *_Controller.cc - init() calls getToNetQueue and getFromNetQueue using m_version + base). However, the Topology object used the cntrl_id - which is required to be unique across all controllers - to index the controllers list as they are being connected to their input and output buffers. If the cntrl_ids did not match the Ruby unique ID, the throttles end up connected to incorrectly indexed nodes in the network, resulting in packets traversing incorrect network paths. This patch fixes the Topology indexing scheme by using the Ruby unique ID to match that of the SimpleNetwork buffer vectors.
2013-09-11	cpu: Dynamically instantiate O3 CPU LSQUnits	Joel Hestness
	Previously, the LSQ would instantiate MaxThreads LSQUnits in the body of it's object, but it would only initialize numThreads LSQUnits as specified by the user. This had the effect of leaving some LSQUnits uninitialized when the number of threads was less than MaxThreads, and when adding statistics to the LSQUnit that must be initialized, this caused the stats initialization check to fail. By dynamically instantiating LSQUnits, they are all initialized and this avoids uninitialized LSQUnits from floating around during runtime.
2013-09-11	ruby: Statically allocate stats in SimpleNetwork, Switch, Throttle	Joel Hestness
	The previous changeset (9863:9483739f83ee) used STL vector containers to dynamically allocate stats in the Ruby SimpleNetwork, Switch and Throttle. For gcc versions before at least 4.6.3, this causes the standard vector allocator to call Stats copy constructors (a no-no, since stats should be allocated in the body of each SimObject instance). Since the size of these stats arrays is known at compile time (NOTE: after code generation), this patch changes their allocation to be static rather than using an STL vector.
2013-09-09	stats: add operator= for DataWrapVec class	Nilay Vaish
	gcc/g++ 4.4.7 complained about the operator= being undefined. This changeset adds the operator.
2013-09-06	ruby: network: convert to gem5 style stats	Nilay Vaish

2013-09-06	ruby: profiler: removes function resourceUsage()	Nilay Vaish

2013-09-06	ruby: remove undefined message size type	Nilay Vaish
	This message size type does not work well with one of the statistical variables. It also seems unnecessary.
2013-09-06	ruby: network: removes reset functionality	Nilay Vaish

2013-09-06	ruby: network: shorten variable names	Nilay Vaish

2013-09-06	stats: adds a Formula operator for division	Nilay Vaish

2013-09-06	ruby: converts sparse memory stats to gem5 style	Nilay Vaish

2013-09-05	sim: Fix clang warning for unused variable	Andreas Hansson
	This patch ensures the NULL ISA can build without causing issues with an unused variable.
2013-09-04	util: Add ini string as tooltip info in dot output	Andreas Hansson
	This patch adds the config ini string as a tooltip that can be displayed in most browsers rendering the resulting svg. Certain characters are modified for HTML output. Tested on chrome and firefox.
2013-09-04	util: Add colours to the dot output	Andreas Hansson
	This patch is adding a splash of colour to the dot output to make it easier to distinguish objects of different types. As a bonus, the pastel-colour palette also makes the output look like a something from the 21st century.
2013-09-04	util: Add class name to dot graph and output to svg	Andreas Hansson
	This patch adds the class name to the label, creates some more space by increasing the rank separation, and additionally outputs the graph as an editable SVG in addition to the PDF.
2013-09-04	arch: Resurrect the NOISA build target and rename it NULL	Andreas Hansson
	This patch makes it possible to once again build gem5 without any ISA. The main purpose is to enable work around the interconnect and memory system without having to build any CPU models or device models. The regress script is updated to include the NULL ISA target. Currently no regressions make use of it, but all the testers could (and perhaps should) transition to it. --HG-- rename : build_opts/NOISA => build_opts/NULL rename : src/arch/noisa/SConsopts => src/arch/null/SConsopts rename : src/arch/noisa/cpu_dummy.hh => src/arch/null/cpu_dummy.hh rename : src/cpu/intr_control.cc => src/cpu/intr_control_noisa.cc
2013-09-04	cpu: Move the branch predictor out of the BaseCPU	Andreas Hansson
	The branch predictor is guarded by having either the in-order or out-of-order CPU as one of the available CPU models and therefore should not be used in the BaseCPU. This patch moves the parameter to the relevant CPU classes.
2013-09-04	arch: Header clean up for NOISA resurrection	Andreas Hansson
	This patch is a first step to getting NOISA working again. A number of redundant includes make life more difficult than it has to be and this patch simply removes them. There are also some redundant forward declarations removed.
2013-09-04	alpha: Move system virtProxy to Alpha only	Andreas Hansson
	This patch moves the system virtual port proxy to the Alpha system only to make the resurrection of the NOISA slightly less painful. Alpha is the only ISA that is actually using it.
2013-09-04	scons: Enable build on OSX	Andreas Hansson
	This patch changes the SConscript to build gem5 with libc++ on OSX as the conventional libstdc++ does not have the C++11 constructs that the current code base makes use of (e.g. std::forward). Since this was the last use of the transitional TR1, the unordered map and set header can now be simplified as well.
2013-08-20	cpu: Fix timing CPU isDrained comment formatting	Andreas Hansson
	This patch fixes up the comment formatting for isDrained in the timing CPU.
2013-08-20	base: Fix VectorPrint initialisation	Andreas Hansson
	This patch changes how the initialisation of the VectorPrint struct is done so that gcc 4.4 is happy again.
2013-08-19	stats: Cumulative stats update	Andreas Hansson
	This patch updates the stats to reflect the: 1) addition of the internal queue in SimpleMemory, 2) moving of the memory class outside FSConfig, 3) fixing up of the 2D vector printing format, 4) specifying burst size and interface width for the DRAM instead of relying on cache-line size, 5) performing merging in the DRAM controller write buffer, and 6) fixing how idle cycles are counted in the atomic and timing CPU models. The main reason for bundling them up is to minimise the changeset size.
2013-08-19	cpu: Accurately count idle cycles for simple cpu	Lena Olson
	Added a couple missing updates to the notIdleFraction stat. Without these, it sometimes gives a (not) idle fraction that is greater than 1 or less than 0.
2013-08-19	config: Command line support for multi-channel memory	Andreas Hansson
	This patch adds support for specifying multi-channel memory configurations on the command line, e.g. 'se/fs.py --mem-type=ddr3_1600_x64 --mem-channels=4'. To enable this, it enhances the functionality of MemConfig and moves the existing makeMultiChannel class method from SimpleDRAM to the support scripts. The se/fs.py example scripts are updated to make use of the new feature.
2013-08-19	mem: Change AbstractMemory defaults to match the common case	Andreas Hansson
	This patch changes the default parameter value of conf_table_reported to match the common case. It also simplifies the regression and config scripts to reflect this change.
2013-08-19	cpu: Fix TrafficGen trace playback	Sascha Bischoff
	This patch addresses an issue with trace playback in the TrafficGen where the trace was reset but the header was not read from the trace when a captured trace was played back for a second time. This resulted in parsing errors as the expected message was not found in the trace file. The header check is moved to an init funtion which is called by the constructor and when the trace is reset. This ensures that the trace header is read each time when the trace is replayed. This patch also addresses a small formatting issue in a panic.
2013-08-19	mem: Use STL deque in favour of list for DRAM queues	Andreas Hansson
	This patch changes the data structure used for the DRAM read, write and response queues from an STL list to deque. This optimisation is based on the observation that the size is small (and fixed), and that the structures are frequently iterated over in a linear fashion.
2013-08-19	mem: Perform write merging in the DRAM write queue	Andreas Hansson
	This patch implements basic write merging in the DRAM to avoid redundant bursts. When a new access is added to the queue it is compared against the existing entries, and if it is either intersecting or immediately succeeding/preceeding an existing item it is merged. There is currently no attempt made at avoiding iterating over the existing items in determining whether merging is possible or not.
2013-08-19	mem: Replacing bytesPerCacheLine with DRAM burstLength in SimpleDRAM	Amin Farmahini
	This patch gets rid of bytesPerCacheLine parameter and makes the DRAM configuration separate from cache line size. Instead of bytesPerCacheLine, we define a parameter for the DRAM called burst_length. The burst_length parameter shows the length of a DRAM device burst in bits. Also, lines_per_rowbuffer is replaced with device_rowbuffer_size to improve code portablity. This patch adds a burst length in beats for each memory type, an interface width for each memory type, and the memory controller model is extended to reason about "system" packets vs "dram" packets and assemble the responses properly. It means that system packets larger than a full burst are split into multiple dram packets.