gem5 - gem5

Age	Commit message (Collapse)	Author
2014-09-03	arm: Fix v8 neon latency issue for loads/stores	Mitch Hayenga
	Neon memory ops that operate on multiple registers currently have very poor performance because of interleave/deinterleave micro-ops. This patch marks the deinterleave/interleave micro-ops as "No_OpClass" such that they take minumum cycles to execute and are never resource constrained. Additionaly the micro-ops over-read registers. Although one form may need to read up to 20 sources, not all do. This adds in new forms so false dependencies are not modeled. Instructions read their minimum number of sources.
2014-04-29	arm: use condition code registers for ARM ISA	Curtis Dunham
	Analogous to ee049bf (for x86). Requires a bump of the checkpoint version and corresponding upgrader code to move the condition code register values to the new register file.
2014-09-03	arm: ISA X31 destination register fix	Andrew Bardsley
	This patch substituted the zero register for X31 used as a destination register. This prevents false dependencies based on X31.
2014-09-03	arm: Mark v7 cbz instructions as direct branches	Mitch Hayenga
	v7 cbz/cbnz instructions were improperly marked as indirect branches.
2014-05-27	arm: support 16kb vm granules	Curtis Dunham

2014-09-03	arch, cpu: Factor out the ExecContext into a proper base class	Andreas Sandberg
	We currently generate and compile one version of the ISA code per CPU model. This is obviously wasting a lot of resources at compile time. This changeset factors out the interface into a separate ExecContext class, which also serves as documentation for the interface between CPUs and the ISA code. While doing so, this changeset also fixes up interface inconsistencies between the different CPU models. The main argument for using one set of ISA code per CPU model has always been performance as this avoid indirect branches in the generated code. However, this argument does not hold water. Booting Linux on a simulated ARM system running in atomic mode (opt/10.linux-boot/realview-simple-atomic) is actually 2% faster (compiled using clang 3.4) after applying this patch. Additionally, compilation time is decreased by 35%.
2014-09-03	arch: Cleanup unused ISA traits constants	Andreas Hansson
	This patch prunes unused values, and also unifies how the values are defined (not using an enum for ALPHA), aligning the use of int vs Addr etc. The patch also removes the duplication of PageBytes/PageShift and VMPageSize/LogVMPageSize. For all ISAs the two pairs had identical values and the latter has been removed.
2014-09-03	config: Change parsing of Addr so hex values work from scripts	Mitch Hayenga
	When passed from a configuration script with a hexadecimal value (like "0x80000000"), gem5 would error out. This is because it would call "toMemorySize" which requires the argument to end with a size specifier (like 1MB, etc). This modification makes it so raw hex values can be passed through Addr parameters from the configuration scripts.
2014-09-03	arm: Fix ExtMachInst hash operator underlying type	Andreas Hansson
	This patch fixes the hash operator used for ARM ExtMachInst, which incorrectly was still using uint32_t. Instead of changing it to uint64_t it is not using the underlying data type of the BitUnion.
2014-08-28	mem: adding architectural page table support for SE mode	Alexandru
	This patch enables the use of page tables that are stored in system memory and respect x86 specification, in SE mode. It defines an architectural page table for x86 as a MultiLevelPageTable class and puts a placeholder class for other ISAs page tables, giving the possibility for future implementation.
2014-08-13	arm: change MISCREG_L2ERRSR to warn not fail	Dam Sunwoo
	Some newer binaries compiled for Versatile Express TC2 contain access to implementation specific L2MERRSR registers. This causes an infinite loop of undefined exceptions. This patch changes the behavior to "warn not fail" to keep the workloads going.
2014-03-11	arm: remove dead code fplib mul64x64	Curtis Dunham

2014-05-12	syscall emulation: clean up & comment SyscallReturn	Steve Reinhardt

2014-04-17	arm: Make sure UndefinedInstructions are properly initialized	Ali Saidi

2014-04-17	arm: allow DC instructions by default so SE mode works	Ali Saidi

2014-04-17	sim, arm: implement more of the at variety syscalls	Ali Saidi
	Needed for new AArch64 binaries
2014-05-09	arm: Add branch flags onto macroops	Andrew Bardsley
	Mark branch flags onto macroops to allow branch prediction before microop decomposition
2014-05-09	arm: add preliminary ISA splits for ARM arch	Curtis Dunham

2014-05-09	arch: teach ISA parser how to split code across files	Curtis Dunham
	This patch encompasses several interrelated and interdependent changes to the ISA generation step. The end goal is to reduce the size of the generated compilation units for instruction execution and decoding so that batch compilation can proceed with all CPUs active without exhausting physical memory. The ISA parser (src/arch/isa_parser.py) has been improved so that it can accept 'split [output_type];' directives at the top level of the grammar and 'split(output_type)' python calls within 'exec {{ ... }}' blocks. This has the effect of "splitting" the files into smaller compilation units. I use air-quotes around "splitting" because the files themselves are not split, but preprocessing directives are inserted to have the same effect. Architecturally, the ISA parser has had some changes in how it works. In general, it emits code sooner. It doesn't generate per-CPU files, and instead defers to the C preprocessor to create the duplicate copies for each CPU type. Likewise there are more files emitted and the C preprocessor does more substitution that used to be done by the ISA parser. Finally, the build system (SCons) needs to be able to cope with a dynamic list of source files coming out of the ISA parser. The changes to the SCons{cript,truct} files support this. In broad strokes, the targets requested on the command line are hidden from SCons until all the build dependencies are determined, otherwise it would try, realize it can't reach the goal, and terminate in failure. Since build steps (i.e. running the ISA parser) must be taken to determine the file list, several new build stages have been inserted at the very start of the build. First, the build dependencies from the ISA parser will be emitted to arch/$ISA/generated/inc.d, which is then read by a new SCons builder to finalize the dependencies. (Once inc.d exists, the ISA parser will not need to be run to complete this step.) Once the dependencies are known, the 'Environments' are made by the makeEnv() function. This function used to be called before the build began but now happens during the build. It is easy to see that this step is quite slow; this is a known issue and it's important to realize that it was already slow, but there was no obvious cause to attribute it to since nothing was displayed to the terminal. Since new steps that used to be performed serially are now in a potentially-parallel build phase, the pathname handling in the SCons scripts has been tightened up to deal with chdir() race conditions. In general, pathnames are computed earlier and more likely to be stored, passed around, and processed as absolute paths rather than relative paths. In the end, some of these issues had to be fixed by inserting serializing dependencies in the build. Minor note: For the null ISA, we just provide a dummy inc.d so SCons is never compelled to try to generate it. While it seems slightly wrong to have anything in src/arch/*/generated (i.e. a non-generated 'generated' file), it's by far the simplest solution.
2014-05-09	arch, arm: Preserve TLB bootUncacheability when switching CPUs	Geoffrey Blake
	The ARM TLBs have a bootUncacheability flag used to make some loads and stores become uncacheable when booting in FS mode. Later the flag is cleared to let those loads and stores operate as normal. When doing a takeOverFrom(), this flag's state is not preserved and is momentarily reset until the CPSR is touched. On single core runs this is a non-issue. On multi-core runs this can lead to crashes on the O3 CPU model from the following series of events: 1) takeOverFrom executed to switch from Atomic -> O3 2) All bootUncacheability flags are reset to true 3) Core2 tries to execute a load covered by bootUncacheability, it is flagged as uncacheable 4) Core2's load needs to replay due to a pipeline flush 3) Core1 core does an action on CPSR 4) The handling code for CPSR then checks all other cores to determine if bootUncacheability can be set to false 5) Asynchronously set bootUncacheability on all cores to false 6) Core2 replays load previously set as uncacheable and notices it is now flagged as cacheable, leads to a panic. This patch implements takeOverFrom() functionality for the ARM TLBs to preserve flag values when switching from atomic -> detailed.
2014-05-09	cpu, arm: Allow the specification of a socket field	Akash Bagdia
	Allow the specification of a socket ID for every core that is reflected in the MPIDR field in ARM systems. This allows studying multi-socket / cluster systems with ARM CPUs.
2014-05-09	arm: Panics in miscreg read functions can be tripped by O3 model	Geoffrey Blake
	Unimplemented miscregs for the generic timer were guarded by panics in arm/isa.cc which can be tripped by the O3 model if it speculatively executes a wrong path containing a mrs instruction with a bad miscreg index. These registers were flagged as implemented and accessible. This patch changes the miscreg info bit vector to flag them as unimplemented and inaccessible. In this case, and UndefinedInst fault will be generated if the register access is not trapped by a hypervisor.
2014-05-09	arch: remove inline specifiers on all inst constrs, all ISAs	Curtis Dunham
	With (upcoming) separate compilation, they are useless. Only link-time optimization could re-inline them, but ideally feedback-directed optimization would choose to do so only for profitable (i.e. common) instructions.
2014-05-09	arm: cleanup ARM ISA definition	Curtis Dunham

2014-04-23	arm: Correctly display disassembly of vldmia/vstmia	Curtis Dunham
	The MicroMemOp class generates the disassembly for both integer and floating point instructions, but it would always print its first operand as an integer register without considering that the op may be a floating instruction in which case a float register should be displayed instead.
2014-04-23	arm: Don't use a stack allocated mnemonic	Mitchell Hayenga
	FailUnimplemented passed a stack created mnemonic as a const char * which causes some grief when the stack goes away.
2014-03-23	arm: m5ops readfile64 args broken, offset coming through garbage	Eric Van Hensbergen
	There were several sections of the m5ops code which were essentially copy/pasted versions of the 32-bit code. The problem is that some of these didn't account fo4 64-bit registers leading to arguments being in the wrong registers. This patch addresses the args for readfile64, writefile64, and addsymbol64 -- all of which seemed to suffer from a similar set of problems when moving to 64-bit.
2014-03-07	arm: Handle functional TLB walks properly	Geoffrey Blake
	The table walker code currently accounts for two types of walks, Atomic and Timing, and treats them differently. Atomic walks keep a single instance of WalkerState around for all walks to use in currState. Timing mode keeps a queue of in-flight WalkerStates and maintains currState as NULL between walks. If a functional walk is done during Timing mode, it is treated as an atomic walk and either creates a persistent WalkerState if in between Timing walks, or stomps an existing currState for an in-progress Timing walk. This patch distinguishes functional walks as being able to exist at any time and sets up a temporary WalkerState for its exclusive use and then cleans up when finished, leaving any in progress Atomic or Timing walks undisturbed.
2014-03-07	scons: Fixes uninitialized warnings issued by clang	Mitch Hayenga
	Small fixes to appease recent clang versions.
2014-03-07	arm: Fix uninitialised warning with gcc 4.8	Stephan Diestelhorst
	Small fix for a warning that prevents compilation with gcc 4.8.1 due to detecting that a variable might be uninitialised. The fix is to assign a safe default.
2014-01-28	arm: Enable umask syscall in SE mode	Mitch Hayenga
	Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-24	arm: Add support for ARMv8 (AArch64 & AArch32)	ARM gem5 Developers
	Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64 kernel you are restricted to AArch64 user-mode binaries. This will be addressed in a later patch. Note: Virtualization is only supported in AArch32 mode. This will also be fixed in a later patch. Contributors: Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation) Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation) Mbou Eyole (AArch64 NEON, validation) Ali Saidi (AArch64 Linux support, code integration, validation) Edmund Grimley-Evans (AArch64 FP) William Wang (AArch64 Linux support) Rene De Jong (AArch64 Linux support, performance opt.) Matt Horsnell (AArch64 MP, validation) Matt Evans (device models, code integration, validation) Chris Adeniyi-Jones (AArch64 syscall-emulation) Prakash Ramrakhyani (validation) Dam Sunwoo (validation) Chander Sudanthi (validation) Stephan Diestelhorst (validation) Andreas Hansson (code integration, performance opt.) Eric Van Hensbergen (performance opt.) Gabe Black
2014-01-24	arch: Make all register index flattening const	Andreas Hansson
	This patch makes all the register index flattening methods const for all the ISAs. As part of this, readMiscRegNoEffect for ARM is also made const.
2014-01-24	cpu: Add CPU support for generatig wake up events when LLSC adresses are ↵	Ali Saidi
	snooped. This patch add support for generating wake-up events in the CPU when an address that is currently in the exclusive state is hit by a snoop. This mechanism is required for ARMv8 multi-processor support.
2014-01-24	mem: per-thread cache occupancy and per-block ages	Dam Sunwoo
	This patch enables tracking of cache occupancy per thread along with ages (in buckets) per cache blocks. Cache occupancy stats are recalculated on each stat dump.
2013-10-31	ARM: add support for TEEHBR access	Chander Sudanthi
	Thumb2 ARM kernels may access the TEEHBR via thumbee_notifier in arch/arm/kernel/thumbee.c. The Linux kernel code just seems to be saving and restoring the register. This patch adds support for the TEEHBR cp14 register. Note, this may be a special case when restoring from an image that was run on a system that supports ThumbEE.
2013-10-31	mem: Add privilege info to request class	Prakash Ramrakhyani
	This patch adds a flag in the request class that indicates if the request was made in privileged mode.
2013-10-17	arm: Accomodate function name changes in newer linux kernels	Eric Van Hensbergen

2013-10-15	cpu: add a condition-code register class	Yasuko Eckert
	Add a third register class for condition codes, in parallel with the integer and FP classes. No ISAs use the CC class at this point though.
2013-10-15	cpu: rename _DepTag constants to _Reg_Base	Steve Reinhardt
	Make these names more meaningful. Specifically, made these substitutions: s/FP_Base_DepTag/FP_Reg_Base/g; s/Ctrl_Base_DepTag/Misc_Reg_Base/g; s/Max_DepTag/Max_Reg_Index/g;
2013-10-15	cpu: clean up architectural register classification	Steve Reinhardt
	Move from a poorly documented scheme where the mapping of unified architectural register indices to register classes is hardcoded all over to one where there's an enum for the register classes and a function that encapsulates the mapping.
2013-09-30	arch: Add support for m5ops using mmapped IPRs	Andreas Sandberg
	In order to support m5ops on virtualized CPUs, we need to either intercept hypercall instructions or provide a memory mapped m5ops interface. Since KVM does not normally pass the results of hypercalls to userspace, which makes that method unfeasible. This changeset introduces support for m5ops using memory mapped mmapped IPRs. This is implemented by adding a class of "generic" IPRs which are handled by architecture-independent code. Such IPRs always have bit 63 set and are handled by handleGenericIprRead() and handleGenericIprWrite(). Platform specific impementations of handleIprRead and handleIprWrite should use GenericISA::isGenericIprAccess to determine if an IPR address should be handled by the generic code instead of the architecture-specific code. Platforms that don't need their own IPR support can reuse GenericISA::handleIprRead() and GenericISA::handleIprWrite().
2013-07-18	mem: Set the cache line size on a system level	Andreas Hansson
	This patch removes the notion of a peer block size and instead sets the cache line size on the system level. Previously the size was set per cache, and communicated through the interconnect. There were plenty checks to ensure that everyone had the same size specified, and these checks are now removed. Another benefit that is not yet harnessed is that the cache line size is now known at construction time, rather than after the port binding. Hence, the block size can be locally stored and does not have to be queried every time it is used. A follow-on patch updates the configuration scripts accordingly.
2013-06-03	arch: Create a method to finalize physical addresses	Andreas Sandberg
	in the TLB Some architectures (currently only x86) require some fixing-up of physical addresses after a normal address translation. This is usually to remap devices such as the APIC, but could be used for other memory mapped devices as well. When running the CPU in a using hardware virtualization, we still need to do these address fix-ups before inserting the request into the memory system. This patch moves this patch allows that code to be used by such CPUs without doing full address translations.
2013-05-14	arm: Add support for the m5fail pseudo-op	Andreas Sandberg

2013-04-22	arm: Add a method to query interrupt state ignoring CPSR masks	Andreas Sandberg
	Add the method checkRaw to ArmISA::Interrupts. This method can be used to query the raw state (ignoring CPSR masks) of an interrupt. It is primarily intended for hardware virtualized CPUs.
2013-04-22	arm: Enable support for triggering a sim panic on kernel panics	Andreas Sandberg
	Add the options 'panic_on_panic' and 'panic_on_oops' to the LinuxArmSystem SimObject. When these option are enabled, the simulator panics when the guest kernel panics or oopses. Enable panic on panic and panic on oops in ARM-based test cases.
2013-04-22	sim: Add helper functions that add PCEvents with custom arguments	Andreas Sandberg
	This changeset adds support for forwarding arguments to the PC event constructors to following methods: addKernelFuncEvent addFuncEvent Additionally, this changeset adds the following helper method to the System base class: addFuncEventOrPanic - Hook a PCEvent to a symbol, panic on failure. addKernelFuncEventOrPanic - Hook a PCEvent to a kernel symbol, panic on failure. System implementations have been updated to use the new functionality where appropriate.
2013-04-17	arm: set ldr_ret_uop as conditional or unconditional control	Nathanael Premillieu
	This patch adds a missing flag to the ldr_ret_uop microop instruction. The flag is added when the instruction is used, not directly in the constructor of the instruction. Committed by: Nilay Vaish <nilay@cs.wisc.edu>"
2013-03-04	ARM: fix some cases where instructions that write to fp reg 15 are ↵	Ali Saidi
	accidently branches.