gem5 - gem5

Age	Commit message (Collapse)	Author
2013-01-31	mem: Separate out the different cases for DRAM bus busy time	Andreas Hansson
	This patch changes how the data bus busy time is calculated such that it is delayed to the actual scheduling time of the request as opposed to being done as soon as possible. This patch changes a bunch of statistics, and the stats update is bundled together with the introruction of tFAW/tTAW and the named DRAM configurations like DDR3 and LPDDR2.
2013-01-28	cache: remove drainManager because it's not used	Anthony Gutierrez
	the cache drainManager is set but never cleared, this is because the cache itself does not need to be drained and thus never triggers a signalDrainDone(). because the drainManager variable is not used properly and does not appear to be necessary it has been removed with this patch.
2013-01-28	merged 798c2cec8e37 and e96ff45795bc	Nilay Vaish

2013-01-28	stats: Fix naming (BPredUnit to branchPred) for 20.parser ARM o3	Andreas Hansson
	This patch bumps the stats for 20.parser for ARM o3-timing to reflect a namechange of the branch predictor.
2013-01-28	ruby: remove get_time()	Nilay Vaish
	This patch replaces get_time() in *.sm files with curCycle() which is now possible since controllers are clocked objects.
2013-01-28	ruby: remove call to curCycle in panic()	Nilay Vaish
	The panic() function already prints the current tick value. This call to curCycle() is as such redundant. Since we are trying to move towards multiple clock domains, this call will print misleading time.
2013-01-24	regressions: update stats due to branch predictor changes	Nilay Vaish
	The actual statistical values are being updated for only two tests belonging to sparc architecture and inorder cpu: 00.hello and 02.insttest. For others the patch updates config.ini and name changes to statistical variables.
2013-01-24	branch predictor: move out of o3 and inorder cpus	Nilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)
	This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh
2013-01-22	o3 cpu: fix zero reg problem	Andrea Pellegrini
	There was an issue w/ the rename logic, which would assign a previous physical register to the ZeroReg architectural register in x86. This issue was giving problems for instructions squashed in threads w/ ID different from 0, sometimes allowing non-mispredicted instructions to obtain a value different from zero when reading the zeroReg.
2013-01-22	x86, cpu: corrects 270c9a75e91f, take over decoder on cpu switch	Nilay Vaish
	The changes made by the changeset 270c9a75e91f do not work well with switching of cpus. The problem is that decoder for the old thread context holds state that is not taken over by the new decoder. This patch adds a takeOverFrom() function to Decoder class in each ISA. Except for x86, functions in other ISAs are blank. For x86, the function copies state from the old decoder to the new decoder.
2013-01-21	scons: Disable protobuf if pkg-config and CheckLib fails	Andreas Hansson
	This patch changes the use of pkg-config such that protobuf is still evaluated with CheckLib even if it fails. This is to allow setups where libprotobuf is available, but not configured through protobuf. Moreover, if CheckLib fails to use libprotobuf then all the tracing is disabled, but scons is allowed to continue with a warning.
2013-01-19	O3 IEW: Make incrWb and decrWb clearer	Joel Hestness
	Move the increment/decrement of wbOutstanding outside of the comparison in incrWb and decrWb in the IEW. This also fixes a compiler bug with gcc 4.4.7, which incorrectly optimizes "-- ==" as "-=".
2013-01-17	ruby: remove calls to g_system_ptr->getTime()	Nilay Vaish
	This patch further removes calls to g_system_ptr->getTime() where ever other clocked objects are available for providing current time.
2013-01-15	x86 regressions: updates due to new instructions and cpuid	Nilay Vaish

2013-01-15	x86 cpuid: enable clflush	Nilay Vaish
	Note that clflush is only being enabled. It is not implemented in actual. A warning is printed if the cpu encounters a clflush instruction. We need to enable this instruction in cpuid since JRE 1.7 tests for it.
2013-01-15	x86: implements fsin, fcos instructions	Nilay Vaish

2013-01-15	x86: implements emms instruction	Nilay Vaish

2013-01-15	x86: implement fabs, fchs instructions	Nilay Vaish

2013-01-14	regressions: update stats due to changes in ruby obj hierarchy	Nilay Vaish

2013-01-14	config: move ruby objects under ruby_system in obj hierarchy	Malek Musleh
	This patch moves the contollers to be children of the ruby_system instead of 'system' under the python object hierarchy. This is so that these objects can inherit some of the ruby_system's parameter values without resorting to calling a global system pointer during run-time. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-01-14	ruby sequencer: converts cycles to ticks in deadlock panic()	Malek Musleh
	This patch converts the panic() print outs in the Sequencer::wakeup() call from ruby cycles to Ticks(). This makes it easier to debug deadlocks with the ProtocolTrace flag so the issue time indicated in the panic message can be quickly searched for. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-01-14	Ruby: remove reference to g_system_ptr from class Message	Nilay Vaish
	This patch was initiated so as to remove reference to g_system_ptr, the pointer to Ruby System that is used for getting the current time. That simple change actual requires changing a lot many things in slicc and garnet. All these changes are related to how time is handled. In most of the places, g_system_ptr has been replaced by another clock object. The changes have been done under the assumption that all the components in the memory system are on the same clock frequency, but the actual clocks might be distributed.
2013-01-14	Ruby: use ClockedObject in Consumer class	Nilay Vaish
	Many Ruby structures inherit from the Consumer, which is used for scheduling events. The Consumer used to relay on an Event Manager for scheduling events and on g_system_ptr for time. With this patch, the Consumer will now use a ClockedObject to schedule events and to query for current time. This resulted in several structures being converted from SimObjects to ClockedObjects. Also, the MessageBuffer class now requires a pointer to a ClockedObject so as to query for time.
2013-01-14	scons: Address clang 3.2 compilation error	Andreas Hansson
	This patch fixes a compilation error encountered using clang 3.2 on OSX.
2013-01-14	stats: Bump failing x86 regression stats	Andreas Hansson
	This patch bumps the stats of mcf and twolf for the o3 CPU such that the regressions pass.
2013-01-12	base simple cpu: removes commented out code about cache ops	Nilay Vaish

2013-01-12	x86: Changes to decoder, corrects 9376	Nilay Vaish
	The changes made by the changeset 9376 were not quite correct. The patch made changes to the code which resulted in decoder not getting initialized correctly when the state was restored from a checkpoint. This patch adds a startup function to each ISA object. For x86, this function sets the required state in the decoder. For other ISAs, the function is empty right now.
2013-01-08	config: Fix issue with changeset: a4739b6f799d.	Ali Saidi

2013-01-08	stats: update stats for previous six changes	Ali Saidi

2013-01-08	util: add writefile to m5 util program for x86	Lluís Vilanova

2013-01-08	util: add m5_fail op.	Lluís Vilanova
	Used as a command in full-system scripts helps the user ensure the benchmarks have finished successfully. For example, one can use: /path/to/benchmark args \|\| /sbin/m5 fail 1 and thus ensure gem5 will exit with an error if the benchmark fails.
2013-01-08	sim: Fix early termination in multi-core simulation under SE mode.	Tao Zhang
	When "-I" (maximum instruction number) and "-F" (fastforward instruction number) are applied together, gem5 immediately exits after the cpu switching. The reason is that multiple exit events may be generated in the same cycle by Atomic CPU and inserted to mainEventQueue. However, mainEventQueue can only serve one exit event in one cycle. Therefore, the rest exit events are left in mainEventQueue without being descheduled or deleted, which causes gem5 exits immediately after the system resumes by cpu switching.
2013-01-08	arm: add access syscall for ARM SE mode	Mitch Hayenga
	This patch adds the "access" syscall for ARM SE as required by some spec2006 benchmarks.
2013-01-08	mem: Make LL/SC locks fine grained	Mitch Hayenga
	The current implementation in gem5 just keeps a list of locks per cacheline. Due to this, a store to a non-overlapping portion of the cacheline can cause an LL/SC pair to fail. This patch simply adds an address range to the lock structure, so that the lock is only invalidated if the store overlaps the lock range.
2013-01-08	mem: Fix use-after-free bug	Mitch Hayenga
	Running with valgrind I noticed a use after free originating from simple_mem.cc. It looks like this is a known issue and this additional call site was missed in an earlier patch.
2013-01-07	dev: Fix infinite recursion in DMA devices	Andreas Sandberg
	The DMA device sometimes calls the process() method on a completion event directly instead of scheduling it on the current tick. This breaks some devices that assume that the completion handler won't be called until the current event handler has returned. Specifically, it causes infinite recursion in the IdeDisk component because it does not advance its chunk generator until after a dmaRead()/dmaWrite() has returned. This changeset removes this mico-optimization and schedules the event in the current tick instead. This way the semantics event handling stay the same even when the delay is 0.
2013-01-07	util: Fix stack corruption in the m5 util	Andreas Sandberg
	The number of arguments specified when calling parse_int_args() in do_exit() is incorrect. This leads to stack corruption since it causes writes past the end of the ints array.
2013-01-07	stats: Fix swig wrapping for Tick in stats	Sascha Bischoff
	Tick was not correctly wrapped for the stats system, and therefore it was not possible to configure the stats dumping from the python scripts without defining Ticks as long long. This patch fixes the wrapping of Tick by copying the typemap of uint64_t to Tick.
2013-01-07	stats: update stats for previous changes.	Ali Saidi

2013-01-07	cpu: Unify the serialization code for all of the CPU models	Andreas Sandberg
	Cleanup the serialization code for the simple CPUs and the O3 CPU. The CPU-specific code has been replaced with a (un)serializeThread that serializes the thread state / context of a specific thread. Assuming that the thread state class uses the CPU-specific thread state uses the base thread state serialization code, this allows us to restore a checkpoint with any of the CPU models.
2013-01-07	tests: Add CPU switching tests	Andreas Sandberg
	This changeset adds a set of tests that stress the CPU switching code. It adds the following test configurations: * tsunami-switcheroo-full -- Alpha system (atomic, timing, O3) * realview-switcheroo-atomic -- ARM system (atomic<->atomic) * realview-switcheroo-timing -- ARM system (timing<->timing) * realview-switcheroo-o3 -- ARM system (O3<->O3) * realview-switcheroo-full -- ARM system (atomic, timing, O3) Reference data is provided for the 10.linux-boot test case. All of the tests trigger a CPU switch once per millisecond during the boot process. The in-order CPU model was not included in any of the tests as it does not support CPU handover.
2013-01-07	cpu: Flush TLBs on switchOut()	Andreas Sandberg
	This changeset inserts a TLB flush in BaseCPU::switchOut to prevent stale translations when doing repeated switching. Additionally, the TLB flushing functionality is exported to the Python to make debugging of switching/checkpointing easier. A simulation script will typically use the TLB flushing functionality to generate a reference trace. The following sequence can be used to simulate a handover (this depends on how drain is implemented, but is generally the case) between identically configured CPU models: m5.drain(test_sys) [ cpu.flushTLBs() for cpu in test_sys.cpu ] m5.resume(test_sys) The generated trace should normally be identical to a trace generated when switching between identically configured CPU models or checkpointing and resuming.
2013-01-07	mem: Fix guest corruption when caches handle uncacheable accesses	Andreas Sandberg
	When the classic gem5 cache sees an uncacheable memory access, it used to ignore it or silently drop the cache line in case of a write. Normally, there shouldn't be any data in the cache belonging to an uncacheable address range. However, since some architecture models don't implement cache maintenance instructions, there might be some dirty data in the cache that is discarded when this happens. The reason it has mostly worked before is because such cache lines were most likely evicted by normal memory activity before a TLB flush was requested by the OS. Previously, the cache model would invalidate cache lines when they were accessed by an uncacheable write. This changeset alters this behavior so all uncacheable memory accesses cause a cache flush with an associated writeback if necessary. This is implemented by reusing the cache flushing machinery used when draining the cache, which implies that writebacks are performed using functional accesses.
2013-01-07	cpu: Rewrite O3 draining to avoid stopping in microcode	Andreas Sandberg
	Previously, the O3 CPU could stop in the middle of a microcode sequence. This patch makes sure that the pipeline stops when it has committed a normal instruction or exited from a microcode sequence. Additionally, it makes sure that the pipeline has no instructions in flight when it is drained, which should make draining more robust. Draining is controlled in the commit stage, which checks if the next PC after a committed instruction is in microcode. If this isn't the case, it requests a squash of all instructions after that the instruction that just committed and immediately signals a drain stall to the fetch stage. The CPU then continues to execute until the pipeline and all associated buffers are empty.
2013-01-07	cpu: Make sure that a drained atomic CPU isn't executing ucode	Andreas Sandberg
	Currently, the atomic CPU can be in the middle of a microcode sequence when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * Since curMacroStaticInst is populated when executing microcode, repeated switching between CPUs executing microcode leads to incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset fixes a bug where the multiple switches to the same atomic CPU sometimes corrupts the target state because of dangling pointers to the currently executing microinstruction. Note: This changeset moves tick event descheduling from switchOut() to drain(), which makes timing consistent between just draining a system and draining /and/ switching between two atomic CPUs. This makes debugging quite a lot easier (execution traces get the same timing), but the latency of the last instruction before a drain will not be accounted for correctly (it will always be 1 cycle). Note 2: This changeset removes so_state variable, the locked variable, and the tickEvent from checkpoints since none of them contain state that needs to be preserved across checkpoints. The so_state is made redundant because we don't use the drain state variable anymore, the lock variable should never be set when the system is drained, and the tick event isn't scheduled.
2013-01-07	cpu: Make sure that a drained timing CPU isn't executing ucode	Andreas Sandberg
	Currently, the timing CPU can be in the middle of a microcode sequence or multicycle (stayAtPC is true) instruction when it is drained. This leads to two problems: * When switching to a hardware virtualized CPU, we obviously can't execute gem5 microcode. * If stayAtPC is true we might execute half of an instruction twice when restoring a checkpoint or switching CPUs, which leads to an incorrect execution. After applying this patch, the CPU will be on a proper instruction boundary, which means that it is safe to switch to any CPU model (including hardware virtualized ones). This changeset also fixes a bug where the timing CPU sometimes switches out with while stayAtPC is true, which corrupts the target state after a CPU switch or checkpoint. Note: This changeset removes the so_state variable from checkpoints since the drain state isn't used anymore.
2013-01-07	cpu: Fix broken thread context handover	Andreas Sandberg
	The thread context handover code used to break when multiple handovers were performed during the same quiesce period. Previously, the thread contexts would assign the TC pointer in the old quiesce event to the new TC. This obviously broke in cases where multiple switches were performed within the same quiesce period, in which case the TC pointer in the quiesce event would point to an old CPU. The new implementation deschedules pending quiesce events in the old TC and schedules a new quiesce event in the new TC. The code has been refactored to remove most of the code duplication.
2013-01-07	cpu: Fix O3 LSQ debug dumping constness and formatting	Andreas Sandberg

2013-01-07	arm: Invalidate cached TLB configuration in drainResume	Andreas Sandberg
	Currently, we invalidate the cached miscregs in TLB::unserialize(). The intended use of the drainResume() method is to invalidate cached state and prepare the system to resume after a CPU handover or (un)serialization. This patch moves the TLB miscregs invalidation code to the drainResume() method to avoid surprising behavior.
2013-01-07	arm: Fix draining of the pagetable walker when squashing	Andreas Sandberg
	Since the page table walker only checks if a drain has completed in doL1DescriptorWrapper() and doL2DescriptorWrapper(), it sometimes looses track of a drain request if there is a squash. This changeset adds a completeDrain() call after squashing requests in the pending queue, which fixes this issue.