gem5 - gem5

Age	Commit message (Collapse)	Author
2014-03-12	alpha: Small removal of dead comments/code from alpha ISA	Paul Rosenfeld
	Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-03-07	cpu: Make CPU and ThreadContext getters const	Andreas Hansson
	This patch merely tidies up the CPU and ThreadContext getters by making them const where appropriate.
2014-03-07	arm: Handle functional TLB walks properly	Geoffrey Blake
	The table walker code currently accounts for two types of walks, Atomic and Timing, and treats them differently. Atomic walks keep a single instance of WalkerState around for all walks to use in currState. Timing mode keeps a queue of in-flight WalkerStates and maintains currState as NULL between walks. If a functional walk is done during Timing mode, it is treated as an atomic walk and either creates a persistent WalkerState if in between Timing walks, or stomps an existing currState for an in-progress Timing walk. This patch distinguishes functional walks as being able to exist at any time and sets up a temporary WalkerState for its exclusive use and then cleans up when finished, leaving any in progress Atomic or Timing walks undisturbed.
2014-03-07	mem: Fix incorrect assert failure in the Cache	Prakash Ramrakhyani
	This patch fixes an assert condition that is not true at all times. There are valid situations that arise in dual-core dual-workload runs where the assert condition is false. The function call following the assert however needs to be called only when the condition is true (a block cannot be invalidated in the tags structure if has not been allocated in the structure, and the tempBlock is never allocated). Hence the 'assert' has been replaced with an 'if'.
2014-03-07	mem: Edit proto Packet and enhance the python script	Radhika Jagtap
	This patch changes the decode script to output the optional fields of the proto message Packet, namely id and flags. The flags field is set by the communication monitor. The id field is useful for CPU trace experiments, e.g. linking the fetch side to decode side. It had to be renamed because it clashes with a built in python function id() for getting the "identity" of an object. This patch also takes a few common function definitions out from the multiple scripts and adds them to a protolib python module.
2014-03-07	misc: Add panic_if / fatal_if / chatty_assert	Stephan Diestelhorst
	This snippet can be used to replace if + {panics, fatals, asserts} constructs. The idea is to have both the condition checking and a verbose printout in a single statement. The interface is as follows: panic_if(foo != bar, "These should be equal: foo %i bar %i", foo, bar); fatal_if(foo != bar, "These should be equal: foo %i bar %i", foo, bar); chatty_assert(foo == bar, "These should be equal: foo %i bar %i", foo, bar);
2014-03-07	scons: Fixes uninitialized warnings issued by clang	Mitch Hayenga
	Small fixes to appease recent clang versions.
2014-03-07	arm: Fix uninitialised warning with gcc 4.8	Stephan Diestelhorst
	Small fix for a warning that prevents compilation with gcc 4.8.1 due to detecting that a variable might be uninitialised. The fix is to assign a safe default.
2014-03-07	mem: Wakeup sleeping CPUs without caches on LLSC	Ali Saidi
	For systems without caches, the LLSC code does not get snoops for wake-ups. We add the LLSC code in the abstract memory to do the job for us.
2014-03-06	sim: Schedule the global sync event at curTick() + simQuantum	Andreas Sandberg
	The global synchronization event used to be scheduled at simQuantum. This prevented repeated entries into gem5 from Python as it can be scheduled in the past. This changeset ensures that the first global synchronization happens at curTick() + simQuantum instead.
2014-03-03	x86: Setup correct TSL/TR segment attributes on INIT	Andreas Sandberg
	The TSL/LDT & TR/TSS segments didn't contain valid attributes. This caused problems when transfering the state into KVM where invalid state is a no-go. Fixup the attributes with values from AMD's architecture programmer's manual.
2014-03-03	kvm: x86: Always assume segments to be usable	Andreas Sandberg
	When transferring segment registers into kvm, we need to find the value of the unusable bit. We used to assume that this could be inferred from the selector since segments are generally unusable if their selector is 0. This assumption breaks in some weird corner cases. Instead, we just assume that segments are always usable. This is what qemu does so it should work.
2014-03-03	kvm: Initialize signal handlers from startupThread()	Andreas Sandberg
	Signal handlers in KVM are controlled per thread and should be initialized from the thread that is going to execute the CPU. This changeset moves the initialization call from startup() to startupThread().
2014-03-01	ruby: message buffer: changes related to tracking push/pop times	Nilay Vaish
	The last pop operation is now tracked as a Tick instead of in Cycles. This helps in avoiding use of the receiver's clock during the enqueue operation.
2014-03-01	ruby: make the max_size variable of the MessageBuffer unsigned	Nilay Vaish

2014-03-01	cpu: Enable fast-forwarding for MIPS InOrderCPU and O3CPU	Christopher Torng
	A copyRegs() function is added to MIPS utilities to copy architectural state from the old CPU to the new CPU during fast-forwarding. This addition alone enables fast-forwarding for the o3 cpu model running MIPS. The patch also adds takeOverFrom() and drainResume() functions to the InOrderCPU to enable it to take over from another CPU. This change enables fast-forwarding for the inorder cpu model running MIPS, but not for Alpha. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-03-01	ruby: profiler: statically allocate stats variable	Nilay Vaish
	Couple of users observed segmentation fault when the simulator tries to register the statistical variable m_IncompleteTimes. It seems that there is some problem with the initialization of these variables when allocated in the constructor.
2014-02-23	ruby: route all packets through ruby port	Nilay Vaish
	Currently, the interrupt controller in x86 is connected to the io bus directly. Therefore the packets between the io devices and the interrupt controller do not go through ruby. This patch changes ruby port so that these packets arrive at the ruby port first, which then routes them to their destination. Note that the patch does not make these packets go through the ruby network. That would happen in a subsequent patch.
2014-02-23	ruby: Simplify RubyPort flow control and routing	Andreas Hansson
	This patch simplfies the retry logic in the RubyPort, avoiding redundant attributes, and enforcing more stringent checks on the interactions with the normal ports. The patch also simplifies the routing done by the RubyPort, using the port identifiers instead of a heavy-weight sender state. The patch also fixes a bug in the sending of responses from PIO ports. Previously these responses bypassed the queue in the queued port, and ignored the return value, potentially leading to response packets being lost. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-02-23	ruby: message buffer: refactor code	Nilay Vaish
	Code in two of the functions was exactly the same. This patch moves this code to a new function which is called from the two functions mentioned initially.
2014-02-23	ruby: remove few not required #includes	Nilay Vaish

2014-02-23	ruby: slicc: remove unused COPY_HEAD functionality	Nilay Vaish

2014-02-23	ruby: protocols: remove unused action z_stall	Nilay Vaish

2014-02-21	ruby: network: move message buffers to base network class.	Nilay Vaish

2014-02-21	ruby: network: garnet: fixed: removes net_ptr from links	Nilay Vaish

2014-02-21	ruby: cache: remove not required variable m_cache_name	Nilay Vaish

2014-02-20	ruby: network: garnet: fixed: removes next cycle functions	Nilay Vaish
	At several places, there are functions that take a cycle value as input and performs some computation. Along with each such function, another function was being defined that simply added one more cycle to input and computed the same function. This patch removes this second copy of the function. Places where these functions were being called have been updated to use the original function with argument being current cycle + 1.
2014-02-20	ruby: controller: slight code refactoring	Nilay Vaish

2014-02-20	ruby: mesi three level: rename incorrectly named files	Nilay Vaish
	Two files had been incorrectly named with a .cache suffix. --HG-- rename : src/mem/protocol/MESI_Three_Level-L0.cache => src/mem/protocol/MESI_Three_Level-L0cache.sm rename : src/mem/protocol/MESI_Three_Level-L1.cache => src/mem/protocol/MESI_Three_Level-L1cache.sm
2014-02-20	ruby: network: removes unused code.	Nilay Vaish

2014-02-20	ruby: slicc: slight code refactoring	Nilay Vaish

2014-02-20	ruby: message buffer: removes some unecessary functions.	Nilay Vaish

2014-02-20	kvm: Add support for multi-system simulation	Andreas Sandberg
	The introduction of parallel event queues added most of the support needed to run multiple VMs (systems) within the same gem5 instance. This changeset fixes up signal delivery so that KVM's control signals are delivered to the thread that executes the CPU's event queue. Specifically: * Timers and counters are now initialized from a separate method (startupThread) that is scheduled as the first event in the thread-specific event queue. This ensures that they are initialized from the thread that is going to execute the CPUs event queue and enables signal delivery to the right thread when exiting from KVM. * The POSIX-timer-based KVM timer (used to force exits from KVM) has been updated to deliver signals to the thread that's executing KVM instead of the process (thread is undefined in that case). This assumes that the timer is instantiated from the thread that is going to execute the KVM vCPU. * Signal masking is now done using pthread_sigmask instead of sigprocmask. The behavior of the latter is undefined in threaded applications. * Since signal masks can be inherited, make sure to actively unmask the control signals when setting up the KVM signal mask. There are currently no facilities to multiplex between multiple KVM CPUs in the same event queue, we are therefore limited to configurations where there is only one KVM CPU per event queue. In practice, this means that multi-system configurations can be simulated, but not multiple CPUs in a shared-memory configuration.
2014-02-18	mem: Fix bug in PhysicalMemory use of mmap and munmap	Andreas Hansson
	This patch fixes a bug in how physical memory used to be mapped and unmapped. Previously we unmapped and re-mapped if restoring from a checkpoint. However, we never checked that the new mapping was actually the same, it was just magically working as the OS seems to fairly reliably give us the same chunk back. This patch fixes this issue by relying entirely on the mmap call in the constructor.
2014-02-18	dev: Include basic devices in NULL ISA build	Andreas Hansson
	This patch enbles use of the basic PIO devices as part of the NULL build. Although it might seem counter intuitive to have a PIO device without being able to execute a driver, this change enables us to break a device class hierarchy into an ISA-agnostic part, and an ISA-specific part, without requiring multiple-inheritance. The ISA-agnostic base class is a PIO device, but does not make use of the port.
2014-02-18	mem: Filter cache snoops based on address ranges	Andreas Hansson
	This patch adds a filter to the cache to drop snoop requests that are not for a range covered by the cache. This fixes an issue observed when multiple caches are placed in parallel, covering different address ranges. Without this patch, all the caches will forward the snoop upwards, when only one should do so.
2014-02-18	mem: Add a wrapped DRAMSim2 memory controller	Andreas Hansson
	This patch adds DRAMSim2 as a memory controller by wrapping the external library and creating a sublass of AbstractMemory that bridges between the semantics of gem5 and the DRAMSim2 interface. The DRAMSim2 wrapper extracts the clock period from the config file. There is no way of extracting this information from DRAMSim2 itself, so we simply read the same config file and get it from there. To properly model the response queue, the wrapper keeps track of how many transactions are in the actual controller, and how many are stacking up waiting to be sent back as responses (in the wrapper). The latter requires us to move away from the queued port and manage the packets ourselves. This is due to DRAMSim2 not having any flow control on the response path. DRAMSim2 assumes that the transactions it is given are matching the burst size of the choosen memory. The wrapper checks to ensure the cache line size of the system matches the burst size of DRAMSim2 as there are currently no provisions to split the system requests. In theory we could allow a cache line size smaller than the burst size, but that would lead to inefficient use of the DRAM, so for not we fatal also in this case.
2014-02-18	mem: Fix input to DPRINTF in CommMonitor	Andreas Hansson
	Minor fix of the debug message parameters.
2014-02-09	cpu: simple: Add support for using branch predictors	Andreas Sandberg
	This changesets adds branch predictor support to the BaseSimpleCPU. The simple CPUs normally don't need a branch predictor, however, there are at least two cases where it can be desirable: 1) A simple CPU can be used to warm the branch predictor of an O3 CPU before switching to the slower O3 model. 2) The simple CPU can be used as a quick way of evaluating/debugging new branch predictors since it exposes branch predictor statistics. Limitations: * Since the simple CPU doesn't speculate, only one instruction will be active in the branch predictor at a time (i.e., the branch predictor will never see speculative branches). * The outcome of a branch prediction does not affect the performance of the simple CPU.
2014-02-06	base: calls abort() from fatal	Nilay Vaish
	Currently fatal() ends the simulation in a normal fashion. This results in the call stack getting lost when using a debugger and it is not always possible to debug the simulation just from the information provided by the printed error message. Even though the error is likely due to a user's fault, the information available should not be thrown away. Hence, this patch to call abort() from fatal().
2014-02-06	ruby: memory controller: use MemoryNode *	Nilay Vaish

2014-02-05	x86: Fix x87 state transfer bug	Andreas Sandberg
	Changeset 7274310be1bb (isa: clean up register constants) increased the value of NumFloatRegs, which triggered a bug in X86ISA::copyRegs(). This bug is caused by the x87 stack being copied twice since register indexes past NUM_FLOATREGS are mapped into the x87 stack relative to the top of the stack, which is undefined when the copy takes place. This changeset updates the copyRegs() function to use access registers using the non-flattening interface, which guarantees that undesirable register folding does not happen.
2014-02-02	x86, kvm: Fix bug in the RFlags get and set functions	Nikos Nikoleris
	The getRFlags and setRFlags utility functions were not updated correctly when condition registers were separated into their own register class. This lead to incorrect state transfer in calls from kvm into the simulator (e.g., m5 readfile ended up in an infinite loop) and when switching CPUs. This patch makes these utility functions use getCCReg and setCCReg instead of getIntReg and setIntReg which read and write the integer registers. Reviewed-by: Andreas Sandberg <andreas@sandberg.pp.se>
2014-01-30	unittest: Fix build errors	Ola Jeppsson
	Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-29	mem: Add additional tolerance to stride prefetcher	Mitch Hayenga
	Forces the prefetcher to mispredict twice in a row before resetting the confidence of prefetching. This helps cases where a load PC strides by a constant factor, however it may operate on different arrays at times. Avoids the cost of retraining. Primarily helps with small iteration loops. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-29	mem: Allowed tagged instruction prefetching in stride prefetcher	Mitch Hayenga
	For systems with a tightly coupled L2, a stride-based prefetcher may observe access requests from both instruction and data L1 caches. However, the PC address of an instruction miss gives no relevant training information to the stride based prefetcher(there is no stride to train). In theses cases, its better if the L2 stride prefetcher simply reverted back to a simple N-block ahead prefetcher. This patch enables this option. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-29	mem: prefetcher: add options, support for unaligned addresses	Mitch Hayenga ext:(%2C%20Amin%20Farmahini%20%3Caminfar%40gmail.com%3E)
	This patch extends the classic prefetcher to work on non-block aligned addresses. Because the existing prefetchers in gem5 mask off the lower address bits of cache accesses, many predictable strides fail to be detected. For example, if a load were to stride by 48 bytes, with 64 byte cachelines, the current stride based prefetcher would see an access pattern of 0, 64, 64, 128, 192.... Thus not detecting a constant stride pattern. This patch fixes this, by training the prefetcher on access and not masking off the lower address bits. It also adds the following configuration options: 1) Training/prefetching only on cache misses, 2) Training/prefetching only on data acceses, 3) Optionally tagging prefetches with a PC address. #3 allows prefetchers to train off of prefetch requests in systems with multiple cache levels and PC-based prefetchers present at multiple levels. It also effectively allows a pipelining of prefetch requests (like in POWER4) across multiple levels of cache hierarchy. Improves performance on my gem5 configuration by 4.3% for SPECINT and 4.7% for SPECFP (geomean).
2014-01-29	cpu: fix bug when TrafficGen deschedules event	Xiangyu Dong
	Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-28	arm: Enable umask syscall in SE mode	Mitch Hayenga
	Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2014-01-28	base: Fix race condition in the socket listen function	Mitch Hayenga
	gem5 makes the incorrect assumption that by binding a socket, it effectively has allocated a port. Linux only allocates ports once you call listen on the given socket, not when you call bind. So even if the port was free when bind was called, another process (gem5 instance) could race in between the bind & listen calls and steal the port. In the current code, if the call to bind fails due to the port being in use (EADDRINUSE), gem5 retries for a different port. However if listen fails, gem5 just panics. The fix is testing the return value of listen and re-trying if it was due to EADDRINUSE. Committed by: Nilay Vaish <nilay@cs.wisc.edu>