gem5 - gem5

Age	Commit message (Collapse)	Author
2011-02-06	dev: fixed bugs to extend interrupt capability beyond 15 cores	Brad Beckmann

2011-02-06	x86: Timing support for pagetable walker	Joel Hestness
	Move page table walker state to its own object type, and make the walker instantiate state for each outstanding walk. By storing the states in a queue, the walker is able to handle multiple outstanding timing requests. Note that functional walks use separate state elements.
2011-02-06	TimingSimpleCPU: split data sender state fix	Joel Hestness
	In sendSplitData, keep a pointer to the senderState that may be updated after the call to handle*Packet. This way, if the receiver updates the packet senderState, it can still be accessed in sendSplitData.
2011-02-06	ruby: Fix RubyPort to properly handle retrys	Brad Beckmann

2011-02-06	Ruby: Fix to return cache block size to CPU for split data transfers	Joel Hestness

2011-02-06	Ruby: Add support for locked memory accesses in X86_FS	Joel Hestness

2011-02-06	Ruby: Update the Ruby request type names for LL/SC	Joel Hestness

2011-02-06	ruby: Assert for x86 misaligned access	Brad Beckmann
	This patch ensures only aligned access are passed to ruby and includes a fix to the DPRINTF address print.
2011-02-06	MOESI_hammer: Added full-bit directory support	Brad Beckmann

2011-02-06	x86: Add checkpointing capability to devices	Joel Hestness
	Add checkpointing capability to the Intel 8254 timer, CMOS, I8042, PS2 Keyboard and Mouse, I82094AA, I8237, I8254, I8259, and speaker devices
2011-02-06	x86: Add checkpointing capability to arch components	Joel Hestness
	Add checkpointing capability to the x86 interrupt device and the TLBs
2011-02-06	x86: implements vtophys	Joel Hestness
	Calls walker to look up virt. to phys. page mapping
2011-02-06	IntDev: packet latency fix	Joel Hestness
	The x86 local apic now includes a separate latency parameter for interrupts.
2011-02-06	MessagePort: implement the virtual recvTiming function to avoid double pkt ↵	Joel Hestness
	delete Double packet delete problem is due to an interrupt device deleting a packet that the SimpleTimingPort also deletes. Since MessagePort descends from SimpleTimingPort, simply reimplement the failing code from SimpleTimingPort: recvTiming.
2011-02-06	MOESI_hammer: trigge queue fix.	Joel Hestness

2011-02-06	mcpat: Adds McPAT performance counters	Joel Hestness
	Updated patches from Rick Strong's set that modify performance counters for McPAT
2011-02-06	garnet: added orion2.0 for network power calculation	Tushar Krishna

2011-02-06	garnet: separate data and ctrl VCs	Tushar Krishna
	Separate data VCs and ctrl VCs in garnet, as ctrl VCs have 1 buffer per VC, while data VCs have > 1 buffers per VC. This is for correct power estimations.
2011-02-06	x86: set IsCondControl flag for the appropriate microops	Brad Beckmann

2011-02-03	Fault: Forgot to refresh to grab these header guard updates.	Gabe Black

2011-02-04	inorder: fault handling	Korey Sewell
	Maintain all information about an instruction's fault in the DynInst object rather than any cpu-request object. Also, if there is a fault during the execution stage then just save the fault inside the instruction and trap once the instruction tries to graduate
2011-02-04	inorder: pcstate and delay slots bug	Korey Sewell
	not taken delay slots were not being advanced correctly to pc+8, so for those ISAs we 'advance()' the pcstate one more time for the desired effect
2011-02-04	inorder: add a fetch buffer to fetch unit	Korey Sewell
	Give fetch unit it's own parameterizable fetch buffer to read from. Very inefficient (architecturally and in simulation) to continually fetch at the granularity of the wordsize. As expected, the number of fetch memory requests drops dramatically
2011-02-04	inorder: overload find-req fn	Korey Sewell
	no need to have separate function name findSplitRequest, just overload the function
2011-02-04	inorder: implement separate fetch unit	Korey Sewell
	instead of having one cache-unit class be responsible for both data and code accesses, separate code that is just for fetch in it's own derived class off the original base class. This makes the code easier to manage as well as handle future cases of special fetch handling
2011-02-04	inorder: cache port blocking	Korey Sewell
	set the request to false when the cache port blocks so we dont deadlock. also, comment out the outstanding address list sanity check for now.
2011-02-04	inorder: stage width as a python parameter	Korey Sewell
	allow the user to specify how many instructions a pipeline stage can process on any given cycle (stageWidth...i.e.bandwidth) by setting the parameter through the python interface rather than compile the code after changing the *.cc file. (we always had the parameter there, but still used the static 'ThePipeline::StageWidth' instead) - Since StageWidth is now dynamically defined, change the interstage communication structure to use a vector and get rid of array and array handling index (toNextStageIndex) since we can just make calls to the list for the same information
2011-02-04	inorder: multi-issue branch resolution	Korey Sewell
	Only execute (resolve) one branch per cycle because handling more than one is a little more complicated
2011-02-04	inorder: pipe. stage inst. buffering	Korey Sewell
	use skidbuffer as only location for instructions between stages. before, we had the insts queue from the prior stage and the skidbuffer for the current stage, but that gets confusing and this consolidation helps when handling squash cases
2011-02-04	inorder: change skidBuffer to list instead of queue	Korey Sewell
	manage insertion and deletion like a queue but will need access to internal elements for future changes Currently, skidbuffer manages any instruction that was in a stage but could not complete processing, however we will want to manage all blocked instructions (from prev stage and from cur. stage) in just one buffer.
2011-02-04	inorder: activity tracking bug	Korey Sewell
	Previous code was marking CPU activity on almost every cycle due to a bug in tracking the status of pipeline stages. This disables the CPU from sleeping on long latency stalls and increases simulation time
2011-02-03	Fault: Rename sim/fault.hh to fault_fwd.hh to distinguish it from faults.hh.	Gabe Black
	--HG-- rename : src/sim/fault.hh => src/sim/fault_fwd.hh
2011-02-03	Config: Keep track of uncached and cached ports separately.	Gabe Black
	This makes sure that the address ranges requested for caches and uncached ports don't conflict with each other, and that accesses which are always uncached (message signaled interrupts for instance) don't waste time passing through caches.
2011-02-02	O3: Fix a style bug in O3.	Gabe Black

2011-02-02	X86: Get rid of the stupd microop.	Gabe Black

2011-02-02	X86: Replace the stupd microop with a store/update sequence.	Gabe Black

2011-02-02	Time: Add serialization functions to the Time class.	Gabe Black

2011-02-01	X86: Add L1 caches for the TLB walkers.	Gabe Black
	Small L1 caches are connected to the TLB walkers when caches are used. This allows them to participate in the coherence protocol properly.
2011-01-31	Fault: Move the definition of NoFault from faults.hh to fault.hh.	Gabe Black
	Moving the definition of NoFault into fault.hh doesn't bring any new dependencies with it, and allows some files to include just fault.hh which has less baggage. NoFault will still be available to everything that includes faults.hh because it includes fault.hh.
2011-01-22	refcnt: Change things around so that we handle constness correctly.	Nathan Binkert
	To use a non const pointer: typedef RefCountingPtr<Foo> FooPtr; To use a const pointer: typedef RefCountingPtr<const Foo> ConstFooPtr;
2011-01-20	checkpointing: fix bug from curTick accessor conversion.	Steve Reinhardt
	Regex replacement of curTick with curTick() accidentally changed checkpoint key string for serialization but not for unserialization.
2011-01-19	TimeSync: Use the new setTick and getTick functions.	Gabe Black

2011-01-19	Time: Add setTick and getTick functions to the Time class.	Gabe Black

2011-01-19	Time: Add a mechanism to prevent M5 from running faster than real time.	Gabe Black
	M5 skips over any simulated time where it doesn't have any work to do. When the simulation is active, the time skipped is short and the work done at any point in time is relatively substantial. If the time between events is long and/or the work to do at each event is small, it's possible for simulated time to pass faster than real time. When running a benchmark that can be good because it means the simulation will finish sooner in real time. When interacting with the real world through, for instance, a serial terminal or bridge to a real network, this can be a problem. Human or network response time could be greatly exagerated from the perspective of the simulation and make simulated events happen "too soon" from an external perspective. This change adds the capability to force the simulation to run no faster than real time. It does so by scheduling a periodic event that checks to see if its simulated period is shorter than its real period. If it is, it stalls the simulation until they're equal. This is called time syncing. A future change could add pseudo instructions which turn time syncing on and off from within the simulation. That would allow time syncing to be used for the interactive parts of a session but then turned off when running a benchmark using the m5 utility program inside a script. Time syncing would probably not happen anyway while running a benchmark because there would be plenty of work for M5 to do, but the event overhead could be avoided.
2011-01-18	O3: Fix itstate prediction and recovery.	Matt Horsnell
	Any change of control flow now resets the itstate to 0 mask and 0 condition, except where the control flow alteration write into the cpsr register. These case, for example return from an iterrupt, require the predecoder to recover the itstate. As there is a window of opportunity between the return from an interrupt changing the control flow at the head of the pipe and the commit of the update to the CPSR, the predecoder needs to be able to grab the ITstate early. This is now handled by setting the forcedItState inside a PCstate for the control flow altering instruction. That instruction will have the correct mask/cond, but will not have a valid itstate until advancePC is called (note this happens to advance the execution). When the new PCstate is copy constructed it gets the itstate cond/mask, and upon advancing the PC the itstate becomes valid. Subsequent advancing invalidates the state and zeroes the cond/mask. This is handled in isolation for the ARM ISA and should have no impact on other ISAs. Refer arch/arm/types.hh and arch/arm/predecoder.cc for the details.
2011-01-18	O3: Fix some variable length instruction issues with the O3 CPU and ARM ISA.	Matt Horsnell

2011-01-18	O3: Don't test misprediction on load instructions until executed.	Matt Horsnell

2011-01-18	O3: Keep around the last committed instruction and use for squashing.	Ali Saidi
	Without this change 0 is always used for the youngest sequence number if a squash occured and the ROB was empty (E.g. an instruction is marked serializeAfter or a fetch stall prevents other instructions from issuing). Using 0 there is a race to rename where an instruction that committed the same cycle as the squashing instruction can have it's renamed state undone by the squash using sequence number 0.
2011-01-18	O3: Don't try to scoreboard misc registers.	Ali Saidi
	I'm not positive this is the correct fix, but it's working right now. Either we need to do something like this, prevent the misc reg from being renamed at all, or there something else going on. We need to find the root cause as to why this is only a problem sometimes.
2011-01-18	ARM: The ARM decoder should not panic when decoding undefined holes is arch.	Matt Horsnell
	This can abort simulations when the fetch unit runs ahead and speculatively decodes instructions that are off the execution path.