gem5 - gem5

Age	Commit message (Collapse)	Author
2013-06-27	sim: Add the notion of clock domains to all ClockedObjects	Akash Bagdia
	This patch adds the notion of source- and derived-clock domains to the ClockedObjects. As such, all clock information is moved to the clock domain, and the ClockedObjects are grouped into domains. The clock domains are either source domains, with a specific clock period, or derived domains that have a parent domain and a divider (potentially chained). For piece of logic that runs at a derived clock (a ratio of the clock its parent is running at) the necessary derived clock domain is created from its corresponding parent clock domain. For now, the derived clock domain only supports a divider, thus ensuring a lower speed compared to its parent. Multiplier functionality implies a PLL logic that has not been modelled yet (create a separate clock instead). The clock domains should be used as a mechanism to provide a controllable clock source that affects clock for every clocked object lying beneath it. The clock of the domain can (in a future patch) be controlled by a handler responsible for dynamic frequency scaling of the respective clock domains. All the config scripts have been retro-fitted with clock domains. For the System a default SrcClockDomain is created. For CPUs that run at a different speed than the system, there is a seperate clock domain created. This domain incorporates the CPU and the associated caches. As before, Ruby runs under its own clock domain. The clock period of all domains are pre-computed, such that no virtual functions or multiplications are needed when calling clockPeriod. Instead, the clock period is pre-computed when any changes occur. For this to be possible, each clock domain tracks its children.
2013-06-27	config: Rename clock option to Ruby clock	Akash Bagdia
	This patch changes the 'clock' option to 'ruby-clock' as it is only used by Ruby.
2013-06-27	config: Add a system clock command-line option	Akash Bagdia
	This patch adds a 'sys_clock' command-line option and use it to assign clocks to the system during instantiation. As part of this change, the default clock in the System class is removed and whenever a system is instantiated a system clock value must be set. A default value is provided for the command-line option. The configs and tests are updated accordingly.
2013-06-27	config: Add a CPU clock command-line option	Akash Bagdia
	This patch adds a 'cpu_clock' command-line option and uses the value to assign clocks to components running at the CPU speed (L1 and L2 including the L2-bus). The configuration scripts are updated accordingly. The 'clock' option is left unchanged in this patch as it is still used by a number of components. In follow-on patches the latter will be disambiguated further.
2013-06-03	config: Add missing CPUs to --restore-with-cpu	Andreas Sandberg
	The --restore-with-cpu option didn't use CpuConfig.cpu_names() to determine which CPU names are valid, instead it used a static list of known CPU names. This changeset makes the option parsing code use the CPU list from the CpuConfig module instead.
2013-05-30	mem: More descriptive DRAM config names	Andreas Hansson
	This patch changes the class names of the variuos DRAM configurations to better reflect what memory they are based on. The speed and interface width is now part of the name, and also the alias that is used to select them on the command line. Some minor changes are done to the actual parameters, to better reflect the named configurations. As a result of these changes the regressions change slightly and the stats will be bumped in a separate patch.
2013-05-30	mem: Add a LPDDR3-1600 configuration	Andreas Hansson
	This patch adds a typical (leaning towards fast) LPDDR3 configuration based on publically available data. As expected, it looks very similar to the LPDDR2-S4 configuration, only with a slightly lower burst time.
2013-05-30	mem: Avoid explicitly zeroing the memory backing store	Andreas Hansson
	This patch removes the explicit memset as it is redundant and causes the simulator to touch the entire space, forcing the host system to allocate the pages. Anonymous pages are mapped on the first access, and the page-fault handler is responsible for zeroing them. Thus, the pages are still zeroed, but we avoid touching the entire allocated space which enables us to use much larger memory sizes as long as not all the memory is actually used.
2013-05-14	cpu: remove local/globalHistoryBits params from branch pred	Anthony Gutierrez
	having separate params for the local/globalHistoryBits and the local/globalPredictorSize can lead to inconsistencies if they are not carefully set. this patch dervies the number of bits necessary to index into the local/global predictors based on their size. the value of the localHistoryTableSize for the ARM O3 CPU has been increased to 1024 from 64, which is more accurate for an A15 based on some correlation against A15 hardware.
2013-04-22	config: Add a mem-type config option to se/fs scripts	Andreas Hansson
	This patch enables selection of the memory controller class through a mem-type command-line option. Behind the scenes, this option is treated much like the cpu-type, and a similar framework is used to resolve the valid options, and translate the short-hand description to a valid class. The regression scripts are updated with a hardcoded memory class for the moment. The best solution going forward is probably to get the memory out of the makeSystem functions, but Ruby complicates things as it does not connect the memory controller to the membus. --HG-- rename : configs/common/CpuConfig.py => configs/common/MemConfig.py
2013-04-22	cpu: generate SimPoint basic block vector profiles	Dam Sunwoo
	This patch is based on http://reviews.m5sim.org/r/1474/ originally written by Mitch Hayenga. Basic block vectors are generated (simpoint.bb.gz in simout folder) based on start and end addresses of basic blocks. Some comments to the original patch are addressed and hooks are added to create and resume from checkpoints based on instruction counts dictated by external SimPoint analysis tools. SimPoint creation/resuming options will be implemented as a separate patch.
2013-04-09	Configs: Fix handling of maxtick and take_checkpoints	Joel Hestness
	In Simulation.py, calls to m5.simulate(num_ticks) will run the simulated system for num_ticks after the current tick. Fix calls to m5.simulate in scriptCheckpoints() and benchCheckpoints() to appropriately handle the maxticks variable.
2013-03-28	x86: create space in bios memory map	Nilay Vaish
	As of now, we mark the top 1MB of memory space as unusable. Part of it is actually usable and is required to be marked so by some of the newer versions of linux kernel. This patch marks the top 639KB as usable. This value was chosen by looking at QEMU's output for bios memory map.
2013-03-22	config: return exit event instead of cause	Nilay Vaish
	changeset: a4739b6f799d made some changes that where an exit event should have been returned in place of exit cause. This patch corrects the error.
2013-02-20	config: Fix --prog-interval command line option	Ali Saidi

2013-02-15	options: add command line option for dtb file	Anthony Gutierrez

2013-02-15	config: Remove O3 dependencies	Andreas Sandberg
	The default cache configuration script currently import the O3_ARM_v7a model configuration, which depends on the O3 CPU. This breaks if gem5 has been compiled without O3 support. This changeset removes the dependency by only importing the model if it is requested by the user. As a bonus, it actually removes some code duplication in the configuration scripts.
2013-02-15	config: Move CPU handover logic to m5.switchCpus()	Andreas Sandberg
	CPU switching consists of the following steps: 1. Drain the system 2. Switch out old CPUs (cpu.switchOut()) 3. Change the system timing mode to the mode the new CPUs require 4. Flush caches if switching to hardware virtualization 5. Inform new CPUs of the handover (cpu.takeOverFrom()) 6. Resume the system m5.switchCpus() previously only did step 2 & 5. Since information about the new processors' memory system requirements is now exposed, do all of the steps above. This patch adds automatic memory system switching and flush (if needed) to switchCpus(). Additionally, it adds optional draining to switchCpus(). This has the following implications: * changeToTiming and changeToAtomic are no longer needed, so they have been removed. * changeMemoryMode is only used internally, so it is has been renamed to be private. * switchCpus requires a reference to the system containing the CPUs as its first parameter. WARNING: This changeset breaks compatibility with existing configuration scripts since it changes the signature of m5.switchCpus().
2013-02-15	config: Cleanup CPU configuration	Andreas Sandberg
	The CPUs supported by the configuration scripts used to be hard-coded. This was not ideal for several reasons. For example, the configuration scripts depend on all CPU models even though only a subset might have been compiled. This changeset adds a new module to the configuration scripts that automatically discovers the available CPU models from the compiled SimObjects. As a nice bonus, the use of introspection allows us to automatically generate a list of available CPU models suitable for printing. This list is augmented with the Python doc string from the underlying class if available.
2013-02-15	cpu: Add CPU metadata om the Python classes	Andreas Sandberg
	The configuration scripts currently hard-code the requirements of each CPU. This is clearly not optimal as it makes writing new configuration scripts painful and adding new CPU models requires existing scripts to be updated. This patch adds the following class methods to the base CPU and all relevant CPUs: * memory_mode -- Return a string describing the current memory mode (invalid/atomic/timing). * require_caches -- Does the CPU model require caches? * support_take_over -- Does the CPU support CPU handover?
2013-02-10	config: Don't call sys.exit in interactive mode in run()	Andreas Sandberg
	The run() method in Simulation.py used to call sys.exit() when the simulator exits. This is undesirable when user has requested the simulator to be run in interactive mode since it causes the simulator to exit rather than entering the interactive Python environment.
2013-01-31	mem: Add DDR3 and LPDDR2 DRAM controller configurations	Andreas Hansson
	This patch moves the default DRAM parameters from the SimpleDRAM class to two different subclasses, one for DDR3 and one for LPDDR2. More can be added as we go forward. The regressions that previously used the SimpleDRAM are now using SimpleDDR3 as this is the most similar configuration.
2013-01-24	branch predictor: move out of o3 and inorder cpus	Nilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)
	This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh
2013-01-08	config: Fix issue with changeset: a4739b6f799d.	Ali Saidi

2013-01-08	util: add m5_fail op.	Lluís Vilanova
	Used as a command in full-system scripts helps the user ensure the benchmarks have finished successfully. For example, one can use: /path/to/benchmark args \|\| /sbin/m5 fail 1 and thus ensure gem5 will exit with an error if the benchmark fails.
2013-01-07	cpu: Rename defer_registration->switched_out	Andreas Sandberg
	The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive.
2013-01-07	config: Do not use hardcoded physmem in fs script	Andreas Hansson
	This patch generalises the address range resolution for the I/O cache and I/O bridge such that they do not assume a single memory. The patch involves adding a parameter to the system which is then defined based on the memories that are to be visible from the I/O subsystem, whether behind a cache or a bridge. The change is needed to allow interleaved memory controllers in the system.
2012-12-06	TournamentBP: Fix some bugs with table sizes and counters	Erik Tomusk
	globalHistoryBits, globalPredictorSize, and choicePredictorSize are decoupled. globalHistoryBits controls how much history is kept, global and choice predictor sizes control how much of that history is used when accessing predictor tables. This way, global and choice predictors can actually be different sizes, and it is no longer possible to walk off the predictor arrays and cause a seg fault. There are now individual thresholds for choice, global, and local saturating counters, so that taken/not taken decisions are correct even when the predictors' counters' sizes are different. The interface for localPredictorSize has been removed from TournamentBP because the value can be calculated from localHistoryBits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2012-11-19	config: Fix description of checkpoint option from cycle to tick	Andreas Hansson
	This patch merely updates the description of the "take-checkpoints" option to reflect that it is specified in ticks and not in cycles.
2012-11-02	python: Rename doDrain()->drain() and make it do the right thing	Andreas Sandberg
	There is no point in exporting the old drain() method in Simulate.py. It should only be used internally by doDrain(). This patch moves the old drain() method into doDrain() and renames doDrain() to drain().
2012-11-02	Partly revert [4f54b0f229b5] and move draining to m5.changeToTiming	Andreas Sandberg
	Changeset 4f54b0f229b5 removed the call to doDrain in changeToTiming based on the assumption that the system does not need draining when running in atomic mode. This is a false assumption since at least the System class requires the system to be drained before it allows switching of memory modes. This patch reverts that part of the changeset.
2012-10-30	config: Unify caches used in regressions and adjust L2 MSHRs	Andreas Hansson
	This patch unified the L1 and L2 caches used throughout the regressions instead of declaring different, but very similar, configurations in the different scripts. The patch also changes the default L2 configuration to match what it used to be for the fs and se scripts (until the last patch that updated the regressions to also make use of the cache config). The MSHRs and targets per MSHR are now set to a more realistic default of 20 and 12, respectively. As a result of both the aforementioned changes, many of the regression stats are changed. A follow-on patch will bump the stats.
2012-10-26	config: Fix the cache class naming in regression scripts	Andreas Hansson
	This patch unifies the naming of the default L1 and L2 caches in the regression configs to be in line with what is used in the se and fs scripts.
2012-10-25	config: Use SimpleDRAM in full-system, and with o3 and inorder	Andreas Hansson
	This patch favours using SimpleDRAM with the default timing instead of SimpleMemory for all regressions that involve the o3 or inorder CPU, or are full system (in other words, where the actual performance of the memory is important for the overall performance). Moving forward, the solution for FSConfig and the users of fs.py and se.py is probably something similar to what we use to choose the CPU type. I envision a few pre-set configurations SimpleLPDDR2, SimpleDDR3, etc that can be choosen by a dram_type option. Feedback on this part is welcome. This patch changes plenty stats and adds all the DRAM controller related stats. A follow-on patch updates the relevant statistics. The total run-time for the entire regression goes up with ~5% with this patch due to the added complexity of the SimpleDRAM model. This is a concious trade-off to ensure that the model is properly tested.
2012-10-25	config: Use shared cache config for regressions	Andreas Hansson
	This patch uses the common L1, L2 and IOCache configuration for the regressions that all share the same cache parameters. There are a few regressions that use a slightly different configuration (memtest, o3-timing=mp, simple-atomic-mp and simple-timing-mp), and the latter are not changed in this patch. They will be updated in a future patch. The common cache configurations are changed to match the ones used in the regressions, and are slightly changed with respect to what they were. Hopefully this means we can converge on a common base configuration, used both in the normal user configurations and regressions. As only regressions that shared the same cache configuration are updated, no regressions are affected.
2012-10-15	Mem: Use cycles to express cache-related latencies	Andreas Hansson
	This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.
2012-10-15	Regression: Use CPU clock and 32-byte width for L1-L2 bus	Andreas Hansson
	This patch changes the CoherentBus between the L1s and L2 to use the CPU clock and also four times the width compared to the default bus. The parameters are not intending to fit every single scenario, but rather serve as a better startingpoint than what we previously had. Note that the scripts that do not use the addTwoLevelCacheHiearchy are not affected by this change. A separate patch will update the stats.
2012-09-25	Cache: add a response latency to the caches	Mrinmoy Ghosh
	In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path.
2012-09-19	AddrRange: Simplify AddrRange params Python hierarchy	Andreas Hansson
	This patch simplifies the Range object hierarchy in preparation for an address range class that also allows striping (e.g. selecting a few bits as matching in addition to the range). To extend the AddrRange class to an AddrRegion, the first step is to simplify the hierarchy such that we can make it as lean as possible before adding the new functionality. The only class using Range and MetaRange is AddrRange, and the three classes are now collapsed into one.
2012-09-12	Standard Switch: Drain the system before switching CPUs	Joel Hestness
	When switching from an atomic CPU to any of the timing CPUs, a drain is unnecessary since no events are scheduled in atomic mode. However, when trying to switch CPUs starting with a timing CPU, there may be events scheduled. This change ensures that all events are drained from the system by calling m5.drain before switching CPUs.
2012-09-11	Checkpoint: Pass maxtick to avoid undefined variable	Andreas Hansson
	This patch fixes a bug in scriptCheckpoints, where maxtick was used undefined. The bug caused checkpointing by means of --take-checkpoints to fail.
2012-09-09	se.py: support specifying multiple programs via command line	Nilay Vaish
	This patch allows for specifying multiple programs via command line. It also adds an option for specifying whether to use of SMT. But SMT does not work for the o3 cpu as of now.
2012-08-22	Bridge: Remove NACKs in the bridge and unify with packet queue	Andreas Hansson
	This patch removes the NACKing in the bridge, as the split request/response busses now ensure that protocol deadlocks do not occur, i.e. the message-dependency chain is broken by always allowing responses to make progress without being stalled by requests. The NACKs had limited support in the system with most components ignoring their use (with a suitable call to panic), and as the NACKs are no longer needed to avoid protocol deadlocks, the cleanest way is to simply remove them. The bridge is the starting point as this is the only place where the NACKs are created. A follow-up patch will remove the code that deals with NACKs in the endpoints, e.g. the X86 table walker and DMA port. Ultimately the type of packet can be complete removed (until someone sees a need for modelling more complex protocols, which can now be done in parts of the system since the port and interface is split). As a consequence of the NACK removal, the bridge now has to send a retry to a master if the request or response queue was full on the first attempt. This change also makes the bridge ports very similar to QueuedPorts, and a later patch will change the bridge to use these. A first step in this direction is taken by aligning the name of the member functions, as done by this patch. A bit of tidying up has also been done as part of the simplifications. Surprisingly, this patch has no impact on any of the regressions. Hence, there was never any NACKs issued. In a follow-up patch I would suggest changing the size of the bridge buffers set in FSConfig.py to also test the situation where the bridge fills up.
2012-08-21	Checkpoint: Fix broken checkpointing functionality	Andreas Hansson
	This patch fixes the checkpointing by ensuring that the directory is passer to the scriptCheckpoints function, and that the num_checkpoints is not used before it is initialised.
2012-08-15	configs: add option for repeatedly switching back-and-forth between cpu types.	Anthony Gutierrez
	This patch adds a --repeat-switch option that will enable repeat core switching at a user defined period (set with --switch-freq option). currently, a switch can only occur between like CPU types. inorder CPU switching is not supported. note this patch simply allows a config that will perform repeat switching, it does not fix drain/switchout functionality. if you run with repeat switching you will hit assertion failures and/or your workload with hang or die.
2012-08-06	Simulation.py: move code related to checkpointing to functions	Nilay Vaish
	This patch moves the code related to checkpointing from the run() function to several different functions. The aim is to make the code more manageable. No functionality changes are expected, but since the code is kind of unruly, it is possible that some change might have creeped in.
2012-08-06	Config: change how cpu class is set	Nilay Vaish
	This changes the way in which the cpu class while restoring from a checkpoint is set. Earlier it was assumed if cpu type with which to restore is not same as the cpu type with the which to run the simulation, then the checkpoint should be restored with the atomic cpu. This assumption is being dropped. The checkpoint can now be restored with any cpu type, the default being atomic cpu.
2012-07-23	Config: Use clock option in se/fs script and pass to switch_cpus	Andreas Hansson
	This patch changes the se and fs script to use the clock option and not simply set the CPUs clock to 2 GHz. It also makes a minor change to the assignment of the switch_cpus clock to allow different clocks.
2012-06-11	configs: add run scripts for ics/gb versions of android and bbench	Anthony Gutierrez
	1) Modifies Benchmarks.py to add support for Android ICS and BBench on Android ICS. 2) An rcS script is added for BBench on ICS. 3) Separates benchmark entries and rcS scripts for GB/ICS 4) Removes the debugging output from the existing BBench run script. These print statements were used for debugging and they seemed to confuse users into believing they should see some terminal output.
2012-06-07	Config: Remove setMipsOptions	Nilay Vaish
	As status matrix, MIPS fs does not work. Hence, these options are not required. Secondly, the function is setting param values for a CPU class. This seems strange, should probably be done in a different way.