gem5 - gem5

Age	Commit message (Collapse)	Author
2013-08-26	ARM: Fix configuration files for bare-metal binaries.	Ali Saidi

2013-08-19	config: Command line support for multi-channel memory	Andreas Hansson
	This patch adds support for specifying multi-channel memory configurations on the command line, e.g. 'se/fs.py --mem-type=ddr3_1600_x64 --mem-channels=4'. To enable this, it enhances the functionality of MemConfig and moves the existing makeMultiChannel class method from SimpleDRAM to the support scripts. The se/fs.py example scripts are updated to make use of the new feature.
2013-08-19	power: Add voltage domains to the clock domains	Akash Bagdia
	This patch adds the notion of voltage domains, and groups clock domains that operate under the same voltage (i.e. power supply) into domains. Each clock domain is required to be associated with a voltage domain, and the latter requires the voltage to be explicitly set. A voltage domain is an independently controllable voltage supply being provided to section of the design. Thus, if you wish to perform dynamic voltage scaling on a CPU, its clock domain should be associated with a separate voltage domain. The current implementation of the voltage domain does not take into consideration cases where there are derived voltage domains running at ratio of native voltage domains, as with the case where there can be on-chip buck/boost (charge pumps) voltage regulation logic. The regression and configuration scripts are updated with a generic voltage domain for the system, and one for the CPUs.
2013-08-19	config: Move the memory instantiation outside FSConfig	Andreas Hansson
	This patch moves the instantiation of the memory controller outside FSConfig and instead relies on the mem_ranges to pass the information to the caller (e.g. fs.py or one of the regression scripts). The main motivation for this change is to expose the structural composition of the memory system and allow more tuning and configuration without adding a large number of options to the makeSystem functions. The patch updates the relevant example scripts to maintain the current functionality. As the order that ports are connected to the memory bus changes (in certain regresisons), some bus stats are shuffled around. For example, what used to be layer 0 is now layer 1. Going forward, options will be added to support the addition of multi-channel memory controllers.
2013-07-18	Configs: Fix up maxtick and maxtime	Joel Hestness
	This patch contains three fixes to max tick options handling in Options.py and Simulation.py: 1) Since the global simulator frequency isn't bound until m5.instantiate() is called, the maxtick resolution needs to happen after this call, since changes to the global frequency will cause m5.simulate() to misinterpret the maxtick value. Shuffling this also requires tweaking the checkpoint directory handling to signal the checkpoint restore tick back to run(). Fixing this completely and correctly will require storing the simulation frequency into checkpoints, which is beyond the scope of this patch. 2) The maxtick option in Options.py was defaulted to MaxTicks, so the old code would always skip over the maxtime part of the conditionals at the beginning of run(). Change the maxtick default to None, and set the maxtick local variable in run() appropriately. 3) To clarify whether max ticks settings are relative or absolute, split the maxtick option into separate options, for relative and absolute. Ensure that these two options and the maxtime option are handled appropriately to set the maxtick variable in Simulation.py.
2013-07-18	config: Update script to set cache line size on system	Andreas Hansson
	This patch changes the config scripts such that they do not set the cache line size per cache instance, but rather for the system as a whole.
2013-06-28	configs: rearrange the available options in Options.py	Nilay Vaish
	It also changes the instantiation of physmem in se.py so as to make use of the memory size supplied by the mem_size option.
2013-06-27	sim: Add the notion of clock domains to all ClockedObjects	Akash Bagdia
	This patch adds the notion of source- and derived-clock domains to the ClockedObjects. As such, all clock information is moved to the clock domain, and the ClockedObjects are grouped into domains. The clock domains are either source domains, with a specific clock period, or derived domains that have a parent domain and a divider (potentially chained). For piece of logic that runs at a derived clock (a ratio of the clock its parent is running at) the necessary derived clock domain is created from its corresponding parent clock domain. For now, the derived clock domain only supports a divider, thus ensuring a lower speed compared to its parent. Multiplier functionality implies a PLL logic that has not been modelled yet (create a separate clock instead). The clock domains should be used as a mechanism to provide a controllable clock source that affects clock for every clocked object lying beneath it. The clock of the domain can (in a future patch) be controlled by a handler responsible for dynamic frequency scaling of the respective clock domains. All the config scripts have been retro-fitted with clock domains. For the System a default SrcClockDomain is created. For CPUs that run at a different speed than the system, there is a seperate clock domain created. This domain incorporates the CPU and the associated caches. As before, Ruby runs under its own clock domain. The clock period of all domains are pre-computed, such that no virtual functions or multiplications are needed when calling clockPeriod. Instead, the clock period is pre-computed when any changes occur. For this to be possible, each clock domain tracks its children.
2013-06-27	config: Rename clock option to Ruby clock	Akash Bagdia
	This patch changes the 'clock' option to 'ruby-clock' as it is only used by Ruby.
2013-06-27	config: Add a system clock command-line option	Akash Bagdia
	This patch adds a 'sys_clock' command-line option and use it to assign clocks to the system during instantiation. As part of this change, the default clock in the System class is removed and whenever a system is instantiated a system clock value must be set. A default value is provided for the command-line option. The configs and tests are updated accordingly.
2013-06-27	config: Add a CPU clock command-line option	Akash Bagdia
	This patch adds a 'cpu_clock' command-line option and uses the value to assign clocks to components running at the CPU speed (L1 and L2 including the L2-bus). The configuration scripts are updated accordingly. The 'clock' option is left unchanged in this patch as it is still used by a number of components. In follow-on patches the latter will be disambiguated further.
2013-06-03	config: Add missing CPUs to --restore-with-cpu	Andreas Sandberg
	The --restore-with-cpu option didn't use CpuConfig.cpu_names() to determine which CPU names are valid, instead it used a static list of known CPU names. This changeset makes the option parsing code use the CPU list from the CpuConfig module instead.
2013-05-30	mem: More descriptive DRAM config names	Andreas Hansson
	This patch changes the class names of the variuos DRAM configurations to better reflect what memory they are based on. The speed and interface width is now part of the name, and also the alias that is used to select them on the command line. Some minor changes are done to the actual parameters, to better reflect the named configurations. As a result of these changes the regressions change slightly and the stats will be bumped in a separate patch.
2013-05-30	mem: Add a LPDDR3-1600 configuration	Andreas Hansson
	This patch adds a typical (leaning towards fast) LPDDR3 configuration based on publically available data. As expected, it looks very similar to the LPDDR2-S4 configuration, only with a slightly lower burst time.
2013-05-30	mem: Avoid explicitly zeroing the memory backing store	Andreas Hansson
	This patch removes the explicit memset as it is redundant and causes the simulator to touch the entire space, forcing the host system to allocate the pages. Anonymous pages are mapped on the first access, and the page-fault handler is responsible for zeroing them. Thus, the pages are still zeroed, but we avoid touching the entire allocated space which enables us to use much larger memory sizes as long as not all the memory is actually used.
2013-05-14	cpu: remove local/globalHistoryBits params from branch pred	Anthony Gutierrez
	having separate params for the local/globalHistoryBits and the local/globalPredictorSize can lead to inconsistencies if they are not carefully set. this patch dervies the number of bits necessary to index into the local/global predictors based on their size. the value of the localHistoryTableSize for the ARM O3 CPU has been increased to 1024 from 64, which is more accurate for an A15 based on some correlation against A15 hardware.
2013-04-22	config: Add a mem-type config option to se/fs scripts	Andreas Hansson
	This patch enables selection of the memory controller class through a mem-type command-line option. Behind the scenes, this option is treated much like the cpu-type, and a similar framework is used to resolve the valid options, and translate the short-hand description to a valid class. The regression scripts are updated with a hardcoded memory class for the moment. The best solution going forward is probably to get the memory out of the makeSystem functions, but Ruby complicates things as it does not connect the memory controller to the membus. --HG-- rename : configs/common/CpuConfig.py => configs/common/MemConfig.py
2013-04-22	cpu: generate SimPoint basic block vector profiles	Dam Sunwoo
	This patch is based on http://reviews.m5sim.org/r/1474/ originally written by Mitch Hayenga. Basic block vectors are generated (simpoint.bb.gz in simout folder) based on start and end addresses of basic blocks. Some comments to the original patch are addressed and hooks are added to create and resume from checkpoints based on instruction counts dictated by external SimPoint analysis tools. SimPoint creation/resuming options will be implemented as a separate patch.
2013-04-09	Configs: Fix handling of maxtick and take_checkpoints	Joel Hestness
	In Simulation.py, calls to m5.simulate(num_ticks) will run the simulated system for num_ticks after the current tick. Fix calls to m5.simulate in scriptCheckpoints() and benchCheckpoints() to appropriately handle the maxticks variable.
2013-03-28	x86: create space in bios memory map	Nilay Vaish
	As of now, we mark the top 1MB of memory space as unusable. Part of it is actually usable and is required to be marked so by some of the newer versions of linux kernel. This patch marks the top 639KB as usable. This value was chosen by looking at QEMU's output for bios memory map.
2013-03-22	config: return exit event instead of cause	Nilay Vaish
	changeset: a4739b6f799d made some changes that where an exit event should have been returned in place of exit cause. This patch corrects the error.
2013-02-20	config: Fix --prog-interval command line option	Ali Saidi

2013-02-15	options: add command line option for dtb file	Anthony Gutierrez

2013-02-15	config: Remove O3 dependencies	Andreas Sandberg
	The default cache configuration script currently import the O3_ARM_v7a model configuration, which depends on the O3 CPU. This breaks if gem5 has been compiled without O3 support. This changeset removes the dependency by only importing the model if it is requested by the user. As a bonus, it actually removes some code duplication in the configuration scripts.
2013-02-15	config: Move CPU handover logic to m5.switchCpus()	Andreas Sandberg
	CPU switching consists of the following steps: 1. Drain the system 2. Switch out old CPUs (cpu.switchOut()) 3. Change the system timing mode to the mode the new CPUs require 4. Flush caches if switching to hardware virtualization 5. Inform new CPUs of the handover (cpu.takeOverFrom()) 6. Resume the system m5.switchCpus() previously only did step 2 & 5. Since information about the new processors' memory system requirements is now exposed, do all of the steps above. This patch adds automatic memory system switching and flush (if needed) to switchCpus(). Additionally, it adds optional draining to switchCpus(). This has the following implications: * changeToTiming and changeToAtomic are no longer needed, so they have been removed. * changeMemoryMode is only used internally, so it is has been renamed to be private. * switchCpus requires a reference to the system containing the CPUs as its first parameter. WARNING: This changeset breaks compatibility with existing configuration scripts since it changes the signature of m5.switchCpus().
2013-02-15	config: Cleanup CPU configuration	Andreas Sandberg
	The CPUs supported by the configuration scripts used to be hard-coded. This was not ideal for several reasons. For example, the configuration scripts depend on all CPU models even though only a subset might have been compiled. This changeset adds a new module to the configuration scripts that automatically discovers the available CPU models from the compiled SimObjects. As a nice bonus, the use of introspection allows us to automatically generate a list of available CPU models suitable for printing. This list is augmented with the Python doc string from the underlying class if available.
2013-02-15	cpu: Add CPU metadata om the Python classes	Andreas Sandberg
	The configuration scripts currently hard-code the requirements of each CPU. This is clearly not optimal as it makes writing new configuration scripts painful and adding new CPU models requires existing scripts to be updated. This patch adds the following class methods to the base CPU and all relevant CPUs: * memory_mode -- Return a string describing the current memory mode (invalid/atomic/timing). * require_caches -- Does the CPU model require caches? * support_take_over -- Does the CPU support CPU handover?
2013-02-10	config: Don't call sys.exit in interactive mode in run()	Andreas Sandberg
	The run() method in Simulation.py used to call sys.exit() when the simulator exits. This is undesirable when user has requested the simulator to be run in interactive mode since it causes the simulator to exit rather than entering the interactive Python environment.
2013-01-31	mem: Add DDR3 and LPDDR2 DRAM controller configurations	Andreas Hansson
	This patch moves the default DRAM parameters from the SimpleDRAM class to two different subclasses, one for DDR3 and one for LPDDR2. More can be added as we go forward. The regressions that previously used the SimpleDRAM are now using SimpleDDR3 as this is the most similar configuration.
2013-01-24	branch predictor: move out of o3 and inorder cpus	Nilay Vaish ext:(%2C%20Timothy%20Jones%20%3Ctimothy.jones%40cl.cam.ac.uk%3E)
	This patch moves the branch predictor files in the o3 and inorder directories to src/cpu/pred. This allows sharing the branch predictor across different cpu models. This patch was originally posted by Timothy Jones in July 2010 but never made it to the repository. --HG-- rename : src/cpu/o3/bpred_unit.cc => src/cpu/pred/bpred_unit.cc rename : src/cpu/o3/bpred_unit.hh => src/cpu/pred/bpred_unit.hh rename : src/cpu/o3/bpred_unit_impl.hh => src/cpu/pred/bpred_unit_impl.hh rename : src/cpu/o3/sat_counter.hh => src/cpu/pred/sat_counter.hh
2013-01-08	config: Fix issue with changeset: a4739b6f799d.	Ali Saidi

2013-01-08	util: add m5_fail op.	Lluís Vilanova
	Used as a command in full-system scripts helps the user ensure the benchmarks have finished successfully. For example, one can use: /path/to/benchmark args \|\| /sbin/m5 fail 1 and thus ensure gem5 will exit with an error if the benchmark fails.
2013-01-07	cpu: Rename defer_registration->switched_out	Andreas Sandberg
	The defer_registration parameter is used to prevent a CPU from initializing at startup, leaving it in the "switched out" mode. The name of this parameter (and the help string) is confusing. This patch renames it to switched_out, which should be more descriptive.
2013-01-07	config: Do not use hardcoded physmem in fs script	Andreas Hansson
	This patch generalises the address range resolution for the I/O cache and I/O bridge such that they do not assume a single memory. The patch involves adding a parameter to the system which is then defined based on the memories that are to be visible from the I/O subsystem, whether behind a cache or a bridge. The change is needed to allow interleaved memory controllers in the system.
2012-12-06	TournamentBP: Fix some bugs with table sizes and counters	Erik Tomusk
	globalHistoryBits, globalPredictorSize, and choicePredictorSize are decoupled. globalHistoryBits controls how much history is kept, global and choice predictor sizes control how much of that history is used when accessing predictor tables. This way, global and choice predictors can actually be different sizes, and it is no longer possible to walk off the predictor arrays and cause a seg fault. There are now individual thresholds for choice, global, and local saturating counters, so that taken/not taken decisions are correct even when the predictors' counters' sizes are different. The interface for localPredictorSize has been removed from TournamentBP because the value can be calculated from localHistoryBits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2012-11-19	config: Fix description of checkpoint option from cycle to tick	Andreas Hansson
	This patch merely updates the description of the "take-checkpoints" option to reflect that it is specified in ticks and not in cycles.
2012-11-02	python: Rename doDrain()->drain() and make it do the right thing	Andreas Sandberg
	There is no point in exporting the old drain() method in Simulate.py. It should only be used internally by doDrain(). This patch moves the old drain() method into doDrain() and renames doDrain() to drain().
2012-11-02	Partly revert [4f54b0f229b5] and move draining to m5.changeToTiming	Andreas Sandberg
	Changeset 4f54b0f229b5 removed the call to doDrain in changeToTiming based on the assumption that the system does not need draining when running in atomic mode. This is a false assumption since at least the System class requires the system to be drained before it allows switching of memory modes. This patch reverts that part of the changeset.
2012-10-30	config: Unify caches used in regressions and adjust L2 MSHRs	Andreas Hansson
	This patch unified the L1 and L2 caches used throughout the regressions instead of declaring different, but very similar, configurations in the different scripts. The patch also changes the default L2 configuration to match what it used to be for the fs and se scripts (until the last patch that updated the regressions to also make use of the cache config). The MSHRs and targets per MSHR are now set to a more realistic default of 20 and 12, respectively. As a result of both the aforementioned changes, many of the regression stats are changed. A follow-on patch will bump the stats.
2012-10-26	config: Fix the cache class naming in regression scripts	Andreas Hansson
	This patch unifies the naming of the default L1 and L2 caches in the regression configs to be in line with what is used in the se and fs scripts.
2012-10-25	config: Use SimpleDRAM in full-system, and with o3 and inorder	Andreas Hansson
	This patch favours using SimpleDRAM with the default timing instead of SimpleMemory for all regressions that involve the o3 or inorder CPU, or are full system (in other words, where the actual performance of the memory is important for the overall performance). Moving forward, the solution for FSConfig and the users of fs.py and se.py is probably something similar to what we use to choose the CPU type. I envision a few pre-set configurations SimpleLPDDR2, SimpleDDR3, etc that can be choosen by a dram_type option. Feedback on this part is welcome. This patch changes plenty stats and adds all the DRAM controller related stats. A follow-on patch updates the relevant statistics. The total run-time for the entire regression goes up with ~5% with this patch due to the added complexity of the SimpleDRAM model. This is a concious trade-off to ensure that the model is properly tested.
2012-10-25	config: Use shared cache config for regressions	Andreas Hansson
	This patch uses the common L1, L2 and IOCache configuration for the regressions that all share the same cache parameters. There are a few regressions that use a slightly different configuration (memtest, o3-timing=mp, simple-atomic-mp and simple-timing-mp), and the latter are not changed in this patch. They will be updated in a future patch. The common cache configurations are changed to match the ones used in the regressions, and are slightly changed with respect to what they were. Hopefully this means we can converge on a common base configuration, used both in the normal user configurations and regressions. As only regressions that shared the same cache configuration are updated, no regressions are affected.
2012-10-15	Mem: Use cycles to express cache-related latencies	Andreas Hansson
	This patch changes the cache-related latencies from an absolute time expressed in Ticks, to a number of cycles that can be scaled with the clock period of the caches. Ultimately this patch serves to enable future work that involves dynamic frequency scaling. As an immediate benefit it also makes it more convenient to specify cache performance without implicitly assuming a specific CPU core operating frequency. The stat blocked_cycles that actually counter in ticks is now updated to count in cycles. As the timing is now rounded to the clock edges of the cache, there are some regressions that change. Plenty of them have very minor changes, whereas some regressions with a short run-time are perturbed quite significantly. A follow-on patch updates all the statistics for the regressions.
2012-10-15	Regression: Use CPU clock and 32-byte width for L1-L2 bus	Andreas Hansson
	This patch changes the CoherentBus between the L1s and L2 to use the CPU clock and also four times the width compared to the default bus. The parameters are not intending to fit every single scenario, but rather serve as a better startingpoint than what we previously had. Note that the scripts that do not use the addTwoLevelCacheHiearchy are not affected by this change. A separate patch will update the stats.
2012-09-25	Cache: add a response latency to the caches	Mrinmoy Ghosh
	In the current caches the hit latency is paid twice on a miss. This patch lets a configurable response latency be set of the cache for the backward path.
2012-09-19	AddrRange: Simplify AddrRange params Python hierarchy	Andreas Hansson
	This patch simplifies the Range object hierarchy in preparation for an address range class that also allows striping (e.g. selecting a few bits as matching in addition to the range). To extend the AddrRange class to an AddrRegion, the first step is to simplify the hierarchy such that we can make it as lean as possible before adding the new functionality. The only class using Range and MetaRange is AddrRange, and the three classes are now collapsed into one.
2012-09-12	Standard Switch: Drain the system before switching CPUs	Joel Hestness
	When switching from an atomic CPU to any of the timing CPUs, a drain is unnecessary since no events are scheduled in atomic mode. However, when trying to switch CPUs starting with a timing CPU, there may be events scheduled. This change ensures that all events are drained from the system by calling m5.drain before switching CPUs.
2012-09-11	Checkpoint: Pass maxtick to avoid undefined variable	Andreas Hansson
	This patch fixes a bug in scriptCheckpoints, where maxtick was used undefined. The bug caused checkpointing by means of --take-checkpoints to fail.
2012-09-09	se.py: support specifying multiple programs via command line	Nilay Vaish
	This patch allows for specifying multiple programs via command line. It also adds an option for specifying whether to use of SMT. But SMT does not work for the o3 cpu as of now.
2012-08-22	Bridge: Remove NACKs in the bridge and unify with packet queue	Andreas Hansson
	This patch removes the NACKing in the bridge, as the split request/response busses now ensure that protocol deadlocks do not occur, i.e. the message-dependency chain is broken by always allowing responses to make progress without being stalled by requests. The NACKs had limited support in the system with most components ignoring their use (with a suitable call to panic), and as the NACKs are no longer needed to avoid protocol deadlocks, the cleanest way is to simply remove them. The bridge is the starting point as this is the only place where the NACKs are created. A follow-up patch will remove the code that deals with NACKs in the endpoints, e.g. the X86 table walker and DMA port. Ultimately the type of packet can be complete removed (until someone sees a need for modelling more complex protocols, which can now be done in parts of the system since the port and interface is split). As a consequence of the NACK removal, the bridge now has to send a retry to a master if the request or response queue was full on the first attempt. This change also makes the bridge ports very similar to QueuedPorts, and a later patch will change the bridge to use these. A first step in this direction is taken by aligning the name of the member functions, as done by this patch. A bit of tidying up has also been done as part of the simplifications. Surprisingly, this patch has no impact on any of the regressions. Hence, there was never any NACKs issued. In a follow-up patch I would suggest changing the size of the bridge buffers set in FSConfig.py to also test the situation where the bridge fills up.