summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-08-19mem: Perform write merging in the DRAM write queueAndreas Hansson
This patch implements basic write merging in the DRAM to avoid redundant bursts. When a new access is added to the queue it is compared against the existing entries, and if it is either intersecting or immediately succeeding/preceeding an existing item it is merged. There is currently no attempt made at avoiding iterating over the existing items in determining whether merging is possible or not.
2013-08-19mem: Replacing bytesPerCacheLine with DRAM burstLength in SimpleDRAMAmin Farmahini
This patch gets rid of bytesPerCacheLine parameter and makes the DRAM configuration separate from cache line size. Instead of bytesPerCacheLine, we define a parameter for the DRAM called burst_length. The burst_length parameter shows the length of a DRAM device burst in bits. Also, lines_per_rowbuffer is replaced with device_rowbuffer_size to improve code portablity. This patch adds a burst length in beats for each memory type, an interface width for each memory type, and the memory controller model is extended to reason about "system" packets vs "dram" packets and assemble the responses properly. It means that system packets larger than a full burst are split into multiple dram packets.
2013-08-19cpu: Fix timing CPU drain checkAndreas Hansson
This patch modifies the SimpleTimingCPU drain check to also consider the fetch event. Previously, there was an assumption that there is never a fetch event scheduled if the CPU is not executing microcode. However, when a context is activated, a fetch even is scheduled, and microPC() is zero.
2013-08-19alpha: Check interrupts before quiesceAndreas Hansson
This patch adds a check to the quiesce operation to ensure that the CPU does not suspend itself when there are unmasked interrupts pending. Without this patch there are corner cases when the CPU gets an interrupt before the quiesce is executed and then never wakes up again.
2013-08-19stats: Fix issue when printing 2D vectorsSascha Bischoff
This patch addresses an issue with the text-based stats output which resulted in Vector2D stats being printed without subnames in the event that one of the dimensions was of length 1. This patch also fixes the total printing for the 2D vector. Previously totals were printed without explicitly stating that a total was being printed. This has been rectified in this patch.
2013-08-19power: Add voltage domains to the clock domainsAkash Bagdia
This patch adds the notion of voltage domains, and groups clock domains that operate under the same voltage (i.e. power supply) into domains. Each clock domain is required to be associated with a voltage domain, and the latter requires the voltage to be explicitly set. A voltage domain is an independently controllable voltage supply being provided to section of the design. Thus, if you wish to perform dynamic voltage scaling on a CPU, its clock domain should be associated with a separate voltage domain. The current implementation of the voltage domain does not take into consideration cases where there are derived voltage domains running at ratio of native voltage domains, as with the case where there can be on-chip buck/boost (charge pumps) voltage regulation logic. The regression and configuration scripts are updated with a generic voltage domain for the system, and one for the CPUs.
2013-08-19config: Move the memory instantiation outside FSConfigAndreas Hansson
This patch moves the instantiation of the memory controller outside FSConfig and instead relies on the mem_ranges to pass the information to the caller (e.g. fs.py or one of the regression scripts). The main motivation for this change is to expose the structural composition of the memory system and allow more tuning and configuration without adding a large number of options to the makeSystem functions. The patch updates the relevant example scripts to maintain the current functionality. As the order that ports are connected to the memory bus changes (in certain regresisons), some bus stats are shuffled around. For example, what used to be layer 0 is now layer 1. Going forward, options will be added to support the addition of multi-channel memory controllers.
2013-08-19mem: Warn instead of panic for tXAW violationAndreas Hansson
Until the performance bug is fixed, avoid killing simulations.
2013-08-19mem: Allow disabling of tXAW through a 0 activation limitAndreas Hansson
This patch fixes an issue where an activation limit of 0 was not allowed. With this patch, setting the limit to 0 simply disables the tXAW constraint.
2013-08-19mem: Add an internal packet queue in SimpleMemoryAndreas Hansson
This patch adds a packet queue in SimpleMemory to avoid using the packet queue in the port (and thus have no involvement in the flow control). The port queue was bound to 100 packets, and as the SimpleMemory is modelling both a controller and an actual RAM, it potentially has a large number of packets in flight. There is currently no limit on the number of packets in the memory controller, but this could easily be added in a follow-on patch. As a result of the added internal storage, the functional access and draining is updated. Some minor cleaning up and renaming has also been done. The memtest regression changes as a result of this patch and the stats will be updated.
2013-08-19cpu: Fix a bug in the O3 CPU introduced by the cache line patchAndreas Hansson
This patch fixes a bug in the O3 fetch stage that was introduced when the cache line size was moved to the system. By mistake, the initialisation and resetting of the fetch stage was merged and put in the constructor. The resetting is now re-added where it should be.
2013-08-14arm: use -march when compiling m5op_arm.SAnthony Gutierrez
Using arm-linux-gnueabi-gcc 4.7.3-1ubuntu1 on Ubuntu 13.04 to compiled the m5 binary yields the error: m5op_arm.S: Assembler messages: m5op_arm.S:85: Error: selected processor does not support ARM mode `bxj lr' For each of of the SIMPLE_OPs. Apparently, this compiler doesn't like the interworking of these code types for the default arch. Adding -march=armv7-a makes it compile. Another alternative that I found to work is replacing the bxj lr instruction with mov pc, lr, but I don't know how that affects the KVM stuff and if bxj is needed.
2013-08-07ruby: slicc: remove double trigger, continueProcessingNilay Vaish
These constructs are not in use and are not being maintained by any one. In addition, it is not known if doubleTrigger works correctly with Ruby now.
2013-08-07ruby: slicc: move some code to AbstractControllerNilay Vaish
Some of the code in StateMachine.py file is added to all the controllers and is independent of the controller definition. This code is being moved to the AbstractController class which is the parent class of all controllers.
2013-08-07x86: add tlb checkpointingNilay Vaish
This patch adds checkpointing support to x86 tlb. It upgrades the cpt_upgrader.py script so that previously created checkpoints can be updated. It moves the checkpoint version to 6.
2013-07-19cpu: Remove unused getBranchPred() method from BaseCPUAndreas Sandberg
Remove unused virtual getBranchPred() method from BaseCPU as it is not implemented by any of the CPU models. It used to always return NULL.
2013-07-18Configs: Fix up maxtick and maxtimeJoel Hestness
This patch contains three fixes to max tick options handling in Options.py and Simulation.py: 1) Since the global simulator frequency isn't bound until m5.instantiate() is called, the maxtick resolution needs to happen after this call, since changes to the global frequency will cause m5.simulate() to misinterpret the maxtick value. Shuffling this also requires tweaking the checkpoint directory handling to signal the checkpoint restore tick back to run(). Fixing this completely and correctly will require storing the simulation frequency into checkpoints, which is beyond the scope of this patch. 2) The maxtick option in Options.py was defaulted to MaxTicks, so the old code would always skip over the maxtime part of the conditionals at the beginning of run(). Change the maxtick default to None, and set the maxtick local variable in run() appropriately. 3) To clarify whether max ticks settings are relative or absolute, split the maxtick option into separate options, for relative and absolute. Ensure that these two options and the maxtime option are handled appropriately to set the maxtick variable in Simulation.py.
2013-07-18config: Update script to set cache line size on systemAndreas Hansson
This patch changes the config scripts such that they do not set the cache line size per cache instance, but rather for the system as a whole.
2013-07-18mem: Set the cache line size on a system levelAndreas Hansson
This patch removes the notion of a peer block size and instead sets the cache line size on the system level. Previously the size was set per cache, and communicated through the interconnect. There were plenty checks to ensure that everyone had the same size specified, and these checks are now removed. Another benefit that is not yet harnessed is that the cache line size is now known at construction time, rather than after the port binding. Hence, the block size can be locally stored and does not have to be queried every time it is used. A follow-on patch updates the configuration scripts accordingly.
2013-07-18mem: Add cache class destructor to avoid memory leaksXiangyu Dong
Make valgrind a little bit happier
2013-07-18scons: Use python-config instead of distutilsAndreas Hansson
This patch changes how we determine the Python-related compiler and linker flags. The previous approach used the internal LINKFORSHARED which is not intended as part of the external API (http://bugs.python.org/issue3588) and causes failures on recent OSX installations. Instead of using distutils we now rely on python-config and scons ParseConfig. For backwards compatibility we also parse out the includes and libs although this could safely be dropped. The drawback of this patch is that Python 2.5 is now required, but hopefully that is an acceptable compromise as any system with gcc 4.4 most likely will have Python >= 2.5.
2013-07-18sim: Make MaxTick in Python match the one in C++Andreas Hansson
This patch aligns the MaxTick in Python with the one in C++. Thus, both reflect the maximum value that an unsigned 64-bit integer can have.
2013-07-15loader: Load weak symbols for function tracingDeyuan Guo
2013-07-15debug : Fixes the issue wherein Debug symbols were not getting dumped into ↵Umesh Bhaskar
trace files for SE mode
2013-07-11dev: make BasicPioDevice take size in constructorSteve Reinhardt
Instead of relying on derived classes explicitly assigning to the BasicPioDevice pioSize field, require them to pass a size value in to the constructor. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-11dev: consistently end device classes in 'Device'Steve Reinhardt
PciDev and IntDev stuck out as the only device classes that ended in 'Dev' rather than 'Device'. This patch takes care of that inconsistency. Note that you may need to delete pre-existing files matching build/*/python/m5/internal/param_* as scons does not pick up indirect dependencies on imported python modules when generating params, and the PciDev -> PciDevice rename takes place in a file (dev/Device.py) that gets imported quite a bit. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-11dev/arm: get rid of AmbaDev namespaceSteve Reinhardt
It was confusing having an AmbaDev namespace along with an AmbaDevice class. The namespace stuff is now moved in to a new base AmbaDevice class, which is a mixin for classes AmbaPioDevice (the former AmbaDevice) and AmbaDmaDevice to provide the readId function as an inherited member function. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-11devices: make more classes derive from BasicPioDeviceSteve Reinhardt
A couple of devices that have single fixed memory mapped regions were not derived from BasicPioDevice, when that's exactly the functionality that BasicPioDevice provides. This patch gets rid of a little bit of redundant code by making those devices actually do so. Also fixed the weird case of X86ISA::Interrupts, where the class already did derive from BasicPioDevice but didn't actually use all the features it could have. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-11ruby: removed the very old double trigger hackBrad Beckmann
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2013-07-02regressions: update a couple stats.txtNilay Vaish
The statistics for 30.eio-mp, pc-simple-timing-ruby tests are being updated to incorporate the changes due to recent patches.
2013-07-02regressions: update a couple of configsNilay Vaish
The configs for pc-simple-timing-ruby, t1000-simple-atomic had not been updated correctly in the patch 6e6cefc1db1f.
2013-06-28ruby: append transition comment only when in opt/debugNilay Vaish
2013-06-28configs: rearrange the available options in Options.pyNilay Vaish
It also changes the instantiation of physmem in se.py so as to make use of the memory size supplied by the mem_size option.
2013-06-28ruby: network: remove reconfiguration codeNilay Vaish
This code seems not to be of any use now. There is no path in the simulator that allows for reconfiguring the network. A better approach would be to take a checkpoint and start the simulation from the checkpoint with the new configuration.
2013-06-28ruby: check for compatibility between mem size and num dirsNilay Vaish
The configuration scripts provided for ruby assume that the available physical memory is equally distributed amongst the directory controllers. But there is no check to ensure this assumption has been adhered to. This patch adds the required check.
2013-06-27stats: Update stats for monitor, cache and bus changesAndreas Hansson
This patch removes the sparse histogram total from the CommMonitor stats. It also bumps the stats after the unit fixes in the atomic cache access. Lastly, it updates the stats to match the new port ordering. All numbers are the same, and the only thing that changes is which master corresponds to what port index.
2013-06-27mem: Reorganize cache tags and make them a SimObjectPrakash Ramrakhyani
This patch reorganizes the cache tags to allow more flexibility to implement new replacement policies. The base tags class is now a clocked object so that derived classes can use a clock if they need one. Also having deriving from SimObject allows specialized Tag classes to be swapped in/out in .py files. The cache set is now templatized to allow it to contain customized cache blocks with additional informaiton. This involved moving code to the .hh file and removing cacheset.cc. The statistics belonging to the cache tags are now including ".tags" in their name. Hence, the stats need an update to reflect the change in naming.
2013-06-27mem: Remove the cache builderAndreas Hansson
This patch removes the redundant cache builder class.
2013-06-27config: Remove Clock parameter multiplicationAndreas Hansson
This patch removes the multiplication operator support for Clock parameters as this functionality is now achieved by creating derived clock domains. Nate, this one is for you.
2013-06-27sim: Add the notion of clock domains to all ClockedObjectsAkash Bagdia
This patch adds the notion of source- and derived-clock domains to the ClockedObjects. As such, all clock information is moved to the clock domain, and the ClockedObjects are grouped into domains. The clock domains are either source domains, with a specific clock period, or derived domains that have a parent domain and a divider (potentially chained). For piece of logic that runs at a derived clock (a ratio of the clock its parent is running at) the necessary derived clock domain is created from its corresponding parent clock domain. For now, the derived clock domain only supports a divider, thus ensuring a lower speed compared to its parent. Multiplier functionality implies a PLL logic that has not been modelled yet (create a separate clock instead). The clock domains should be used as a mechanism to provide a controllable clock source that affects clock for every clocked object lying beneath it. The clock of the domain can (in a future patch) be controlled by a handler responsible for dynamic frequency scaling of the respective clock domains. All the config scripts have been retro-fitted with clock domains. For the System a default SrcClockDomain is created. For CPUs that run at a different speed than the system, there is a seperate clock domain created. This domain incorporates the CPU and the associated caches. As before, Ruby runs under its own clock domain. The clock period of all domains are pre-computed, such that no virtual functions or multiplications are needed when calling clockPeriod. Instead, the clock period is pre-computed when any changes occur. For this to be possible, each clock domain tracks its children.
2013-06-27config: Add a BaseSESystem builder for re-use in regressionsAndreas Hansson
This patch extends the existing system builders to also include a syscall-emulation builder. This builder is deployed in all syscall-emulation regressions that do not involve Ruby, i.e. o3-timing, simple-timing and simple-atomic, as well as the multi-processor regressions o3-timing-mp, simple-timing-mp and simple-atomic-mp (the latter are only used by SPARC at this point). The values chosen for the cache sizes match those that were used in the existing config scripts (despite being on the large side). Similarly, a mem_class parameter is added to the builder base class to enable simple-atomic to use SimpleMemory and o3-timing to use the default DDR3 configuration. Due to the different order the ports are connected, the bus stats get shuffled around for the multi-processor regressions. A separate patch bumps the port indices. Besides this, all behaviour is exactly the same.
2013-06-27config: Rename clock option to Ruby clockAkash Bagdia
This patch changes the 'clock' option to 'ruby-clock' as it is only used by Ruby.
2013-06-27config: Add a system clock command-line optionAkash Bagdia
This patch adds a 'sys_clock' command-line option and use it to assign clocks to the system during instantiation. As part of this change, the default clock in the System class is removed and whenever a system is instantiated a system clock value must be set. A default value is provided for the command-line option. The configs and tests are updated accordingly.
2013-06-27config: Add a CPU clock command-line optionAkash Bagdia
This patch adds a 'cpu_clock' command-line option and uses the value to assign clocks to components running at the CPU speed (L1 and L2 including the L2-bus). The configuration scripts are updated accordingly. The 'clock' option is left unchanged in this patch as it is still used by a number of components. In follow-on patches the latter will be disambiguated further.
2013-06-27config: Remove redundant explicit setting of default clocksAkash Bagdia
This patch removes the explicit setting of the clock period for certain instances of CoherentBus, NonCoherentBus and IOCache where the specified clock is same as the default value of the system clock. As all the values used are the defaults, there are no performance changes. There are similar cases where the toL2Bus is set to use the parent CPU clock which is already the default behaviour. The main motivation for these simplifications is to ease the introduction of clock domains.
2013-06-27tests: Prune 00.gzip from the regressionsAndreas Hansson
This patch prunes the 00.gzip regressions with the main motivation being that it adds little (or no) coverage and requires a substantial amount of run time. A complete regression run, including compilation from a clean repo, is almost 20% faster(!).
2013-06-27mem: Tidy up the bridge with const and additional checksAndreas Hansson
This patch does a bit of tidying up in the bridge code, adding const where appropriate and also removing redundant checks and adding a few new ones. There are no changes to the behaviour of any regressions.
2013-06-27mem: Fix CommMonitor style and response checkAndreas Hansson
This patch fixes the CommMonitor local variable names, and also introduces a variable to capture if it expects to see a response. The latter check considers both needsResponse and memInhibitAsserted.
2013-06-27mem: Align cache timing to clock edgesAndreas Hansson
This patch changes the cache timing calculations such that the results are aligned to clock edges. Plenty stats change as a results of this patch.
2013-06-27cpu: Consider instructions waiting for FU completion in drainingAndreas Hansson
This patch changes the IEW drain check to include the FU pool as there can be instructions that are "stored" in FU completion events and thus not covered by the existing checks. With this patch, we simply include a check to see if all the FUs are considered non-busy in the next tick. Without this patch, the pc-switcheroo-full regression fails after minor changes to the cache timing (aligning to clock edge).