summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2014-08-13cpu: Don't forward declare RefCountingPtrAndreas Sandberg
RefCountingPtr is sometimes forward declared to avoid having to include refcnt.hh. This does not work since we typically return instances of RefCountingPtr rather than references to instances. The only reason this currently works is that we include refcnt.hh in cprintf.hh, which "leaks" the header to most other source files. This changeset replaces such forward declarations with an include of refcnt.hh.
2014-08-13mem: Properly set cache block status fields on writebacksMitch Hayenga
When a cacheline is written back to a lower-level cache, tags->insertBlock() sets various status parameters. However these status bits were cleared immediately after calling. This patch makes it so that these status fields are not cleared by moving them outside of the tags->insertBlock() call.
2014-08-13cpu: Modernise the branch predictor (STL and C++11)Andreas Hansson
This patch does some minor house keeping of the branch predictor by adopting STL containers, and shifting some iterator to use range-based for loops. The predictor history is also changed from a list to a deque as we never to insertion/deletion other than at the front and back.
2014-03-11arm: remove dead code fplib mul64x64Curtis Dunham
2014-08-10config: Add SubSystem container for simobjectsGeoffrey Blake
This patch adds the SubSystem container for grouping simobjects together in logical subsystems to facilitate building a larger system from constituent parts. The container is simply a non-abstract empty simobject to hold the components that will be connected as its children. In simulation the object does not participate, its only use is during configuration of the system.
2014-08-10config: Add hooks to enable new config sysGeoffrey Blake
This patch adds helper functions to SimObject.py, params.py and simulate.py to enable the new configuration system. Functions like enumerateParams() in SimObject lets the config system auto-generate command line options for simobjects to be modified on the command line. Params in params.py have __call__() added to their definition to allow the argparse module to use them as a type to check command input is in the proper format.
2014-08-10cpu: Ensure the traffic generator suppresses non-memory packetsAndreas Hansson
This patch adds a check to ensure that packets which are not going to a memory range are suppressed in the traffic generator. Thus, if a trace is collected in full-system, the packets destined for devices are not played back.
2014-08-10base: Remove unused filesAndreas Hansson
A bit of pruning
2014-07-28mem: refactor LRU cache tags and add random replacement tagsAnthony Gutierrez
this patch implements a new tags class that uses a random replacement policy. these tags prefer to evict invalid blocks first, if none are available a replacement candidate is chosen at random. this patch factors out the common code in the LRU class and creates a new abstract class: the BaseSetAssoc class. any set associative tag class must implement the functionality related to the actual replacement policy in the following methods: accessBlock() findVictim() insertBlock() invalidate()
2014-07-23cpu: `Minor' in-order CPU modelAndrew Bardsley
This patch contains a new CPU model named `Minor'. Minor models a four stage in-order execution pipeline (fetch lines, decompose into macroops, decompose macroops into microops, execute). The model was developed to support the ARM ISA but should be fixable to support all the remaining gem5 ISAs. It currently also works for Alpha, and regressions are included for ARM and Alpha (including Linux boot). Documentation for the model can be found in src/doc/inside-minor.doxygen and its internal operations can be visualised using the Minorview tool utils/minorview.py. Minor was designed to be fairly simple and not to engage in a lot of instruction annotation. As such, it currently has very few gathered stats and may lack other gem5 features. Minor is faster than the o3 model. Sample results: Benchmark | Stat host_seconds (s) ---------------+--------v--------v-------- (on ARM, opt) | simple | o3 | minor | timing | timing | timing ---------------+--------+--------+-------- 10.linux-boot | 169 | 1883 | 1075 10.mcf | 117 | 967 | 491 20.parser | 668 | 6315 | 3146 30.eon | 542 | 3413 | 2414 40.perlbmk | 2339 | 20905 | 11532 50.vortex | 122 | 1094 | 588 60.bzip2 | 2045 | 18061 | 9662 70.twolf | 207 | 2736 | 1036
2014-07-19syscall emulation: fix fast build issueSteve Reinhardt
Surprisingly gcc will complain about unused variables even inside an 'if (false)' block. I thought I had tested this previously, but apparently not.
2014-07-18x86: make PioBus return BadAddress errorsBinh Pham
Stop setting the use_default_range flag in PioBus in order to have random bad addresses result in a BadAddress response and not a gem5 fatal error. This is necessary in Ruby as Ruby is connected directly to PioBus, so misspeculated addresses will be sent there directly. For the classic memory system, this change has no effect, as bad addresses are caught by the memory bus before being sent to the PioBus. This work was done while Binh was an intern at AMD Research.
2014-07-18sim: remove unused MemoryModeStrings arraySteve Reinhardt
The System object has a static MemoryModeStrings array that's (1) unused and (2) redundant, since there's an auto-generated version in the Enums namespace. No point in leaving it in.
2014-07-18kern: get rid of unused linux syscall filesSteve Reinhardt
2014-07-18syscall emulation: fix DPRINTF arg ordering bugSteve Reinhardt
When we switched getSyscallArg() from explicit arg indices to the implicit method, some DPRINTF arguments were left as calls to getSyscallArg(), even though C/C++ doesn't guarantee anything about the order of invocation of these calls. As a result, the args could be printed out in arbitrary orders. Interestingly, this bug has been around since 2009: http://repo.gem5.org/gem5/rev/4842482e1bd1
2014-07-09base: fix operator== for comparing EthAddr objectsAnthony Gutierrez
this operator uses memcmp() to detect if two EthAddr object have the same address, however memcmp() will return 0 if all bytes are equal. operator== returns the return value of memcmp() to indicate whether or not two address are equal. this is incorrect as it will always give the opposite of the intended behavior. this patch fixes that problem.
2014-07-02base: fix some bugs in EthAddrAnthony Gutierrez
per the IEEE 802 spec: 1) fixed broadcast() to ensure that all bytes are equal to 0xff. 2) fixed unicast() to ensure that bit 0 of the first byte is equal to 0 3) fixed multicast() to ensure that bit 0 of the first byte is equal to 1, and that it is not a broadcast. also the constructors in EthAddr are fixed so that all bytes of data are initialized.
2014-07-01util: Add DVFS perfLevel to checkpoint upgrade scriptRadhika Jagtap
This patch updates the checkpoint upgrader script. It adds the _perfLevel variable in the clock domain and voltage domain simObjects used for DVFS.
2014-06-30power: Add basic DVFS support for gem5Stephan Diestelhorst
Adds DVFS capabilities to gem5, by allowing users to specify lists for frequencies and voltages in SrcClockDomains and VoltageDomains respectively. A separate component, DVFSHandler, provides a small interface to change operating points of the associated domains. Clock domains will be linked to voltage domains and thus allow separate clock, but shared voltage lines. Currently all the valid performance-level updates are performed with a fixed transition latency as specified for the domain. Config file example: ... vd = VoltageDomain(voltage = ['1V','0.95V','0.90V','0.85V']) tsys.cluster1.clk_domain.clock = ['1GHz','700MHz','400MHz','230MHz'] tsys.cluster2.clk_domain.clock = ['1GHz','700MHz','400MHz','230MHz'] tsys.cluster1.clk_domain.domain_id = 0 tsys.cluster2.clk_domain.domain_id = 1 tsys.cluster1.clk_domain.voltage_domain = vd tsys.cluster2.clk_domain.voltage_domain = vd tsys.dvfs_handler.domains = [tsys.cluster1.clk_domain, tsys.cluster2.clk_domain] tsys.dvfs_handler.enable = True
2014-06-30mem: DRAMPower trace outputAndreas Hansson
This patch adds a DRAMPower flag to enable off-line DRAM power analysis using the DRAMPower tool. A new DRAMPower flag is added and a follow-on patch adds a Python script to post-process the output and order it based on time stamps. The long-term goal is to link DRAMPower as a library and provide the commands through function calls to the model rather than first printing and then parsing the commands. At the moment it is also up to the user to ensure that the same DRAM configuration is used by the gem5 controller model and DRAMPower.
2014-06-30mem: Add bank and rank indices as fields to the DRAM bankAndreas Hansson
This patch adds the index of the bank and rank as a field so that we can determine the identity of a given bank (reference or pointer) for the power tracing. We also grab the opportunity of cleaning up the arguments used for identifying the bank when activating.
2014-06-30mem: Extend DRAM row bits from 16 to 32 for larger densitiesAndreas Hansson
This patch extends the DRAM row bits to 32 to support larger density memories. Additional checks are also added to ensure the row fits in the 32 bits.
2014-06-30cpu: implement a bi-mode branch predictorAnthony Gutierrez
2014-06-21x86: fix table walker assertionBinh Pham
In a cycle, we could see a R and W requests corresponding to the same page walk being sent to the memory. During the cycle that assertion happens, we have 2 responses corresponding to the R and W above. We also have a 'read' variable to keep track of the inflight Read request, this gets reset to NULL right after we send out any R request; and gets set to the next R in the page walk when a response comes back. The issue we are seeing here is when we get a response for W request, assert(!read) fires because we got a response for R request right before this, hence we set 'read' to NOT NULL value, pointing to the next R request in the pagewalk! This work was done while Binh was an intern at AMD Research.
2014-06-21o3: make dispatch LSQ full check more selectiveBinh Pham
Dispatch should not check LSQ size/LSQ stall for non load/store instructions. This work was done while Binh was an intern at AMD Research.
2014-06-21o3: split load & store queue full cases in renameBinh Pham
Check for free entries in Load Queue and Store Queue separately to avoid cases when load cannot be renamed due to full Store Queue and vice versa. This work was done while Binh was an intern at AMD Research.
2014-06-10scons: Bump the compiler version to gcc 4.6 and clang 3.0Andreas Hansson
This patch bumps the supported version of gcc from 4.4 to 4.6, and clang from 2.9 to 3.0. This enables, amongst other things, range-based for loops, lambda expressions, etc. The STL implementation shipping with 4.6 also has a full functional implementation of unique_ptr and shared_ptr.
2014-06-09sim: More rigorous clocking commentsJoel Hestness
The language describing the clockEdge and nextCycle functions were ambiguous, and so were prone to misinterpretation/misuse. Clear up the comments to more rigorously describe their functionality.
2014-05-31style: eliminate equality tests with true and falseSteve Reinhardt
Using '== true' in a boolean expression is totally redundant, and using '== false' is pretty verbose (and arguably less readable in most cases) compared to '!'. It's somewhat of a pet peeve, perhaps, but I had some time waiting for some tests to run and decided to clean these up. Unfortunately, SLICC appears not to have the '!' operator, so I had to leave the '== false' tests in the SLICC code.
2014-05-23ruby: slicc: remove unused ids DNUCA*Nilay Vaish
2014-05-23ruby: remove old protocol documentationNilay Vaish
2014-05-23ruby: message buffer: drop dequeue_getDelayCycles()Nilay Vaish
The functionality of updating and returning the delay cycles would now be performed by the dequeue() function itself.
2014-05-23cpu: o3: remove stat totalCommittedInstsNilay Vaish
This patch removes the stat totalCommittedInsts. This variable was used for recording the total number of instructions committed across all the threads of a core. The instructions committed by each thread are recorded invidually. The total would now be generated by summing these individual counts.
2014-05-12syscall emulation: clean up & comment SyscallReturnSteve Reinhardt
2014-05-09mem: Update DDR3 and DDR4 based on datasheetsAndreas Hansson
This patch makes a more firm connection between the DDR3-1600 configuration and the corresponding datasheet, and also adds a DDR3-2133 and a DDR4-2400 configuration. At the moment there is also an ongoing effort to align the choice of datasheets to what is available in DRAMPower.
2014-05-09mem: Add DRAM cycle timeAndreas Hansson
This patch extends the current timing parameters with the DRAM cycle time. This is needed as the DRAMPower tool expects timestamps in DRAM cycles. At the moment we could get away with doing this in a post-processing step as the DRAMPower execution is separate from the simulation run. However, in the long run we want the tool to be called during the simulation, and then the cycle time is needed.
2014-05-09mem: Simplify DRAM response schedulingAndreas Hansson
This patch simplifies the DRAM response scheduling based on the assumption that they are always returned in order.
2014-05-09mem: Add precharge all (PREA) to the DRAM controllerAndreas Hansson
This patch adds the basic ingredients for a precharge all operation, to be used in conjunction with DRAM power modelling. Currently we do not try and apply any cleverness when precharging all banks, thus even if only a single bank is open we use PREA as opposed to PRE. At the moment we only have a single tRP (tRPpb), and do not model the slightly longer all-bank precharge constraint (tRPab).
2014-05-09mem: Remove printing of DRAM paramsAndreas Hansson
This patch removes the redundant printing of DRAM params.
2014-05-09mem: Add tRTP to the DRAM controllerAndreas Hansson
This patch adds the tRTP timing constraint, governing the minimum time between a read command and a precharge. Default values are provided for the existing DRAM types.
2014-05-09mem: Merge DRAM latency calculation and bank state updateAndreas Hansson
This patch merges the two control paths used to estimate the latency and update the bank state. As a result of this merging the computation is now in one place only, and should be easier to follow as it is all done in absolute (rather than relative) time. As part of this change, the scheduling is also refined to ensure that we look at a sensible estimate of the bank ready time in choosing the next request. The bank latency stat is removed as it ends up being misleading when the DRAM access code gets evaluated ahead of time (due to the eagerness of waking the model up for scheduling the next request).
2014-05-09mem: Add tWR to DRAM activate and precharge constraintsAndreas Hansson
This patch adds the write recovery time to the DRAM timing constraints, and changes the current tRASDoneAt to a more generic preAllowedAt, capturing when a precharge is allowed to take place. The part of the DRAM access code that accounts for the precharge and activate constraints is updated accordingly.
2014-05-09mem: Merge DRAM page-management calculationsAndreas Hansson
This patch treats the closed page policy as yet another case of auto-precharging, and thus merges the code with that used for the other policies.
2014-05-09mem: Add DRAM power states to the controllerAndreas Hansson
This patch adds power states to the controller. These states and the transitions can be used together with the Micron power model. As a more elaborate use-case, the transitions can be used to drive the DRAMPower tool. At the moment, the power-down modes are not used, and this patch simply serves to capture the idle, auto refresh and active modes. The patch adds a third state machine that interacts with the refresh state machine.
2014-05-09mem: Ensure DRAM refresh respects timingsAndreas Hansson
This patch adds a state machine for the refresh scheduling to ensure that no accesses are allowed while the refresh is in progress, and that all banks are propely precharged. As part of this change, the precharging of banks of broken out into a method of its own, making is similar to how activations are dealt with. The idle accounting is also updated to ensure that the refresh duration is not added to the time that the DRAM is in the idle state with all banks precharged.
2014-05-09mem: Make DRAM read/write switching less conservativeAndreas Hansson
This patch changes the read/write event loop to use a single event (nextReqEvent), along with a state variable, thus joining the two control flows. This change makes it easier to follow the state transitions, and control what happens when. With the new loop we modify the overly conservative switching times such that the write-to-read switch allows bank preparation to happen in parallel with the bus turn around. Similarly, the read-to-write switch uses the introduced tRTW constraint.
2014-04-17arm: Make sure UndefinedInstructions are properly initializedAli Saidi
2014-04-17arm: allow DC instructions by default so SE mode worksAli Saidi
2014-04-17sim, arm: implement more of the at variety syscallsAli Saidi
Needed for new AArch64 binaries
2014-05-09cpu: Useful getters for ActivityRecorderAndrew Bardsley
Add some useful getters to ActivityRecorder