Age | Commit message (Collapse) | Author |
|
This patch allows the KVM CPU module to initialize it's MSRs by
enumerating the MSRs in the gem5 x86 implementation.
|
|
In order to support hardware virtualization, we need to be able to
check if there are any interrupts pending irregardless of the
rflags.intf value. This changeset adds the checkInterruptsRaw() method
to the x86 interrupt control. It returns true if there are pending
interrupts that can be delivered as soon as the CPU is ready for
interrupt delivery.
|
|
This patch allows a hardware virtualized CPU to discover which interrupt
to deliver to the guest.
|
|
The Topology source sets up input and output buffers for each of the external
nodes of a topology by indexing on Ruby's generated controller unique IDs.
These unique IDs are found by adding the MachineType_base_number to the version
number of each controller (see any generated *_Controller.cc - init() calls
getToNetQueue and getFromNetQueue using m_version + base). However, the
Topology object used the cntrl_id - which is required to be unique across all
controllers - to index the controllers list as they are being connected to
their input and output buffers. If the cntrl_ids did not match the Ruby unique
ID, the throttles end up connected to incorrectly indexed nodes in the network,
resulting in packets traversing incorrect network paths. This patch fixes the
Topology indexing scheme by using the Ruby unique ID to match that of the
SimpleNetwork buffer vectors.
|
|
Previously, the LSQ would instantiate MaxThreads LSQUnits in the body of it's
object, but it would only initialize numThreads LSQUnits as specified by the
user. This had the effect of leaving some LSQUnits uninitialized when the
number of threads was less than MaxThreads, and when adding statistics to the
LSQUnit that must be initialized, this caused the stats initialization check to
fail. By dynamically instantiating LSQUnits, they are all initialized and this
avoids uninitialized LSQUnits from floating around during runtime.
|
|
The previous changeset (9863:9483739f83ee) used STL vector containers to
dynamically allocate stats in the Ruby SimpleNetwork, Switch and Throttle. For
gcc versions before at least 4.6.3, this causes the standard vector allocator
to call Stats copy constructors (a no-no, since stats should be allocated in
the body of each SimObject instance). Since the size of these stats arrays is
known at compile time (NOTE: after code generation), this patch changes their
allocation to be static rather than using an STL vector.
|
|
gcc/g++ 4.4.7 complained about the operator= being undefined.
This changeset adds the operator.
|
|
|
|
|
|
This message size type does not work well with one of the statistical
variables. It also seems unnecessary.
|
|
|
|
|
|
|
|
|
|
This patch ensures the NULL ISA can build without causing issues with
an unused variable.
|
|
This patch adds the config ini string as a tooltip that can be
displayed in most browsers rendering the resulting svg. Certain
characters are modified for HTML output.
Tested on chrome and firefox.
|
|
This patch is adding a splash of colour to the dot output to make it
easier to distinguish objects of different types. As a bonus, the
pastel-colour palette also makes the output look like a something from
the 21st century.
|
|
This patch adds the class name to the label, creates some more space
by increasing the rank separation, and additionally outputs the graph
as an editable SVG in addition to the PDF.
|
|
This patch makes it possible to once again build gem5 without any
ISA. The main purpose is to enable work around the interconnect and
memory system without having to build any CPU models or device models.
The regress script is updated to include the NULL ISA target. Currently
no regressions make use of it, but all the testers could (and perhaps
should) transition to it.
--HG--
rename : build_opts/NOISA => build_opts/NULL
rename : src/arch/noisa/SConsopts => src/arch/null/SConsopts
rename : src/arch/noisa/cpu_dummy.hh => src/arch/null/cpu_dummy.hh
rename : src/cpu/intr_control.cc => src/cpu/intr_control_noisa.cc
|
|
The branch predictor is guarded by having either the in-order or
out-of-order CPU as one of the available CPU models and therefore
should not be used in the BaseCPU. This patch moves the parameter to
the relevant CPU classes.
|
|
This patch is a first step to getting NOISA working again. A number of
redundant includes make life more difficult than it has to be and this
patch simply removes them. There are also some redundant forward
declarations removed.
|
|
This patch moves the system virtual port proxy to the Alpha system
only to make the resurrection of the NOISA slightly less
painful. Alpha is the only ISA that is actually using it.
|
|
This patch changes the SConscript to build gem5 with libc++ on OSX as
the conventional libstdc++ does not have the C++11 constructs that the
current code base makes use of (e.g. std::forward).
Since this was the last use of the transitional TR1, the unordered map
and set header can now be simplified as well.
|
|
This patch fixes up the comment formatting for isDrained in the timing
CPU.
|
|
This patch changes how the initialisation of the VectorPrint struct is
done so that gcc 4.4 is happy again.
|
|
This patch updates the stats to reflect the: 1) addition of the
internal queue in SimpleMemory, 2) moving of the memory class outside
FSConfig, 3) fixing up of the 2D vector printing format, 4) specifying
burst size and interface width for the DRAM instead of relying on
cache-line size, 5) performing merging in the DRAM controller write
buffer, and 6) fixing how idle cycles are counted in the atomic and
timing CPU models.
The main reason for bundling them up is to minimise the changeset
size.
|
|
Added a couple missing updates to the notIdleFraction stat. Without
these, it sometimes gives a (not) idle fraction that is greater than 1
or less than 0.
|
|
This patch adds support for specifying multi-channel memory
configurations on the command line, e.g. 'se/fs.py
--mem-type=ddr3_1600_x64 --mem-channels=4'. To enable this, it
enhances the functionality of MemConfig and moves the existing
makeMultiChannel class method from SimpleDRAM to the support scripts.
The se/fs.py example scripts are updated to make use of the new
feature.
|
|
This patch changes the default parameter value of conf_table_reported
to match the common case. It also simplifies the regression and config
scripts to reflect this change.
|
|
This patch addresses an issue with trace playback in the TrafficGen
where the trace was reset but the header was not read from the trace
when a captured trace was played back for a second time. This resulted
in parsing errors as the expected message was not found in the trace
file.
The header check is moved to an init funtion which is called by the
constructor and when the trace is reset. This ensures that the trace
header is read each time when the trace is replayed.
This patch also addresses a small formatting issue in a panic.
|
|
This patch changes the data structure used for the DRAM read, write
and response queues from an STL list to deque. This optimisation is
based on the observation that the size is small (and fixed), and that
the structures are frequently iterated over in a linear fashion.
|
|
This patch implements basic write merging in the DRAM to avoid
redundant bursts. When a new access is added to the queue it is
compared against the existing entries, and if it is either
intersecting or immediately succeeding/preceeding an existing item it
is merged.
There is currently no attempt made at avoiding iterating over the
existing items in determining whether merging is possible or not.
|
|
This patch gets rid of bytesPerCacheLine parameter and makes the DRAM
configuration separate from cache line size. Instead of
bytesPerCacheLine, we define a parameter for the DRAM called
burst_length. The burst_length parameter shows the length of a DRAM
device burst in bits. Also, lines_per_rowbuffer is replaced with
device_rowbuffer_size to improve code portablity.
This patch adds a burst length in beats for each memory type, an
interface width for each memory type, and the memory controller model
is extended to reason about "system" packets vs "dram" packets and
assemble the responses properly. It means that system packets larger
than a full burst are split into multiple dram packets.
|
|
This patch modifies the SimpleTimingCPU drain check to also consider
the fetch event. Previously, there was an assumption that there is
never a fetch event scheduled if the CPU is not executing
microcode. However, when a context is activated, a fetch even is
scheduled, and microPC() is zero.
|
|
This patch adds a check to the quiesce operation to ensure that the
CPU does not suspend itself when there are unmasked interrupts
pending. Without this patch there are corner cases when the CPU gets
an interrupt before the quiesce is executed and then never wakes up
again.
|
|
This patch addresses an issue with the text-based stats output which
resulted in Vector2D stats being printed without subnames in the event
that one of the dimensions was of length 1.
This patch also fixes the total printing for the 2D vector. Previously
totals were printed without explicitly stating that a total was being
printed. This has been rectified in this patch.
|
|
This patch adds the notion of voltage domains, and groups clock
domains that operate under the same voltage (i.e. power supply) into
domains. Each clock domain is required to be associated with a voltage
domain, and the latter requires the voltage to be explicitly set.
A voltage domain is an independently controllable voltage supply being
provided to section of the design. Thus, if you wish to perform
dynamic voltage scaling on a CPU, its clock domain should be
associated with a separate voltage domain.
The current implementation of the voltage domain does not take into
consideration cases where there are derived voltage domains running at
ratio of native voltage domains, as with the case where there can be
on-chip buck/boost (charge pumps) voltage regulation logic.
The regression and configuration scripts are updated with a generic
voltage domain for the system, and one for the CPUs.
|
|
Until the performance bug is fixed, avoid killing simulations.
|
|
This patch fixes an issue where an activation limit of 0 was not
allowed. With this patch, setting the limit to 0 simply disables the
tXAW constraint.
|
|
This patch adds a packet queue in SimpleMemory to avoid using the
packet queue in the port (and thus have no involvement in the flow
control). The port queue was bound to 100 packets, and as the
SimpleMemory is modelling both a controller and an actual RAM, it
potentially has a large number of packets in flight. There is
currently no limit on the number of packets in the memory controller,
but this could easily be added in a follow-on patch.
As a result of the added internal storage, the functional access and
draining is updated. Some minor cleaning up and renaming has also been
done.
The memtest regression changes as a result of this patch and the stats
will be updated.
|
|
This patch fixes a bug in the O3 fetch stage that was introduced when
the cache line size was moved to the system. By mistake, the
initialisation and resetting of the fetch stage was merged and put in
the constructor. The resetting is now re-added where it should be.
|
|
These constructs are not in use and are not being maintained by any one.
In addition, it is not known if doubleTrigger works correctly with Ruby now.
|
|
Some of the code in StateMachine.py file is added to all the controllers and
is independent of the controller definition. This code is being moved to the
AbstractController class which is the parent class of all controllers.
|
|
This patch adds checkpointing support to x86 tlb. It upgrades the
cpt_upgrader.py script so that previously created checkpoints can
be updated. It moves the checkpoint version to 6.
|
|
Remove unused virtual getBranchPred() method from BaseCPU as it is not
implemented by any of the CPU models. It used to always return NULL.
|
|
This patch removes the notion of a peer block size and instead sets
the cache line size on the system level.
Previously the size was set per cache, and communicated through the
interconnect. There were plenty checks to ensure that everyone had the
same size specified, and these checks are now removed. Another benefit
that is not yet harnessed is that the cache line size is now known at
construction time, rather than after the port binding. Hence, the
block size can be locally stored and does not have to be queried every
time it is used.
A follow-on patch updates the configuration scripts accordingly.
|
|
Make valgrind a little bit happier
|
|
This patch aligns the MaxTick in Python with the one in C++. Thus,
both reflect the maximum value that an unsigned 64-bit integer can
have.
|
|
|
|
trace files for SE mode
|