Age | Commit message (Collapse) | Author |
|
This patch changes the name of a bunch of packet flags and MSHR member
functions and variables to make the coherency protocol easier to
understand. In addition the patch adds and updates lots of
descriptions, explicitly spelling out assumptions.
The following name changes are made:
* the packet memInhibit flag is renamed to cacheResponding
* the packet sharedAsserted flag is renamed to hasSharers
* the packet NeedsExclusive attribute is renamed to NeedsWritable
* the packet isSupplyExclusive is renamed responderHadWritable
* the MSHR pendingDirty is renamed to pendingModified
The cache states, Modified, Owned, Exclusive, Shared are also called
out in the cache and MSHR code to make it easier to understand.
|
|
|
|
Move the IDE controller and the disk implementations to
src/dev/storage.
--HG--
rename : src/dev/DiskImage.py => src/dev/storage/DiskImage.py
rename : src/dev/Ide.py => src/dev/storage/Ide.py
rename : src/dev/SimpleDisk.py => src/dev/storage/SimpleDisk.py
rename : src/dev/disk_image.cc => src/dev/storage/disk_image.cc
rename : src/dev/disk_image.hh => src/dev/storage/disk_image.hh
rename : src/dev/ide_atareg.h => src/dev/storage/ide_atareg.h
rename : src/dev/ide_ctrl.cc => src/dev/storage/ide_ctrl.cc
rename : src/dev/ide_ctrl.hh => src/dev/storage/ide_ctrl.hh
rename : src/dev/ide_disk.cc => src/dev/storage/ide_disk.cc
rename : src/dev/ide_disk.hh => src/dev/storage/ide_disk.hh
rename : src/dev/ide_wdcreg.h => src/dev/storage/ide_wdcreg.h
rename : src/dev/simple_disk.cc => src/dev/storage/simple_disk.cc
rename : src/dev/simple_disk.hh => src/dev/storage/simple_disk.hh
|
|
--HG--
rename : src/dev/Ethernet.py => src/dev/net/Ethernet.py
rename : src/dev/etherbus.cc => src/dev/net/etherbus.cc
rename : src/dev/etherbus.hh => src/dev/net/etherbus.hh
rename : src/dev/etherdevice.cc => src/dev/net/etherdevice.cc
rename : src/dev/etherdevice.hh => src/dev/net/etherdevice.hh
rename : src/dev/etherdump.cc => src/dev/net/etherdump.cc
rename : src/dev/etherdump.hh => src/dev/net/etherdump.hh
rename : src/dev/etherint.cc => src/dev/net/etherint.cc
rename : src/dev/etherint.hh => src/dev/net/etherint.hh
rename : src/dev/etherlink.cc => src/dev/net/etherlink.cc
rename : src/dev/etherlink.hh => src/dev/net/etherlink.hh
rename : src/dev/etherobject.hh => src/dev/net/etherobject.hh
rename : src/dev/etherpkt.cc => src/dev/net/etherpkt.cc
rename : src/dev/etherpkt.hh => src/dev/net/etherpkt.hh
rename : src/dev/ethertap.cc => src/dev/net/ethertap.cc
rename : src/dev/ethertap.hh => src/dev/net/ethertap.hh
rename : src/dev/i8254xGBe.cc => src/dev/net/i8254xGBe.cc
rename : src/dev/i8254xGBe.hh => src/dev/net/i8254xGBe.hh
rename : src/dev/i8254xGBe_defs.hh => src/dev/net/i8254xGBe_defs.hh
rename : src/dev/multi_etherlink.cc => src/dev/net/multi_etherlink.cc
rename : src/dev/multi_etherlink.hh => src/dev/net/multi_etherlink.hh
rename : src/dev/multi_iface.cc => src/dev/net/multi_iface.cc
rename : src/dev/multi_iface.hh => src/dev/net/multi_iface.hh
rename : src/dev/multi_packet.cc => src/dev/net/multi_packet.cc
rename : src/dev/multi_packet.hh => src/dev/net/multi_packet.hh
rename : src/dev/ns_gige.cc => src/dev/net/ns_gige.cc
rename : src/dev/ns_gige.hh => src/dev/net/ns_gige.hh
rename : src/dev/ns_gige_reg.h => src/dev/net/ns_gige_reg.h
rename : src/dev/pktfifo.cc => src/dev/net/pktfifo.cc
rename : src/dev/pktfifo.hh => src/dev/net/pktfifo.hh
rename : src/dev/sinic.cc => src/dev/net/sinic.cc
rename : src/dev/sinic.hh => src/dev/net/sinic.hh
rename : src/dev/sinicreg.hh => src/dev/net/sinicreg.hh
rename : src/dev/tcp_iface.cc => src/dev/net/tcp_iface.cc
rename : src/dev/tcp_iface.hh => src/dev/net/tcp_iface.hh
|
|
--HG--
rename : src/dev/I2C.py => src/dev/i2c/I2C.py
rename : src/dev/i2cbus.cc => src/dev/i2c/bus.cc
rename : src/dev/i2cbus.hh => src/dev/i2c/bus.hh
rename : src/dev/i2cdev.hh => src/dev/i2c/device.hh
|
|
--HG--
rename : src/dev/CopyEngine.py => src/dev/pci/CopyEngine.py
rename : src/dev/copy_engine.cc => src/dev/pci/copy_engine.cc
rename : src/dev/copy_engine.hh => src/dev/pci/copy_engine.hh
rename : src/dev/copy_engine_defs.hh => src/dev/pci/copy_engine_defs.hh
|
|
Move pcidev.(hh|cc) to src/dev/pci/device.(hh|cc) and update existing
devices to use the new header location. This also renames the PCIDEV
debug flag to have a capitalization that is consistent with the PCI
host and other devices.
--HG--
rename : src/dev/Pci.py => src/dev/pci/PciDevice.py
rename : src/dev/pcidev.cc => src/dev/pci/device.cc
rename : src/dev/pcidev.hh => src/dev/pci/device.hh
rename : src/dev/pcireg.h => src/dev/pci/pcireg.h
|
|
Previous ARM-based simulations were limited to 8 cores due to
limitations in GICv2 and earlier. This changeset adds a set of
gem5-specific extensions that enable support for up to 256 cores.
When the gem5 extensions are enabled, the GIC uses CPU IDs instead of
a CPU bitmask in the GIC's register interface. To OS can enable the
extensions by setting bit 0x200 in ICDICTR.
This changeset is based on previous work by Matt Evans.
|
|
The gem5's current PCI host functionality is very ad hoc. The current
implementations require PCI devices to be hooked up to the
configuration space via a separate configuration port. Devices query
the platform to get their config-space address range. Un-mapped parts
of the config space are intercepted using the XBar's default port
mechanism and a magic catch-all device (PciConfigAll).
This changeset redesigns the PCI host functionality to improve code
reuse and make config-space and interrupt mapping more
transparent. Existing platform code has been updated to use the new
PCI host and configured to stay backwards compatible (i.e., no
guest-side visible changes). The current implementation does not
expose any new functionality, but it can easily be extended with
features such as automatic interrupt mapping.
PCI devices now register themselves with a PCI host controller. The
host controller interface is defined in the abstract base class
PciHost. Registration is done by PciHost::registerDevice() which takes
the device, its bus position (bus/dev/func tuple), and its interrupt
pin (INTA-INTC) as a parameter. The registration interface returns a
PciHost::DeviceInterface that the PCI device can use to query memory
mappings and signal interrupts.
The host device manages the entire PCI configuration space. Accesses
to devices decoded into the devices bus position and then forwarded to
the correct device.
Basic PCI host functionality is implemented in the GenericPciHost base
class. Most platforms can use this class as a basic PCI controller. It
provides the following functionality:
* Configurable configuration space decoding. The number of bits
dedicated to a device is a prameter, making it possible to support
both CAM, ECAM, and legacy mappings.
* Basic interrupt mapping using the interruptLine value from a
device's configuration space. This behavior is the same as in the
old implementation. More advanced controllers can override the
interrupt mapping method to dynamically assign host interrupts to
PCI devices.
* Simple (base + addr) remapping from the PCI bus's address space to
physical addresses for PIO, memory, and DMA.
|
|
The HDLCD model implements a workaround that swaps the red and blue
channels. This works around an issue in certain old kernels. The new
driver doesn't seem to have this behavior, so disable the workaround
by default and enable it in the affected platforms.
|
|
Devices behind the Versatile Express configuration controllers are
currently all lumped into one SimObject. This will make DTB generation
challenging since the DTB assumes them to be in different parts of the
hierarchy. It also makes it hard to model other CoreTiles without also
replicating devices from the motherboard.
This changeset splits the VExpressCoreTileCtrl into two subsystems:
VExpressMCC for all motherboard-related devices and CoreTile2A15DCC
for Core Tile specific devices.
|
|
The MaltaPChip class is currently unused and identical (except for the
class name) to the TsunamiPChip. If someone decides to implement PCI
for Malta, they should make sure to share code with the Tsunami
implementation if they are similar.
|
|
--HG--
extra : rebase_source : 64046371962e98413757bc3ab0c0d48dfb11ff1e
|
|
The flash model has typos in its serialization code for
unknownPages, locationTable, blockValidEntries, and blockEmptyEntries
arrays where it would save each entry in the array under the same
name in the checkpoint. This patch fixes these typos.
|
|
The IICRPR register in the GIC is currently not being initialized when
the GIC is instantiated. Initialize to the value mandated by the
architecture specification.
|
|
This patch adds very basic checkpoint support for the VirtIO9PProxy
device. Previously, attempts to checkpoint gem5 with a present 9P
device caused gem5 to fatal as none of the state is tracked. We still
do not track any state, but we replace the fatal with a warning which
is triggered if the device has been used by the guest system. In the
event that it has not been used, we assume that no state is lost
during checkpointing. The warning is triggered on both a serialize and
an unserialize to ensure maximum visibility for the user.
|
|
Devices should never need to include dev/pciconfall.hh.
--HG--
extra : amend_source : 3a6e56485d432b49e2af22407982fa785c0ccb68
|
|
Cleanup PCI devices to avoid using the PciDevice::platform pointer
directly. The PCI-specific functionality provided by the Platform
should be accessed through the wrappers in PciDevice.
|
|
This patch updates the I/O devices, bridge and simple memory to take
the packet header and payload delay into account in their latency
calculations. In all cases we add the header delay, i.e. the
accumulated pipeline delay of any crossbars, and the payload delay
needed for deserialisation of any payload.
Due to the additional unknown latency contribution, the packet queue
of the simple memory is changed to use insertion sorting based on the
time stamp. Moreover, since the memory hands out exclusive (non
shared) responses, we also need to ensure ordering for reads to the
same address.
|
|
Fix a bug in which the flash device would write out of bounds and
could either trigger a segfault and corrupt the memory of other
objects. This was caused by using pageSize in the place of
pagesPerBlock when running the garbage collector.
Also, added an assert to flag this condition in the future.
|
|
This patch fixes the drain logic for the UFSHostDevice and the
FlashDevice. In the case of the FlashDevice, the logic for CheckDrain
needed to be reversed, whilst in the case of the UFSHostDevice check
drain was never being called. In both cases the system would never
complete draining if the initial attampt to drain failed.
|
|
Make clang >= 3.5 happy when compiling build/X86/gem5.opt on OSX.
|
|
Make clang >= 3.5 happy when compiling build/ARM/gem5.opt on OSX.
|
|
This patch adds explicit overrides as this is now required when using
"-Wall" with clang >= 3.5, the latter now part of the most recent
XCode. The patch consequently removes "virtual" for those methods
where "override" is added. The latter should be enough of an
indication.
As part of this patch, a few minor issues that clang >= 3.5 complains
about are also resolved (unused methods and variables).
|
|
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap
(and similar) abstractions, as these are no longer needed with gcc 4.7
and clang 3.1 as minimum compiler versions.
|
|
Adds per-thread interrupt controllers and thread/context logic
so that interrupts properly get routed in SMT systems.
|
|
IntDevice::recvResponse is called from two places in current mainline: (1) the
short circuit path of X86ISA::IntDevice::IntMasterPort::sendMessage for atomic
mode, and (2) the full request->response path to and from the x86 interrupts
device (finally called from MessageMasterPort::recvTimingResp). In the former
case, the packet was deleted correctly, but in the latter case, the packet and
request leak. To fix the leak, move request and packet deletion into IntDevice
inherited class implementations of recvResponse.
|
|
Handle bad IDE disk image size 0. When image size is 0, gem5 will cause an
exception with log "Floating point exception (core dumped)".
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
|
|
Add a stat that counts buffer underruns in the HDLCD controller. The
stat counts at most one underrun per frame since the controller aborts
the current frame if it underruns.
|
|
Rewrite the HDLCD controller to use the new DMA engine and pixel
pump. This fixes several bugs in the current implementation:
* Broken/missing interrupt support (VSync, underrun, DMA end)
* Fragile resolution changes (changing resolutions used
to cause assertion errors).
* Support for resolutions with a width that isn't divisible by 32.
* The pixel clock can now be set dynamically.
This breaks checkpoint compatibility. Checkpoints can be upgraded with
the checkpoint conversion script. However, upgraded checkpoints won't
contain the state of the current frame. That means that HDLCD
controllers restoring from a converted checkpoint immediately start
drawing a new frame (i.e, expect timing differences).
|
|
EtherLink currently uses a fire-and-forget link delay event that
delays sending of packets by a fixed number of ticks. In order to
serialize this event, it relies on the event queue's auto
serialization support. However, support for event auto serialization
has been broken for more than two years, which means that checkpoints
of multi-system setups are likely to drop in-flight packets.
This changeset the replaces rewrites this part of the EtherLink to use
a packet queue instead. The queue contains a (tick, packet) tuple. The
tick indicates when the packet will be ready. Instead of relying on
event autoserialization, we now explicitly serialize the packet queue
in the EhterLink::Link class.
Note that this changeset changes the way in-flight packages are
serialized. Old checkpoints will still load, but in-flight packets
will be dropped (just as before). There has been no attempt to upgrade
checkpoints since this would actually change the behavior of existing
checkpoints.
|
|
Timing generator for a pixel-based display. The timing generator is
intended for display processors driving a standard rasterized
display. The simplest possible display processor needs to derive from
this class and override the nextPixel() method to feed the display
with pixel data.
Pixels are ordered relative to the top left corner of the
display. Scan lines appear in the following order:
* Vertical Sync (starting at line 0)
* Vertical back porch
* Visible lines
* Vertical front porch
Pixel order within a scan line:
* Horizontal Sync
* Horizontal Back Porch
* Visible pixels
* Horizontal Front Porch
All events in the timing generator are automatically suspended on a
drain() request and restarted on drainResume(). This is conceptually
equivalent to clock gating when the pixel clock while the system is
draining. By gating the pixel clock, we prevent display controllers
from disturbing a memory system that is about to drain.
|
|
Add support for oscillators that can be programmed using the RealView
/ Versatile Express configuration interface. These oscillators are
typically used for things like the pixel clock in the display
controller.
The default configurations support the oscillators from a Versatile
Express motherboard (V2M-P1) with a CoreTile Express A15x2.
|
|
Add a simple DMA engine that sits behind a FIFO. This engine can be
used by devices that need to read large amounts of data (e.g., display
controllers). Most aspects of the controller, such as FIFO size,
maximum number of in-flight accesses, and maximum request sizes can be
configured.
The DMA copies blocks of data into its FIFO. Transfers are initiated
with a call to startFill() command that takes a start address and a
size. Advanced users can create a derived class that overrides the
onEndOfBlock() callback that is triggered when the last request to a
block has been issued. At this point, the DMA engine is ready to start
fetching a new block of data, potentially from a different address
range.
The DMA engine stops issuing new requests while it is draining. Care
must be taken to ensure that devices that are fed by a DMA engine are
suspended while the system is draining to avoid buffer underruns.
|
|
The CircleBuf class has at least one bug causing it to overwrite the
wrong elements when wrapping. The current code has a lot of unused
functionality and duplicated code. This changeset replaces the old
implementation with a new version that supports serialization and
arbitrary types in the buffer (not just char).
|
|
The i8042 device drops the contents of a PS2 device's buffer when
serializing, which results in corrupted PS2 state when continuing
simulation after a checkpoint. This changeset fixes this bug and
transitions the i8042 model to use the new serialization API that
requires the serialize() method to be const.
|
|
This changeset transitions the Sinic device to the new serialization
framework that requires the serialization method to be constant.
|
|
Context IDs used to be declared as ad hoc (usually as int). This
changeset introduces a typedef for ContextIDs and a constant for
invalid context IDs.
|
|
Multi gem5 is an extension to gem5 to enable parallel simulation of a
distributed system (e.g. simulation of a pool of machines
connected by Ethernet links). A multi gem5 run consists of seperate gem5
processes running in parallel (potentially on different hosts/slots on
a cluster). Each gem5 process executes the simulation of a component of the
simulated distributed system (e.g. a multi-core board with an Ethernet NIC).
The patch implements the "distributed" Ethernet link device
(dev/src/multi_etherlink.[hh.cc]). This device will send/receive
(simulated) Ethernet packets to/from peer gem5 processes. The interface
to talk to the peer gem5 processes is defined in dev/src/multi_iface.hh and
in tcp_iface.hh.
There is also a central message server process (util/multi/tcp_server.[hh,cc])
which acts like an Ethernet switch and transfers messages among the gem5 peers.
A multi gem5 simulations can be kicked off by the util/multi/gem5-multi.sh
wrapper script.
Checkpoints are supported by multi-gem5. The checkpoint must be
initiated by a single gem5 process. E.g., the gem5 process with rank 0
can take a checkpoint from the bootscript just before it invokes
'mpirun' to launch an MPI test. The message server process will notify
all the other peer gem5 processes and make them take a checkpoint, too
(after completing a global synchronisation to ensure that there are no
inflight messages among gem5).
|
|
Add a simple device shim that interfaces with the NoMali model
library. The gem5 side of the interface supports Mali T60x/T62x/T760
GPUs. This device model pretends to be a Mali GPU, but doesn't render
anything and executes in zero time.
|
|
The drain() call currently passes around a DrainManager pointer, which
is now completely pointless since there is only ever one global
DrainManager in the system. It also contains vestiges from the time
when SimObjects had to keep track of their child objects that needed
draining.
This changeset moves all of the DrainState handling to the Drainable
base class and changes the drain() and drainResume() calls to reflect
this. Particularly, the drain() call has been updated to take no
parameters (the DrainManager argument isn't needed) and return a
DrainState instead of an unsigned integer (there is no point returning
anything other than 0 or 1 any more). Drainable objects should return
either DrainState::Draining (equivalent to returning 1 in the old
system) if they need more time to drain or DrainState::Drained
(equivalent to returning 0 in the old system) if they are already in a
consistent state. Returning DrainState::Running is considered an
error.
Drain done signalling is now done through the signalDrainDone() method
in the Drainable class instead of using the DrainManager directly. The
new call checks if the state of the object is DrainState::Draining
before notifying the drain manager. This means that it is safe to call
signalDrainDone() without first checking if the simulator has
requested draining. The intention here is to reduce the code needed to
implement draining in simple objects.
|
|
Draining is currently done by traversing the SimObject graph and
calling drain()/drainResume() on the SimObjects. This is not ideal
when non-SimObjects (e.g., ports) need draining since this means that
SimObjects owning those objects need to be aware of this.
This changeset moves the responsibility for finding objects that need
draining from SimObjects and the Python-side of the simulator to the
DrainManager. The DrainManager now maintains a set of all objects that
need draining. To reduce the overhead in classes owning non-SimObjects
that need draining, objects inheriting from Drainable now
automatically register with the DrainManager. If such an object is
destroyed, it is automatically unregistered. This means that drain()
and drainResume() should never be called directly on a Drainable
object.
While implementing the new functionality, the DrainManager has now
been made thread safe. In practice, this means that it takes a lock
whenever it manipulates the set of Drainable objects since SimObjects
in different threads may create Drainable objects
dynamically. Similarly, the drain counter is now an atomic_uint, which
ensures that it is manipulated correctly when objects signal that they
are done draining.
A nice side effect of these changes is that it makes the drain state
changes stricter, which the simulation scripts can exploit to avoid
redundant drains.
|
|
The drain state enum is currently a part of the Drainable
interface. The same state machine will be used by the DrainManager to
identify the global state of the simulator. Make the drain state a
global typed enum to better cater for this usage scenario.
|
|
Events expected to be unserialized using an event-specific
unserializeEvent call. This call was never actually used, which meant
the events relying on it never got unserialized (or scheduled after
unserialization).
Instead of relying on a custom call, we now use the normal
serialization code again. In order to schedule the event correctly,
the parrent object is expected to use the
EventQueue::checkpointReschedule() call. This happens automatically
for events that are serialized using the AutoSerialize mechanism.
|
|
Objects that are can be serialized are supposed to inherit from the
Serializable class. This class is meant to provide a unified API for
such objects. However, so far it has mainly been used by SimObjects
due to some fundamental design limitations. This changeset redesigns
to the serialization interface to make it more generic and hide the
underlying checkpoint storage. Specifically:
* Add a set of APIs to serialize into a subsection of the current
object. Previously, objects that needed this functionality would
use ad-hoc solutions using nameOut() and section name
generation. In the new world, an object that implements the
interface has the methods serializeSection() and
unserializeSection() that serialize into a named /subsection/ of
the current object. Calling serialize() serializes an object into
the current section.
* Move the name() method from Serializable to SimObject as it is no
longer needed for serialization. The fully qualified section name
is generated by the main serialization code on the fly as objects
serialize sub-objects.
* Add a scoped ScopedCheckpointSection helper class. Some objects
need to serialize data structures, that are not deriving from
Serializable, into subsections. Previously, this was done using
nameOut() and manual section name generation. To simplify this,
this changeset introduces a ScopedCheckpointSection() helper
class. When this class is instantiated, it adds a new /subsection/
and subsequent serialization calls during the lifetime of this
helper class happen inside this section (or a subsection in case
of nested sections).
* The serialize() call is now const which prevents accidental state
manipulation during serialization. Objects that rely on modifying
state can use the serializeOld() call instead. The default
implementation simply calls serialize(). Note: The old-style calls
need to be explicitly called using the
serializeOld()/serializeSectionOld() style APIs. These are used by
default when serializing SimObjects.
* Both the input and output checkpoints now use their own named
types. This hides underlying checkpoint implementation from
objects that need checkpointing and makes it easier to change the
underlying checkpoint storage code.
|
|
Make it possible to specify the size of the PIO space for an AMBA DMA
device. Maintain backwards compatibility and default to zero.
|
|
There are cases when we don't want to use a system register mapped
generic timer, but can't use the SP804. For example, when using KVM on
aarch64, we want to intercept accesses to the generic timer, but can't
do so if it is using the system register interface. In such cases,
we need to use a memory-mapped generic timer.
This changeset adds a device model that implements the memory mapped
generic timer interface. The current implementation only supports a
single frame (i.e., one virtual timer and one physical timer).
|
|
The generic timer model currently does not support virtual
counters. Virtual and physical counters both tick with the same
frequency. However, virtual timers allow a hypervisor to set an offset
that is subtracted from the counter when it is read. This enables the
hypervisor to present a time base that ticks with virtual time in the
VM (i.e., doesn't tick when the VM isn't running). Modern Linux
kernels generally assume that virtual counters exist and try to use
them by default.
|
|
This changeset cleans up the generic timer a bit and moves most of the
register juggling from the ISA code into a separate class in the same
source file as the rest of the generic timer. It also removes the
assumption that there is always 8 or fewer CPUs in the system. Instead
of having a fixed limit, we now instantiate per-core timers as they
are requested. This is all in preparation for other patches that add
support for virtual timers and a memory mapped interface.
|
|
Some versions of the kernel incorrectly swap the red and blue color
select registers. This changeset adds a workaround for that by
swapping them when instantiating a PixelConverter.
|