Age | Commit message (Collapse) | Author |
|
When using a Ruby memory system, the Ruby configuration scripts expect
to get a list of DMA ports to create the necessary DMA sequencers. Add
support in the utility functions that wire up devices to append DMA
ports to a list instead of connecting them to the IO bus. These
functions are currently only used by the VExpress_GEM5_V1 platform.
Change-Id: I46059e46b0f69e7be5f267e396811bd3caa3ed63
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Brad Beckmann <brad.beckmann@amd.com>
|
|
The boot ROM shouldn't be used as a memory by the kernel. Memories
have a flag to indicate this which is set for some platforms. Update
all platforms to consistently set this flag to indicate that the boot
ROM shouldn't be reported as normal memory.
Change-Id: I2bf0273e99d2a668e4e8d59f535c1910c745aa7b
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Brad Beckmann <brad.beckmann@amd.com>
--HG--
extra : amend_source : c2cbda38636ea37cbe9ae6977a06b923eab5ba56
|
|
This patch enables Linux to read the temperature using hwmon infrastructure.
In order to use this in your gem5 you need to compile the kernel using the
following configs:
CONFIG_HWMON=y
CONFIG_SENSORS_VEXPRESS=y
And a proper dts file (containing an entry such as):
dcc {
compatible = "arm,vexpress,config-bus";
arm,vexpress,config-bridge = <&v2m_sysreg>;
temp@0 {
compatible = "arm,vexpress-temp";
arm,vexpress-sysreg,func = <4 0>;
label = "DCC";
};
};
|
|
Add a platform with support for both aarch32 and aarch64. This
platform implements a subset of the devices in a real Versatile
Express and extends it with some gem5-specific functionality. It is in
many ways similar to the old VExpress_EMM64 platform, but supports the
following new features:
* Automatic PCI interrupt assignment
* PCI interrupts allocated in a contiguous range.
* Automatic boot loader selection (32-bit / 64-bit)
* Cleaner memory map where gem5-specific devices live in CS5 which
isn't used by current Versatile Express platforms.
* No fake devices. Devices that were previously faked will be
removed from the device tree instead.
* Support for 510 GiB contiguous memory
|
|
Add support for automatic PCI interrupt routing using a device's ID on
the PCI bus. Our current DTBs typically tell the kernel that we do
this or something similar when declaring the PCI controller. This
changeset adds an option to make the simulator behave in the same way.
Interrupt routing can be selected by setting the int_policy parameter
in the GenericArmPciHost. The following values are supported:
* ARM_PCI_INT_STATIC: Use the old static routing policy using the
interrupt line from a device's configurtion space.
* ARM_PCI_INT_DEV: Use device number on the PCI bus to map to an
interrupt in the GIC. The interrupt is computed as:
gic_int = int_base + (pci_dev % int_count)
* ARM_PCI_INT_PIN: Use device interrupt pin on the PCI bus to map to
an interrupt in the GIC. The PCI specification reserves pin ID 0
for devices without interrupts, the interrupt therefore computed
as:
gic_int = int_base + ((pin - 1) % int_count)
|
|
The gem5's current PCI host functionality is very ad hoc. The current
implementations require PCI devices to be hooked up to the
configuration space via a separate configuration port. Devices query
the platform to get their config-space address range. Un-mapped parts
of the config space are intercepted using the XBar's default port
mechanism and a magic catch-all device (PciConfigAll).
This changeset redesigns the PCI host functionality to improve code
reuse and make config-space and interrupt mapping more
transparent. Existing platform code has been updated to use the new
PCI host and configured to stay backwards compatible (i.e., no
guest-side visible changes). The current implementation does not
expose any new functionality, but it can easily be extended with
features such as automatic interrupt mapping.
PCI devices now register themselves with a PCI host controller. The
host controller interface is defined in the abstract base class
PciHost. Registration is done by PciHost::registerDevice() which takes
the device, its bus position (bus/dev/func tuple), and its interrupt
pin (INTA-INTC) as a parameter. The registration interface returns a
PciHost::DeviceInterface that the PCI device can use to query memory
mappings and signal interrupts.
The host device manages the entire PCI configuration space. Accesses
to devices decoded into the devices bus position and then forwarded to
the correct device.
Basic PCI host functionality is implemented in the GenericPciHost base
class. Most platforms can use this class as a basic PCI controller. It
provides the following functionality:
* Configurable configuration space decoding. The number of bits
dedicated to a device is a prameter, making it possible to support
both CAM, ECAM, and legacy mappings.
* Basic interrupt mapping using the interruptLine value from a
device's configuration space. This behavior is the same as in the
old implementation. More advanced controllers can override the
interrupt mapping method to dynamically assign host interrupts to
PCI devices.
* Simple (base + addr) remapping from the PCI bus's address space to
physical addresses for PIO, memory, and DMA.
|
|
The HDLCD model implements a workaround that swaps the red and blue
channels. This works around an issue in certain old kernels. The new
driver doesn't seem to have this behavior, so disable the workaround
by default and enable it in the affected platforms.
|
|
Devices behind the Versatile Express configuration controllers are
currently all lumped into one SimObject. This will make DTB generation
challenging since the DTB assumes them to be in different parts of the
hierarchy. It also makes it hard to model other CoreTiles without also
replicating devices from the motherboard.
This changeset splits the VExpressCoreTileCtrl into two subsystems:
VExpressMCC for all motherboard-related devices and CoreTile2A15DCC
for Core Tile specific devices.
|
|
Rewrite the HDLCD controller to use the new DMA engine and pixel
pump. This fixes several bugs in the current implementation:
* Broken/missing interrupt support (VSync, underrun, DMA end)
* Fragile resolution changes (changing resolutions used
to cause assertion errors).
* Support for resolutions with a width that isn't divisible by 32.
* The pixel clock can now be set dynamically.
This breaks checkpoint compatibility. Checkpoints can be upgraded with
the checkpoint conversion script. However, upgraded checkpoints won't
contain the state of the current frame. That means that HDLCD
controllers restoring from a converted checkpoint immediately start
drawing a new frame (i.e, expect timing differences).
|
|
Add support for oscillators that can be programmed using the RealView
/ Versatile Express configuration interface. These oscillators are
typically used for things like the pixel clock in the display
controller.
The default configurations support the oscillators from a Versatile
Express motherboard (V2M-P1) with a CoreTile Express A15x2.
|
|
There are cases when we don't want to use a system register mapped
generic timer, but can't use the SP804. For example, when using KVM on
aarch64, we want to intercept accesses to the generic timer, but can't
do so if it is using the system register interface. In such cases,
we need to use a memory-mapped generic timer.
This changeset adds a device model that implements the memory mapped
generic timer interface. The current implementation only supports a
single frame (i.e., one virtual timer and one physical timer).
|
|
The generic timer model currently does not support virtual
counters. Virtual and physical counters both tick with the same
frequency. However, virtual timers allow a hypervisor to set an offset
that is subtracted from the counter when it is read. This enables the
hypervisor to present a time base that ticks with virtual time in the
VM (i.e., doesn't tick when the VM isn't running). Modern Linux
kernels generally assume that virtual counters exist and try to use
them by default.
|
|
This changeset cleans up the generic timer a bit and moves most of the
register juggling from the ISA code into a separate class in the same
source file as the rest of the generic timer. It also removes the
assumption that there is always 8 or fewer CPUs in the system. Instead
of having a fixed limit, we now instantiate per-core timers as they
are requested. This is all in preparation for other patches that add
support for virtual timers and a memory mapped interface.
|
|
Some versions of the kernel incorrectly swap the red and blue color
select registers. This changeset adds a workaround for that by
swapping them when instantiating a PixelConverter.
|
|
This patch adds an example configuration in ext/sst/tests/ that allows
an SST/gem5 instance to simulate a 4-core AArch64 system with SST's
memHierarchy components providing all the caches and memories.
|
|
Tie in the newly created energy controller components in the default
configurations.
|
|
|
|
This change adds support for a generic pci host bus driver that
has been included in recent Linux kernel instead of the more
bespoke one we've been using to date. It also works with
aarch64 so it provides PCI support for 64-bit ARM Linux.
To make this work a new configuration option pci_io_base is added
to the RealView platform that should be set to the start of
the memory used as memory mapped IO ports (IO ports that are
memory mapped, not regular memory mapped IO). And a parameter
pci_cfg_gen_offsets which specifies if the config space
offsets should be used that the generic driver expects.
To use the pci-host-generic device you need to:
pci_io_base = 0x2f000000 (Valid for VExpress EMM)
pci_cfg_gen_offsets = True
and add the following to your device tree:
pci {
compatible = "pci-host-ecam-generic";
device_type = "pci";
#address-cells = <0x3>;
#size-cells = <0x2>;
#interrupt-cells = <0x1>;
//bus-range = <0x0 0x1>;
// CPU_PHYSICAL(2) SIZE(2)
// Note, some DTS blobs only support 1 size
reg = <0x0 0x30000000 0x0 0x10000000>;
// IO (1), no bus address (2), cpu address (2), size (2)
// MMIO (1), at address (2), cpu address (2), size (2)
ranges = <0x01000000 0x0 0x00000000 0x0 0x2f000000 0x0 0x10000>,
<0x02000000 0x0 0x40000000 0x0 0x40000000 0x0 0x10000000>;
// With gem5 we typically use INTA/B/C/D one per device
interrupt-map = <0x0000 0x0 0x0 0x1 0x1 0x0 0x11 0x1
0x0000 0x0 0x0 0x2 0x1 0x0 0x12 0x1
0x0000 0x0 0x0 0x3 0x1 0x0 0x13 0x1
0x0000 0x0 0x0 0x4 0x1 0x0 0x14 0x1>;
// Only match INTA/B/C/D and not BDF
interrupt-map-mask = <0x0000 0x0 0x0 0x7>;
};
|
|
This eliminates some default devices and adds in helper functions
to connect the devices defined here to associate with the proper
clock domains.
|
|
This patch changes the default pixel clock to effectively generate
1080p resolution at 60 frames per second. It is dependent upon the
kernel device tree file using the specified resolution / display
string in the comments.
|
|
Note: AArch64 and AArch32 interworking is not supported. If you use an AArch64
kernel you are restricted to AArch64 user-mode binaries. This will be addressed
in a later patch.
Note: Virtualization is only supported in AArch32 mode. This will also be fixed
in a later patch.
Contributors:
Giacomo Gabrielli (TrustZone, LPAE, system-level AArch64, AArch64 NEON, validation)
Thomas Grocutt (AArch32 Virtualization, AArch64 FP, validation)
Mbou Eyole (AArch64 NEON, validation)
Ali Saidi (AArch64 Linux support, code integration, validation)
Edmund Grimley-Evans (AArch64 FP)
William Wang (AArch64 Linux support)
Rene De Jong (AArch64 Linux support, performance opt.)
Matt Horsnell (AArch64 MP, validation)
Matt Evans (device models, code integration, validation)
Chris Adeniyi-Jones (AArch64 syscall-emulation)
Prakash Ramrakhyani (validation)
Dam Sunwoo (validation)
Chander Sudanthi (validation)
Stephan Diestelhorst (validation)
Andreas Hansson (code integration, performance opt.)
Eric Van Hensbergen (performance opt.)
Gabe Black
|
|
There is an option to enable/disable all framebuffer dumps, but the
last frame always gets dumped in the run folder with no other way to
disable it. These files can add up very quickly running many experiments.
This patch adds an option to disable them. The default behavior
remains unchanged.
|
|
This patch changes the default parameter value of conf_table_reported
to match the common case. It also simplifies the regression and config
scripts to reflect this change.
|
|
It was confusing having an AmbaDev namespace along with an
AmbaDevice class. The namespace stuff is now moved in to
a new base AmbaDevice class, which is a mixin for classes
AmbaPioDevice (the former AmbaDevice) and AmbaDmaDevice
to provide the readId function as an inherited member function.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
|
|
This patch removes the explicit setting of the clock period for
certain instances of CoherentBus, NonCoherentBus and IOCache where the
specified clock is same as the default value of the system clock. As
all the values used are the defaults, there are no performance
changes. There are similar cases where the toL2Bus is set to use the
parent CPU clock which is already the default behaviour.
The main motivation for these simplifications is to ease the
introduction of clock domains.
|
|
This patch removes the explicit memset as it is redundant and causes
the simulator to touch the entire space, forcing the host system to
allocate the pages.
Anonymous pages are mapped on the first access, and the page-fault
handler is responsible for zeroing them. Thus, the pages are still
zeroed, but we avoid touching the entire allocated space which enables
us to use much larger memory sizes as long as not all the memory is
actually used.
|
|
Newer core tiles / daughterboards for the Versatile Express platform have an
HDLCD controller that supports HD-quality output. This patch adds an
implementation of the controller.
|
|
This patch moves the GIC interface to a separate base class and makes
all interrupt devices use that base class instead of a pointer to the
PL390 implementation. This allows us to have multiple GIC
implementations. Future implementations will allow in-kernel GIC
implementations when using hardware virtualization.
--HG--
rename : src/dev/arm/gic.cc => src/dev/arm/gic_pl390.cc
rename : src/dev/arm/gic.hh => src/dev/arm/gic_pl390.hh
|
|
This patch fixes the Pl111 timings by creating a separate clock for
the pixel timings. The device clock is used for all interactions with
the memory system, just like the AHB clock on the actual module.
The result without this patch is that the module only is allowed to
send one request every tick of the 24MHz clock which causes a huge
backlog.
|
|
The platform has two KMI devices that are both setup to be keyboards. This
patch changes the second keyboard to a mouse. This patch will allow keyboard
input as usual and additionally provide mouse support.
|
|
When casting objects in the generated SWIG interfaces, SWIG uses
classical C-style casts ( (Foo *)bar; ). In some cases, this can
degenerate into the equivalent of a reinterpret_cast (mainly if only a
forward declaration of the type is available). This usually works for
most compilers, but it is known to break if multiple inheritance is
used anywhere in the object hierarchy.
This patch introduces the cxx_header attribute to Python SimObject
definitions, which should be used to specify a header to include in
the SWIG interface. The header should include the declaration of the
wrapped object. We currently don't enforce header the use of the
header attribute, but a warning will be generated for objects that do
not use it.
|
|
This patch adds a VncInput base class which VncServer inherits from.
Another class can implement the same interface and be used instead
of the VncServer, for example a class that replays Vnc traffic.
--HG--
rename : src/base/vnc/VncServer.py => src/base/vnc/Vnc.py
rename : src/base/vnc/vncserver.cc => src/base/vnc/vncinput.cc
rename : src/base/vnc/vncserver.hh => src/base/vnc/vncinput.hh
|
|
|
|
This patch moves the clock of the CPU, bus, and numerous devices to
the new class ClockedObject, that sits in between the SimObject and
MemObject in the class hierarchy. Although there are currently a fair
amount of MemObjects that do not make use of the clock, they
potentially should do so, e.g. the caches should at some point have
the same clock as the CPU, potentially with a 1:n ratio. This patch
does not introduce any new clock objects or object hierarchies
(clusters, clock domains etc), but is still a step in the direction of
having a more structured approach clock domains.
The most contentious part of this patch is the serialisation of clocks
that some of the modules (but not all) did previously. This
serialisation should not be needed as the clock is set through the
parameters even when restoring from the checkpoint. In other words,
the state is "stored" in the Python code that creates the modules.
The nextCycle methods are also simplified and the clock phase
parameter of the CPU is removed (this could be part of a clock object
once they are introduced).
|
|
|
|
|
|
0x40000000 is reservered for external AXI addresses. This address
range is not used currently. Removed the range from the bridge.
|
|
This patch removes the assumption on having on single instance of
PhysicalMemory, and enables a distributed memory where the individual
memories in the system are each responsible for a single contiguous
address range.
All memories inherit from an AbstractMemory that encompasses the basic
behaviuor of a random access memory, and provides untimed access
methods. What was previously called PhysicalMemory is now
SimpleMemory, and a subclass of AbstractMemory. All future types of
memory controllers should inherit from AbstractMemory.
To enable e.g. the atomic CPU and RubyPort to access the now
distributed memory, the system has a wrapper class, called
PhysicalMemory that is aware of all the memories in the system and
their associated address ranges. This class thus acts as an
infinitely-fast bus and performs address decoding for these "shortcut"
accesses. Each memory can specify that it should not be part of the
global address map (used e.g. by the functional memories by some
testers). Moreover, each memory can be configured to be reported to
the OS configuration table, useful for populating ATAG structures, and
any potential ACPI tables.
Checkpointing support currently assumes that all memories have the
same size and organisation when creating and resuming from the
checkpoint. A future patch will enable a more flexible
re-organisation.
--HG--
rename : src/mem/PhysicalMemory.py => src/mem/AbstractMemory.py
rename : src/mem/PhysicalMemory.py => src/mem/SimpleMemory.py
rename : src/mem/physical.cc => src/mem/abstract_mem.cc
rename : src/mem/physical.hh => src/mem/abstract_mem.hh
rename : src/mem/physical.cc => src/mem/simple_mem.cc
rename : src/mem/physical.hh => src/mem/simple_mem.hh
|
|
|
|
|
|
Also clean up how we create boot loader memory a bit.
|
|
This patch cleans up a number of remaining uses of bus.port which
is now split into bus.master and bus.slave. The only non-trivial change
is the memtest where the level building now has to be aware of the role
of the ports used in the previous level.
|
|
This patch classifies all ports in Python as either Master or Slave
and enforces a binding of master to slave. Conceptually, a master (such
as a CPU or DMA port) issues requests, and receives responses, and
conversely, a slave (such as a memory or a PIO device) receives
requests and sends back responses. Currently there is no
differentiation between coherent and non-coherent masters and slaves.
The classification as master/slave also involves splitting the dual
role port of the bus into a master and slave port and updating all the
system assembly scripts to use the appropriate port. Similarly, the
interrupt devices have to have their int_port split into a master and
slave port. The intdev and its children have minimal changes to
facilitate the extra port.
Note that this patch does not enforce any port typing in the C++
world, it merely ensures that the Python objects have a notion of the
port roles and are connected in an appropriate manner. This check is
carried when two ports are connected, e.g. bus.master =
memory.port. The following patches will make use of the
classifications and specialise the C++ ports into masters and slaves.
|
|
|
|
--HG--
rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh
rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc
rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
|
|
In preparation for the introduction of Master and Slave ports, this
patch removes the default port parameter in the Python port and thus
forces the argument list of the Port to contain only the
description. The drawback at this point is that the config port and
dma port of PCI and DMA devices have to be connected explicitly. This
is key for future diversification as the pio and config port are
slaves, but the dma port is a master.
|
|
This patch makes the bus bridge uni-directional and specialises the
bus ports to be a master port and a slave port. This greatly
simplifies the assumptions on both sides as either port only has to
deal with requests or responses. The following patches introduce the
notion of master and slave ports, and would not be possible without
this split of responsibilities.
In making the bridge unidirectional, the address range mechanism of
the bridge is also changed. For the cases where communication is
taking place both ways, an additional bridge is needed. This causes
issues with the existing mechanism, as the busses cannot determine
when to stop iterating the address updates from the two bridges. To
avoid this issue, and also greatly simplify the specification, the
bridge now has a fixed set of address ranges, specified at creation
time.
|
|
Not all objects need a platform pointer, and having one creates a dependence
on their being a platform object. This change removes the platform pointer to
from the base device object and moves it into subclasses that actually need
it.
|
|
|
|
|