Age | Commit message (Collapse) | Author |
|
This patch keeps the logic behind the HMC model implementation untouched.
Additional changes:
- simple hello world script using HMC (SE simulation)
Usage examples:
./build/ARM/gem5.opt configs/example/hmctest.py
./build/ARM/gem5.opt configs/example/hmctest.py --enable-global-monitor --enable-link-monitor --arch=same
./build/ARM/gem5.opt configs/example/hmctest.py --enable-global-monitor --enable-link-monitor --arch=mixed
./build/ARM/gem5.opt configs/example/hmc_hello.py
./build/ARM/gem5.opt configs/example/hmc_hello.py --enable-global-monitor --enable-link-monitor
Change-Id: I64eb6c9abb45376b6ed72722926acddd50765394
Reviewed-on: https://gem5-review.googlesource.com/6061
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
This particular functor looks in the config root, not in the path
specified by M5_ROOT like binary and disk.
Change-Id: Ib007c36934c65ca9f808e995a2e0c71f0b338788
Reviewed-on: https://gem5-review.googlesource.com/5641
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Maintainer: Gabe Black <gabeblack@google.com>
|
|
These functions were already being treated as psuedo objects and had
properties assigned to them setting what their paths were. That's a bit
unusual and made it less obvious what the code was doing, but also
forced the "system" function to know what all the possible path
searching functions were so that they'd have their "path" property
initialized properly in a central location.
This change introduces a PathSearcFunc class which encapsulates the
mechanisms of the old code and makes it implicitly extensible so that
other path searching functions which might look in other directories
can be added in other places.
Change-Id: I7be28e51481a06ec83997677af99927709b18003
Reviewed-on: https://gem5-review.googlesource.com/5341
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
The CpuConfig helper currently assumes that all timing models live in
the cores.arm package. This ignores the potential mismatch between the
target ISA and the ISA assumptions made by the timing models.
Instead of unconditionally listing all CPU models in cores.arm, list
timing models from cores.generic and cores.${TARGET_ISA}. This ensures
that the listed timing models support the ISA that gem5 is targeting.
Change-Id: If6235af2118889638f56ac4151003f38edfe9485
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3947
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
The High-Performance In-order (HPI) CPU timing model is tuned to be
representative of a modern in-order ARMv8-A implementation. The HPI
core and its supporting simulation scripts, namely starter_se.py and
starter_fs.py (under /configs/example/arm/) are part of the ARM
Research Starter Kit on System Modeling. More information can be found
at: http://www.arm.com/ResearchEnablement/SystemModeling
Change-Id: I124bd06ba42d20abff09d447542b031d17eabe22
Signed-off-by: Ashkan Tousi <ashkan.tousimojarad@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/4201
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
|
|
This patch adds some more functionality to the cpu model and the arch to
interface with the vector register file.
This change consists mainly of augmenting ThreadContexts and ExecContexts
with calls to get/set full vectors, underlying microarchitectural elements
or lanes. Those are meant to interface with the vector register file. All
classes that implement this interface also get an appropriate implementation.
This requires implementing the vector register file for the different
models using the VecRegContainer class.
This change set also updates the Result abstraction to contemplate the
possibility of having a vector as result.
The changes also affect how the remote_gdb connection works.
There are some (nasty) side effects, such as the need to define dummy
numPhysVecRegs parameter values for architectures that do not implement
vector extensions.
Nathanael Premillieu's work with an increasing number of fixes and
improvements of mine.
Change-Id: Iee65f4e8b03abfe1e94e6940a51b68d0977fd5bb
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
[ Fix RISCV build issues and CC reg free list initialisation ]
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2705
|
|
When importing the cores.arm package, we currently throw an exception
if a timing model can't be imported due to a missing dependency (e.g.,
the required CPU model wasn't included in the build). This is
undesirable since it prevents other, working, timing models from being
added to the package. Wrap the import_module call in a try-except
block and skip timing models that have missing dependencies.
Change-Id: I92bab62c989f433a8a4a7bf59207d9d81b3d19e1
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3946
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
Instead of hard-coding timing models in CpuConfig.py, use
introspection to find them in the cores.arm model package.
Change-Id: I6642dc9cbc3f5beeeec748e716c9426c233d51ea
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3944
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
Change-Id: I189b6462cc64f7cc6c1b7a6c2af1abb60e1854de
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3943
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
The ex5_LITTLE and ex5_big configs currently depend on Caches.py and
O3_ARM_v7a.py. These aren't actual dependencies since all of the
params from the caches and the old O3 model are overridden. This
changeset updates the ex5 models to derive from the base SimObjects
instead.
Change-Id: I999e73bb9cc21ad96865c1bc0dd5973faa48ab61
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3942
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
Change-Id: I7762d344cb964c3e010135ff928c6ea12538912c
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3941
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
MemConfig currently assumes that all callers include the its full set
of options in the command line parser. This is unnecessary and
sometimes confusing. Make most of the options optional to avoid having
to add all of them to example scripts.
Change-Id: I2d73be2454427b00db16716edcfd96a47133c888
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/3940
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
Change-Id: I0c839bb649a5d2d73080b7e718da3c9b5839cf8c
Signed-off-by: Gedare Bloom <gedare@rtems.org>
Reviewed-on: https://gem5-review.googlesource.com/3264
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
Ruby for ARM systems is not fully supported but certain configurations
are expected to work. This change removes the more general fail
statement and warns or fails depending on the particular
configuration.
Change-Id: Ic24799aff966ba15858b93482e0f24a8672d9483
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2905
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
|
|
This patch enables using calibrated big and LITTLE cores, ex5_big and
ex5_LITTLE instead of the default 'arm_detailed' and 'minor' cpus. The ex5
model is based on the Samsung Exynos 5 Octa (5422) SoC. Operation and memory
hierarchy latencies have been calibrated using the lmbench micro-benchmark
suite. The preliminary validation results have been published as: 'Full-System
Simulation of big.LITTLE Multicore Architecture for Performance and Energy
Exploration', in International Symposium on Embedded Multicore/Many-core
Systems-on-Chip (MCSoC'16), Lyon, France (Sep, 2016).
From http://reviews.gem5.org/r/3666
Change-Id: I4935dee0a9222bd1bf7adfccb9443014945bb2d7
Signed-off-by: Anastasiia Butko <abutko@lbl.gov>
Signed-off-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr>
Reviewed-on: https://gem5-review.googlesource.com/2464
Reviewed-by: Gabor Dozsa <gabor.dozsa@arm.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
necessary kernel command line options in FSConfig.py
Change-Id: Id66f640b6beb4efa9c23080c3d2516eda688c72d
Reviewed-on: https://gem5-review.googlesource.com/3320
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Support for CPU aliases were removed recently.
Change-Id: I3c1173dc34170d8639d95e52bf660f248848f77f
Reviewed-on: https://gem5-review.googlesource.com/3100
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
This was added for backwards compatability, but it adds a decent amount
of complexity.
The table below shows what CPU class name to use in place of a given
alias.
+==========+========================================================+
| Alias | CPU class |
+==========+========================================================+
| timing | TimingSimpleCPU |
| atomic | AtomicSimpleCPU |
| minor | MinorCPU |
| detailed | DrivO3CPU |
| kvm | ArmKvmCPU, ArmV8KvmCPU or X86KvmCPU, depending on arch |
| trace | TraceCPU |
+==========+========================================================+
Change-Id: I251c4f64b7869c6b64dd25b36967ae240f01ef08
Reviewed-on: https://gem5-review.googlesource.com/2940
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
When the change below removed the hard coded disk name for the SPARC FS
configuration, it broke the regression which had not specified a disk name.
This change adds a default disk name so that the regression will continue to
work like it used to, but preserving the effect of this other change.
commit 86a25bbcee88f6e69299867b6264885d738f636e
Author: Jakub Jermar <jakub@jermar.eu>
Date: Tue Jul 19 09:52:46 2016 -0500
config: Allow SPARC FS image to be specified on the command line
Change-Id: Ieb317b2bf573a4f2fc435d34cccd1f246c28d84c
Reviewed-on: https://gem5-review.googlesource.com/2645
Maintainer: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
|
|
If output redirection is activated, the error message is printed in
simout. This change ensure it will be printed in simerr.
Change-Id: Ie661ac6b6978bf2e4aaaccdf23134795d764d459
Signed-off-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr>
Reviewed-on: https://gem5-review.googlesource.com/2221
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
The EIOProcess class was removed recently and it was the only other class
which derived from Process. Since every Process invocation is also a
LiveProcess invocation, it makes sense to simplify the organization by
combining the fields from LiveProcess into Process.
|
|
Names of DRAM configurations were updated to reflect both
the channel and device data width.
Previous naming format was:
<DEVICE_TYPE>_<DATA_RATE>_<CHANNEL_WIDTH>
The following nomenclature is now used:
<DEVICE_TYPE>_<DATA_RATE>_<n>x<w>
where n = The number of devices per rank on the channel
x = Device width
Total channel width can be calculated by n*w
Example:
A 64-bit DDR4, 2400 channel consisting of 4-bit devices:
n = 16
w = 4
The resulting configuration name is:
DDR4_2400_16x4
Updated scripts to match new naming convention.
Added unique configurations for DDR4 for:
1) 16x4
2) 8x8
3) 4x16
Change-Id: Ibd7f763b7248835c624309143cb9fc29d56a69d1
Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
|
|
The current TLM bridge only provides a Slave Port that allows the gem5
world to send request to the SystemC world. This patch series refractors
and cleans up the existing code, and adds a Master Port that allows the
SystemC world to send requests to the gem5 world.
This patch:
* Restructure the existing sources in preparation of the addition of the
* new
Master Port.
* Refractor names to allow for distinction of the slave and master port.
* Replace the Makefile by a SConstruct.
Testing Done: The examples provided in util/tlm (now
util/tlm/examples/slave_port) still compile and run error free.
Reviewed at http://reviews.gem5.org/r/3527/
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
|
|
Change-Id: Id6bdbc0c988aa92b96e292cabc913e6b974f14bb
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
|
|
If the cache access mode is parallel, i.e. "sequential_access" parameter
is set to "False", tags and data are accessed in parallel. Therefore,
the hit_latency is the maximum latency between tag_latency and
data_latency. On the other hand, if the cache access mode is
sequential, i.e. "sequential_access" parameter is set to "True",
tags and data are accessed sequentially. Therefore, the hit_latency
is the sum of tag_latency plus data_latency.
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
|
|
This patch adds the ability for an application to request dist-gem5 to begin/
end synchronization using an m5 op. When toggling on sync, all nodes agree
on the next sync point based on the maximum of all nodes' ticks. CPUs are
suspended until the sync point to avoid sending network messages until sync has
been enabled. Toggling off sync acts like a global execution barrier, where
all CPUs are disabled until every node reaches the toggle off point. This
avoids tricky situations such as one node hitting a toggle off followed by a
toggle on before the other nodes hit the first toggle off.
|
|
This patch breaks out the most basic configuration options into a set
of base options, to allow them to be used also by scripts that do not
involve any ISA, and thus no actual CPUs or devices.
The patch also fixes a few modules so that they can be imported in a
NULL build, and avoid dragging in FSConfig every time Options is
imported.
|
|
Modify the opClass assigned to AArch64 FP instructions from SimdFloat* to
Float*. Also create the FloatMemRead and FloatMemWrite opClasses, which
distinguishes writes to the INT and FP register banks.
Change the latency of (Simd)FloatMultAcc to 5, based on the Cortex-A72,
where the "latency" of FMADD is 3 if the next instruction is a FMADD and
has only the augend to destination dependency, otherwise it's 7 cycles.
Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
|
|
Continue along the same line as the recent patch that made the
Ruby-related config scripts Python packages and make also the
configs/common directory a package.
All affected config scripts are updated (hopefully).
Note that this change makes it apparent that the current organisation
and naming of the config directory and its subdirectories is rather
chaotic. We mix scripts that are directly invoked with scripts that
merely contain convenience functions. While it is not addressed in
this patch we should follow up with a re-organisation of the
config structure, and renaming of some of the packages.
|
|
This patch moves the addition of network options into the Ruby module
to avoid the regressions all having to add it explicitly. Doing this
exposes an issue in our current config system though, namely the fact
that addtoPath is relative to the Python script being executed. Since
both example and regression scripts use the Ruby module we would end
up with two different (relative) paths being added. Instead we take a
first step at turning the config modules into Python packages, simply
by adding a __init__.py in the configs/ruby, configs/topologies and
configs/network subdirectories.
As a result, we can now add the top-level configs directory to the
Python search path, and then use the package names in the various
modules. The example scripts are also updated, and the messy
path-deducing variations in the scripts are unified.
|
|
dist-gem5 should not be restricted to FullSystem mode.
|
|
This patch changes the default behaviour of the SystemXBar, adding a
snoop filter. With the recent updates to the snoop filter allocation
behaviour this change no longer causes problems for the regressions
without caches.
Change-Id: Ibe0cd437b71b2ede9002384126553679acc69cc1
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com>
|
|
Ruby on ARM is currently very experimental. Fail with a fatal error
that explains this to make sure users are aware of the limitations (it
doesn't actually work yet!).
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
Add initial support for creating an ARM system with a Ruby-based
memory system. This support is currently experimental and limited to
the new VExpress_GEM5_V1 platform.
Change-Id: I36baeb68b0d891e34ea46aafe17b5e55217b4bfa
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Brad Beckmann <brad.beckmann@amd.com>
|
|
At the moment the SPARC FS machine configuration comes with a hardcoded
value for using the Solaris 10 disk image from the OpenSPARC tarball. The
--disk-image option is completely ignored for SPARC. This simple patch
modifies the behavior so that --disk-image option is both taken into
account and also required. This makes it possible to easily change SPARC FS
images without having to modify the configuration files.
|
|
In this new hmc configuration we have used the existing components in gem5
mainly [SerialLink] [NoncoherentXbar]& [DRAMCtrl] to define 3 different
architecture for HMC.
Highlights
1- It explores 3 different HMC architectures
2- It creates 4-HMC crossbars and attaches 16 vault controllers with it.
This will connect vaults to serial links
3- From the previous version, HMCController with round robin funtionality
is being removed and all the serial links are being accessible directly
from user ports
4- Latency incorporated by HMCController (in previous version) is being
added to SerialLink
Committed by Jason Lowe-Power <jason@lowepower.com>
|
|
This patch ensures a walker cache is instantiated if specfied.
Change-Id: I2c6b4bf3454d56bb19558c73b406e1875acbd986
Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
Reviewed-by: Mitch Hayenga <mitch.hayenga@arm.com>
|
|
Disable the default snoop filter in the SystemXBar so that the
typical membus does not have a snoop filter by default. Instead,
add the snoop filter only when there are caches added to the system
(with the caches / l2cache options).
The underlying problem is that the snoop filter grows without
bounds (for now) if there are no caches to tell it that lines have
been evicted. This causes slow regression runs for all the atomic
regressions. This patch fixes this behaviour.
--HG--
extra : source : f97c20511828209757440839ed48d741d02d428f
|
|
According to the Intel Multi Processor Specification rev 1.4 (-006) (*),
section 4.3.2 Bus Entries, Bus type strings are >>6-character ASCII
(blank-filled) strings<<.
This patch properly pads the entries with the missing spaces at the end.
(*) http://www.intel.com/design/pentium/datashts/24201606.pdf
Committed by Jason Lowe-Power <power.jg@gmail.com>
|
|
This patch changes how the cache determines if snoops should be
forwarded from the memory side to the CPU side. Instead of having a
parameter, the cache now looks at the port connected on the CPU side,
and if it is a snooping port, then snoops are forwarded. Less error
prone, and less parameters to worry about.
The patch also tidies up the CPU classes to ensure that their I-side
port is not snooping by removing overrides to the snoop request
handler, such that snoop requests will panic via the default
MasterPort implement
|
|
Result of running 'hg m5style --skip-all --fix-white -a'.
|
|
|
|
Add a platform with support for both aarch32 and aarch64. This
platform implements a subset of the devices in a real Versatile
Express and extends it with some gem5-specific functionality. It is in
many ways similar to the old VExpress_EMM64 platform, but supports the
following new features:
* Automatic PCI interrupt assignment
* PCI interrupts allocated in a contiguous range.
* Automatic boot loader selection (32-bit / 64-bit)
* Cleaner memory map where gem5-specific devices live in CS5 which
isn't used by current Versatile Express platforms.
* No fake devices. Devices that were previously faked will be
removed from the device tree instead.
* Support for 510 GiB contiguous memory
|
|
Minor fix to ensure the HMCSystem can actually be instantiated
(SimObject cannot be created). Also address some spacing issues.
|
|
|
|
This patch adds changes to the configuration scripts to support elastic
tracing and replay.
The patch adds a command line option to enable elastic tracing in SE mode
and FS mode. When enabled the Elastic Trace cpu probe is attached to O3CPU
and a few O3 CPU parameters are tuned. The Elastic Trace probe writes out
both instruction fetch and data dependency traces. The patch also enables
configuring the TraceCPU to replay traces using the SE and FS script.
The replay run is designed to resume from checkpoint using atomic cpu to
restore state keeping it consistent with FS run flow. It then switches to
TraceCPU to replay the input traces.
|
|
The gem5's current PCI host functionality is very ad hoc. The current
implementations require PCI devices to be hooked up to the
configuration space via a separate configuration port. Devices query
the platform to get their config-space address range. Un-mapped parts
of the config space are intercepted using the XBar's default port
mechanism and a magic catch-all device (PciConfigAll).
This changeset redesigns the PCI host functionality to improve code
reuse and make config-space and interrupt mapping more
transparent. Existing platform code has been updated to use the new
PCI host and configured to stay backwards compatible (i.e., no
guest-side visible changes). The current implementation does not
expose any new functionality, but it can easily be extended with
features such as automatic interrupt mapping.
PCI devices now register themselves with a PCI host controller. The
host controller interface is defined in the abstract base class
PciHost. Registration is done by PciHost::registerDevice() which takes
the device, its bus position (bus/dev/func tuple), and its interrupt
pin (INTA-INTC) as a parameter. The registration interface returns a
PciHost::DeviceInterface that the PCI device can use to query memory
mappings and signal interrupts.
The host device manages the entire PCI configuration space. Accesses
to devices decoded into the devices bus position and then forwarded to
the correct device.
Basic PCI host functionality is implemented in the GenericPciHost base
class. Most platforms can use this class as a basic PCI controller. It
provides the following functionality:
* Configurable configuration space decoding. The number of bits
dedicated to a device is a prameter, making it possible to support
both CAM, ECAM, and legacy mappings.
* Basic interrupt mapping using the interruptLine value from a
device's configuration space. This behavior is the same as in the
old implementation. More advanced controllers can override the
interrupt mapping method to dynamically assign host interrupts to
PCI devices.
* Simple (base + addr) remapping from the PCI bus's address space to
physical addresses for PIO, memory, and DMA.
|
|
Add support for automatically discover available platforms. The
Python-side uses functionality similar to what we use when
auto-detecting available CPU models. The machine IDs have been updated
to match the platform configurations. If there isn't a matching
machine ID, the configuration scripts default to -1 which Linux uses
for device tree only platforms.
|
|
This patch adds the necessary commands and cache functionality to
allow clean writebacks. This functionality is crucial, especially when
having exclusive (victim) caches. For example, if read-only L1
instruction caches are not sending clean writebacks, there will never
be any spills from the L1 to the L2. At the moment the cache model
defaults to not sending clean writebacks, and this should possibly be
re-evaluated.
The implementation of clean writebacks relies on a new packet command
WritebackClean, which acts much like a Writeback (renamed
WritebackDirty), and also much like a CleanEvict. On eviction of a
clean block the cache either sends a clean evict, or a clean
writeback, and if any copies are still cached upstream the clean
evict/writeback is dropped. Similarly, if a clean evict/writeback
reaches a cache where there are outstanding MSHRs for the block, the
packet is dropped. In the typical case though, the clean writeback
allocates a block in the downstream cache, and marks it writable if
the evicted block was writable.
The patch changes the O3_ARM_v7a L1 cache configuration and the
default L1 caches in config/common/Caches.py
|
|
This patch adds a parameter to control the cache clusivity, that is if
the cache is mostly inclusive or exclusive. At the moment there is no
intention to support strict policies, and thus the options are: 1)
mostly inclusive, or 2) mostly exclusive.
The choice of policy guides the behaviuor on a cache fill, and a new
helper function, allocOnFill, is created to encapsulate the decision
making process. For the timing mode, the decision is annotated on the
MSHR on sending out the downstream packet, and in atomic we directly
pass the decision to handleFill. We (ab)use the tempBlock in cases
where we are not allocating on fill, leaving the rest of the cache
unaffected. Simple and effective.
This patch also makes it more explicit that multiple caches are
allowed to consider a block writable (this is the case
also before this patch). That is, for a mostly inclusive cache,
multiple caches upstream may also consider the block exclusive. The
caches considering the block writable/exclusive all appear along the
same path to memory, and from a coherency protocol point of view it
works due to the fact that we always snoop upwards in zero time before
querying any downstream cache.
Note that this patch does not introduce clean writebacks. Thus, for
clean lines we are essentially removing a cache level if it is made
mostly exclusive. For example, lines from the read-only L1 instruction
cache or table-walker cache are always clean, and simply get dropped
rather than being passed to the L2. If the L2 is mostly exclusive and
does not allocate on fill it will thus never hold the line. A follow
on patch adds the clean writebacks.
The patch changes the L2 of the O3_ARM_v7a CPU configuration to be
mostly exclusive (and stats are affected accordingly).
|