Age | Commit message (Collapse) | Author |
|
Refactor the TLB and page table walker test interface to use a dynamic
registration mechanism. Instead of patching a couple of empty methods
to wire up a TLB tester, this change allows such testers to register
themselves using the setTestInterface() method.
|
|
Libraries are loaded into the process address space using the
mmap system call. Conveniently, this happens to be a good
time to update the process symbol table with the library's
incoming symbols so we handle the table update from within the
system call.
This works just like an application's normal symbols. The only
difference between a dynamic library and a main executable is
when the symbol table update occurs. The symbol table update for
an executable happens at program load time and is finished before
the process ever begins executing. Since dynamic linking happens
at runtime, the symbol loading happens after the library is
first loaded into the process address space. The library binary
is examined at this time for a symbol section and that section
is parsed for symbol types with specific bindings (global,
local, weak). Subsequently, these symbols are added to the table
and are available for use by gem5 for things like trace
generation.
Checkpointing should work just as it did previously. The address
space (and therefore the library) will be recorded and the symbol
table will be entirely recorded. (It's not possible to do anything
clever like checkpoint a program and then load the program back
with different libraries with LD_LIBRARY_PATH, because the
library becomes part of the address space after being loaded.)
|
|
|
|
The mmapGrowsDown() method was a static method on the OperatingSystem
class (and derived classes), which worked OK for the templated syscall
emulation methods, but made it hard to access elsewhere. This patch
moves the method to be a virtual function on the LiveProcess method,
where it can be overridden for specific platforms (for now, Alpha).
This patch also changes the value of mmapGrowsDown() from being false
by default and true only on X86Linux32 to being true by default and
false only on Alpha, which seems closer to reality (though in reality
most people use ASLR and this doesn't really matter anymore).
In the process, also got rid of the unused mmap_start field on
LiveProcess and OperatingSystem mmapGrowsUp variable.
|
|
For O3, which has a stat that counts reg reads, there is an additional
reg read per mmap() call since there's an arg we no longer ignore.
Otherwise, stats should not be affected.
|
|
|
|
The structure definition only had the open system call flag set in mind when
it was named, so we rename it here with the intention of using it to define
additional tables to translate flags for other system calls in the future.
|
|
Fix the printDataInst function to properly print the immediate value.
|
|
This changeset adds support for changing the simulator output
directory. This can be useful when the simulation goes through several
stages (e.g., a warming phase, a simulation phase, and a verification
phase) since it allows the output from each stage to be located in a
different directory. Relocation is done by calling core.setOutputDir()
from Python or simout.setOutputDirectory() from C++.
This change affects several parts of the design of the gem5's output
subsystem. First, files returned by an OutputDirectory instance (e.g.,
simout) are of the type OutputStream instead of a std::ostream. This
allows us to do some more book keeping and control re-opening of files
when the output directory is changed. Second, new subdirectories are
OutputDirectory instances, which should be used to create files in
that sub-directory.
Signed-off-by: Andreas Sandberg <andreas@sandberg.pp.se>
[sascha.bischoff@arm.com: Rebased patches onto a newer gem5 version]
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
|
|
This patch adds assertions that enforce that only invalidating snoops
will ever reach into the logic that tracks in-order load completion and
also invalidation of LL/SC (and MONITOR / MWAIT) monitors. Also adds
some comments to MSHR::replaceUpgrades().
|
|
Properly done for the ERET instruction in v8, but not for v7.
Many control register changes are only visible after explicit
instruction synchronization barriers or exception entry/exit.
This means mode changing instructions should squash any
younger in-flight speculative instructions.
|
|
This patch implements the clock_getres() system call for arm and x86 in linux
SE mode.
|
|
Result of running 'hg m5style --skip-all --fix-control -a'.
|
|
Result of running 'hg m5style --skip-all --fix-white -a'.
|
|
For historical reasons, the ExecContext interface had a single
function, readMem(), that did two different things depending on
whether the ExecContext supported atomic memory mode (i.e.,
AtomicSimpleCPU) or timing memory mode (all the other models).
In the former case, it actually performed a memory read; in the
latter case, it merely initiated a read access, and the read
completion did not happen until later when a response packet
arrived from the memory system.
This led to some confusing things, including timing accesses
being required to provide a pointer for the return data even
though that pointer was only used in atomic mode.
This patch splits this interface, adding a new initiateMemRead()
function to the ExecContext interface to replace the timing-mode
use of readMem().
For consistency and clarity, the readMemTiming() helper function
in the ISA definitions is renamed to initiateMemRead() as well.
For x86, where the access size is passed in explicitly, we can
also get rid of the data parameter at this level. For other ISAs,
where the access size is determined from the type of the data
parameter, we have to keep the parameter for that purpose.
|
|
|
|
Make best use of the compiler, and enable -Wextra as well as
-Wall. There are a few issues that had to be resolved, but they are
all trivial.
|
|
The key parameter can be used to read out various config parameters from
within the simulated software.
|
|
Currently, the wire format of register values in g- and G-packets is
modelled using a union of uint8/16/32/64 arrays. The offset positions
of each register are expressed as a "register count" scaled according
to the width of the register in question. This results in counter-
intuitive and error-prone "register count arithmetic", and some
formats would even be altogether unrepresentable in such model, e.g.
a 64-bit register following a 32-bit one would have a fractional index
in the regs64 array.
Another difficulty is that the array is allocated before the actual
architecture of the workload is known (and therefore before the correct
size for the array can be calculated).
With this patch I propose a simpler mechanism for expressing the
register set structure. In the new code, GdbRegCache is an abstract
class; its subclasses contain straightforward structs reflecting the
register representation. The determination whether to use e.g. the
AArch32 vs. AArch64 register set (or SPARCv8 vs SPARCv9, etc.) is made
by polymorphically dispatching getregs() to the concrete subclass.
The subclass is not instantiated until it is needed for actual
g-/G-packet processing, when the mode is already known.
This patch is not meant to be merged in on its own, because it changes
the contract between src/base/remote_gdb.* and src/arch/*/remote_gdb.*,
so as it stands right now, it would break the other architectures.
In this patch only the base and the ARM code are provided for review;
once we agree on the structure, I will provide src/arch/*/remote_gdb.*
for the other architectures; those patches could then be merged in
together.
Review Request: http://reviews.gem5.org/r/3207/
Pushed by Joel Hestness <jthestness@gmail.com>
|
|
Add support for automatically discover available platforms. The
Python-side uses functionality similar to what we use when
auto-detecting available CPU models. The machine IDs have been updated
to match the platform configurations. If there isn't a matching
machine ID, the configuration scripts default to -1 which Linux uses
for device tree only platforms.
|
|
Add support for automatically selecting a boot loader that matches the
guest system's kernel. Instead of accepting a single boot loader, the
ArmSystem class now accepts a vector of boot loaders. When
initializing a system, the we now look for the first boot loader with
an architecture that matches the kernel.
This changeset makes it possible to use the same system for both
64-bit and 32-bit kernels.
|
|
Appease clang.
|
|
|
|
The checkpoint changes, along with the SMT patches have changed a
number of APIs. Adapt the ArmKvmCPU accordingly.
|
|
This patch adds explicit overrides as this is now required when using
"-Wall" with clang >= 3.5, the latter now part of the most recent
XCode. The patch consequently removes "virtual" for those methods
where "override" is added. The latter should be enough of an
indication.
As part of this patch, a few minor issues that clang >= 3.5 complains
about are also resolved (unused methods and variables).
|
|
This patch moves away from using M5_ATTR_OVERRIDE and the m5::hashmap
(and similar) abstractions, as these are no longer needed with gcc 4.7
and clang 3.1 as minimum compiler versions.
|
|
The decoder is responsible for splitting instructions in micro
operations (uops). Given that different micro architectures may split
operations differently, this patch allows to specify which micro
architecture each isa implements, so different cores in the system can
split instructions differently, also decoupling uop splitting
(microArch) from ISA (Arch). This is done making the decodification
calls templates that receive a type 'DecoderFlavour' that maps the
name of the operation to the class that implements it. This way there
is only one selection point (converting the command line enum to the
appropriate DecodeFeatures object). In addition, there is no explicit
code replication: template instantiation hides that, and the compiler
should be able to resolve a number of things at compile-time.
|
|
In ARM, certain variables are only updated when a necessary change is
detected. Having 2 SMT threads share a TLB resulted in these not being
updated as required. This patch adds a thread context identifer to
assist in the invalidation of these variables.
|
|
Changes wakeup functionality so that only specific threads on SMT
capable cpus are woken.
|
|
Adds per-thread interrupt controllers and thread/context logic
so that interrupts properly get routed in SMT systems.
|
|
Changes assignment of the MPIDR for multi-threaded systems only.
|
|
Cleaning up dead code. The CLREX stores zero directly to
MISCREG_LOCKFLAG and so the request flag is no longer needed. The
corresponding functionality in the cache tags is also removed.
|
|
A more natural home for this constant.
|
|
|
|
This adds a vector register type. The type is defined as a std::array of a
fixed number of uint64_ts. The isa_parser.py has been modified to parse vector
register operands and generate the required code. Different cpus have vector
register files now.
|
|
The drain() call currently passes around a DrainManager pointer, which
is now completely pointless since there is only ever one global
DrainManager in the system. It also contains vestiges from the time
when SimObjects had to keep track of their child objects that needed
draining.
This changeset moves all of the DrainState handling to the Drainable
base class and changes the drain() and drainResume() calls to reflect
this. Particularly, the drain() call has been updated to take no
parameters (the DrainManager argument isn't needed) and return a
DrainState instead of an unsigned integer (there is no point returning
anything other than 0 or 1 any more). Drainable objects should return
either DrainState::Draining (equivalent to returning 1 in the old
system) if they need more time to drain or DrainState::Drained
(equivalent to returning 0 in the old system) if they are already in a
consistent state. Returning DrainState::Running is considered an
error.
Drain done signalling is now done through the signalDrainDone() method
in the Drainable class instead of using the DrainManager directly. The
new call checks if the state of the object is DrainState::Draining
before notifying the drain manager. This means that it is safe to call
signalDrainDone() without first checking if the simulator has
requested draining. The intention here is to reduce the code needed to
implement draining in simple objects.
|
|
Draining is currently done by traversing the SimObject graph and
calling drain()/drainResume() on the SimObjects. This is not ideal
when non-SimObjects (e.g., ports) need draining since this means that
SimObjects owning those objects need to be aware of this.
This changeset moves the responsibility for finding objects that need
draining from SimObjects and the Python-side of the simulator to the
DrainManager. The DrainManager now maintains a set of all objects that
need draining. To reduce the overhead in classes owning non-SimObjects
that need draining, objects inheriting from Drainable now
automatically register with the DrainManager. If such an object is
destroyed, it is automatically unregistered. This means that drain()
and drainResume() should never be called directly on a Drainable
object.
While implementing the new functionality, the DrainManager has now
been made thread safe. In practice, this means that it takes a lock
whenever it manipulates the set of Drainable objects since SimObjects
in different threads may create Drainable objects
dynamically. Similarly, the drain counter is now an atomic_uint, which
ensures that it is manipulated correctly when objects signal that they
are done draining.
A nice side effect of these changes is that it makes the drain state
changes stricter, which the simulation scripts can exploit to avoid
redundant drains.
|
|
The drain state enum is currently a part of the Drainable
interface. The same state machine will be used by the DrainManager to
identify the global state of the simulator. Make the drain state a
global typed enum to better cater for this usage scenario.
|
|
Objects that are can be serialized are supposed to inherit from the
Serializable class. This class is meant to provide a unified API for
such objects. However, so far it has mainly been used by SimObjects
due to some fundamental design limitations. This changeset redesigns
to the serialization interface to make it more generic and hide the
underlying checkpoint storage. Specifically:
* Add a set of APIs to serialize into a subsection of the current
object. Previously, objects that needed this functionality would
use ad-hoc solutions using nameOut() and section name
generation. In the new world, an object that implements the
interface has the methods serializeSection() and
unserializeSection() that serialize into a named /subsection/ of
the current object. Calling serialize() serializes an object into
the current section.
* Move the name() method from Serializable to SimObject as it is no
longer needed for serialization. The fully qualified section name
is generated by the main serialization code on the fly as objects
serialize sub-objects.
* Add a scoped ScopedCheckpointSection helper class. Some objects
need to serialize data structures, that are not deriving from
Serializable, into subsections. Previously, this was done using
nameOut() and manual section name generation. To simplify this,
this changeset introduces a ScopedCheckpointSection() helper
class. When this class is instantiated, it adds a new /subsection/
and subsequent serialization calls during the lifetime of this
helper class happen inside this section (or a subsection in case
of nested sections).
* The serialize() call is now const which prevents accidental state
manipulation during serialization. Objects that rely on modifying
state can use the serializeOld() call instead. The default
implementation simply calls serialize(). Note: The old-style calls
need to be explicitly called using the
serializeOld()/serializeSectionOld() style APIs. These are used by
default when serializing SimObjects.
* Both the input and output checkpoints now use their own named
types. This hides underlying checkpoint implementation from
objects that need checkpointing and makes it easier to change the
underlying checkpoint storage code.
|
|
Break the dependency on dma_device.hh by forward-declaring DmaPort in
the relevant header.
|
|
There seems to have been a debug print left in when the original ARMv8
support was merged in. This printout is performed every time you
initialize a hardware thread, and it prints raw pointers, so it always
causes diffs in the regression. This patch removes the debug print.
|
|
ldrsh was typoed as hdrsh, which is a bit annoying when printing
instructions. This patch fixes it.
|
|
put O_DIRECT under ifdefs -- this fixes build for MacOSX.
Also use correct class for arm64 openFlagTable.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
|
|
This changeset adds support for aarch64 in kvm. The CPU module
supports both checkpointing and online CPU model switching as long as
no devices are simulated by the host kernel. It currently has the
following limitations:
* The system register based generic timer can only be simulated by
the host kernel. Workaround: Use a memory mapped timer instead to
simulate the timer in gem5.
* Simulating devices (e.g., the generic timer) in the host kernel
requires that the host kernel also simulates the GIC.
* ID registers in the host and in gem5 must match for switching
between simulated CPUs and KVM. This is particularly important
for ID registers describing memory system capabilities (e.g.,
ASID size, physical address size).
* Switching between a virtualized CPU and a simulated CPU is
currently not supported if in-kernel device emulation is
used. This could be worked around by adding support for switching
to the gem5 (e.g., the KvmGic) side of the device models. A
simpler workaround is to avoid in-kernel device models
altogether.
|
|
This changeset adds a GIC implementation that uses the kernel's
built-in support for simulating the interrupt controller. Since there
is currently no support for state transfer between gem5 and the
kernel, the device model does not support serialization and CPU
switching (which would require switching to a gem5-simulated GIC).
|
|
This changeset moves the ARM-specific KVM CPU implementation to
arch/arm/kvm/. This change is expected to keep the source tree
somewhat cleaner as we start adding support for ARMv8 and KVM
in-kernel interrupt controller simulation.
--HG--
rename : src/cpu/kvm/ArmKvmCPU.py => src/arch/arm/kvm/ArmKvmCPU.py
rename : src/cpu/kvm/arm_cpu.cc => src/arch/arm/kvm/arm_cpu.cc
rename : src/cpu/kvm/arm_cpu.hh => src/arch/arm/kvm/arm_cpu.hh
|
|
|
|
This patch adds better caching of the sys regs for AArch64, thus
avoiding unnecessary calls to tc->readMiscReg(MISCREG_CPSR) in the
non-faulting case.
|
|
Adding a few syscalls that were previously considered unimplemented.
|
|
The ArmSystem class has a parameter to indicate whether it is
configured to use the generic timer extension or not. This parameter
doesn't affect any feature flags in the current implementation and is
therefore completely unnecessary. In fact, we usually don't set it
even if a system has a generic timer. If we ever need to check if
there is a generic timer present, we should just request a pointer and
check if it is non-null instead.
|