gem5 - gem5

Age	Commit message (Collapse)	Author
2015-07-07	sim: Refactor the serialization base class	Andreas Sandberg
	Objects that are can be serialized are supposed to inherit from the Serializable class. This class is meant to provide a unified API for such objects. However, so far it has mainly been used by SimObjects due to some fundamental design limitations. This changeset redesigns to the serialization interface to make it more generic and hide the underlying checkpoint storage. Specifically: * Add a set of APIs to serialize into a subsection of the current object. Previously, objects that needed this functionality would use ad-hoc solutions using nameOut() and section name generation. In the new world, an object that implements the interface has the methods serializeSection() and unserializeSection() that serialize into a named /subsection/ of the current object. Calling serialize() serializes an object into the current section. * Move the name() method from Serializable to SimObject as it is no longer needed for serialization. The fully qualified section name is generated by the main serialization code on the fly as objects serialize sub-objects. * Add a scoped ScopedCheckpointSection helper class. Some objects need to serialize data structures, that are not deriving from Serializable, into subsections. Previously, this was done using nameOut() and manual section name generation. To simplify this, this changeset introduces a ScopedCheckpointSection() helper class. When this class is instantiated, it adds a new /subsection/ and subsequent serialization calls during the lifetime of this helper class happen inside this section (or a subsection in case of nested sections). * The serialize() call is now const which prevents accidental state manipulation during serialization. Objects that rely on modifying state can use the serializeOld() call instead. The default implementation simply calls serialize(). Note: The old-style calls need to be explicitly called using the serializeOld()/serializeSectionOld() style APIs. These are used by default when serializing SimObjects. * Both the input and output checkpoints now use their own named types. This hides underlying checkpoint implementation from objects that need checkpointing and makes it easier to change the underlying checkpoint storage code.
2015-07-07	sim: Add serialization macros for std containers	Andreas Sandberg

2015-07-06	mem: Cleanup CommMonitor in preparation for probe support	Andreas Sandberg
	Make configuration parameters constant and get rid of an unnecessary dependency on the Time class.
2015-07-04	x86: Adjust the size of the values written to the x87 misc registers	Nikos Nikoleris
	All x87 misc registers are implemented in an array of 64 bit values but in real hardware the size of some of these registers is smaller. Previsouly all 64 bits where incorrectly set and then later read. To ensure correctness we mask the value in setMiscRegNoEffect to write only the valid bits. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-07-04	o3: correct the number of cc registers in rename map	Nilay Vaish

2015-07-04	mem: packet: Add const to constructor argument	Nilay Vaish

2015-07-04	ruby: drop NetworkMessage class	Nilay Vaish
	This patch drops the NetworkMessage class. The relevant data members and functions have been moved to the Message class, which was the parent of NetworkMessage.
2015-07-04	ruby: mesi three level: name change to avoid clash	Nilay Vaish
	The accessor function getDestination() for Destination variable in the coherence message clashes with the getDestination() that is part of the Message class. Hence the name change.
2015-07-04	ruby: remove message buffer node	Nilay Vaish
	This structure's only purpose was to provide a comparison function for ordering messages in the MessageBuffer. The comparison function is now being moved to the Message class itself. So we no longer require this structure.
2015-07-03	mem: Increase the default buffer sizes for the DDR4 controller	Andreas Hansson
	This patch increases the default read/write buffer sizes for the DDR4 controller config to values that are more suitable for the high bandwidth and high bank count.
2015-07-03	mem: Update DRAM command scheduler for bank groups	Wendy Elsasser
	This patch updates the command arbitration so that bank group timing as well as rank-to-rank delays will be taken into account. The resulting arbitration no longer selects commands (prepped or not) that cannot issue seamlessly if there are commands that can issue back-to-back, minimizing the effect of rank-to-rank (tCS) & same bank group (tCCD_L) delays. The arbitration selects a new command based on the following priority. Within each priority band, the arbitration will use FCFS to select the appropriate command: 1) Bank is prepped and burst can issue seamlessly, without a bubble 2) Bank is not prepped, but can prep and issue seamlessly, without a bubble 3) Bank is prepped but burst cannot issue seamlessly. In this case, a bubble will occur on the bus Thus, to enable more parallelism in subsequent selections, an unprepped packet is given higher priority if the bank prep can be hidden. If the bank prep cannot be hidden, the selection logic will choose a prepped packet that cannot issue seamlessly if one exist. Otherwise, the default selection will choose the packet with the minimum bank prep delay.
2015-07-03	mem: Avoid DRAM write queue iteration for merging and read lookup	Andreas Hansson
	This patch adds a simple lookup structure to avoid iterating over the write queue to find read matches, and for the merging of write bursts. Instead of relying on iteration we simply store a set of currently-buffered write-burst addresses and compare against these. For the reads we still perform the iteration if we have a match. For the writes, we rely entirely on the set. Note that there are corner-cases where sub-bursts would actually not be mergeable without a read-modify-write. We ignore these cases and opt for speed.
2015-07-03	mem: Delay responses in the crossbar before forwarding	Andreas Hansson
	This patch changes how the crossbar classes deal with responses. Instead of forwarding responses directly and burdening the neighbouring modules in paying for the latency (through the pkt->headerDelay), we now queue them before sending them. The coherency protocol is not affected as requests and any snoop requests/responses are still passed on in zero time. Thus, the responses end up paying for any header delay accumulated when passing through the crossbar. Any latency incurred on the request path will be paid for on the response side, if no other module has dealt with it. As a result of this patch, responses are returned at a later point. This affects the number of outstanding transactions, and quite a few regressions see an impact in blocking due to no MSHRs, increased cache-miss latencies, etc. Going forward we should be able to use the same concept also for snoop responses, and any request that is not an express snoop.
2015-07-03	mem: Remove redundant is_top_level cache parameter	Andreas Hansson
	This patch takes the final step in removing the is_top_level parameter from the cache. With the recent changes to read requests and write invalidations, the parameter is no longer needed, and consequently removed. This also means that asymmetric cache hierarchies are now fully supported (and we are actually using them already with L1 caches, but no table-walker caches, connected to a shared L2).
2015-07-03	mem: Split WriteInvalidateReq into write and invalidate	Andreas Hansson
	WriteInvalidateReq ensures that a whole-line write does not incur the cost of first doing a read exclusive, only to later overwrite the data. This patch splits the existing WriteInvalidateReq into a WriteLineReq, which is done locally, and an InvalidateReq that is sent out throughout the memory system. The WriteLineReq re-uses the normal WriteResp. The change allows us to better express the difference between the cache that is performing the write, and the ones that are merely invalidating. As a consequence, we no longer have to rely on the isTopLevel flag. Moreover, the actual memory in the system does not see the intitial write, only the writeback. We were marking the written line as dirty already, so there is really no need to also push the write all the way to the memory. The overall flow of the write-invalidate operation remains the same, i.e. the operation is only carried out once the response for the invalidate comes back. This patch adds the InvalidateResp for this very reason.
2015-07-03	mem: Add ReadCleanReq and ReadSharedReq packets	Andreas Hansson
	This patch adds two new read requests packets: ReadCleanReq - For a cache to explicitly request clean data. The response is thus exclusive or shared, but not owned or modified. The read-only caches (see previous patch) use this request type to ensure they do not get dirty data. ReadSharedReq - We add this to distinguish cache read requests from those issued by other masters, such as devices and CPUs. Thus, devices use ReadReq, and caches use ReadCleanReq, ReadExReq, or ReadSharedReq. For the latter, the response can be any state, shared, exclusive, owned or even modified. Both ReadCleanReq and ReadSharedReq re-use the normal ReadResp. The two transactions are aligned with the emerging cache-coherent TLM standard and the AMBA nomenclature. With this change, the normal ReadReq should never be used by a cache, and is reserved for the actual (non-caching) masters in the system. We thus have a way of identifying if a request came from a cache or not. The introduction of ReadSharedReq thus removes the need for the current isTopLevel hack, and also allows us to stop relying on checking the packet size to determine if the source is a cache or not. This is fixed in follow-on patches.
2015-07-03	mem: Allow read-only caches and check compliance	Andreas Hansson
	This patch adds a parameter to the BaseCache to enable a read-only cache, for example for the instruction cache, or table-walker cache (not for x86). A number of checks are put in place in the code to ensure a read-only cache does not end up with dirty data. A follow-on patch adds suitable read requests to allow a read-only cache to explicitly ask for clean data.
2015-07-03	mem: Add clean evicts to improve snoop filter tracking	Ali Jafri
	This patch adds eviction notices to the caches, to provide accurate tracking of cache blocks in snoop filters. We add the CleanEvict message to the memory heirarchy and use both CleanEvicts and Writebacks with BLOCK_CACHED flags to propagate notice of clean and dirty evictions respectively, down the memory hierarchy. Note that the BLOCK_CACHED flag indicates whether there exist any copies of the evicted block in the caches above the evicting cache. The purpose of the CleanEvict message is to notify snoop filters of silent evictions in the relevant caches. The CleanEvict message behaves much like a Writeback. CleanEvict is a write and a request but unlike a Writeback, CleanEvict does not have data and does not need exclusive access to the block. The cache generates the CleanEvict message on a fill resulting in eviction of a clean block. Before travelling downwards CleanEvict requests generate zero-time snoop requests to check if the same block is cached in upper levels of the memory heirarchy. If the block exists, the cache discards the CleanEvict message. The snoops check the tags, writeback queue and the MSHRs of upper level caches in a manner similar to snoops generated from HardPFReqs. Currently CleanEvicts keep travelling towards main memory unless they encounter the block corresponding to their address or reach main memory (since we have no well defined point of serialisation). Main memory simply discards CleanEvict messages. We have modified the behavior of Writebacks, such that they generate snoops to check for the presence of blocks in upper level caches. It is possible in our current implmentation for a lower level cache to be writing back a block while a shared copy of the same block exists in the upper level cache. If the snoops find the same block in upper level caches, we set the BLOCK_CACHED flag in the Writeback message. We have also added logic to account for interaction of other message types with CleanEvicts waiting in the writeback queue. A simple example is of a response arriving at a cache removing any CleanEvicts to the same address from the cache's writeback queue.
2015-07-03	mem: Convert Request static const flags to enums	Andreas Hansson
	This patch fixes an issue which is very wide spread in the codebase, causing sporadic linking failures. The issue is that we declare static const class variables in the header, without any definition (as part of a source file). In most cases the compiler propagates the value and we have no issues. However, especially for less optimising builds such as debug, we get sporadic linking failures due to undefined references. This patch fixes the Request class, by turning the static const flags and master IDs into C++11 typed enums.
2015-07-03	base: remove fd from object loaders	Curtis Dunham
	All the object loaders directly examine the (already completely loaded by object_file.cc) memory image. There is no current motivation to keep the fd around.
2015-07-03	scons: Bump compiler requirement to gcc >= 4.7 and clang >= 3.1	Andreas Hansson
	This patch updates the compiler minimum requirement to gcc 4.7 and clang 3.1, thus allowing: 1. Explicit virtual overrides (no need for M5_ATTR_OVERRIDE) 2. Non-static data member initializers 3. Template aliases 4. Delegating constructors This patch also enables a transition from --std=c++0x to --std=c++11.
2015-06-25	ruby: slicc: remove README	Nilay Vaish
	No longer maintained. Updates are only made to the wiki page. So being dropped.
2015-06-25	ruby: message: remove a data member added by mistake	Nilay Vaish
	I (Nilay) had mistakenly added a data member to the Message class in revision c1694b4032a6. The data member is being removed.
2015-06-25	Ruby: Remove assert in RubyPort retry list logic	Jason Power
	Remove the assert when adding a port to the RubyPort retry list. Instead of asserting, just ignore the added port, since it's already on the list. Without this patch, Ruby+detailed fails for even the simplest tests
2015-06-21	base: Add a warn_if macro	Andreas Sandberg
	Add a warn if macro that is analogous to the panic_if and fatal_if.
2015-06-21	arm: Cleanup arch headers to remove dma_device.hh dependency	Andreas Sandberg
	Break the dependency on dma_device.hh by forward-declaring DmaPort in the relevant header.
2015-06-09	mem: Add check for express snoop in packet destructor	Ali Jafri
	Snoop packets share the request pointer with the originating packets. We need to ensure that the snoop packet destruction does not delete the request. Snoops are used for reads, invalidations, HardPFReqs, Writebacks and CleansEvicts. Reads, invalidations, and HardPFReqs need a response so their snoops do not delete the request. For Writebacks and CleanEvicts we need to check explicitly for whethere the current packet is an express snoop, in whcih case do not delete the request.
2015-06-09	mem: Fix snoop packet data allocation bug	Andreas Hansson
	This patch fixes an issue where the snoop packet did not properly forward the data pointer in case of static data.
2015-06-09	arm: Delete debug print in initialization of hardware thread	Rune Holm
	There seems to have been a debug print left in when the original ARMv8 support was merged in. This printout is performed every time you initialize a hardware thread, and it prints raw pointers, so it always causes diffs in the regression. This patch removes the debug print.
2015-06-09	arm: Fix typo in ldrsh instruction name	Rune Holm
	ldrsh was typoed as hdrsh, which is a bit annoying when printing instructions. This patch fixes it.
2015-06-09	base: Reset CircleBuf size on flush()	Andreas Sandberg
	The flush() method in CircleBuf resets the state of the circular buffer, but fails to set size to zero. This obviously confuses code that tries to determine the amount of data in the buffer. Set the size to zero on flush.
2015-06-09	dev, arm: Include PIO size in AmbaDmaDevice constructor	Andreas Sandberg
	Make it possible to specify the size of the PIO space for an AMBA DMA device. Maintain backwards compatibility and default to zero.
2015-06-07	ruby: Fix MESI consistency bug	Marco Elver
	Fixes missed forward eviction to CPU. With the O3CPU this can lead to load-load reordering, as the LQ is never notified of the invalidate. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-06-07	mem: Add HMC Timing Parameters	Matthias Jung
	A single HMC-2500 x32 model based on: [1] DRAMSpec: a high-level DRAM bank modelling tool developed at the University of Kaiserslautern. This high level tool uses RC (resistance-capacitance) and CV (capacitance-voltage) models to estimate the DRAM bank latency and power numbers. [2] A Logic-base Interconnect for Supporting Near Memory Computation in the Hybrid Memory Cube (E. Azarkhish et. al) Assumed for the HMC model is a 30 nm technology node. The modelled HMC consists of a 4 Gbit part with 4 layers connected with TSVs. Each layer has 16 vaults and each vault consists of 2 banks per layer. In order to be able to use the same controller used for 2D DRAM generations for HMC, the following analogy is done: Channel (DDR) => Vault (HMC) device_size (DDR) => size of a single layer in a vault ranks per channel (DDR) => number of layers banks per rank (DDR) => banks per layer devices per rank (DDR) => devices per layer ( 1 for HMC). The parameters for which no input is available are inherited from the DDR3 configuration.
2015-06-07	arch: fix build under MacOSX	Ruslan Bukin ext:(%2C%20Zhang%20Guoye)
	put O_DIRECT under ifdefs -- this fixes build for MacOSX. Also use correct class for arm64 openFlagTable. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-05-30	mem: addr_mapper: restore old address if request not sent	Christoph Pfister
	Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-06-01	sim, arm: add checkpoint upgrader for d02b45a5	Curtis Dunham
	The insertion of CONTEXTIDR_EL2 in the ARM miscellaneous registers obsoletes old checkpoints.
2015-06-01	kvm, arm: Add support for aarch64	Andreas Sandberg
	This changeset adds support for aarch64 in kvm. The CPU module supports both checkpointing and online CPU model switching as long as no devices are simulated by the host kernel. It currently has the following limitations: * The system register based generic timer can only be simulated by the host kernel. Workaround: Use a memory mapped timer instead to simulate the timer in gem5. * Simulating devices (e.g., the generic timer) in the host kernel requires that the host kernel also simulates the GIC. * ID registers in the host and in gem5 must match for switching between simulated CPUs and KVM. This is particularly important for ID registers describing memory system capabilities (e.g., ASID size, physical address size). * Switching between a virtualized CPU and a simulated CPU is currently not supported if in-kernel device emulation is used. This could be worked around by adding support for switching to the gem5 (e.g., the KvmGic) side of the device models. A simpler workaround is to avoid in-kernel device models altogether.
2015-06-01	kvm, arm, dev: Add an in-kernel GIC implementation	Andreas Sandberg
	This changeset adds a GIC implementation that uses the kernel's built-in support for simulating the interrupt controller. Since there is currently no support for state transfer between gem5 and the kernel, the device model does not support serialization and CPU switching (which would require switching to a gem5-simulated GIC).
2015-06-01	kvm: Handle inst events at the current instruction count	Andreas Sandberg
	There are cases (particularly when attaching GDB) when instruction events are scheduled at the current instruction tick. This used to trigger an assertion error in kvm. This changeset adds a check for this condition and forces KVM to do a quick entry that completes any pending IO operations, but does not execute any new instructions, before servicing the event. We could check if we need to enter KVM at all, but forcing a quick entry is makes the code slightly cleaner and does not hurt correctness (performance is hardly an issue in these cases).
2015-06-01	kvm, arm: Move ARM-specific files to arch/arm/kvm/	Andreas Sandberg
	This changeset moves the ARM-specific KVM CPU implementation to arch/arm/kvm/. This change is expected to keep the source tree somewhat cleaner as we start adding support for ARMv8 and KVM in-kernel interrupt controller simulation. --HG-- rename : src/cpu/kvm/ArmKvmCPU.py => src/arch/arm/kvm/ArmKvmCPU.py rename : src/cpu/kvm/arm_cpu.cc => src/arch/arm/kvm/arm_cpu.cc rename : src/cpu/kvm/arm_cpu.hh => src/arch/arm/kvm/arm_cpu.hh
2015-05-26	arm: implement the CONTEXTIDR_EL2 system reg.	Curtis Dunham

2015-05-26	arm: Make address translation faster with better caching	Nathanael Premillieu
	This patch adds better caching of the sys regs for AArch64, thus avoiding unnecessary calls to tc->readMiscReg(MISCREG_CPSR) in the non-faulting case.
2015-05-26	base: Allow multiple interleaved ranges	Andreas Hansson
	This patch changes how the address range calculates intersection such that a system can have a number of non-overlapping interleaved ranges without complaining. Without this patch we end up with a panic.
2015-05-26	cpu: Fix a bug in counting issued instructions in MinorCPU	Andrew Bardsley
	The MinorCPU would count bubbles in Execute::issue as part of the num_insts_issued and so sometimes reach the instruction issue limit incorrectly. Fixed by checking for a bubble in one new place.
2015-05-26	arm: Implement some missing syscalls (SE mode)	Giacomo Gabrielli
	Adding a few syscalls that were previously considered unimplemented.
2015-05-26	ruby: Deprecation warning for RubyMemoryControl	Andreas Hansson
	A step towards removing RubyMemoryControl and shift users to DRAMCtrl. The latter is faster, more representative, very versatile, and is integrated with power models.
2015-05-23	arm, dev: Add support for a memory mapped generic timer	Andreas Sandberg
	There are cases when we don't want to use a system register mapped generic timer, but can't use the SP804. For example, when using KVM on aarch64, we want to intercept accesses to the generic timer, but can't do so if it is using the system register interface. In such cases, we need to use a memory-mapped generic timer. This changeset adds a device model that implements the memory mapped generic timer interface. The current implementation only supports a single frame (i.e., one virtual timer and one physical timer).
2015-05-23	arm: Get rid of pointless have_generic_timer param	Andreas Sandberg
	The ArmSystem class has a parameter to indicate whether it is configured to use the generic timer extension or not. This parameter doesn't affect any feature flags in the current implementation and is therefore completely unnecessary. In fact, we usually don't set it even if a system has a generic timer. If we ever need to check if there is a generic timer present, we should just request a pointer and check if it is non-null instead.
2015-05-23	dev, arm: Add virtual timers to the generic timer model	Andreas Sandberg
	The generic timer model currently does not support virtual counters. Virtual and physical counters both tick with the same frequency. However, virtual timers allow a hypervisor to set an offset that is subtracted from the counter when it is read. This enables the hypervisor to present a time base that ticks with virtual time in the VM (i.e., doesn't tick when the VM isn't running). Modern Linux kernels generally assume that virtual counters exist and try to use them by default.