summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2015-05-23kvm: Fix dumping code for large registersAndreas Sandberg
The register dumping code in kvm tries to print the bytes in large registers (128 bits and larger) instead of printing them as hex. This changeset fixes that.
2015-05-23kvm, x86: Guard x86-specific APIs in KvmVMAndreas Sandberg
Protect x86-specific APIs in KvmVM with compile-time guards to avoid breaking ARM builds.
2015-05-23build: Don't test for KVM xsave support on ARMAndreas Sandberg
The current build tests for KVM unconditionally check for xsave support. This obviously never works on ARM since xsave is x86-specific. This changeset refactors the build tests probing for KVM support and moves the xsave test to an x86-specific section of is_isa_kvm_compatible().
2015-05-23arm: Workaround incorrect HDLCD register order in kernelAndreas Sandberg
Some versions of the kernel incorrectly swap the red and blue color select registers. This changeset adds a workaround for that by swapping them when instantiating a PixelConverter.
2015-05-23base: Redesign internal frame buffer handlingAndreas Sandberg
Currently, frame buffer handling in gem5 is quite ad hoc. In practice, we pass around naked pointers to raw pixel data and expect consumers to convert frame buffers using the (broken) VideoConverter. This changeset completely redesigns the way we handle frame buffers internally. In summary, it fixes several color conversion bugs, adds support for more color formats (e.g., big endian), and makes the code base easier to follow. In the new world, gem5 always represents pixel data using the Pixel struct when pixels need to be passed between different classes (e.g., a display controller and the VNC server). Producers of entire frames (e.g., display controllers) should use the FrameBuffer class to represent a frame. Frame producers are expected to create one instance of the FrameBuffer class in their constructors and register it with its consumers once. Consumers are expected to check the dimensions of the frame buffer when they consume it. Conversion between the external representation and the internal representation is supported for all common "true color" RGB formats of up to 32-bit color depth. The external pixel representation is expected to be between 1 and 4 bytes in either big endian or little endian. Color channels are assumed to be contiguous ranges of bits within each pixel word. The external pixel value is scaled to an 8-bit internal representation using a floating multiplication to map it to the entire 8-bit range.
2015-05-23base: Clean up bitmap generation codeAndreas Sandberg
The bitmap generation code is hard to follow and incorrectly uses the size of an enum member to calculate the size of a pixel. This changeset cleans up the code and adds some documentation.
2015-05-19ruby: Fix RubySystem warm-up and cool-down scopeJoel Hestness
The processes of warming up and cooling down Ruby caches are simulation-wide processes, not just RubySystem instance-specific processes. Thus, the warm-up and cool-down variables should be globally visible to any Ruby components participating in either process. Make these variables static members and track the warm-up and cool-down processes as appropriate. This patch also has two side benefits: 1) It removes references to the RubySystem g_system_ptr, which are problematic for allowing multiple RubySystem instances in a single simulation. Warmup and cooldown variables being static (global) reduces the need for instance-specific dereferences through the RubySystem. 2) From the AbstractController, it removes local RubySystem pointers, which are used inconsistently with other uses of the RubySystem: 11 other uses reference the RubySystem with the g_system_ptr. Only sequencers have local pointers.
2015-05-15arm: Identify table-walker requestsAndreas Hansson
This patch ensures all page-table walks are flagged as such.
2015-05-15misc: Appease gcc 5.1Andreas Hansson
Three minor issues are resolved: 1. Apparently gcc 5.1 does not like negation of booleans followed by bitwise AND. 2. Somehow the compiler also gets confused and warns about NoopMachInst being unused (removing it causes compilation errors though). Most likely a compiler bug. 3. There seems to be a number of instances where loop unrolling causes false positives for the array-bounds check. For now, switch to std::array. Potentially we could disable the warning for newer gcc versions, but switching to std::array is probably a good move in any case.
2015-05-15sim: Don't clear the active CPU vector in System::initStateAndreas Sandberg
The system class currently clears the vector of active CPUs in initState(). CPUs are added to the list by registerThreadContext() which is called from BaseCPU::init(). This obviously breaks when the System object is initialized after the CPUs. This changeset removes the offending clear() call since the list will be empty after it has been instantiated anyway.
2015-05-15config: Use null memory for DRAM sweep scriptAndreas Hansson
Do not waste time when we do not care about the data.
2015-05-15config: Add new MemConfig options to DRAM sweep scriptWendy Elsasser
Update script to match current MemConfig options with external_memory_system option set to 0.
2015-05-05syscall_emul: fix warn_once behaviorSteve Reinhardt
The current ignoreWarnOnceFunc doesn't really work as expected, since it will only generate one warning total, for whichever "warn-once" syscall is invoked first. This patch fixes that behavior by keeping a "warned" flag in the SyscallDesc object, allowing suitably flagged syscalls to warn exactly once per syscall.
2015-05-05stats, arm: Update stats for missing FPEXC.EN checkAndreas Hansson
Only one regression is affected.
2015-05-05arm: Add missing FPEXC.EN checkAndreas Hansson
Add a missing check to ensure that exceptions are generated properly.
2015-05-05arm: enable DCZVA by default in SE modeGiacomo Gabrielli
2015-05-05stats: Update stats to reflect cache changesAndreas Hansson
2015-03-17mem: Create a request copy for deferred snoopsStephan Diestelhorst
Sometimes, we need to defer an express snoop in an MSHR, but the original request might complete and deallocate the original pkt->req. In those cases, create a copy of the request so that someone who is inspecting the delayed snoop can also inspect the request still. All of this is rather hacky, but the allocation / linking and general life-time management of Packet and Request is rather tricky. Deleting the copy is another tricky area, testing so far has shown that the right copy is deleted at the right time.
2015-05-05arm: Relax ordering for some uncacheable accessesAndreas Sandberg
We currently assume that all uncacheable memory accesses are strictly ordered. Instead of always enforcing strict ordering, we now only enforce it if the required memory type is device memory or strongly ordered memory.
2015-05-05mem, cpu: Add a separate flag for strictly ordered memoryAndreas Sandberg
The Request::UNCACHEABLE flag currently has two different functions. The first, and obvious, function is to prevent the memory system from caching data in the request. The second function is to prevent reordering and speculation in CPU models. This changeset gives the order/speculation requirement a separate flag (Request::STRICT_ORDER). This flag prevents CPU models from doing the following optimizations: * Speculation: CPU models are not allowed to issue speculative loads. * Write combining: CPU models and caches are not allowed to merge writes to the same cache line. Note: The memory system may still reorder accesses unless the UNCACHEABLE flag is set. It is therefore expected that the STRICT_ORDER flag is combined with the UNCACHEABLE flag to prevent this behavior.
2015-05-05mem, alpha: Move Alpha-specific request flagsAndreas Sandberg
Move Alpha-specific memory request flags to an architecture-specific header and map them to the architecture specific flag bit range.
2015-05-05arm: Remove unnecessary boot uncachabilityAndreas Hansson
With the recent patches addressing how we deal with uncacheable accesses there is no longer need for the work arounds put in place to enforce certain sections of memory to be uncacheable during boot.
2015-05-05mem: Snoop into caches on uncacheable accessesAndreas Hansson
This patch takes a last step in fixing issues related to uncacheable accesses. We do not separate uncacheable memory from uncacheable devices, and in cases where it is really memory, there are valid scenarios where we need to snoop since we do not support cache maintenance instructions (yet). On snooping an uncacheable access we thus provide data if possible. In essence this makes uncacheable accesses IO coherent. The snoop filter is also queried to steer the snoops, but not updated since the uncacheable accesses do not allocate a block.
2015-05-05arch, cpu: Do not forward snoops to table walkerAndreas Hansson
This patch simplifies the overall CPU by changing the TLB caches such that they do not forward snoops to the table walker port(s). Note that only ARM and X86 are affected. There is no reason for the ports to snoop as they do not actually take any action, and from a performance point of view we are better of not snooping more than we have to. Should it at a later point be required to snoop for a particular TLB design it is easy enough to add it back.
2015-05-05mem: Pass shared downstream through cachesAndreas Hansson
This patch ensures that we pass on information about a packet being shared (rather than exclusive), when forwarding a packet downstream. Without this patch there is a risk that a downstream cache considers the line exclusive when it really isn't.
2015-05-05mem: Add forward snoop check for HardPFReqsAli Jafri
We should always check whether the cache is supposed to be forwarding snoops before generating snoops.
2015-05-05mem: Add missing stats update for uncacheable MSHRsAndreas Hansson
This patch adds a missing counter update for the uncacheable accesses. By updating this counter we also get a meaningful average latency for uncacheable accesses (previously inf).
2015-05-05mem: Tidy up BaseCache parametersAndreas Hansson
This patch simply tidies up the BaseCache parameters and removes the unused "two_queue" parameter.
2015-05-05mem: Remove templates in cache modelDavid Guillen
This patch changes the cache implementation to rely on virtual methods rather than using the replacement policy as a template argument. There is no impact on the simulation performance, and overall the changes make it easier to modify (and subclass) the cache and/or replacement policy.
2015-05-05cpu: Work around gcc 4.9 issues with Num_OpClassesAndreas Hansson
This patch fixes a recent issue with gcc 4.9 (and possibly more) being convinced that indices outside the array bounds are used when initialising the FUPool members.
2015-05-05stats: Bring regression stats in line with actual behaviourAndreas Hansson
2015-04-30stats: arm: updatesNilay Vaish
2015-04-29stats: x86: updates due to change in div latencyNilay Vaish
2015-04-29arch, base, dev, kern, sym: FreeBSD supportRuslan Bukin
This adds support for FreeBSD/aarch64 FS and SE mode (basic set of syscalls only) Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-29mem: Simplify page close checks for adaptive policiesRizwana Begum
Both open_adaptive and close_adaptive page polices keep the page open if a row hit is found. If a row hit is not found, close_adaptive page policy precharges the row, and open_adaptive policy precharges the row only if there is a bank conflict request waiting in the queue. This patch makes the checks for above conditions simpler. Committed by: Nilay Vaish <nilay@cs.wisc.edu>
2015-04-29ruby: set: replace long by unsigned longNilay Vaish
UBSan complains about negative value being shifted
2015-04-29cpu: o3: replace issueLatency with bool pipelinedNilay Vaish
Currently, each op class has a parameter issueLat that denotes the cycles after which another op of the same class can be issued. As of now, this latency can either be one cycle (fully pipelined) or same as execution latency of the op (not at all pipelined). The fact that issueLat is a parameter of type Cycles makes one believe that it can be set to any value. To avoid the confusion, the parameter is being renamed as 'pipelined' with type boolean. If set to true, the op would execute in a fully pipelined fashion. Otherwise, it would execute in an unpipelined fashion.
2015-04-29cpu: o3: single cycle default div microop latency on x86Nilay Vaish
This patch sets the default latency of the division microop to a single cycle on x86. This is because the division instructions DIV and IDIV have been implemented as loops of div microops, where each microop computes a single bit of the quotient.
2015-04-29x86: change divide-by-zero fault to divide-errorNilay Vaish
Same exception is raised whether division with zero is performed or the quotient is greater than the maximum value that the provided space can hold. Divide-by-Zero is the AMD terminology, while Divide-Error is Intel's.
2015-04-24misc: Appease gcc 5.1 without moving GDB_REG_BYTESAndreas Hansson
This patch rolls back the move of the GDB_REG_BYTES constant, and instead adds M5_VAR_USED.
2015-04-23config: enable setting SE-mode environment variables from filebpotter
2015-04-23arm, dev: Add a UFS deviceRene de Jong
This patch introduces a UFS host controller and a UFS device. More information about the UFS standard can be found at the JEDEC site: http://www.jedec.org/standards-documents/results/jesd220 Note that the model does not implement the complete standard, and as such is not an actual implementation of UFS. The following SCSI commands are implemented: inquiry, read, read capacity, report LUNs, start/stop, test unit ready, verify, write, format unit, send diagnostic, synchronize cache, mode select, mode sense, request sense, unmap, write buffer and read buffer. This is sufficient for usage with Linux and Android. To interact with this model a kernel version 3.9 or above is needed.
2015-04-23arm, dev: Add a NAND flash timing modelRene de Jong
This adds a NAND flash timing model. This model takes the number of planes into account and is ultimately intended to be used as a high-level performance model for any device using flash. To access the memory, use either readMemory or writeMemory. To make use of the model you will need an interface model such as UFSHostDevice, which is part of a separate patch. At the moment the flash device is part of the ARM device tree since the only use if the UFSHostDevice, and that in turn relies on the ARM GIC.
2015-04-23dev: Add support for i2c devicesPeter Enns
This patch adds an I2C bus and base device. I2C is used to connect a variety of sensors, and this patch serves as a starting point to enable a range of I2C devices.
2015-04-23misc: Appease gcc 5.1Andreas Hansson
This patch fixes a few small issues to ensure gem5 compiles when using gcc 5.1. First, the GDB_REG_BYTES in the RemoteGDB header are, rather surprisingly, flagged as unused for both ARM and X86. Removing them, however, causes compilation errors as they are actually used in the source file. Moving the constant into the class definition fixes the issue. Possibly a gcc bug. Second, we have an unused EthPktData constructor using auto_ptr, and the latter is deprecated. Since the code is never used it is simply removed.
2015-04-22stats: update for previous changesetSteve Reinhardt
Very small differences in IQ-specific O3 stats.
2015-04-22cpu: remove conditional check (count > 0) on o3 IQ squashesBrandon Potter
The o3 cpu instruction queue model uses the count variable to track the number of unissued instructions in the queue. Previously, the squash method used this variable to avoid executing the doSquash method when there were no unissued instructions in the pipeline. A corner case problem exists when only issued instructions exist in the pipeline and a squash occurs; the doSquash code is not invoked and subsequently does not clean up state properly.
2015-04-22syscall_emul: implement clock_gettime system callBrandon Potter
2015-04-22syscall_emul: update x86 syscall tableMonir Mozumder
Update table with additional definitions through Linux 3.13.
2015-04-22syscall_emul: update getrlimit to use warnBrandon Potter
Don't use std::cerr directly, and just return EINVAL instead of aborting.