summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-12-02stats: Bump stats for fixes, mostly TLB and WriteInvalidateAndreas Hansson
2014-12-02scons: Ensure dictionary iteration is sorted by keyAndreas Hansson
This patch adds sorting based on the SimObject name or parameter name for all situations where we iterate over dictionaries. This should ensure a deterministic and consistent order across the host systems and hopefully avoid regression results differing across python versions.
2014-12-02mem: Support WriteInvalidate (again)Curtis Dunham
This patch takes a clean-slate approach to providing WriteInvalidate (write streaming, full cache line writes without first reading) support. Unlike the prior attempt, which took an aggressive approach of directly writing into the cache before handling the coherence actions, this approach follows the existing cache flows as closely as possible.
2014-12-02mem: Remove WriteInvalidate supportCurtis Dunham
Prepare for a different implementation following in the next patch
2014-12-02cpu: Fix retries on barrier/store in Minor's store bufferAndrew Bardsley
This patch fixes a case where a store in Minor's store buffer never leaves the store buffer as it is pre-maturely counted as having been issued, leading to the store buffer idling. LSQ::StoreBuffer::numUnissuedAccesses should count the number of accesses either in memory, or still in the store buffer after being completed. For stores which are also barriers, the store will stay in the store buffer for a cycle after it is completed and will be cleaned up by the barrier clearing code (to ensure that barriers are completed in-order). To acheive this, numUnissuedAccesses is not decremented when a store-barrier is issued to memory, but when its barrier effect is cleared. Without this patch, the correct behaviour happens when a memory transaction is immediately accepted, but not if it needs a retry.
2014-12-02cpu: Fix memoryIssueLimit checking in MinorAndrew Bardsley
This patch fixes the checking of the number of memory instructions issued per cycles in the Minor CPU.
2014-12-02arm: Fix TLB ignoring faults when table walkingAndrew Bardsley
This patch fixes a case where the Minor CPU can deadlock due to the lack of a response to TLB request because of a bug in fault handling in the ARM table walker. TableWalker::processWalkWrapper is the scheduler-called wrapper which handles deferred walks which calls to TableWalker::wait cannot immediately process. The handling of faults generated by processWalk{AArch64,LPAE,} calls in those two functions is is different. processWalkWrapper ignores fault returns from processWalk... which can lead to ::finish not being called on a translation. This fix provides fault handling in processWalkWrapper similar to that found in the leaf functions which BaseTLB::Translation::finish.
2014-12-02config: Fix to SystemC example's event handlingAndrew Bardsley
This patch fixes checkpoint restore in the SystemC hosting example by handling early PollEvent events correctly before any EventQueue events are posted. The SystemC event queue handler (SCEventQueue) reports an error if the event loop is entered with no Events posted. It is possible for this to happen after instantiate due to PollEvent events. This patch separates out `external' events into a different handler in sc_module.cc to prevent the error from occurring. This fix also improves the event handling of asynchronous events by: 1) Making asynchronous events 'catch up' gem5 time to SystemC time to avoid the appearance that events have been lost while servicing an asynchronous event that schedules an event loop exit event 2) Add an in_simulate data member to Module to allow the event loop to check whether events should be processed or deferred until the next time Module::simulate is entered 3) Cancel pending events around the entry/exit of the event loop in Module::simulate 4) Moving the state initialisation of the example entirely into run to correct a problem with early events in checkpoint restore. It is still possible to schedule asynchronous events (and talk PollQueue actions) while simulate is not running. This behaviour may stil cause some problems.
2014-12-02config: SystemC Gem5Control top level additionsAndrew Bardsley
This patch cleans up a few style issues and adds a few capabilities to the SystemC top level 'Gem5Control/Gem5System' mechanism. These include: 1) A space to store/retrieve a version string for a model 2) A mechanism for registering functions to be called at the end of elaboration to perform simulation setup tasks in SystemC 3) Adding setGDBRemotePort to the Gem5Control 4) Changing the sc_set_time_resolution behaviour to instead check that the SystemC time resolution is already acceptable
2014-12-02stats: Bump stats for o3 LSQ changesAndreas Hansson
2014-12-02cpu, o3: Ignored invalidate causing same-address load reorderingMarco Elver
In case the memory subsystem sends a combined response with invalidate (e.g. ReadRespWithInvalidate), we cannot ignore the invalidate part of the response. If we were to ignore the invalidate part, under certain circumstances this effectively leads to reordering of loads to the same address which is not permitted under any memory consistency model implemented in gem5. Consider the case where a later load's address is computed before an earlier load in program order, and is therefore sent to the memory subsystem first. At some point the earlier load's address is computed and in doing so correctly marks the later load as a possibleLoadViolation. In the meantime some other node writes and sends invalidations to all other nodes. The invalidation races with the later load's ReadResp, and arrives before ReadResp and is deferred. Upon receipt of the ReadResp, the response is changed to ReadRespWithInvalidate, and sent to the CPU. If we ignore the invalidate part of the packet, we let the later load read the old value of the address. Eventually the earlier load's ReadResp arrives, but with new data. As there was no invalidate snoop (sunk into the ReadRespWithInvalidate), and if we did not process the invalidate of the ReadRespWithInvalidate, we obtain a load reordering. A similar scenario can be constructed where the earlier load's address is computed after ReadRespWithInvalidate arrives for the younger load. In this case hitExternalSnoop needs to be set to true on the ReadRespWithInvalidate, so that upon knowing the address of the earlier load, checkViolations will cause the later load to be squashed. Finally we must account for the case where both loads are sent to the memory subsystem (reordered), a snoop invalidate arrives and correctly sets the later loads fault to ReExec. However, before the CPU processes the fault, the later load's ReadResp arrives and the writeback discards the outstanding fault. We must add a check to ensure that we do not skip any unprocessed faults.
2014-12-02cpu: Always mask the snoop address when performing lock checkAndreas Hansson
Ensure the snoop address check is always using a cache-block aligned address. This patch updates Alpha and Mips to match the other ISAs.
2014-12-02cpu: Move packet deallocation to recvTimingResp in the O3 CPUStephan Diestelhorst
Move the packet deallocations in the O3 CPU so that the completeDataAccess deals only with the LSQ specific parts and the generic recvTimingResp frees the packet in all other cases.
2014-12-02mem: Relax packet src/dest check and shift onus to crossbarAndreas Hansson
This patch allows objects to get the src/dest of a packet even if it is not set to a valid port id. This simplifies (ab)using the bridge as a buffer and latency adapter in situations where the neighbouring MemObjects are not crossbars. The checks that were done in the packet are now shifted to the crossbar where the fields are used to index into the port arrays. Thus, the carrier of the information is not burdened with checking, and the crossbar can check not only that the destination is set, but also that the port index is within limits.
2014-12-02mem: Clean up packet data allocationAndreas Hansson
This patch attempts to make the rules for data allocation in the packet explicit, understandable, and easy to verify. The constructor that copies a packet is extended with an additional flag "alloc_data" to enable the call site to explicitly say whether the newly created packet is short-lived (a zero-time snoop), or has an unknown life-time and therefore should allocate its own data (or copy a static pointer in the case of static data). The tricky case is the static data. In essence this is a copy-avoidance scheme where the original source of the request (DMA, CPU etc) does not ask the memory system to return data as part of the packet, but instead provides a pointer, and then the memory system carries this pointer around, and copies the appropriate data to the location itself. Thus any derived packet actually never copies any data. As the original source does not copy any data from the response packet when arriving back at the source, we must maintain the copy of the original pointer to not break the system. We might want to revisit this one day and pay the price for a few extra memcpy invocations. All in all this patch should make it easier to grok what is going on in the memory system and how data is actually copied (or not).
2014-12-02mem: Cleanup Packet::checkFunctional and hasData usageAndreas Hansson
This patch cleans up the use of hasData and checkFunctional in the packet. The hasData function is unfortunately suggesting that it checks if the packet has a valid data pointer, when it does in fact only check if the specific packet type is specified to have a data payload. The confusion led to a bug in checkFunctional. The latter function is also tidied up to avoid name overloading.
2014-12-02mem: Make the requests carried by packets constAndreas Hansson
This adds a basic level of sanity checking to the packet by ensuring that a request is not modified once the packet is created. The only issue that had to be worked around is the relaying of software-prefetches in the cache. The specific situation is now solved by first copying the request, and then creating a new packet accordingly.
2014-12-02mem: Make Request getters constAndreas Hansson
This patch tidies up the Request class, making all getters const. The odd one out is incAccessDepth which is called by the memory system as packets carry the request around. This is also const to enable the packet to hold on to a const Request.
2014-12-02mem: Add checks and explanation for assertMemInhibit usageAndreas Hansson
2014-12-02mem: Assume all dynamic packet data is array allocatedAndreas Hansson
This patch simplifies how we deal with dynamically allocated data in the packet, always assuming that it is array allocated, and hence should be array deallocated (delete[] as opposed to delete). The only uses of dataDynamic was in the Ruby testers. The ARRAY_DATA flag in the packet is removed accordingly. No defragmentation of the flags is done at this point, leaving a gap in the bit masks. As the last part the patch, it renames dataDynamicArray to dataDynamic.
2014-12-02mem: Remove redundant Packet::allocate callsAndreas Hansson
This patch cleans up the packet memory allocation confusion. The data is always allocated at the requesting side, when a packet is created (or copied), and there is never a need for any device to allocate any space if it is merely responding to a paket. This behaviour is in line with how SystemC and TLM works as well, thus increasing interoperability, and matching established conventions. The redundant calls to Packet::allocate are removed, and the checks in the function are tightened up to make sure data is only ever allocated once. There are still some oddities in the packet copy constructor where we copy the data pointer if it is static (without ownership), and allocate new space if the data is dynamic (with ownership). The latter is being worked on further in a follow-on patch.
2014-12-02mem: Use const pointers for port proxy write functionsAndreas Hansson
This patch changes the various write functions in the port proxies to use const pointers for all sources (similar to how memcpy works). The one unfortunate aspect is the need for a const_cast in the packet, to avoid having to juggle a const and a non-const data pointer. This design decision can always be re-evaluated at a later stage.
2014-12-02mem: Add const getters for write packet dataAndreas Hansson
This patch takes a first step in tightening up how we use the data pointer in write packets. A const getter is added for the pointer itself (getConstPtr), and a number of member functions are also made const accordingly. In a range of places throughout the memory system the new member is used. The patch also removes the unused isReadWrite function.
2014-12-02mem: Remove null-check bypassing in Packet::getPtrAndreas Hansson
This patch removes the parameter that enables bypassing the null check in the Packet::getPtr method. A number of call sites assume the value to be non-null. The one odd case is the RubyTester, which issues zero-sized prefetches(!), and despite being reads they had no valid data pointer. This is now fixed, but the size oddity remains (unless anyone object or has any good suggestions). Finally, in the Ruby Sequencer, appropriate checks are made for flush packets as they have no valid data pointer.
2014-12-02mem: Add a GDDR5 DRAM configOmar Naji
This patch adds a first cut GDDR5 config to accommodate the users combining gem5 and GPUSim. The config is based on a SK Hynix datasheet, and the Nvidia GTX580 specification. Someone from the GPUSim user-camp should tweak the default page-policy and static frontend and backend latencies.
2014-11-24stats: Bump stats after static analysis fixesAndreas Hansson
Fixing up the uninitialised values changes two of the x86 Linux boot regressions slightly.
2014-11-24misc: Another round of static analysis fixupsAndreas Hansson
Mostly addressing uninitialised members.
2014-11-23mem: Page Table map api modificationAlexandru Dutu
This patch adds uncacheable/cacheable and read-only/read-write attributes to the map method of PageTableBase. It also modifies the constructor of TlbEntry structs for all architectures to consider the new attributes.
2014-11-23mem: Multi Level Page Table bug fixAlexandru Dutu
The multi level page table was giving false positives for already mapped translations. This patch fixes the bogus behavior.
2014-11-23mem: Page Table long linesAlexandru Dutu
Trimmed down all the lines greater than 78 characters.
2014-11-23config, kvm: Enabling KvmCPU in SE modeAlexandru Dutu
This patch modifies se.py such that it can now use kvm cpu model.
2014-11-23x86: Segment initialization to support KvmCPU in SEAlexandru Dutu
This patch sets up low and high privilege code and data segments and places them in the following order: cs low, ds low, ds, cs, in the GDT. Additionally, a syscall and page fault handler for KvmCPU in SE mode are defined. The order of the segment selectors in GDT is required in this manner for interrupt handling to work properly. Segment initialization is done for all the thread contexts.
2014-11-23kvm, x86: Adding support for SE mode executionAlexandru Dutu
This patch adds methods in KvmCPU model to handle KVM exits caused by syscall instructions and page faults. These types of exits will be encountered if KvmCPU is run in SE mode.
2014-11-23cpuid, x86: Enabling more features in CPUidAlexandru Dutu
Adding more features in the CPUid with the purpose of supporting running the KvmCPU in SE mode.
2014-11-23Backed out prior changeset f9fb64a72259Steve Reinhardt
Back out use of importlib to avoid implicitly creating dependency on Python 2.7.
2014-11-23config: ruby: Get rid of an "eval" and an "exec" operating on generated code.Gabe Black
We can get the same result using importlib.
2014-11-21x86: Update stats for the new Linux delay port.Gabe Black
2014-11-21x86: pc: Put a stub IO device at port 0xed which the kernel can use for delays.Gabe Black
There was already a stub device at 0x80, the port traditionally used for an IO delay. 0x80 is also the port used for POST codes sent by firmware, and that may have prompted adding this port as a second option.
2014-11-18configs: small fix to ruby portion of fs.py and se.pyNilay Vaish
In fs.py the io port controller was being attached to the iobus multiple times. This should be done only once. In se.py, the the option use_map was being set which no longer exists.
2014-11-18dev: Use fixed size member variables to describe fixed size PL111 registers.Gabe Black
2014-11-17vnc: Add a conversion function for bgr888.Gabe Black
2014-11-17x86: Fix setting segment bases in real mode.Gabe Black
The data size used for actually writing the base value for the segment was the default size, but really it should set the entire value without any possible truncation.
2014-11-17x86: Fix some bugs in the real mode far jmp instruction.Gabe Black
The far pointer should be shifted right to get the selector value, not left. Also, when calculating the width of the offset, the wrong register was used in one spot.
2014-11-17x86: APIC: Only set deliveryStatus if our IPI is going somewhere.Gabe Black
Otherwise the IPI which isn't sent will never arrive, and the deliveryStatus bit will never be cleared.
2014-11-17x86: APIC: Fix the getRegArrayBit function.Gabe Black
The getRegArrayBit function extracts a bit from a series of registers which are treated as a single large bit array. A previous change had modified the logic which figured out which bit to extract from ">> 5" to "% 5" which seems wrong, especially when other, similar functions were changed to use "% 32".
2014-11-17x86: Update the stats for the x86 FS o3 boot test.Gabe Black
2014-11-16x86: Fix the CPUID Long Mode Address Size function.Gabe Black
The value in EAX has an 8 bit field for the linear address size and one for the physical address size when calling that function. A recent change implemented it but returned 0xff for both of those fields. That implies that linear and physical addresses are 255 bits wide which is wrong. When using the KVM CPU model this causes an error, presumably because some of those bits are actually reserved, or the CPU or kernel realizes 255 bits is a bad value. This change makes those values 48.
2014-11-14config: Fix checkpoint restore in C++ config exampleAndrew Bardsley
This patch fixes the checkpoint restore option in the example of C++ configuration (util/cxx_config). The fix introduces a call to config_manager->startup() (which calls startup on all SimObjects managed by that manager) to replicate the loop of SimObject::startup calls in src/python/m5/simulate.py::simulate guarded by need_startup. As util/cxx_config/main.cc is a C++ analogue of src/python/mt/simulate.py, it should make a similar set of calls.
2014-11-14arm: Fixes based on UBSan and static analysisAndreas Hansson
Another churn to clean up undefined behaviour, mostly ARM, but some parts also touching the generic part of the code base. Most of the fixes are simply ensuring that proper intialisation. One of the more subtle changes is the return type of the sign-extension, which is changed to uint64_t. This is to avoid shifting negative values (undefined behaviour) in the ISA code.
2014-11-14mem: Clarify unit of DRAM controller buffer sizeAndreas Hansson