summaryrefslogtreecommitdiff
path: root/src/cpu/o3
AgeCommit message (Collapse)Author
2012-02-10O3 CPU: Strengthen condition for handling interruptsNilay Vaish
The condition for handling interrupts is to check whether or not the cpu's instruction list is empty. As observed, this can lead to cases in which even though the instruction list is empty, interrupts are handled when they should not be. The condition is being strengthened so that interrupts get handled only when the last committed microop did not had IsDelayedCommit set.
2012-02-10O3 CPU: Provide the squashing instructionNilay Vaish
This patch adds a function to the ROB that will get the squashing instruction from the ROB's list of instructions. This squashing instruction is used for figuring out the macroop from which the fetch stage should fetch the microops. Further, a check has been added that if the instructions are to be fetched from the cache maintained by the fetch stage, then the data in the cache should be valid and the PC of the thread being fetched from is same as the address of the cache block.
2012-02-10O3 Fetch: Check if PC is pointing to Microcode ROMNilay Vaish
2012-02-07Faults: Turn off arch/faults.hhGabe Black
Because there are no longer architecture independent but specialized functions in arch/XXX/faults.hh, code that isn't using the faults from a particular ISA no longer needs to be able to include them through the switching header file arch/faults.hh. By removing that header file (arch/faults.hh), the potential interface between ISA code and non ISA code is narrowed.
2012-01-31Merge with head, hopefully the last time for this batch.Gabe Black
2012-01-31clang: Enable compiling gem5 using clang 2.9 and 3.0Koan-Sin Tan
This patch adds the necessary flags to the SConstruct and SConscript files for compiling using clang 2.9 and later (on Ubuntu et al and OSX XCode 4.2), and also cleans up a bunch of compiler warnings found by clang. Most of the warnings are related to hidden virtual functions, comparisons with unsigneds >= 0, and if-statements with empty bodies. A number of mismatches between struct and class are also fixed. clang 2.8 is not working as it has problems with class names that occur in multiple namespaces (e.g. Statistics in kernel_stats.hh). clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which causes confusion between the container std::set and the function Packet::set, and this is currently addressed by not including the entire namespace std, but rather selecting e.g. "using std::vector" in the appropriate places.
2012-01-31CheckerCPU: Re-factor CheckerCPU to be compatible with current gem5Geoffrey Blake
Brings the CheckerCPU back to life to allow FS and SE checking of the O3CPU. These changes have only been tested with the ARM ISA. Other ISAs potentially require modification.
2012-01-30Merge with main repository.Gabe Black
2012-01-30MEM: Clean-up of Functional/Virtual/TranslatingPort remnantsAndreas Hansson
This patch cleans up forward declarations and a member-function prototype that still referred to the old FunctionalPort, VirtualPort and TranslatingPort. There is no change in functionality.
2012-01-29Yet another merge with the main repository.Gabe Black
--HG-- rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/config.ini => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/config.ini rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/simout => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/simout rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/stats.txt => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/stats.txt rename : tests/long/10.linux-boot/ref/x86/linux/pc-o3-timing/system.pc.com_1.terminal => tests/long/fs/10.linux-boot/ref/x86/linux/pc-o3-timing/system.pc.com_1.terminal rename : tests/long/00.gzip/ref/x86/linux/o3-timing/config.ini => tests/long/se/00.gzip/ref/x86/linux/o3-timing/config.ini rename : tests/long/00.gzip/ref/x86/linux/o3-timing/simout => tests/long/se/00.gzip/ref/x86/linux/o3-timing/simout rename : tests/long/00.gzip/ref/x86/linux/o3-timing/stats.txt => tests/long/se/00.gzip/ref/x86/linux/o3-timing/stats.txt rename : tests/long/10.mcf/ref/x86/linux/o3-timing/config.ini => tests/long/se/10.mcf/ref/x86/linux/o3-timing/config.ini rename : tests/long/10.mcf/ref/x86/linux/o3-timing/simout => tests/long/se/10.mcf/ref/x86/linux/o3-timing/simout rename : tests/long/10.mcf/ref/x86/linux/o3-timing/stats.txt => tests/long/se/10.mcf/ref/x86/linux/o3-timing/stats.txt rename : tests/long/20.parser/ref/x86/linux/o3-timing/config.ini => tests/long/se/20.parser/ref/x86/linux/o3-timing/config.ini rename : tests/long/20.parser/ref/x86/linux/o3-timing/simout => tests/long/se/20.parser/ref/x86/linux/o3-timing/simout rename : tests/long/20.parser/ref/x86/linux/o3-timing/stats.txt => tests/long/se/20.parser/ref/x86/linux/o3-timing/stats.txt rename : tests/long/70.twolf/ref/x86/linux/o3-timing/config.ini => tests/long/se/70.twolf/ref/x86/linux/o3-timing/config.ini rename : tests/long/70.twolf/ref/x86/linux/o3-timing/simout => tests/long/se/70.twolf/ref/x86/linux/o3-timing/simout rename : tests/long/70.twolf/ref/x86/linux/o3-timing/stats.txt => tests/long/se/70.twolf/ref/x86/linux/o3-timing/stats.txt rename : tests/quick/00.hello/ref/x86/linux/o3-timing/config.ini => tests/quick/se/00.hello/ref/x86/linux/o3-timing/config.ini rename : tests/quick/00.hello/ref/x86/linux/o3-timing/simout => tests/quick/se/00.hello/ref/x86/linux/o3-timing/simout rename : tests/quick/00.hello/ref/x86/linux/o3-timing/stats.txt => tests/quick/se/00.hello/ref/x86/linux/o3-timing/stats.txt
2012-01-29Implement Ali's review feedback.Gabe Black
Try to decrease indentation, and remove some redundant FullSystem checks.
2012-01-28O3 CPU LSQ: Implement TSONilay Vaish
This patch makes O3's LSQ maintain total order between stores. Essentially only the store at the head of the store buffer is allowed to be in flight. Only after that store completes, the next store is issued to the memory system. By default, the x86 architecture will have TSO.
2012-01-28Merge with the main repo.Gabe Black
--HG-- rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
2012-01-16Merge yet again with the main repository.Gabe Black
2012-01-17MEM: Separate queries for snooping and address rangesAndreas Hansson
This patch simplifies the address-range determination mechanism and also unifies the naming across ports and devices. It further splits the queries for determining if a port is snooping and what address ranges it responds to (aiming towards a separation of cache-maintenance ports and pure memory-mapped ports). Default behaviours are such that most ports do not have to define isSnooping, and master ports need not implement getAddrRanges.
2012-01-17CPU: Moving towards a more general port across CPU modelsAndreas Hansson
This patch performs minimal changes to move the instruction and data ports from specialised subclasses to the base CPU (to the largest degree possible). Ultimately it servers to make the CPU(s) have a well-defined interface to the memory sub-system.
2012-01-17MEM: Add port proxies instead of non-structural portsAndreas Hansson
Port proxies are used to replace non-structural ports, and thus enable all ports in the system to correspond to a structural entity. This has the advantage of accessing memory through the normal memory subsystem and thus allowing any constellation of distributed memories, address maps, etc. Most accesses are done through the "system port" that is used for loading binaries, debugging etc. For the entities that belong to the CPU, e.g. threads and thread contexts, they wrap the CPU data port in a port proxy. The following replacements are made: FunctionalPort > PortProxy TranslatingPort > SETranslatingPortProxy VirtualPort > FSTranslatingPortProxy --HG-- rename : src/mem/vport.cc => src/mem/fs_translating_port_proxy.cc rename : src/mem/vport.hh => src/mem/fs_translating_port_proxy.hh rename : src/mem/translating_port.cc => src/mem/se_translating_port_proxy.cc rename : src/mem/translating_port.hh => src/mem/se_translating_port_proxy.hh
2012-01-10DPRINTF: Improve some dprintf messages.Nilay Vaish
2012-01-09O3: Remove some asserts that no longer seem to be valid.Ali Saidi
2012-01-09O3: Add support of function tracing with O3 CPU.Ali Saidi
2012-01-07Another merge with the main repository.Gabe Black
2012-01-07Merge with the main repository again.Gabe Black
2012-01-07Merge with main repository.Gabe Black
2011-12-13gcc: fix unused variable warnings from GCC 4.6.1Nathan Binkert
--HG-- extra : rebase_source : f9e22de341493a25ac6106c16ac35c61c128a080
2011-12-01O3: Remove hardcoded tgts_per_mshr in O3CPU.py.Chander Sudanthi
There are two lines in O3CPU.py that set the dcache and icache tgts_per_mshr to 20, ignoring any pre-configured value of tgts_per_mshr. This patch removes these hardcoded lines from O3CPU.py and sets the default L1 cache mshr targets to 20. --HG-- extra : rebase_source : 6f92d950e90496a3102967442814e97dc84db08b
2011-12-01O3: Add stat that counts how many cycles the O3 cpu was quiesced.Ali Saidi
--HG-- extra : rebase_source : 043b9307eef3c5b87f8e6370765641e016ed1fa7
2011-11-18SE/FS: Get rid of includes of config/full_system.hh.Gabe Black
2011-11-18SE/FS: Get rid of FULL_SYSTEM in the CPU directory.Gabe Black
2011-11-01SE/FS: Expose the same methods on the CPUs in SE and FS modes.Gabe Black
2011-10-31SE/FS: Make the functions available from the TC consistent between SE and FS.Gabe Black
2011-10-31GCC: Get everything working with gcc 4.6.1.Gabe Black
And by "everything" I mean all the quick regressions.
2011-10-30SE/FS: Make getProcessPtr available in both modes, and get rid of FULL_SYSTEMs.Gabe Black
2011-10-30SE/FS: Build the base process class in FS.Gabe Black
2011-10-16SE/FS: Include getMemPort in FS.Gabe Black
2011-10-16SE/FS: Build/expose vport in SE mode.Gabe Black
2011-10-16CPU: Make physPort and getPhysPort available in SE mode.Gabe Black
2011-09-27O3: Tidy up some DPRINTFs in the LSQ.Gabe Black
2011-09-27Faults: Replace calls to genMachineCheckFault with M5PanicFault.Gabe Black
2011-09-26LSQ: Moved a couple of lines to enable O3 + RubyNilay Vaish
This patch makes O3 CPU work along with the Ruby memory model. Ruby overwrites the senderState pointer with another pointer. The pointer is restored only when Ruby gets done with the packet. LSQ makes use of senderState just after sendTiming() returns. But the dynamic_cast returns a NULL pointer since Ruby's senderState pointer is from a different class. Storing the senderState pointer before calling sendTiming() does away with the problem.
2011-09-22event: minor cleanupSteve Reinhardt
Initialize flags via the Event constructor instead of calling setFlags() in the body of the derived class's constructor. I forget exactly why, but this made life easier when implementing multi-queue support. Also rename Event::getFlags() to isFlagSet() to better match common usage, and get rid of some unused Event methods.
2011-09-19Syscall: Make the syscall function available in both SE and FS modes.Gabe Black
In FS mode the syscall function will panic, but the interface will be consistent and code which calls syscall can be compiled in. This will allow, for instance, instructions that use syscall to be built unconditionally but then not returned by the decoder.
2011-09-13LSQ: Only trigger a memory violation with a load/load if the value changes.Ali Saidi
Only create a memory ordering violation when the value could have changed between two subsequent loads, instead of just when loads go out-of-order to the same address. While not very common in the case of Alpha, with an architecture with a hardware table walker this can happen reasonably frequently beacuse a translation will miss and start a table walk and before the CPU re-schedules the faulting instruction another one will pass it to the same address (or cache block depending on the dendency checking). This patch has been tested with a couple of self-checking hand crafted programs to stress ordering between two cores. The performance improvement on SPEC benchmarks can be substantial (2-10%).
2011-09-09Decode: Pull instruction decoding out of the StaticInst class into its own.Gabe Black
This change pulls the instruction decoding machinery (including caches) out of the StaticInst class and puts it into its own class. This has a few intrinsic benefits. First, the StaticInst code, which has gotten to be quite large, gets simpler. Second, the code that handles decode caching is now separated out into its own component and can be looked at in isolation, making it easier to understand. I took the opportunity to restructure the code a bit which will hopefully also help. Beyond that, this change also lays some ground work for each ISA to have its own, potentially stateful decode object. We'd be able to include less contextualizing information in the ExtMachInst objects since that context would be applied at the decoder. Also, the decoder could "know" ahead of time that all the instructions it's going to see are going to be, for instance, 64 bit mode, and it will have one less thing to check when it decodes them. Because the decode caching mechanism has been separated out, it's now possible to have multiple caches which correspond to different types of decoding context. Having one cache for each element of the cross product of different configurations may become prohibitive, so it may be desirable to clear out the cache when relatively static state changes and not to have one for each setting. Because the decode function is no longer universally accessible as a static member of the StaticInst class, a new function was added to the ThreadContexts that returns the applicable decode object.
2011-08-19LSQ: Set store predictor to periodically clear itself as recommended in the ↵Ali Saidi
storesets paper. This patch improves performance by as much as 10% on some spec benchmarks.
2011-08-19Fix bugs due to interaction between SEV instructions and O3 pipelineGeoffrey Blake
SEV instructions were originally implemented to cause asynchronous squashes via the generateTCSquash() function in the O3 pipeline when updating the SEV_MAILBOX miscReg. This caused race conditions between CPUs in an MP system that would lead to a pipeline either going inactive indefinitely or not being able to commit squashed instructions. Fixed SEV instructions to behave like interrupts and cause synchronous sqaushes inside the pipeline, eliminating the race conditions. Also fixed up the semantics of the WFE instruction to behave as documented in the ARMv7 ISA description to not sleep if SEV_MAILBOX=1 or unmasked interrupts are pending.
2011-08-19LSQ: Add some better dprintfs for storeset predictor.Mrinmoy Ghosh
2011-08-19LSQ: Fix a few issues with the storeset predictor.Mrinmoy Ghosh
Two issues are fixed in this patch: 1. The load and store pc passed to the predictor are passed in reverse order. 2. The flag indicating that a barrier is inflight was never cleared when the barrier was squashed instead of committed. This made all load insts dependent on a non-existent barrier in-flight.
2011-08-19O3: Squash the violator and younger instructions instead not all insts.Giacomo Gabrielli
Change the way instructions are squashed on memory ordering violations to squash the violator and younger instructions, not all instructions that are younger than the instruction they violated (no reason to throw away valid work).
2011-08-16O3: Make lsq_unit.hh include arch/isa_traits.hh directly, not transitively.Gabe Black
2011-08-14O3: When squashing, restore the macroop that should be used for fetching.Gabe Black