gem5 - gem5

Age	Commit message (Collapse)	Author
2016-11-09	syscall_emul: [patch 7/22] remove numCpus method	Brandon Potter
	The numCpus method is misleading in that it's not really a measure of how many CPUs might be executing a process, but how many thread contexts are assigned to the process at any given point in time. It's nice to highlight this distinction because thread contexts are never reused in the same way that a CPU can be reused for multiple processes. The reason that there is no reuse is that there is no CPU scheduler for SE. The tru64 code intends to use this method and the accompanying contextIDs field to support SMT and track the number of threads with some system calls. With the up coming clone and exec patches, this paradigm must change. There needs to be a 1:1 mapping between the thread contexts and processes so that the process state between threads is allowed to vary when needed by Linux. This should not break SMT for tru64 if the Process class is refactored so that multiple Processes can share state between themselves. The following patches will do the refactoring incrementally as features are added.
2016-11-09	syscall_emul: [patch 6/22] remove unused fields from Process class	Brandon Potter
	It looks like tru64 has some nxm* system calls, but the two fields that are defined in the Process class are unused by any of the code. There doesn't appear to be any reference in the tru64 code.
2016-11-09	syscall_emul: [patch 5/22] remove LiveProcess class and use Process instead	Brandon Potter
	The EIOProcess class was removed recently and it was the only other class which derived from Process. Since every Process invocation is also a LiveProcess invocation, it makes sense to simplify the organization by combining the fields from LiveProcess into Process.
2017-02-17	sparc: fix bugs caused by cd7f3a1dbf55	Brandon Potter
	Turns out that SPARC SE mode relied on M5_pid being "0" in all cases. The entries in the SPARC TLBs are accessed with M5_pid as their context. This is buggy in the sense that it will never work with more than one process or any initialization that doesn't have the M5_pid value passed in as "0". cd7f3a1dbf55 broke the SPARC build because it deletes M5_pid and uses a _pid with a default of "100" instead. This caused the SPARC TLB to never return any valid lookups for any request; the program never moved past the first instruction with SPARC SE in the regression tester. The solution proposed in this changeset is to initialize the address space identification register with the PID value that is passed into the process class as a parameter from Python. This should return the correct responses from the TLB since the insertions and lookups into the page table will be using the same PID. Furthermore, there are corner cases in the code which elevate privileges and revert to using context "0" as the context in the TLB. I believe that these are related to kernel level traps and hypervisor privilege escalations, but I'm not completely sure. I've tried to address the corner cases properly, but it would be beneficial to have someone who is familiar with the SPARC architecture to take a look at this fix.
2017-02-17	sim: fix out-of-bounds error in syscall_desc	Brandon Potter

2017-02-15	mem, stats: fix typos in CommMonitor and Stats	Pierre-Yves Péneau
	Signed-off-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> Reviewed-by: Tony Gutierrez <anthony.gutierrez@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed at http://reviews.gem5.org/r/3802/
2017-02-15	mem, misc: fix building issue with CommMonitor (unused variables)	Pierre-Yves Péneau
	Signed-off-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr> Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Signed-off-by: Jason Lowe-Power <jason@lowepower.com> Reviewed at http://reviews.gem5.org/r/3801/
2017-02-15	mem: fix assertion in respondEvent	Wendy Elsasser
	Assertion in the respondEvent erroneously fired. The assertion verifies that the controller has not moved to a low-power state prior to receiving read data from the memory. The original assertion triggered if the state was not: PWR_IDLE or PWR_ACT. In the case that failed, a periodic refresh event occurred around the read. The REF is stalled until the final read burst is issued and the subsequent PRE closes the bank. While the PRE will temporarily move the state to PWR_IDLE, state will immediately transition to PWR_REF due to the pending refresh operation. This state does not match the assertion, which is subsequently triggered. Fixed the assertion by explicitly checking that the state is not a low power state !PWR_SREF && !PWR_PRE_PDN && !PWR_ACT_PDN Change-Id: I82921a733bbeac2bcb5a487c2f981448d41ed50b Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com>
2017-02-14	arm, kvm: remove KvmGic	Curtis Dunham
	KvmGic functionality has been subsumed within the new MuxingKvmGic model, which has Pl390 fallback when not using KVM for fast emulation. This simplifies configuration and will enable checkpointing between KVM emulation and full-system simulation. Change-Id: Ie61251720064c512843015c075e4ac419a4081e8 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14	arm, kvm: Automatically use the MuxingKvmGic	Andreas Sandberg
	Automatically use the MuxingKvmGic in the VExpress_GEM5_V1 platform. This removes the need to patch the host kernel or the platform configuration when using KVM on ARM. Change-Id: Ib1ed9b3b849b80c449ef1b62b83748f3f54ada26 Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2017-02-14	arm, kvm: implement MuxingKvmGic	Curtis Dunham
	This device allows us to, when KVM support is detected and compiled in, instantiate the same Gic device whether the actual simulation is with KVM cores or simulated cores. Checkpointing is not yet supported. Change-Id: I67e4e0b6fb7ab5058e52c933f4f3d8e7ab24981e Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14	sim, kvm: make KvmVM a System parameter	Curtis Dunham
	A KVM VM is typically a child of the System object already, but for solving future issues with configuration graph resolution, the most logical way to keep track of this object is for it to be an actual parameter of the System object. Change-Id: I965ded22203ff8667db9ca02de0042ff1c772220 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14	sim,kvm,arm: fix typos	Curtis Dunham
	Change-Id: Ifc65d42eebfd109c1c622c82c3c3b3e523819e85 Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
2017-02-14	mem: Update DRAM configuration names	Wendy Elsasser
	Names of DRAM configurations were updated to reflect both the channel and device data width. Previous naming format was: <DEVICE_TYPE>_<DATA_RATE>_<CHANNEL_WIDTH> The following nomenclature is now used: <DEVICE_TYPE>_<DATA_RATE>_<n>x<w> where n = The number of devices per rank on the channel x = Device width Total channel width can be calculated by n*w Example: A 64-bit DDR4, 2400 channel consisting of 4-bit devices: n = 16 w = 4 The resulting configuration name is: DDR4_2400_16x4 Updated scripts to match new naming convention. Added unique configurations for DDR4 for: 1) 16x4 2) 8x8 3) 4x16 Change-Id: Ibd7f763b7248835c624309143cb9fc29d56a69d1 Reviewed-by: Radhika Jagtap <radhika.jagtap@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2017-02-12	ruby: fix round robin arbiter in garnet2.0	Tushar Krishna
	The rr arbiter pointer in garnet was getting updated on every request, even if there is no grant. This was leading to a huge variance in wait time at a router at high injection rates. This patch corrects it to update upon a grant.
2017-02-11	mem: fix printing of 1st cache tags line	Bjoern A. Zeeb
	Rather than having the 1st line on the Log line and every other line on its own, add a new line to have a common format for all of them. Makes parsing a lot easier. Reviewed at http://reviews.gem5.org/r/3808/ Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-10	x86: Fix implicit stack addressing in 64-bit mode	Jason Lowe-Power
	When in 64-bit mode, if the stack is accessed implicitly by an instruction the alternate address prefix should be ignored if present. This patch adds an extra flag to the ldstop which signifies when the address override should be ignored. Then, for all of the affected instructions, this patch adds two options to the ld and st opcode to use the current stack addressing mode for all addresses and to ignore the AddressSizeFlagBit. Finally, this patch updates the x86 TLB to not truncate the address if it is in 64-bit mode and the IgnoreAddrSizeFlagBit is set. This fixes a problem when calling __libc_start_main with a binary that is linked with a recent version of ld. This version of ld uses the address override prefix (0x67) on the call instruction instead of a nop. Note: This has not been tested in compatibility mode and only the call instruction with the address override prefix has been tested. See [1] page 9 (pdf page 45) For instructions that are affected see [1] page 519 (pdf page 555). [1] http://support.amd.com/TechDocs/24594.pdf Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-10	misc: Update #!env calls for python to explicit version	Jason Lowe-Power
	In some newer Linux distributions, env python default to Python 3.0. This patch explicitly uses "python2" instead of just "python" for all scripts that use #! Reported-by: Sanchayan Maity <maitysanchayan@gmail.com> Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-10	misc: Add Python.h header to pyevents.hh	Jason Lowe-Power
	Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-09	misc: add a MasterId to the ExternalPort	Christian Menard
	The Request constructor requires a MasterID. However, an external transactor has no chance of getting a MasterID as it does not have a pointer to the System. This patch adds a MasterID to ExternalMaster to allow external modules to easily genrerate new Requests. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-09	misc: Fix order of object construction in the CxxConfigManager	Christian Menard
	The CxxConfigManager schould create objects by traversing the object tree starting from the root object. However, currently objects are created in aplphabetical order, which only works if the root object alphabetically comes before any system object (e.g. 'root' < 'system'. Otherwise (e.g. 'a_system' < 'root'), object construction may fail. The reason for this behaviour is, that the call to findObject() in the sorting code also constructs the object if it is not yet existent. Then findTraversalOrder() calls findObject("root") and subseqeuently calls findObject() on all the children, and so on. However, the call to findTraversalOrder() is redundant, since all objects are already created in alphabetical order. This patch simply removes the alphabetical ordering, leading to the objects being created starting from 'root'. Reviewed at http://reviews.gem5.org/r/3778/ Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-09	sim: fix build breakage in process.cc after brandon@11801	Bjoern A. Zeeb
	Seeing build breakage after brandon@11801: [ CXX] X86/sim/process.cc -> .o build/X86/sim/process.cc:137:64: error: field '_pid' is uninitialized when used here [-Werror,-Wuninitialized] static_cast<PageTableBase >(new ArchPageTable(name(), _pid, system)) : ^ build/X86/sim/process.cc:138:64: error: field '_pid' is uninitialized when used here [-Werror,-Wuninitialized] static_cast<PageTableBase >(new FuncPageTable(name(), _pid))), ^ 2 errors generated. Testing Done: Compiles now on FreeBSD 10 with clang. Reviewed at http://reviews.gem5.org/r/3804/ Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-09	sim: Patch to fix the statfs build	Bjoern A. Zeeb
	See developers mailing list. Trying to unbreak statfs. Testing Done: Builds on FreeBSD now. Reviewed at http://reviews.gem5.org/r/3803/ Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-09	dev: net/i8254xGBe add two more wakeup registers to ignore	Bjoern A. Zeeb
	There are drivers writing to WUFC uncondtionally of anything. In order to not panic gem5 in these cases, ignore writes to WUFC and WUS as we do for WUC. Similarly return 0 (default reset value) on reads. Testing Done: Booted in FS with such a driver revision which would previously panic and now boots fine. Reviewed at http://reviews.gem5.org/r/3791/ Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-09	arm: AArch64 report cache size correctly when reading CTR_EL0	Bjoern A. Zeeb
	Trying to read MISCREG_CTR_EL0 on AArch64 returned 0 as is was not implmemented. With that an operating system relying on the cache line sizes reported in order to manage the caches would (a) panic given the returned value 0 is not valid (high bit is RES1) or (b) worst case would assume a cache line size of 4 doing a tremendous amount of extra instruction work (including fetching). Return the same values as for ARMv7 as the fields seem to be the same, or RES0/1 seem to be reported accordingly for AArch64 In collaboration with: Andrew Turner Testing Done: Checked on FreeBSD boots with extra printfs; also observed a reduction of a factor of about 10 in instruction fetches for a simple micro-test. Reviewed at http://reviews.gem5.org/r/3667/ Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-02-07	style: Force Python.h to be included before main header	Andreas Sandberg
	Python's header files set various compiler macros (e.g., _XOPEN_SOURCE) unconditionally. This triggers preprocessor warnings that end up being treated as errors. The Python integration manual [1] strongly recommends that Python.h is included before any system header. The style guide used to mandate that Python.h is included first in any file that needs it. This requirement was changed to always include a source file's main header first, which ended up triggering these errors. This change updates the style checker to always include Python.h before the main header file. [1] https://docs.python.org/2/extending/extending.html Change-Id: Id6a4f7fc64a336a8fd26691a0ca682abeb1d1579 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Pierre-Yves Péneau <pierre-yves.peneau@lirmm.fr>
2017-01-27	proto: Fix warnings for protoc v3	Nikos Nikoleris
	protoc v3 introduces a new syntax for proto files and warns when the syntax is not explicitly stated. protoc relies on the fact that undefined preprocessor symbols are explanded to 0 but since we use -Wundef they end up generating warnings. Change-Id: If07abeb54e932469c8f2c4d38634a97fdae40f77 Reviewed-by: Andreas Hansson <andreas.hansson@arm.com> Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com> Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-27	riscv: Fix crash when syscall argument reg index is too high	Alec Roelke
	By default, doSyscall gets the values of six registers to be used for system call arguments. RISC-V, by convention, only has four. Because RISC-V's implementation of these indices is as arrays of integers rather than as base indices plus offsets, trying to get the fifth argument register's value will cause a crash. This patch fixes that by returning 0 for any index higher than 3. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-27	mem: Refactor CommMonitor stats, add basic atomic mode stats	Rahul Thakur
	Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-27	mem: Add memory footprint probe	Rahul Thakur
	Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2017-01-27	python: Move native wrappers to the _m5 namespace	Andreas Sandberg
	Swig wrappers for native objects currently share the _m5.internal name space with Python code. This is undesirable if we ever want to switch from Swig to some other framework for native binding (e.g., PyBind11 or Boost::Python). This changeset moves all of such wrappers to the _m5 namespace, which is now reserved for native code. Change-Id: I2d2bc12dbc05b57b7c5a75f072e08124413d77f3 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2016-11-09	syscall_emul: [patch 4/22] remove redundant M5_pid field from process	Brandon Potter

2016-11-09	style: [patch 3/22] reduce include dependencies in some headers	Brandon Potter
	Used cppclean to help identify useless includes and removed them. This involved erroneously included headers, but also cases where forward declarations could have been used rather than a full include.
2017-01-20	syscall_emul: #ifdef new system calls to allow builds on OSX and BSD	Brandon Potter

2017-01-19	ruby: guard usage of GPUCoalescer code in Profiler	Tony Gutierrez
	the GPUCoalescer code is used in the ruby profiler regardless of whether or not the coalescer code has been compiled, which can lead to link/run time errors. here we add #ifdefs to guard the usage of GPUCoalescer code. eventually we should refactor this code to use probe points.
2017-01-19	ruby: Check MessageBuffer space in garnet NetworkInterface	Matthew Poremba
	Garnet's NetworkInterface does not consider the size of MessageBuffers when ejecting a Message from the network. Add a size check for the MessageBuffer and only enqueue if space is available. If space is not available, the message if placed in a queue and the credit is held. A callback from the MessageBuffer is implemented to wake the NetworkInterface. If there are messages in the stalled queue, they are processed first, in a FIFO manner and if succesfully ejected, the credit is finally sent back upstream. The maximum size of the stall queue is equal to the number of valid VNETs with MessageBuffers attached.
2017-01-19	ruby: Add occupancy stats to MessageBuffers	Matthew Poremba
	This patch is an updated version of /r/3297. "The most important statistic for measuring memory hierarchy performance is throughput, which is affected by independent variables, buffer sizing and communication latency. It is difficult/impossible to debug performance issues through series buffers without knowing which are the bottlenecks. For finite buffers, this patch adds statistics for the average number of messages in the buffer, the occupancy of the buffer slots, and number of message stalls."
2017-01-19	ruby: Check all VNETs for injection in garnet NetworkInterface	Matthew Poremba
	The NetworkInterface wakeup currently iterates over all VNETs and breaks the loop if a VNET is unable to allocate a VC. This can cause a deadlock if a lower numbered VNET is unable to allocate a VC while a higher numbered VNET has idle VCs. This seems like a bug as Garnet 1.0 uses a while loop over an if-statement, suggesting the break was intended for this while loop. This patch removes the break statement, which allows up to one message to be dequeued from a VNET and injected into the network.
2016-11-09	syscall_emul: [patch 2/22] move SyscallDesc into its own .hh and .cc	Brandon Potter
	The class was crammed into syscall_emul.hh which has tons of forward declarations and template definitions. To clean it up a bit, moved the class into separate files and commented the class with doxygen style comments. Also, provided some encapsulation by adding some accessors and a mutator. The syscallreturn.hh file was renamed syscall_return.hh to make it consistent with other similarly named files in the src/sim directory. The DPRINTF_SYSCALL macro was moved into its own header file with the include the Base and Verbose flags as well. --HG-- rename : src/sim/syscallreturn.hh => src/sim/syscall_return.hh
2016-11-09	style: [patch 1/22] use /r/3648/ to reorganize includes	Brandon Potter

2017-01-03	sim: Remove declaration of unused CountedDrainEvent	Andreas Sandberg
	The CountedDrainEvent event was used to keep track of objects that required additional simulation to drain. It was removed as a part of the great drain rewrite, but the declaration remained. Change-Id: I767a3213669040d3f27e2afafa2e4a5bb997e325 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com>
2017-01-03	python: Don't use Swig to cast stats	Andreas Sandberg
	Call the stat visitor from the stat itself rather than casting stats in Python. This reduces the number of ways visitors are called. Change-Id: Ic4d0b7b32e3ab9897b9a34cd22d353f4da62d738 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Sascha Bischoff <sascha.bischoff@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com> Reviewed-by: Joe Gross <joseph.gross@amd.com>
2017-01-03	sim: Remove redundant export_method_cxx_predecls	Andreas Sandberg
	The headers declared in export_method_cxx_predecls are redundant since a SimObject's main header is automatically included. Change-Id: Ied9e84630b36960e54efe91d16f8c66fba7e0da0 Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Reviewed-by: Curtis Dunham <curtis.dunham@arm.com> Reviewed-by: Joe Gross <joseph.gross@amd.com> Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-23	sim: Fix SE mode checkpoint restore file handling	Joel Hestness
	When restoring from a checkpoint, the simulation used to use file handles from the checkpoint. This disallows multiple separate restore simulations from using separate input and output files and directories, and plays havoc when the checkpointed file locations may have changed. Add handling to allow the command line specified files to be used as input/output for the restored simulation (Note: this is the similar functionality to FS mode for output and error).
2016-12-21	cpu: implement an L-TAGE branch predictor	Arthur Perais
	This patch implements an L-TAGE predictor, based on André Seznec's code available from CBP-2 (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). Signed-off-by Jason Lowe-Power <jason@lowepower.com>
2016-12-21	cpu: disallow speculative update of branch predictor tables (o3)	Arthur Perais
	The Minor and o3 cpu models share the branch prediction code. Minor relies on the BPredUnit::squash() function to update the branch predictor tables on a branch mispre- diction. This is fine because Minor executes in-order, so the update is on the correct path. However, this causes the branch predictor to be updated on out-of-order branch mispredictions when using the o3 model, which should not be the case. This patch guards against speculative update of the branch prediction tables. On a branch misprediction, BPredUnit::squash() calls BpredUnit::update(..., squashed = true). The underlying branch predictor tests against the value of squashed. If it is true, it restores any speculatively updated internal state it might have (e.g., global/local branch history), then returns. If false, it updates its prediction tables. Previously, exist- ing predictors did not test against the "squashed" parameter. To accomodate for this change, the Minor model must now call BPredUnit::squash() then BPredUnit::update(..., squashed = false) on branch mispredictions. Before, calling BpredUnit::squash() performed the prediction tables update. The effect is a slight MPKI improvement when using the o3 model. A further patch should perform the same modifications for the indirect target predictor and BTB (less critical). Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21	cpu: correct comments in tournament branch predictor	Arthur Perais
	The tournament predictor is presented as doing speculative update of the global history and non-speculative update of the local history used to generate the branch prediction. However, the code does speculative update of both histories. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21	cpu: Resolve targets of predicted 'taken' decode for O3	Arthur Perais
	The target of taken conditional direct branches does not need to be resolved in IEW: the target can be computed at decode, usually using the decoded instruction word and the PC. The higher-than-necessary penalty is taken only on conditional branches that are predicted taken but miss in the BTB. Thus, this is mostly inconsequential on IPC if the BTB is big/associative enough (fewer capacity/conflict misses). Nonetheless, what gem5 simulates is not representative of how conditional branch targets can be handled. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-21	cpu: Clarify meaning of cachePorts variable in lsq_unit.hh of O3	Arthur Perais
	cachePorts currently constrains the number of store packets written to the D-Cache each cycle), but loads currently affect this variable. This leads to unexpected congestion (e.g., setting cachePorts to a realistic 1 will in fact allow a store to WB only if no loads have accessed the D-Cache this cycle). In the absence of arbitration, this patch decouples how many loads can be done per cycle from how many stores can be done per cycle. Signed-off-by: Jason Lowe-Power <jason@lowepower.com>
2016-12-20	ruby: Make MessageBuffers actually finite sized	Joel Hestness
	When Ruby controllers stall messages in MessageBuffers, the buffer moves those messages off the priority heap and into a per-address stall map. When buffers are finite-sized, the test areNSlotsAvailable() only checks the size of the priority heap, but ignores the stall map, so the map is allowed to grow unbounded if the controller stalls numerous messages. This patch fixes the problem by tracking the stall map size and testing the total number of messages in the buffer appropriately.