Age | Commit message (Collapse) | Author |
|
This patch adds Link-Time Optimization when building the fast target
using gcc >= 4.6, and adds a scons flag to disable it (-no-lto). No
check is performed to guarantee that the linker supports LTO and use
of the linker plugin, so the user has to ensure that binutils GNU ld
>= 2.21 or the gold linker is available. Typically, if gcc >= 4.6 is
available, the latter should not be a problem. Currently the LTO
option is only useful for gcc >= 4.6, due to the limited support on
clang and earlier versions of gcc. The intention is to also add
support for clang once the LTO integration matures.
The same number of jobs is used for the parallel phase of LTO as the
jobs specified on the scons command line, using the -flto=n flag that
was introduced with gcc 4.6. The gold linker also supports concurrent
and incremental linking, but this is not used at this point.
The compilation and linking time is increased by almost 50% on
average, although ARM seems to be particularly demanding with an
increase of almost 100%. Also beware when using this as gcc uses a
tremendous amount of memory and temp space in the process. You have
been warned.
After some careful consideration, and plenty discussions, the flag is
only added to the fast target, and the warning that was issued in an
earlier version of this patch is now removed. Similarly, the flag used
to enable LTO, now the default is to use it, and the flag has been
modified to disable LTO. The rationale behind this decision is that
opt is used for development, whereas fast is only used for long runs,
e.g. regressions or more elaborate experiments where the additional
compile and link time is amortized by a much larger run time.
When it comes to the return on investment, the regression seems to be
roughly 15% faster with LTO. For a bit more detail, I ran twolf on
ARM.fast, with three repeated runs, and they all finish within 42
minutes (+- 25 seconds) without LTO and 31 minutes (+- 25 seconds)
with LTO, i.e. LTO gives an impressive >25% speed-up for this case.
Without LTO (ARM.fast twolf)
real 42m37.632s
user 42m34.448s
sys 0m0.390s
real 41m51.793s
user 41m50.384s
sys 0m0.131s
real 41m45.491s
user 41m39.791s
sys 0m0.139s
With LTO (ARM.fast twolf)
real 30m33.588s
user 30m5.701s
sys 0m0.141s
real 31m27.791s
user 31m24.674s
sys 0m0.111s
real 31m25.500s
user 31m16.731s
sys 0m0.106s
|
|
This patch adds a new target called 'perf' that facilitates profiling
using google perftools rather than gprof. The perftools CPU profiler
offers plenty useful information in addition to gprof, and the latter
is kept mostly to offer profiling also on non-Linux hosts.
|
|
This patch restructures the ccflags such that the common parts are
defined in a single location, also capturing all the target types in a
single place.
The patch also adds a corresponding ldflags in preparation for
google-perf profiling support and the addition of Link-Time
Optimization.
|
|
This patch disables a warning for unused values which causes problems
when compiling the swig-generated sources using recent llvm-based
compilers like llvm-gcc and clang.
|
|
them.
|
|
|
|
This patch addresses a number of minor issues that cause problems when
compiling with clang >= 3.0 and gcc >= 4.6. Most importantly, it
avoids using the deprecated ext/hash_map and instead uses
unordered_map (and similarly so for the hash_set). To make use of the
new STL containers, g++ and clang has to be invoked with "-std=c++0x",
and this is now added for all gcc versions >= 4.6, and for clang >=
3.0. For gcc >= 4.3 and <= 4.5 and clang <= 3.0 we use the tr1
unordered_map to avoid the deprecation warning.
The addition of c++0x in turn causes a few problems, as the
compiler is more stringent and adds a number of new warnings. Below,
the most important issues are enumerated:
1) the use of namespaces is more strict, e.g. for isnan, and all
headers opening the entire namespace std are now fixed.
2) another other issue caused by the more stringent compiler is the
narrowing of the embedded python, which used to be a char array,
and is now unsigned char since there were values larger than 128.
3) a particularly odd issue that arose with the new c++0x behaviour is
found in range.hh, where the operator< causes gcc to complain about
the template type parsing (the "<" is interpreted as the beginning
of a template argument), and the problem seems to be related to the
begin/end members introduced for the range-type iteration, which is
a new feature in c++11.
As a minor update, this patch also fixes the build flags for the clang
debug target that used to be shared with gcc and incorrectly use
"-ggdb".
|
|
Partial backout of cset 8b223e308b08.
Although it's great that there's currently no need
for Werror=false in the current tree, some of us
have uncommitted code that still needs this option.
|
|
Unit tests shouldn't build in gem5's main function because they have thier
own.
|
|
This patch removes the overriding of "-Werror" in a handful of
cases. The code compiles with gcc 4.6.3 and clang 3.0 without any
warnings, and thus without any errors. There are no functional changes
introduced by this patch. In the future, rather than ypassing
"-Werror", address the warnings.
|
|
1. --implicit-cache behavior is default.
2. makeEnv in src/SConscript is conditionally called.
3. decider set to MD5-timestamp
4. NO_HTML build option changed to SLICC_HTML (defaults to False)
|
|
This patch adds the necessary flags to the SConstruct and SConscript
files for compiling using clang 2.9 and later (on Ubuntu et al and OSX
XCode 4.2), and also cleans up a bunch of compiler warnings found by
clang. Most of the warnings are related to hidden virtual functions,
comparisons with unsigneds >= 0, and if-statements with empty
bodies. A number of mismatches between struct and class are also
fixed. clang 2.8 is not working as it has problems with class names
that occur in multiple namespaces (e.g. Statistics in
kernel_stats.hh).
clang has a bug (http://llvm.org/bugs/show_bug.cgi?id=7247) which
causes confusion between the container std::set and the function
Packet::set, and this is currently addressed by not including the
entire namespace std, but rather selecting e.g. "using std::vector" in
the appropriate places.
|
|
To make gem5 compile and run with swig 2.0.4 a few minor fixes are
necessary, the fail label issues by swig must not be treated as an
error by gcc (tested with gcc 4.2.1), and the vector wrappers must
have SWIGPY_SLICE_ARG defined which happens in pycontainer.swg,
included through std_container.i. By adding the aforementioned include
to the vector wrappers everything seems to work.
|
|
|
|
And by "everything" I mean all the quick regressions.
|
|
- Move the random bits of SWIG code generation out of src/SConscript
file and into methods on the objects being wrapped.
- Cleaned up some variable naming and added some comments to make
the process a little clearer.
- Did a little generated file/module renaming:
- vptype_Foo now Foo_vector
- init_Foo is now Foo_init
This makes it easier to see all the Foo-related files in a
sorted directory listing.
- Made cxx_predecls and swig_predecls normal SimObject classmethods.
- Got rid of swig_objdecls hook, even though this breaks the System
objects get/setMemoryMode method exports. Will be fixing this in
a future changeset.
|
|
|
|
|
|
The default generated binary is now gem5.<type> instead of m5.<type>.
The latter does still work but gem5.<type> will be generated first and
then m5.<type> will be hard linked to it.
|
|
The end of the COPYING file was generated with:
% python ./util/find_copyrights.py configs src system tests util
Update -C command line option to spit out COPYING file
|
|
|
|
|
|
This is similar to guards on mercurial queues and they're used for selecting
which files are compiled into some given object. We already do something
similar, but it's mostly hard coded for the m5 binary and the m5 library
and I'd like to make it more flexible to better support the unittests
|
|
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
|
|
This causes a lot of rebuilds that could have otherwise possibly been
avoided, and, more annoyingly, a lot of unnecessary rerunning of the
regressions. The benefits of having the revision in the output haven't
materialized, so this change removes it.
|
|
Get rid of RELEASE_NOTES since we no longer do releases, update some of the
information in README, and update the date in LICENSE.
|
|
I like the brevity of Ali's recent change, but the ambiguity of
sometimes showing the source and sometimes the target is a little
confusing. This patch makes scons typically list all sources and
all targets for each action, with the common path prefix factored
out for brevity. It's a little more verbose now but also more
informative.
Somehow Ali talked me into adding colors too, which is a whole
'nother story.
|
|
Ran all the source files through 'perl -pi' with this script:
s|\s*(};?\s*)?/\*\s*(end\s*)?namespace\s*(\S+)\s*\*/(\s*})?|} // namespace $3|;
s|\s*};?\s*//\s*(end\s*)?namespace\s*(\S+)\s*|} // namespace $2\n|;
s|\s*};?\s*//\s*(\S+)\s*namespace\s*|} // namespace $1\n|;
Also did a little manual editing on some of the arch/*/isa_traits.hh files
and src/SConscript.
|
|
|
|
|
|
The build_dir parameter name has been deprecated and replaced with
variant_dir. This change switches us over to avoid warning spew in newer
versions of scons.
|
|
This is necessary because versions of swig older than 1.3.39 fail to
do the right thing and try to do relative imports for everything (even
with the package= option to %module). Instead of putting params in
the m5.internal.params package, put params in the m5.internal package
and make all param modules start with param_. Same thing for
m5.internal.enums.
Also, stop importing all generated params into m5.objects. They are
not necessary and now with everything using relative imports we wound
up with pollution of the namespace (where builtin-range got overridden).
--HG--
rename : src/python/m5/internal/enums/__init__.py => src/python/m5/internal/enums.py
rename : src/python/m5/internal/params/__init__.py => src/python/m5/internal/params.py
|
|
kill params.i and create a separate .i for each object (param, enums, etc.)
|
|
Instead of putting all object files into m5/object/__init__.py, interrogate
the importer to find out what should be imported.
Instead of creating a single file that lists all of the embedded python
modules, use static object construction to put those objects onto a list.
Do something similar for embedded swig (C++) code.
|
|
|
|
If the user sets the environment variable M5_OVERRIDE_PY_SOURCE to
True, then imports that would normally find python code compiled into
the executable will instead first check in the absolute location where
the code was found during the build of the executable. This only
works for files in the src (or extras) directories, not automatically
generated files.
This is a developer feature!
|
|
This causes builds to happen in sorted order rather than in
declaration order. This gets annoying when you make a global change
and then you notice that the files that are being compiled are jumping
around the directory hierarchy.
|
|
|
|
|
|
|
|
|
|
Get rid of misc.py and just stick misc things in __init__.py
Move utility functions out of SCons files and into m5.util
Move utility type stuff from m5/__init__.py to m5/util/__init__.py
Remove buildEnv from m5 and allow access only from m5.defines
Rename AddToPath to addToPath while we're moving it to m5.util
Rename read_command to readCommand while we're moving it
Rename compare_versions to compareVersions while we're moving it.
--HG--
rename : src/python/m5/convert.py => src/python/m5/util/convert.py
rename : src/python/m5/smartdict.py => src/python/m5/util/smartdict.py
|
|
Compile gzstream as position independent code
use the PIC version of date for shared libs...oops
|
|
|
|
Start by turning all of the *Source functions into classes
so we can do more calculations and more easily collect the data we need.
Add parameters to the new classes for indicating what sorts of flags the
objects should be compiled with so we can allow certain files to be compiled
without Werror for example.
|
|
|
|
This allows me to clean things up so we are up to date with respect to
deprecated features. There are many features scheduled for permanent failure
in scons 2.0 and 0.98.1 provides the most compatability for that. It
also paves the way for some nice new features that I will add soon
|
|
This means that similar to libm5_fast.so, you need to explicitly build
build/ALPHA_SE/libm5_fast.a if you want it.
|
|
changing.
|
|
Also clean things up so that help strings can more easily be added.
Move the help function into trace.py
|