diff options
author | Andreas Sandberg <andreas.sandberg@arm.com> | 2016-04-22 22:26:56 +0100 |
---|---|---|
committer | Andreas Sandberg <andreas.sandberg@arm.com> | 2019-08-29 12:12:25 +0000 |
commit | 51d38a4b79d54a97d438ecdba5ce94d6e9172eef (patch) | |
tree | 442153e96f6261a7fed4e7a1dfdcfa7f787097d7 /src/base | |
parent | ab620aca1b6946bd2978d67103b0734e5dfd475d (diff) | |
download | gem5-51d38a4b79d54a97d438ecdba5ce94d6e9172eef.tar.xz |
stats: Add beta support for HDF5 stat dumps
This changeset add support for stat dumps in the HDF5 file
format. HDF5 is a binary data format that represents data in a
file-system-like balanced tree. It has native support for
N-dimensional arrays and binary data (e.g., frame buffers).
It has the following benefits over traditional text stat files:
* Efficient storage of time series (multiple stat dumps)
* Fast lookup of stats
* Plenty of existing tooling (e.g., Python libraries and graphical
viewers)
* File format can be used to store frame buffers together with
normal stats.
Drawbacks:
* Large startup cost (single stat dump larger than text equivalent)
* Stat dumps are slower than text
Known limitations:
* Distributions and histograms aren't supported.
HDF5 stat output can be enabled using the 'h5' URL scheme when
overriding the stat file name on gem5's command line. The following
parameters are supported:
* chunking (unsigned): Number of time steps to pre-allocate
(default: 10)
* desc (bool): Output stat descriptions (default: True)
* formulas (bool): Output derived stats (default: True)
Example gem5 command line:
./build/ARM/gem5.opt \
--stats-file="h5://stats.h5?desc=False;formulas=False" \
configs/example/fs.py
Example Python stat consumer that computes IPC:
import h5py
f = h5py.File('stats.h5', 'r')
group = f['/system/cpu']
for i, c in zip(group['committedInsts'], group['numCycles']):
print i, c, i / c
Change-Id: I351c6cbff2fb7bef9012f47876ba227ed288975b
Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/c/public/gem5/+/8121
Tested-by: kokoro <noreply+kokoro@google.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Reviewed-by: Daniel Carvalho <odanrc@yahoo.com.br>
Reviewed-by: Ciro Santilli <ciro.santilli@arm.com>
Diffstat (limited to 'src/base')
-rw-r--r-- | src/base/SConscript | 2 | ||||
-rw-r--r-- | src/base/stats/hdf5.cc | 325 | ||||
-rw-r--r-- | src/base/stats/hdf5.hh | 159 |
3 files changed, 486 insertions, 0 deletions
diff --git a/src/base/SConscript b/src/base/SConscript index 96f7b5b50..2c8f73371 100644 --- a/src/base/SConscript +++ b/src/base/SConscript @@ -83,6 +83,8 @@ Source('loader/symtab.cc') Source('stats/group.cc') Source('stats/text.cc') +if env['USE_HDF5']: + Source('stats/hdf5.cc') GTest('addr_range.test', 'addr_range.test.cc') GTest('addr_range_map.test', 'addr_range_map.test.cc') diff --git a/src/base/stats/hdf5.cc b/src/base/stats/hdf5.cc new file mode 100644 index 000000000..bb3705c3b --- /dev/null +++ b/src/base/stats/hdf5.cc @@ -0,0 +1,325 @@ +/* + * Copyright (c) 2016-2019 Arm Limited + * All rights reserved + * + * The license below extends only to copyright in the software and shall + * not be construed as granting a license to any other intellectual + * property including but not limited to intellectual property relating + * to a hardware implementation of the functionality of the software + * licensed hereunder. You may use the software subject to the license + * terms below provided that you ensure that this notice is replicated + * unmodified and in its entirety in all distributions of the software, + * modified or unmodified, in source code or in binary form. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer; + * redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution; + * neither the name of the copyright holders nor the names of its + * contributors may be used to endorse or promote products derived from + * this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * Authors: Andreas Sandberg + */ + +#include "base/stats/hdf5.hh" + +#include "base/logging.hh" +#include "base/stats/info.hh" + +/** + * Check if all strings in a container are empty. + */ +template<typename T> +bool emptyStrings(const T &labels) +{ + for (const auto &s : labels) { + if (!s.empty()) + return false; + } + return true; +} + + +namespace Stats { + +Hdf5::Hdf5(const std::string &file, unsigned chunking, + bool desc, bool formulas) + : fname(file), timeChunk(chunking), + enableDescriptions(desc), enableFormula(formulas), + dumpCount(0) +{ + // Tell the library not to print exceptions by default. There are + // cases where we rely on exceptions to determine if we need to + // create a node or if we can just open it. + H5::Exception::dontPrint(); +} + +Hdf5::~Hdf5() +{ +} + + +void +Hdf5::begin() +{ + h5File = H5::H5File(fname, + // Truncate the file if this is the first dump + dumpCount > 0 ? H5F_ACC_RDWR : H5F_ACC_TRUNC); + path.push(h5File.openGroup("/")); +} + +void +Hdf5::end() +{ + assert(valid()); + + dumpCount++; +} + +bool +Hdf5::valid() const +{ + return true; +} + + +void +Hdf5::beginGroup(const char *name) +{ + auto base = path.top(); + + // Try to open an existing stat group corresponding to the + // name. Create it if it doesn't exist. + H5::Group group; + try { + group = base.openGroup(name); + } catch (H5::FileIException e) { + group = base.createGroup(name); + } catch (H5::GroupIException e) { + group = base.createGroup(name); + } + + path.push(group); +} + +void +Hdf5::endGroup() +{ + assert(!path.empty()); + path.pop(); +} + +void +Hdf5::visit(const ScalarInfo &info) +{ + // Since this stat is a scalar, we need 1-dimensional value in the + // stat file. The Hdf5::appendStat helper will populate the size + // of the first dimension (time). + hsize_t fdims[1] = { 0, }; + double data[1] = { info.result(), }; + + appendStat(info, 1, fdims, data); +} + +void +Hdf5::visit(const VectorInfo &info) +{ + appendVectorInfo(info); +} + +void +Hdf5::visit(const DistInfo &info) +{ + warn_once("HDF5 stat files don't support distributions.\n"); +} + +void +Hdf5::visit(const VectorDistInfo &info) +{ + warn_once("HDF5 stat files don't support vector distributions.\n"); +} + +void +Hdf5::visit(const Vector2dInfo &info) +{ + // Request a 3-dimensional stat, the first dimension will be + // populated by the Hdf5::appendStat() helper. The remaining two + // dimensions correspond to the stat instance. + hsize_t fdims[3] = { 0, info.x, info.y }; + H5::DataSet data_set = appendStat(info, 3, fdims, info.cvec.data()); + + if (dumpCount == 0) { + if (!info.subnames.empty() && !emptyStrings(info.subnames)) + addMetaData(data_set, "subnames", info.subnames); + + if (!info.y_subnames.empty() && !emptyStrings(info.y_subnames)) + addMetaData(data_set, "y_subnames", info.y_subnames); + + if (!info.subdescs.empty() && !emptyStrings(info.subdescs)) + addMetaData(data_set, "subdescs", info.subdescs); + } +} + +void +Hdf5::visit(const FormulaInfo &info) +{ + if (!enableFormula) + return; + + H5::DataSet data_set = appendVectorInfo(info); + + if (dumpCount == 0) + addMetaData(data_set, "equation", info.str()); +} + +void +Hdf5::visit(const SparseHistInfo &info) +{ + warn_once("HDF5 stat files don't support sparse histograms.\n"); +} + +H5::DataSet +Hdf5::appendVectorInfo(const VectorInfo &info) +{ + const VResult &vr(info.result()); + // Request a 2-dimensional stat, the first dimension will be + // populated by the Hdf5::appendStat() helper. The remaining + // dimension correspond to the stat instance. + hsize_t fdims[2] = { 0, vr.size() }; + H5::DataSet data_set = appendStat(info, 2, fdims, vr.data()); + + if (dumpCount == 0) { + if (!info.subnames.empty() && !emptyStrings(info.subnames)) + addMetaData(data_set, "subnames", info.subnames); + + if (!info.subdescs.empty() && !emptyStrings(info.subdescs)) + addMetaData(data_set, "subdescs", info.subdescs); + } + + return data_set; +} + +H5::DataSet +Hdf5::appendStat(const Info &info, int rank, hsize_t *dims, const double *data) +{ + H5::Group group = path.top(); + H5::DataSet data_set; + H5::DataSpace fspace; + + dims[0] = dumpCount + 1; + + if (dumpCount > 0) { + // Get the existing stat if we have already dumped this stat + // before. + data_set = group.openDataSet(info.name); + data_set.extend(dims); + fspace = data_set.getSpace(); + } else { + // We don't have the stat already, create it. + + H5::DSetCreatPropList props; + + // Setup max dimensions based on the requested file dimensions + std::vector<hsize_t> max_dims(rank); + std::copy(dims, dims + rank, max_dims.begin()); + max_dims[0] = H5S_UNLIMITED; + + // Setup chunking + std::vector<hsize_t> chunk_dims(rank); + std::copy(dims, dims + rank, chunk_dims.begin()); + chunk_dims[0] = timeChunk; + props.setChunk(rank, chunk_dims.data()); + + // Enable compression + props.setDeflate(1); + + fspace = H5::DataSpace(rank, dims, max_dims.data()); + data_set = group.createDataSet(info.name, H5::PredType::NATIVE_DOUBLE, + fspace, props); + + if (enableDescriptions && !info.desc.empty()) { + addMetaData(data_set, "description", info.desc); + } + } + + // The first dimension is time which isn't included in data. + dims[0] = 1; + H5::DataSpace mspace(rank, dims); + std::vector<hsize_t> foffset(rank, 0); + foffset[0] = dumpCount; + + fspace.selectHyperslab(H5S_SELECT_SET, dims, foffset.data()); + data_set.write(data, H5::PredType::NATIVE_DOUBLE, mspace, fspace); + + return data_set; +} + +void +Hdf5::addMetaData(H5::DataSet &loc, const char *name, + const std::vector<const char *> &values) +{ + H5::StrType type(H5::PredType::C_S1, H5T_VARIABLE); + hsize_t dims[1] = { values.size(), }; + H5::DataSpace space(1, dims); + H5::Attribute attribute = loc.createAttribute(name, type, space); + attribute.write(type, values.data()); +} + +void +Hdf5::addMetaData(H5::DataSet &loc, const char *name, + const std::vector<std::string> &values) +{ + std::vector<const char *> cstrs(values.size()); + for (int i = 0; i < values.size(); ++i) + cstrs[i] = values[i].c_str(); + + addMetaData(loc, name, cstrs); +} + +void +Hdf5::addMetaData(H5::DataSet &loc, const char *name, + const std::string &value) +{ + H5::StrType type(H5::PredType::C_S1, value.length() + 1); + hsize_t dims[1] = { 1, }; + H5::DataSpace space(1, dims); + H5::Attribute attribute = loc.createAttribute(name, type, space); + attribute.write(type, value.c_str()); +} + +void +Hdf5::addMetaData(H5::DataSet &loc, const char *name, double value) +{ + hsize_t dims[1] = { 1, }; + H5::DataSpace space(1, dims); + H5::Attribute attribute = loc.createAttribute( + name, H5::PredType::NATIVE_DOUBLE, space); + attribute.write(H5::PredType::NATIVE_DOUBLE, &value); +} + + +std::unique_ptr<Output> +initHDF5(const std::string &filename, unsigned chunking, + bool desc, bool formulas) +{ + return std::unique_ptr<Output>( + new Hdf5(filename, chunking, desc, formulas)); +} + +}; // namespace Stats diff --git a/src/base/stats/hdf5.hh b/src/base/stats/hdf5.hh new file mode 100644 index 000000000..8c965940d --- /dev/null +++ b/src/base/stats/hdf5.hh @@ -0,0 +1,159 @@ +/* + * Copyright (c) 2016-2019 Arm Limited + * All rights reserved + * + * The license below extends only to copyright in the software and shall + * not be construed as granting a license to any other intellectual + * property including but not limited to intellectual property relating + * to a hardware implementation of the functionality of the software + * licensed hereunder. You may use the software subject to the license + * terms below provided that you ensure that this notice is replicated + * unmodified and in its entirety in all distributions of the software, + * modified or unmodified, in source code or in binary form. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer; + * redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution; + * neither the name of the copyright holders nor the names of its + * contributors may be used to endorse or promote products derived from + * this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * Authors: Andreas Sandberg + */ + +#ifndef __BASE_STATS_HDF5_HH__ +#define __BASE_STATS_HDF5_HH__ + +#include <H5Cpp.h> + +#include <memory> +#include <stack> +#include <string> +#include <vector> + +#include "base/output.hh" +#include "base/stats/output.hh" +#include "base/stats/types.hh" + +namespace Stats { + +class Hdf5 : public Output +{ + public: + Hdf5(const std::string &file, unsigned chunking, bool desc, bool formulas); + + ~Hdf5(); + + Hdf5() = delete; + Hdf5(const Hdf5 &other) = delete; + + public: // Output interface + void begin() override; + void end() override; + bool valid() const override; + + void beginGroup(const char *name) override; + void endGroup() override; + + void visit(const ScalarInfo &info) override; + void visit(const VectorInfo &info) override; + void visit(const DistInfo &info) override; + void visit(const VectorDistInfo &info) override; + void visit(const Vector2dInfo &info) override; + void visit(const FormulaInfo &info) override; + void visit(const SparseHistInfo &info) override; + + protected: + /** + * Helper function to append vector stats and set their metadata. + */ + H5::DataSet appendVectorInfo(const VectorInfo &info); + + /** + * Helper function to append an n-dimensional double stat to the + * file. + * + * This helper function assumes that all stats include a time + * component. I.e., a Stat::Scalar is a 1-dimensional stat. + * + * @param info Stat info structure. + * @param rank Stat dimensionality (including time). + * @param dims Size of each of the dimensions. + */ + H5::DataSet appendStat(const Info &info, int rank, hsize_t *dims, + const double *data); + + /** + * Helper function to add a string vector attribute to a stat. + * + * @param loc Parent location in the file. + * @param name Attribute name. + * @param values Attribute value. + */ + void addMetaData(H5::DataSet &loc, const char *name, + const std::vector<const char *> &values); + + /** + * Helper function to add a string vector attribute to a stat. + * + * @param loc Parent location in the file. + * @param name Attribute name. + * @param values Attribute value. + */ + void addMetaData(H5::DataSet &loc, const char *name, + const std::vector<std::string> &values); + + /** + * Helper function to add a string attribute to a stat. + * + * @param loc Parent location in the file. + * @param name Attribute name. + * @param value Attribute value. + */ + void addMetaData(H5::DataSet &loc, const char *name, + const std::string &value); + + /** + * Helper function to add a double attribute to a stat. + * + * @param loc Parent location in the file. + * @param name Attribute name. + * @param value Attribute value. + */ + void addMetaData(H5::DataSet &loc, const char *name, double value); + + protected: + const std::string fname; + const hsize_t timeChunk; + const bool enableDescriptions; + const bool enableFormula; + + std::stack<H5::Group> path; + + unsigned dumpCount; + H5::H5File h5File; +}; + +std::unique_ptr<Output> initHDF5( + const std::string &filename,unsigned chunking = 10, + bool desc = true, bool formulas = true); + +} // namespace Stats + +#endif // __BASE_STATS_HDF5_HH__ |