Age | Commit message (Collapse) | Author |
|
Previously the directory covered a flat address range that always
started from address 0. This change adds a vector of address ranges
with interleaving and hashing that each directory keeps track of and
the necessary flexibility to support systems with non continuous
memory ranges.
Change-Id: I6ea1c629bdf4c5137b7d9c89dbaf6c826adfd977
Reviewed-by: Andreas Sandberg <andreas.sandberg@arm.com>
Reviewed-on: https://gem5-review.googlesource.com/2903
Reviewed-by: Bradford Beckmann <brad.beckmann@amd.com>
Reviewed-by: Jason Lowe-Power <jason@lowepower.com>
Maintainer: Jason Lowe-Power <jason@lowepower.com>
|
|
This patch is imported from reviewboard patch 2551 by Nilay.
This patch moves from a dynamically defined MachineType to a statically
defined one. The need for this patch was felt since a dynamically defined
type prevents us from having types for which no machine definition may
exist.
The following changes have been made:
i. each machine definition now uses a type from the MachineType enumeration
instead of any random identifier. This required changing the grammar and the
*.sm files.
ii. MachineType enumeration defined statically in RubySlicc_Exports.sm.
* * *
normal protocol fixes for nilay's parser machine type fix
|
|
Changeset 4872dbdea907 replaced Address by Addr, but did not make changes to
print statements. So the addresses which were being printed in hex earlier
along with their line address, were now being printed in decimals. This patch
adds a function printAddress(Addr) that can be used to print the address in hex
along with the lines address. This function has been put to use in some of the
places. At other places, change has been made to print just the address in
hex.
|
|
This patch changes MessageBuffer and TimerTable, two structures used for
buffering messages by components in ruby. These structures would no longer
maintain pointers to clock objects. Functions in these structures have been
changed to take as input current time in Tick. Similarly, these structures
will not operate on Cycle valued latencies for different operations. The
corresponding functions would need to be provided with these latencies by
components invoking the relevant functions. These latencies should also be
in Ticks.
I felt the need for these changes while trying to speed up ruby. The ultimate
aim is to eliminate Consumer class and replace it with an EventManager object in
the MessageBuffer and TimerTable classes. This object would be used for
scheduling events. The event itself would contain information on the object and
function to be invoked.
In hindsight, it seems I should have done this while I was moving away from use
of a single global clock in the memory system. That change led to introduction
of clock objects that replaced the global clock object. It never crossed my
mind that having clock object pointers is not a good design. And now I really
don't like the fact that we have separate consumer, receiver and sender
pointers in message buffers.
|
|
Currently the sequencer calls the function setMRU that updates the replacement
policy structures with the first level caches. While functionally this is
correct, the problem is that this requires calling findTagInSet() which is an
expensive function. This patch removes the calls to setMRU from the sequencer.
All controllers should now update the replacement policy on their own.
The set and the way index for a given cache entry can be found within the
AbstractCacheEntry structure. Use these indicies to update the replacement
policy structures.
|
|
MessageBuffer is a SimObject now. There were protocols that still declared
some of the message buffers are variables of the controller, but not as input
parameters. Special handling was required for these variables in the SLICC
compiler. This patch changes this. Now all message buffers are declared as
input parameters.
|
|
|
|
Currently the sequencer calls the function setMRU that updates the replacement
policy structures with the first level caches. While functionally this is
correct, the problem is that this requires calling findTagInSet() which is an
expensive function. This patch removes the calls to setMRU from the sequencer.
All controllers should now update the replacement policy on their own.
The set and the way index for a given cache entry can be found within the
AbstractCacheEntry structure. Use these indicies to update the replacement
policy structures.
|
|
This is in preparation for adding a second arugment to the lookup
function for the CacheMemory class. The change to *.sm files was made using
the following sed command:
sed -i 's/\[\([0-9A-Za-z._()]*\)\]/.lookup(\1)/' src/mem/protocol/*.sm
|
|
This patch eliminates the type Address defined by the ruby memory system.
This memory system would now use the type Addr that is in use by the
rest of the system.
|
|
Avoid clash between type Addr and variable name Addr.
|
|
|
|
This patch removes the data block present in the directory entry structure
of each protocol in gem5's mainline. Firstly, this is required for moving
towards common set of memory controllers for classic and ruby memory systems.
Secondly, the data block was being misused in several places. It was being
used for having free access to the physical memory instead of calling on the
memory controller.
From now on, the directory controller will not have a direct visibility into
the physical memory. The Memory Vector object now resides in the
Memory Controller class. This also means that some significant changes are
being made to the functional accesses in ruby.
|
|
This patch is the final patch in a series of patches. The aim of the series
is to make ruby more configurable than it was. More specifically, the
connections between controllers are not at all possible (unless one is ready
to make significant changes to the coherence protocol). Moreover the buffers
themselves are magically connected to the network inside the slicc code.
These connections are not part of the configuration file.
This patch makes changes so that these connections will now be made in the
python configuration files associated with the protocols. This requires
each state machine to expose the message buffers it uses for input and output.
So, the patch makes these buffers configurable members of the machines.
The patch drops the slicc code that usd to connect these buffers to the
network. Now these buffers are exposed to the python configuration system
as Master and Slave ports. In the configuration files, any master port
can be connected any slave port. The file pyobject.cc has been modified to
take care of allocating the actual message buffer. This is inline with how
other port connections work.
|
|
There are two changes this patch makes to the way configurable members of a
state machine are specified in SLICC. The first change is that the data
member declarations will need to be separated by a semi-colon instead of a
comma. Secondly, the default value to be assigned would now use SLICC's
assignment operator i.e. ':='.
|
|
This changeset does away with prefixing of member variables of state machines
with the identity of the machine itself.
|
|
Using '== true' in a boolean expression is totally redundant,
and using '== false' is pretty verbose (and arguably less
readable in most cases) compared to '!'.
It's somewhat of a pet peeve, perhaps, but I had some time
waiting for some tests to run and decided to clean these up.
Unfortunately, SLICC appears not to have the '!' operator,
so I had to leave the '== false' tests in the SLICC code.
|
|
As of now, the enqueue statement can take in any number of 'pairs' as
argument. But we only use the pair in which latency is the key. This
latency is allowed to be either a fixed integer or a member variable of
controller in which the expression appears. This patch drops the use of pairs
in an enqueue statement. Instead, an expression is allowed which will be
interpreted to be the latency of the enqueue. This expression can anything
allowed by slicc including a constant integer or a member variable.
|
|
A cluster over here means a set of controllers that can be accessed only by a
certain set of cores. For example, consider a two level hierarchy. Assume
there are 4 L1 controllers (private) and 2 L2 controllers. We can have two
different hierarchies here:
a. the address space is partitioned between the two L2 controllers. Each L1
controller accesses both the L2 controllers. In this case, each L1 controller
is a cluster initself.
b. both the L2 controllers can cache any address. An L1 controller has access
to only one of the L2 controllers. In this case, each L2 controller
along with the L1 controllers that access it, form a cluster.
This patch allows for each controller to have a cluster ID, which is 0 by
default. By setting the cluster ID properly, one can instantiate hierarchies
with clusters. Note that the coherence protocol might have to be changed as
well.
|
|
The patch started of with removing the global variables from the profiler for
profiling the miss latency of requests made to the cache. The corrresponding
histograms have been moved to the Sequencer. These are combined together when
the histograms are printed. Separate histograms are now maintained for
tracking latency of all requests together, of hits only and of misses only.
A particular set of histograms used to use the type GenericMachineType defined
in one of the protocol files. This patch removes this type. Now, everything
that relied on this type would use MachineType instead. To do this, SLICC has
been changed so that multiple machine types can be declared by a controller
in its preamble.
|
|
|
|
Change all occurrances of Address as a variable name to instead use Addr.
Address is an allowed name in slicc even when Address is also being used as a
type, leading to declarations of "Address Address". While this works, it
prevents adding another field of type Address because the compiler then thinks
Address is a variable name, not type.
Committed by: Nilay Vaish <nilay@cs.wisc.edu>
|
|
Updates copyright years, removes space at the end of lines, shortens
variable names.
|
|
This patch changes the way cache statistics are collected in ruby.
As of now, there is separate entity called CacheProfiler which holds
statistical variables for caches. The CacheMemory class defines different
functions for accessing the CacheProfiler. These functions are then invoked
in the .sm files. I find this approach opaque and prone to error. Secondly,
we probably should not be paying the cost of a function call for recording
statistics.
Instead, this patch allows for accessing statistical variables in the
.sm files. The collection would become transparent. Secondly, it would happen
in place, so no function calls. The patch also removes the CacheProfiler class.
--HG--
rename : src/mem/slicc/ast/InfixOperatorExprAST.py => src/mem/slicc/ast/OperatorExprAST.py
|
|
This patch is as of now the final patch in the series of patches that replace
Time with Cycles.This patch further replaces Time with Cycles in Sequencer,
Profiler, different protocols and related entities.
Though Time has not been completely removed, the places where it is in use
seem benign as of now.
|
|
The patch started of with replacing Time with Cycles in the Consumer class.
But to get ruby to compile, the rest of the changes had to be carried out.
Subsequent patches will further this process, till we completely replace
Time with Cycles.
|
|
This patch replaces get_time() in *.sm files with curCycle() which
is now possible since controllers are clocked objects.
|
|
The patch adds support to slicc for recognizing arguments that should be
passed to the constructor of a class. I did not like the fact that an explicit
check was being carried on the type 'TBETable' to figure out the arguments to
be passed to the constructor.
The patch also moves some of the member variables that are declared for all
the controllers to the base class AbstractController.
|
|
This patch adds support to different entities in the ruby memory system
for more reliable functional read/write accesses. Only the simple network
has been augmented as of now. Later on Garnet will also support functional
accesses.
The patch adds functional access code to all the different types of messages
that protocols can send around. These messages are functionally accessed
by going through the buffers maintained by the network entities.
The patch also rectifies some of the bugs found in coherence protocols while
testing the patch.
With this patch applied, functional writes always succeed. But functional
reads can still fail.
|
|
I don't like using the word hack. Hence, the patch.
|
|
This patch implements the functionality for forwarding invalidations and
replacements from the L1 cache of the Ruby memory system to the O3 CPU. The
implementation adds a list of ports to RubyPort. Whenever a replacement or an
invalidation is performed, the L1 cache forwards this to all the ports, which
is the LSQ in case of the O3 CPU.
|
|
This patch rpovides functional access support in Ruby. Currently only
the M5Port of RubyPort supports functional accesses. The support for
functional through the PioPort will be added as a separate patch.
|
|
The access permissions for the directory entries are not being set correctly.
This is because pointers are not used for handling directory entries.
function. get and set functions for access permissions have been added to the
Controller state machine. The changePermission() function provided by the
AbstractEntry and AbstractCacheEntry classes has been exposed to SLICC
code once again. The set_permission() functionality has been removed.
NOTE: Each protocol will have to define these get and set functions in order
to compile successfully.
|
|
Identifying response vnets versus other vnets will allow garnet to
determine which vnets will carry data packets, and which will carry
ctrl packets, and use appropriate buffer sizes (since data packets are larger
than ctrl packets). This in turn allows the orion power model to accurately
estimate buffer power.
|
|
The goal of the patch is to do away with the CacheMsg class currently in use
in coherence protocols. In place of CacheMsg, the RubyRequest class will used.
This class is already present in slicc_interface/RubyRequest.hh. In fact,
objects of class CacheMsg are generated by copying values from a RubyRequest
object.
|
|
This patch converts CacheRequestType to RubyRequestType so that both the
protocol dependent and independent code makes use of the same request type.
|
|
This patch converts AccessModeType to RubyAccessMode so that both the
protocol dependent and independent code uses the same access mode.
|
|
In SLICC, in order to define a type a data type for which it should not
generate any code, the keyword external_type is used. For those data types for
which code should be generated, the keyword structure is used. This patch
eliminates the use of keyword external_type for defining structures. structure
key word can now have an optional attribute external, which would be used for
figuring out whether or not to generate the code for this structure. Also, now
structures can have functions as well data members in them.
|
|
In order to add stall and wait facility for protocols, a keyword
wake_up_dependents was introduced. This patch removes the keyword,
instead this functionality is now implemented as function call.
|
|
In order to add stall and wait facility for protocols, a keyword
wake_up_all_dependents was introduced. This patch removes the keyword,
instead this functionality is now implemented as function call.
|
|
This patch integrates permissions with cache and memory states, and then
automates the setting of permissions within the generated code. No longer
does one need to manually set the permissions within the setState funciton.
This patch will faciliate easier functional access support by always correctly
setting permissions for both cache and memory states.
--HG--
rename : src/mem/slicc/ast/EnumDeclAST.py => src/mem/slicc/ast/StateDeclAST.py
rename : src/mem/slicc/ast/TypeFieldEnumAST.py => src/mem/slicc/ast/TypeFieldStateAST.py
|
|
The patch changes the order in which L1 dcache and icache are looked up when
a request comes in. Earlier, if a request came in for instruction fetch, the
dcache was looked up before the icache, to correctly handle self-modifying
code. But, in the common case, dcache is going to report a miss and the
subsequent icache lookup is going to report a hit. Given the invariant -
caches under the same controller keep track of disjoint sets of cache blocks,
we can move the icache lookup before the dcache lookup. In case of a hit in
the icache, using our invariant, we know that the dcache would have reported
a miss. In case of a miss in the icache, we know that icache would have
missed even if the dcache was looked up before looking up the icache.
Effectively, we are doing the same thing as before, though in the common case,
we expect reduction in the number of lookups. This was empirically confirmed
for MOESI hammer. The ratio lookups to access requests is now about 1.1 to 1.
|
|
By stalling and waiting the mandatory queue instead of recycling it, one can
ensure that no incoming messages are starved when the mandatory queue puts
signficant of pressure on the L1 cache controller (i.e. the ruby memtester).
--HG--
rename : src/mem/slicc/ast/WakeUpDependentsStatementAST.py => src/mem/slicc/ast/WakeUpAllDependentsStatementAST.py
|
|
The purpose of this patch is to change the way CacheMemory interfaces with
coherence protocols. Currently, whenever a cache controller (defined in the
protocol under consideration) needs to carry out any operation on a cache
block, it looks up the tag hash map and figures out whether or not the block
exists in the cache. In case it does exist, the operation is carried out
(which requires another lookup). As observed through profiling of different
protocols, multiple such lookups take place for a given cache block. It was
noted that the tag lookup takes anything from 10% to 20% of the simulation
time. In order to reduce this time, this patch is being posted.
I have to acknowledge that the many of the thoughts that went in to this
patch belong to Brad.
Changes to CacheMemory, TBETable and AbstractCacheEntry classes:
1. The lookup function belonging to CacheMemory class now returns a pointer
to a cache block entry, instead of a reference. The pointer is NULL in case
the block being looked up is not present in the cache. Similar change has
been carried out in the lookup function of the TBETable class.
2. Function for setting and getting access permission of a cache block have
been moved from CacheMemory class to AbstractCacheEntry class.
3. The allocate function in CacheMemory class now returns pointer to the
allocated cache entry.
Changes to SLICC:
1. Each action now has implicit variables - cache_entry and tbe. cache_entry,
if != NULL, must point to the cache entry for the address on which the action
is being carried out. Similarly, tbe should also point to the transaction
buffer entry of the address on which the action is being carried out.
2. If a cache entry or a transaction buffer entry is passed on as an
argument to a function, it is presumed that a pointer is being passed on.
3. The cache entry and the tbe pointers received __implicitly__ by the
actions, are passed __explicitly__ to the trigger function.
4. While performing an action, set/unset_cache_entry, set/unset_tbe are to
be used for setting / unsetting cache entry and tbe pointers respectively.
5. is_valid() and is_invalid() has been made available for testing whether
a given pointer 'is not NULL' and 'is NULL' respectively.
6. Local variables are now available, but they are assumed to be pointers
always.
7. It is now possible for an object of the derieved class to make calls to
a function defined in the interface.
8. An OOD token has been introduced in SLICC. It is same as the NULL token
used in C/C++. If you are wondering, OOD stands for Out Of Domain.
9. static_cast can now taken an optional parameter that asks for casting the
given variable to a pointer of the given type.
10. Functions can be annotated with 'return_by_pointer=yes' to return a
pointer.
11. StateMachine has two new variables, EntryType and TBEType. EntryType is
set to the type which inherits from 'AbstractCacheEntry'. There can only be
one such type in the machine. TBEType is set to the type for which 'TBE' is
used as the name.
All the protocols have been modified to conform with the new interface.
|
|
This patch developed by Nilay Vaish converts all the old GEMS-style ruby
debug calls to the appropriate M5 debug calls.
|
|
This patch allows one to disable migratory sharing for those cache blocks that
are accessed by atomic requests. While the implementations are different
between the token and hammer protocols, the motivation is the same. For
Alpha, LLSC semantics expect that normal loads do not unlock cache blocks that
have been locked by LL accesses. Therefore, locked blocks should not transfer
write permissions when responding to these load requests. Instead, only they
only transfer read permissions so that the subsequent SC access can possibly
succeed.
|
|
This patch fixes several bugs related to previous inconsistent assumptions on
how many tokens the Owner had. Mike Marty should have fixes these bugs years
ago. :)
|
|
The previous slower ruby latencies created a mismatch between the faster M5
cpu models and the much slower ruby memory system. Specifically smp
interrupts were much slower and infrequent, as well as cpus moving in and out
of spin locks. The result was many cpus were idle for large periods of time.
These changes fix the latency mismatch.
|
|
Fixed L2 cache miss profiling for the MOESI_CMP_token protocol
|
|
|