Age | Commit message (Collapse) | Author |
|
This patch fixes a long-standing isue with the port flow
control. Before this patch the retry mechanism was shared between all
different packet classes. As a result, a snoop response could get
stuck behind a request waiting for a retry, even if the send/recv
functions were split. This caused message-dependent deadlocks in
stress-test scenarios.
The patch splits the retry into one per packet (message) class. Thus,
sendTimingReq has a corresponding recvReqRetry, sendTimingResp has
recvRespRetry etc. Most of the changes to the code involve simply
clarifying what type of request a specific object was accepting.
The biggest change in functionality is in the cache downstream packet
queue, facing the memory. This queue was shared by requests and snoop
responses, and it is now split into two queues, each with their own
flow control, but the same physical MasterPort. These changes fixes
the previously seen deadlocks.
|
|
This patch changes the SimpleTimingPort and RubyPort to panic on
inhibited requests as this should never happen in either of the
cases. The SimpleTimingPort is only used for the I/O devices PIO port
and the DMA devices config port and should thus never see an inhibited
request. Similarly, the SimpleTimingPort is also used for the
MessagePort in x86, and there should also not be any cases where the
port sees an inhibited request.
|
|
This patch extends the queued port interfaces with methods for
scheduling the transmission of a timing request/response. The methods
are named similar to the corresponding sendTiming(Snoop)Req/Resp,
replacing the "send" with "sched". As the queues are currently
unbounded, the methods always succeed and hence do not return a value.
This functionality was previously provided in the subclasses by
calling PacketQueue::schedSendTiming with the appropriate
parameters. With this change, there is no need to introduce these
extra methods in the subclasses, and the use of the queued interface
is more uniform and explicit.
|
|
This patch makes the queue implementation in the SimpleTimingPort
private to avoid confusion with the protected member queue in the
QueuedSlavePort. The SimpleTimingPort provides the queue_impl to the
QueuedSlavePort and it can be accessed via the reference in the base
class. The use of the member name queue is thus no longer overloaded.
|
|
This patch is a temporary fix until Andreas' four-phase patches
get reviewed and committed. Removing FastAlloc seems to have exposed
an issue which previously was reasonable rare in which packets are freed
before the sending cache is done with them. This change puts incoming packets
no a pendingDelete queue which are deleted at the start of the next call and
thus breaks the dependency between when the caller returns true and when the
packet is actually used by the sending cache.
Running valgrind on a multi-core linux boot and the memtester results in no
valgrind warnings.
|
|
This patch moves send/recvTiming and send/recvTimingSnoop from the
Port base class to the MasterPort and SlavePort, and also splits them
into separate member functions for requests and responses:
send/recvTimingReq, send/recvTimingResp, and send/recvTimingSnoopReq,
send/recvTimingSnoopResp. A master port sends requests and receives
responses, and also receives snoop requests and sends snoop
responses. A slave port has the reciprocal behaviour as it receives
requests and sends responses, and sends snoop requests and receives
snoop responses.
For all MemObjects that have only master ports or slave ports (but not
both), e.g. a CPU, or a PIO device, this patch merely adds more
clarity to what kind of access is taking place. For example, a CPU
port used to call sendTiming, and will now call
sendTimingReq. Similarly, a response previously came back through
recvTiming, which is now recvTimingResp. For the modules that have
both master and slave ports, e.g. the bus, the behaviour was
previously relying on branches based on pkt->isRequest(), and this is
now replaced with a direct call to the apprioriate member function
depending on the type of access. Please note that send/recvRetry is
still shared by all the timing accessors and remains in the Port base
class for now (to maintain the current bus functionality and avoid
changing the statistics of all regressions).
The packet queue is split into a MasterPort and SlavePort version to
facilitate the use of the new timing accessors. All uses of the
PacketQueue are updated accordingly.
With this patch, the type of packet (request or response) is now well
defined for each type of access, and asserts on pkt->isRequest() and
pkt->isResponse() are now moved to the appropriate send member
functions. It is also worth noting that sendTimingSnoopReq no longer
returns a boolean, as the semantics do not alow snoop requests to be
rejected or stalled. All these assumptions are now excplicitly part of
the port interface itself.
|
|
This patch introduces the notion of a master and slave port in the C++
code, thus bringing the previous classification from the Python
classes into the corresponding simulation objects and memory objects.
The patch enables us to classify behaviours into the two bins and add
assumptions and enfore compliance, also simplifying the two
interfaces. As a starting point, isSnooping is confined to a master
port, and getAddrRanges to slave ports. More of these specilisations
are to come in later patches.
The getPort function is not getMasterPort and getSlavePort, and
returns a port reference rather than a pointer as NULL would never be
a valid return value. The default implementation of these two
functions is placed in MemObject, and calls fatal.
The one drawback with this specific patch is that it requires some
code duplication, e.g. QueuedPort becomes QueuedMasterPort and
QueuedSlavePort, and BusPort becomes BusMasterPort and BusSlavePort
(avoiding multiple inheritance). With the later introduction of the
port interfaces, moving the functionality outside the port itself, a
lot of the duplicated code will disappear again.
|
|
This patch decouples the queueing and the port interactions to
simplify the introduction of the master and slave ports. By separating
the queueing functionality from the port itself, it becomes much
easier to distinguish between master and slave ports, and still retain
the queueing ability for both (without code duplication).
As part of the split into a PacketQueue and a port, there is now also
a hierarchy of two port classes, QueuedPort and SimpleTimingPort. The
QueuedPort is useful for ports that want to leave the packet
transmission of outgoing packets to the queue and is used by both
master and slave ports. The SimpleTimingPort inherits from the
QueuedPort and adds the implemention of recvTiming and recvFunctional
through recvAtomic.
The PioPort and MessagePort are cleaned up as part of the changes.
--HG--
rename : src/mem/tport.cc => src/mem/packet_queue.cc
rename : src/mem/tport.hh => src/mem/packet_queue.hh
|
|
This patch splits the two cache ports into a master (memory-side) and
slave (cpu-side) subclass of port with slightly different
functionality. For example, it is only the CPU-side port that blocks
incoming requests, and only the memory-side port that schedules send
events outside of what the transmit list dictates.
This patch simplifies the two classes by relying further on
SimpleTimingPort and also generalises the latter to better accommodate
the changes (introducing trySendTiming and scheduleSend). The
memory-side cache port overrides sendDeferredPacket to be able to not
only send responses from the transmit list, but also send requests
based on the MSHRs.
A follow on patch further simplifies the SimpleTimingPort and the
cache ports.
|
|
This patch removes the inheritance of EventManager from the ports and
moves all responsibility for event queues to the owner. Eventually the
event manager should be the interface block, which could either be the
structural owner or a subblock like a LSQ in the O3 CPU for example.
|
|
At the same time, rename the trace flags to debug flags since they
have broader usage than simply tracing. This means that
--trace-flags is now --debug-flags and --trace-help is now --debug-help
|
|
This step makes it easy to replace the accessor functions
(which still access a global variable) with ones that access
per-thread curTick values.
|
|
not scheduled, as well as the transmit list being empty.
|
|
|
|
Move the constructor into the .cc file and get rid of the typedef for
SendEvent.
|
|
|
|
memory range.
|
|
For now, there is still a single global event queue, but this is
necessary for making the steps towards a parallelized m5.
|
|
It runs out that if a MemObject turns around and does a send in its
receive callback, and there are other sends already scheduled, then
it could observe a state where it's not at the head of the list but
the bus's sendEvent is not scheduled (because we're still in the
middle of processing the prior sendEvent).
|
|
when the Packet is deleted, since the requester
can't possibly do it.
--HG--
extra : convert_revision : 8571b144ecb3c70efc06d09faa8b3161fb58352d
|
|
--HG--
extra : convert_revision : 31aad7170b35556a4c984f4ebc013137d55d85eb
|
|
Make sure not to keep processing functional accesses
after they've been responded to.
Also use checkFunctional() return value instead of checking
packet command field where possible, mostly just for consistency.
--HG--
extra : convert_revision : 29fc76bc18731bd93a4ed05a281297827028ef75
|
|
--HG--
extra : convert_revision : 76377ca2e4d7ea70d1d54d325a63ce710e260b93
|
|
Handled by Packet::checkFunctional() now.
--HG--
extra : convert_revision : 63642254e2789c80a369ac269f317ec054ffe3c0
|
|
now encoded in cmd field.
--HG--
extra : convert_revision : d67819b7e3ee4b9a5bf08541104de0a89485e90b
|
|
--HG--
extra : convert_revision : 703da6128832eb0d5cfed7724e5105f4b3fe4f90
|
|
sure we don't re-request bus prematurely. Use callback to
avoid calling sendRetry() recursively within recvTiming.
--HG--
extra : convert_revision : a907a2781b4b00aa8eb1ea7147afc81d6b424140
|
|
timing mode still broken.
configs/example/memtest.py:
Revamp options.
src/cpu/memtest/memtest.cc:
No need for memory initialization.
No need to make atomic response... memory system should do that now.
src/cpu/memtest/memtest.hh:
MemTest really doesn't want to snoop.
src/mem/bridge.cc:
checkFunctional() cleanup.
src/mem/bus.cc:
src/mem/bus.hh:
src/mem/cache/base_cache.cc:
src/mem/cache/base_cache.hh:
src/mem/cache/cache.cc:
src/mem/cache/cache.hh:
src/mem/cache/cache_blk.hh:
src/mem/cache/cache_builder.cc:
src/mem/cache/cache_impl.hh:
src/mem/cache/coherence/coherence_protocol.cc:
src/mem/cache/coherence/coherence_protocol.hh:
src/mem/cache/coherence/simple_coherence.hh:
src/mem/cache/miss/SConscript:
src/mem/cache/miss/mshr.cc:
src/mem/cache/miss/mshr.hh:
src/mem/cache/miss/mshr_queue.cc:
src/mem/cache/miss/mshr_queue.hh:
src/mem/cache/prefetch/base_prefetcher.cc:
src/mem/cache/tags/fa_lru.cc:
src/mem/cache/tags/fa_lru.hh:
src/mem/cache/tags/iic.cc:
src/mem/cache/tags/iic.hh:
src/mem/cache/tags/lru.cc:
src/mem/cache/tags/lru.hh:
src/mem/cache/tags/split.cc:
src/mem/cache/tags/split.hh:
src/mem/cache/tags/split_lifo.cc:
src/mem/cache/tags/split_lifo.hh:
src/mem/cache/tags/split_lru.cc:
src/mem/cache/tags/split_lru.hh:
src/mem/packet.cc:
src/mem/packet.hh:
src/mem/physical.cc:
src/mem/physical.hh:
src/mem/tport.cc:
More major reorg. Seems to work for atomic mode now,
timing mode still broken.
--HG--
extra : convert_revision : 7e70dfc4a752393b911880ff028271433855ae87
|
|
Oops... forgot to update call site after changing
function argument semantics.
src/mem/tport.cc:
Oops... forgot to update call site after changing
function argument semantics.
--HG--
extra : convert_revision : 9234b991dc678f062d268ace73c71b3d13dd17dc
|
|
Make it a better base class for cache ports.
--HG--
extra : convert_revision : 37d6de11545a68c1a7d11ce33fe5971c51434ee4
|
|
- factor out checkFunctional() code so it can be
called from derived classes
- use EventWrapper for sendEvent, move event handling
code from event to port where it belongs
- make sendEvent a pointer so derived classes can
override it
- replace std::pair with new class for readability
--HG--
extra : convert_revision : 5709de2daacfb751a440144ecaab5f9fc02e6b7a
|
|
add seperate response buffers and request queue sizes in bus bridge
add delay to respond to a nack in the bus bridge
src/dev/i8254xGBe.cc:
src/dev/ide_ctrl.cc:
src/dev/ns_gige.cc:
src/dev/pcidev.hh:
src/dev/sinic.cc:
add backoff delay parameters
src/dev/io_device.cc:
src/dev/io_device.hh:
add a backoff algorithm when nacks are received.
src/mem/bridge.cc:
src/mem/bridge.hh:
add seperate response buffers and request queue sizes
add a new parameters to specify how long before a nack in ready to go after a packet that needs to be nacked is received
src/mem/cache/cache_impl.hh:
assert on the
src/mem/tport.cc:
add a friendly assert to make sure the packet was inserted into the list
--HG--
extra : convert_revision : 3595ad932015a4ce2bb72772da7850ad91bd09b1
|
|
Created MemCmd class to wrap enum and provide handy methods to
check attributes, convert to string/int, etc.
--HG--
extra : convert_revision : 57f147ad893443e3a2040c6d5b4cdb1a8033930b
|
|
If you get inserted in the front, reschedule the event
--HG--
extra : convert_revision : eccbacf5ec85600e5b68eb554fee2c0e2b65e965
|
|
write intersections.
src/mem/packet.cc:
Make sure to copy the whole data (we were one byte short)
src/mem/tport.cc:
Fix for the proper semantics of fixPacket
--HG--
extra : convert_revision : 215e05db9099d427afd4994f5b29079354c847d8
|
|
--HG--
extra : convert_revision : 5422025f74ba7013f98d1d1dcbd1070f580aae61
|
|
scan all packets on a functional access.
--HG--
extra : convert_revision : c735a6408443b5cc90d1c1841c7aeb61e02ec6ae
|
|
--HG--
extra : convert_revision : 5ddb6ae5d5412f062c07c16a27b79483430b5f22
|
|
into zazzer.eecs.umich.edu:/z/rdreslin/m5bk/newmemcleanest
src/mem/tport.cc:
Merge PacketPtr changes
--HG--
extra : convert_revision : 0329c5803a3df67af3dda89bd9d4753fd1a286d1
|
|
Fix fixPacket assert function.
Stop timing port from forwarding the request if a response was found in its queue on a read.
src/cpu/memtest/memtest.cc:
src/cpu/memtest/memtest.hh:
src/python/m5/objects/MemTest.py:
Add parameter to configure what percentage of mem accesses are functional
src/mem/cache/base_cache.cc:
src/mem/cache/cache_impl.hh:
Use fix Packet function
src/mem/packet.cc:
Fix an assert that was checking the wrong thing
src/mem/tport.cc:
Properly detect if we need to do the access to the functional device
--HG--
extra : convert_revision : 447cc1a9a65ddd2a41e937fb09dc0e7c74e9c75e
|
|
--HG--
extra : convert_revision : d9eb83ab77ffd2d725961f295b1733137e187711
|
|
after sending out a request.
Still need to rework upgrades into this system, but works for now.
src/mem/cache/base_cache.cc:
Re order code to be more readable
src/mem/cache/base_cache.hh:
Be sure to delete the copy on a bus block
src/mem/cache/cache_impl.hh:
Be sure to remove the copy on a writeback success
src/mem/cache/miss/mshr_queue.cc:
Demorgans to make it easier to understand
src/mem/tport.cc:
Delete writebacks
--HG--
extra : convert_revision : 9519fb37b46ead781d340de29bb342a322a6a92e
|
|
fixPacket() should be used anywhere a functional packet and timing packet are found to have the same address.
--HG--
extra : convert_revision : 783ec438271b24ddb0ae742b4efd1ed7d6be93f3
|
|
The response queue is not tying up an MSHR, should we change that or assume infinite storage for responses?
src/mem/cache/base_cache.cc:
src/mem/tport.cc:
Add in functional check of retry queued packets.
--HG--
extra : convert_revision : 0cb40b3a96d37a5e9eec95312d660ec6a9ce526a
|
|
src/mem/bus.cc:
Add debugging statement
src/mem/bus.hh:
Fix implementation of bus for subsequent recvTimings while handling a retry request.
src/mem/tport.cc:
Rework timing port to retry properly
--HG--
extra : convert_revision : fbfb5e8b4a625e49c6cd764da1df46a4f336b1b2
|
|
src/mem/tport.cc:
minor formatting tweak
--HG--
extra : convert_revision : 7391d142815c5876fcc0f991bd053e6a1781c101
|
|
packet was waiting for the bus.
--HG--
extra : convert_revision : 29f7a4f676884330d7b7e93517dea85fc7bbf678
|
|
Fix an issue with memory handling writebacks.
src/mem/cache/base_cache.hh:
src/mem/tport.cc:
Only respond if the pkt needs a response.
src/mem/physical.cc:
Make physical memory respond to writebacks, set satisfied for invalidates/upgrades.
--HG--
extra : convert_revision : 7601987a7923e54a6d1a168def4f8133d8de19fd
|
|
allowing derived classes to be simplified.
--HG--
extra : convert_revision : c980d3aec5e6c044d8f41e96252726fe9a256605
|
|
Make PioPort use it
Make Physical memory use it as well
src/SConscript:
Add timing port to sconscript
src/dev/io_device.cc:
src/dev/io_device.hh:
Move simple timing pio port stuff into a simple timing port class so it can be used by the physical memory
src/mem/physical.cc:
src/mem/physical.hh:
use a simple timing port stuff instead of rolling our own here
--HG--
extra : convert_revision : e5befbd295a572568cfdca533efb5ed1984c59d1
|