Age | Commit message (Collapse) | Author |
|
Since miscellaneous registers bypass wakeup logic, force serialization
to resolve data dependencies through them
* * *
ARM: adding non-speculative/serialize flags for instructions change CPSR
|
|
For loads that PC is the destination, check if the load
was mispredicted again when the value being loaded returns from memory
|
|
THis allows the CPU to handle predicated-false instructions accordingly.
This particular patch makes loads that are predicated-false to be sent
straight to the commit stage directly, not waiting for return of the data
that was never requested since it was predicated-false.
|
|
|
|
ARMv7
|
|
PC is an operand, so we can't have a temp called PC
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
--HG--
rename : src/dev/arm/Versatile.py => src/dev/arm/RealView.py
rename : src/dev/arm/versatile.cc => src/dev/arm/realview.cc
rename : src/dev/arm/versatile.hh => src/dev/arm/realview.hh
|
|
|
|
|
|
from the symbol table.
|
|
a constant.
This allows one two different OS requirements for the same ISA to be handled.
Some OSes are compiled for a virtual address and need to be loaded into physical
memory that starts at address 0, while other bare metal tools generate
images that start at address 0.
|
|
|
|
|
|
This was being done in read(), but if readBytes was called directly it
wouldn't happen. Also, instead of setting the memory blob being read to -1
which would (I believe) require using memset with -1 as a parameter, this now
uses bzero. It's hoped that it's more specialized behavior will make it
slightly faster.
|
|
|
|
previously was only generating 0s.
|
|
bkt size isn't evenly divisible by max-min and it would round down,
it's possible to sample a distribution and have no place to put the sample.
When this case occured the simulator would assert.
|
|
|
|
|
|
This function is always overridden, and doesn't actually have the right
signature.
|
|
Regression tester updates required by the following patches:
brad/moved_python_protocol_files: config: moved python protocol config files
brad/ruby_options_movement: config: reorganized how ruby specifies command-line options
brad/config_token_bcast: ruby: added token broadcast config params to cmd options
brad/topology_name: config: Added the topology description to m5 config.ini
brad/ruby_system_names: config: Improve ruby simobject names
brad/consolidated_protocol_stats: slicc: Consolidated the protocol stats printing
brad/ruby_request_type_ostream_fix: ruby: Added ruby_request_type ostream def to libruby.hh
brad/memtest_dma_extension: memtest: Memtester support for DMA
brad/token_dma_lockdown_fix: MOESI_CMP_token: Fixed dma persistent lockdown bugs
brad/profile_generic_mach_type: ruby: Reincarnated the responding machine profiling
brad/network_msg_consolidated_stats: ruby: Added consolidated network msg stats
brad/bcast_msg_profiling: ruby: Added bcast msg profiling to hammer and token
brad/l2cache_profiling_fix: ruby: Fixed L2 cache miss profiling
brad/llsc_ruby_m5_fix: ruby: fix ruby llsc support to sync sc outcomes
brad/ruby_latency_fixes: ruby: Reduced ruby latencies
brad/hammer_l2_cache_latency: ruby: Updated MOESI_hammer L2 latency behavior
brad/deterministic_resurrection: ruby: Resurrected Ruby's deterministic tests
brad/token_dma_fixes: ruby: MOESI_CMP_token dma fixes
brad/ruby_cmd_options: config: added cmd options to control ruby debug
brad/token_owner_fixes: ruby: fixed token bugs associated with owner token counts
brad/ruby_remove_try_except: ruby: Improved try except blocks in ruby creation
brad/ruby_port_callback_fix: ruby: Fixed RubyPort sendTiming callbacks
brad/interrupt_drain_fix: devices: Fixed periodic interrupts to work with draining
brad/llsc_trace_profile: ruby: Added SC fail indication to trace profiling
brad/no_migrate_atomic: ruby: Disable migratory sharing for token and hammer
brad/ruby_start_time_fix: ruby: Reset ruby stats in RubySystem unserialize
brad/numa_bit_select_fix: ruby: fixed DirectoryMemory's numa_high_bit configuration
brad/hammer_probe_filter: ruby: added probe filter support to hammer
brad/miss_latency_detail_profile: MOESI_hammer: break down miss latency stalled cycles
brad/recycle_latency_fix: ruby: Recycle latency fix for hammer
brad/stall_and_wait: ruby: Stall and wait input messages instead of recycling
brad/rubytest_request_flag_fix: ruby: Fixed minor bug in ruby test for setting the request type
brad/hammer_merge_gets: ruby: Added merge GETS optimization to hammer
brad/regress_updates: regress: Regression tester updates
|
|
Added an optimization that merges multiple pending GETS requests into a
single request to the owner node.
|
|
|
|
This patch allows messages to be stalled in their input buffers and wait
until a corresponding address changes state. In order to make this work,
all in_ports must be ranked in order of dependence and those in_ports that
may unblock an address, must wake up the stalled messages. Alot of this
complexity is handled in slicc and the specification files simply
annotate the in_ports.
--HG--
rename : src/mem/slicc/ast/CheckAllocateStatementAST.py => src/mem/slicc/ast/StallAndWaitStatementAST.py
rename : src/mem/slicc/ast/CheckAllocateStatementAST.py => src/mem/slicc/ast/WakeUpDependentsStatementAST.py
|
|
Patch allows each individual message buffer to have different recycle latencies
and allows the overall recycle latency to be specified at the cmd line. The
patch also adds profiling info to make sure no one processor's requests are
recycled too much.
|
|
This patch tracks the number of cycles a transaction is delayed at different
points of the request-forward-response loop.
|
|
|
|
This fix includes the off-by-one bit selection bug for numa mapping.
|
|
The main purpose for clearing stats in the unserialize process is so
that the profiler can correctly set its start time to the unserialized
value of curTick.
|
|
This patch allows one to disable migratory sharing for those cache blocks that
are accessed by atomic requests. While the implementations are different
between the token and hammer protocols, the motivation is the same. For
Alpha, LLSC semantics expect that normal loads do not unlock cache blocks that
have been locked by LL accesses. Therefore, locked blocks should not transfer
write permissions when responding to these load requests. Instead, only they
only transfer read permissions so that the subsequent SC access can possibly
succeed.
|
|
|
|
Added drain functions to the RTC and 8254 timer so that periodic interrupts
stop when the system is draining. This patch is needed to checkpoint in
timing mode. Otherwise under certain situations, the event queue will never
be completely empty.
|
|
Fixed RubyPort schedSendTiming calls to match ruby frequency.
|
|
Replaced the sys.exit in the try-except blocks with raise so that the python
call stack will be printed
|
|
This patch fixes several bugs related to previous inconsistent assumptions on
how many tokens the Owner had. Mike Marty should have fixes these bugs years
ago. :)
|
|
|
|
This patch fixes various protocol bugs regarding races between dma requests
and persistent requests.
|
|
Added the request series and invalidate deterministic tests as new cpu models
and removed the no longer needed ruby tests
--HG--
rename : configs/example/rubytest.py => configs/example/determ_test.py
rename : src/mem/ruby/tester/DetermGETXGenerator.cc => src/cpu/directedtest/DirectedGenerator.cc
rename : src/mem/ruby/tester/DetermGETXGenerator.hh => src/cpu/directedtest/DirectedGenerator.hh
rename : src/mem/ruby/tester/DetermGETXGenerator.cc => src/cpu/directedtest/InvalidateGenerator.cc
rename : src/mem/ruby/tester/DetermGETXGenerator.hh => src/cpu/directedtest/InvalidateGenerator.hh
rename : src/cpu/rubytest/RubyTester.cc => src/cpu/directedtest/RubyDirectedTester.cc
rename : src/cpu/rubytest/RubyTester.hh => src/cpu/directedtest/RubyDirectedTester.hh
rename : src/mem/ruby/tester/DetermGETXGenerator.cc => src/cpu/directedtest/SeriesRequestGenerator.cc
rename : src/mem/ruby/tester/DetermGETXGenerator.hh => src/cpu/directedtest/SeriesRequestGenerator.hh
|
|
Previously, the MOESI_hammer protocol calculated the same latency for L1 and
L2 hits. This was because the protocol was written using the old ruby
assumption that L1 hits used the sequencer fast path. Since ruby no longer
uses the fast-path, the protocol delays L2 hits by placing them on the
trigger queue.
|
|
The previous slower ruby latencies created a mismatch between the faster M5
cpu models and the much slower ruby memory system. Specifically smp
interrupts were much slower and infrequent, as well as cpus moving in and out
of spin locks. The result was many cpus were idle for large periods of time.
These changes fix the latency mismatch.
|
|
Added support so that ruby can determine the outcome of store conditional
operations and reflect that outcome to M5 physical memory and cpus.
|