summaryrefslogtreecommitdiff
path: root/OvmfPkg/VirtioNetDxe/TechNotes.txt
diff options
context:
space:
mode:
Diffstat (limited to 'OvmfPkg/VirtioNetDxe/TechNotes.txt')
-rw-r--r--OvmfPkg/VirtioNetDxe/TechNotes.txt355
1 files changed, 0 insertions, 355 deletions
diff --git a/OvmfPkg/VirtioNetDxe/TechNotes.txt b/OvmfPkg/VirtioNetDxe/TechNotes.txt
deleted file mode 100644
index 9c1dfe6a77..0000000000
--- a/OvmfPkg/VirtioNetDxe/TechNotes.txt
+++ /dev/null
@@ -1,355 +0,0 @@
-## @file
-#
-# Technical notes for the virtio-net driver.
-#
-# Copyright (C) 2013, Red Hat, Inc.
-#
-# This program and the accompanying materials are licensed and made available
-# under the terms and conditions of the BSD License which accompanies this
-# distribution. The full text of the license may be found at
-# http://opensource.org/licenses/bsd-license.php
-#
-# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS IS" BASIS, WITHOUT
-# WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED.
-#
-##
-
-Disclaimer
-----------
-
-All statements concerning standards and specifications are informative and not
-normative. They are made in good faith. Corrections are most welcome on the
-edk2-devel mailing list.
-
-The following documents have been perused while writing the driver and this
-document:
-- Unified Extensible Firmware Interface Specification, Version 2.3.1, Errata C;
- June 27, 2012
-- Driver Writer's Guide for UEFI 2.3.1, 03/08/2012, Version 1.01;
-- Virtio PCI Card Specification, v0.9.5 DRAFT, 2012 May 7.
-
-
-Summary
--------
-
-The VirtioNetDxe UEFI_DRIVER implements the Simple Network Protocol for
-virtio-net devices. Higher level protocols are automatically installed on top
-of it by the DXE Core / the ConnectController() boot service, enabling for
-virtio-net devices eg. DHCP configuration, TCP transfers with edk2 StdLib
-applications, and PXE booting in OVMF.
-
-
-UEFI driver structure
----------------------
-
-A driver instance, belonging to a given virtio-net device, can be in one of
-four states at any time. The states stack up as follows below. The state
-transitions are labeled with the primary function (and its important callees
-faithfully indented) that implement the transition.
-
- | ^
- | |
- [DriverBinding.c] | | [DriverBinding.c]
- VirtioNetDriverBindingStart | | VirtioNetDriverBindingStop
- VirtioNetSnpPopulate | | VirtioNetSnpEvacuate
- VirtioNetGetFeatures | |
- v |
- +-------------------------+
- | EfiSimpleNetworkStopped |
- +-------------------------+
- | ^
- [SnpStart.c] | | [SnpStop.c]
- VirtioNetStart | | VirtioNetStop
- | |
- v |
- +-------------------------+
- | EfiSimpleNetworkStarted |
- +-------------------------+
- | ^
- [SnpInitialize.c] | | [SnpShutdown.c]
- VirtioNetInitialize | | VirtioNetShutdown
- VirtioNetInitRing {Rx, Tx} | | VirtioNetShutdownRx [SnpSharedHelpers.c]
- VirtioRingInit | | VirtioNetShutdownTx [SnpSharedHelpers.c]
- VirtioNetInitTx | | VirtioRingUninit {Tx, Rx}
- VirtioNetInitRx | |
- v |
- +-----------------------------+
- | EfiSimpleNetworkInitialized |
- +-----------------------------+
-
-The state at the top means "nonexistent" and is hence unnamed on the diagram --
-a driver instance actually doesn't exist at that point. The transition
-functions out of and into that state implement the Driver Binding Protocol.
-
-The lower three states characterize an existent driver instance and are all
-states defined by the Simple Network Protocol. The transition functions between
-them are member functions of the Simple Network Protocol.
-
-Each transition function validates its expected source state and its
-parameters. For example, VirtioNetDriverBindingStop will refuse to disconnect
-from the controller unless it's in EfiSimpleNetworkStopped.
-
-
-Driver instance states (Simple Network Protocol)
-------------------------------------------------
-
-In the EfiSimpleNetworkStopped state, the virtio-net device is (has been)
-re-set. No resources are allocated for networking / traffic purposes. The MAC
-address and other device attributes have been retrieved from the device (this
-is necessary for completing the VirtioNetDriverBindingStart transition).
-
-The EfiSimpleNetworkStarted is completely identical to the
-EfiSimpleNetworkStopped state for virtio-net, in the functional and
-resource-usage sense. This state is mandated / provided by the Simple Network
-Protocol for flexibility that the virtio-net driver doesn't exploit.
-
-In particular, the EfiSimpleNetworkStarted state is the target of the Shutdown
-SNP member function, and must therefore correspond to a hardware configuration
-where "[it] is safe for another driver to initialize". (Clearly another UEFI
-driver could not do that due to the exclusivity of the driver binding that
-VirtioNetDriverBindingStart() installs, but a later OS driver might qualify.)
-
-The EfiSimpleNetworkInitialized state is the live state of the virtio NIC / the
-driver instance. Virtio and other resources required for network traffic have
-been allocated, and the following SNP member functions are available (in
-addition to VirtioNetShutdown which leaves the state):
-
-- VirtioNetReceive [SnpReceive.c]: poll the virtio NIC for an Rx packet that
- may have arrived asynchronously;
-
-- VirtioNetTransmit [SnpTransmit.c]: queue a Tx packet for asynchronous
- transmission (meant to be used together with VirtioNetGetStatus);
-
-- VirtioNetGetStatus [SnpGetStatus.c]: query link status and status of pending
- Tx packets;
-
-- VirtioNetMcastIpToMac [SnpMcastIpToMac.c]: transform a multicast IPv4/IPv6
- address into a multicast MAC address;
-
-- VirtioNetReceiveFilters [SnpReceiveFilters.c]: emulate unicast / multicast /
- broadcast filter configuration (not their actual effect -- a more liberal
- filter setting than requested is allowed by the UEFI specification).
-
-The following SNP member functions are not supported [SnpUnsupported.c]:
-
-- VirtioNetReset: reinitialize the virtio NIC without shutting it down (a loop
- from/to EfiSimpleNetworkInitialized);
-
-- VirtioNetStationAddress: assign a new MAC address to the virtio NIC,
-
-- VirtioNetStatistics: collect statistics,
-
-- VirtioNetNvData: access non-volatile data on the virtio NIC.
-
-Missing support for these functions is allowed by the UEFI specification and
-doesn't seem to trip up higher level protocols.
-
-
-Events and task priority levels
--------------------------------
-
-The UEFI specification defines a sophisticated mechanism for asynchronous
-events / callbacks (see "6.1 Event, Timer, and Task Priority Services" for
-details). Such callbacks work like software interrupts, and some notion of
-locking / masking is important to implement critical sections (atomic or
-exclusive access to data or a device). This notion is defined as Task Priority
-Levels.
-
-The virtio-net driver for OVMF must concern itself with events for two reasons:
-
-- The Simple Network Protocol provides its clients with a (non-optional) WAIT
- type event called WaitForPacket: it allows them to check or wait for Rx
- packets by polling or blocking on this event. (This functionality overlaps
- with the Receive member function.) The event is available to clients starting
- with EfiSimpleNetworkStopped (inclusive).
-
- The virtio-net driver is informed about such client polling or blockage by
- receiving an asynchronous callback (a software interrupt). In the callback
- function the driver must interrogate the driver instance state, and if it is
- EfiSimpleNetworkInitialized, access the Rx queue and see if any packets are
- available for consumption. If so, it must signal the WaitForPacket WAIT type
- event, waking the client.
-
- For simplicity and safety, all parts of the virtio-net driver that access any
- bit of the driver instance (data or device) run at the TPL_CALLBACK level.
- This is the highest level allowed for an SNP implementation, and all code
- protected in this manner satisfies even stricter non-blocking requirements
- than what's documented for TPL_CALLBACK.
-
- The task priority level for the WaitForPacket callback too is set by the
- driver, the choice is TPL_CALLBACK again. This in effect serializes the
- WaitForPacket callback (VirtioNetIsPacketAvailable [Events.c]) with "normal"
- parts of the driver.
-
-- According to the Driver Writer's Guide, a network driver should install a
- callback function for the global EXIT_BOOT_SERVICES event (a special NOTIFY
- type event). When the ExitBootServices() boot service has cleaned up internal
- firmware state and is about to pass control to the OS, any network driver has
- to stop any in-flight DMA transfers, lest it corrupts OS memory. For this
- reason EXIT_BOOT_SERVICES is emitted and the network driver must abort
- in-flight DMA transfers.
-
- This callback (VirtioNetExitBoot) is synchronized with the rest of the driver
- code just the same as explained for WaitForPacket. In
- EfiSimpleNetworkInitialized state it resets the virtio NIC, halting all data
- transfer. After the callback returns, no further driver code is expected to
- be scheduled.
-
-
-Virtio internals -- Rx
-----------------------
-
-Requests (Rx and Tx alike) are always submitted by the guest and processed by
-the host. For Tx, processing means transmission. For Rx, processing means
-filling in the request with an incoming packet. Submitted requests exist on the
-"Available Ring", and answered (processed) requests show up on the "Used Ring".
-
-Packet data includes the media (Ethernet) header: destination MAC, source MAC,
-and Ethertype (14 bytes total).
-
-The following structures implement packet reception. Most of them are defined
-in the Virtio specification, the only driver-specific trait here is the static
-pre-configuration of the two-part descriptor chains, in VirtioNetInitRx. The
-diagram is simplified.
-
- Available Index Available Index
- last processed incremented
- by the host by the guest
- v -------> v
-Available +-------+-------+-------+-------+-------+
-Ring |DescIdx|DescIdx|DescIdx|DescIdx|DescIdx|
- +-------+-------+-------+-------+-------+
- =D6 =D2
-
- D2 D3 D4 D5 D6 D7
-Descr. +----------+----------++----------+----------++----------+----------+
-Table |Adr:Len:Nx|Adr:Len:Nx||Adr:Len:Nx|Adr:Len:Nx||Adr:Len:Nx|Adr:Len:Nx|
- +----------+----------++----------+----------++----------+----------+
- =A2 =D3 =A3 =A4 =D5 =A5 =A6 =D7 =A7
-
-
- A2 A3 A4 A5 A6 A7
-Receive +---------------+---------------+---------------+
-Destination |vnet hdr:packet|vnet hdr:packet|vnet hdr:packet|
-Area +---------------+---------------+---------------+
-
- Used Index Used Index incremented
- last processed by the guest by the host
- v -------> v
-Used +-----------+-----------+-----------+-----------+-----------+
-Ring |DescIdx:Len|DescIdx:Len|DescIdx:Len|DescIdx:Len|DescIdx:Len|
- +-----------+-----------+-----------+-----------+-----------+
- =D4
-
-In VirtioNetInitRx, the guest allocates the fixed size Receive Destination
-Area, which accommodates all packets delivered asynchronously by the host. To
-each packet, a slice of this area is dedicated; each slice is further
-subdivided into virtio-net request header and network packet data. The
-(guest-physical) addresses of these sub-slices are denoted with A2, A3, A4 and
-so on. Importantly, an even-subscript "A" always belongs to a virtio-net
-request header, while an odd-subscript "A" always belongs to a packet
-sub-slice.
-
-Furthermore, the guest lays out a static pattern in the Descriptor Table. For
-each packet that can be in-flight or already arrived from the host,
-VirtioNetInitRx sets up a separate, two-part descriptor chain. For packet N,
-the Nth descriptor chain is set up as follows:
-
-- the first (=head) descriptor, with even index, points to the fixed-size
- sub-slice receiving the virtio-net request header,
-
-- the second descriptor (with odd index) points to the fixed (1514 byte) size
- sub-slice receiving the packet data,
-
-- a link from the first (head) descriptor in the chain is established to the
- second (tail) descriptor in the chain.
-
-Finally, the guest populates the Available Ring with the indices of the head
-descriptors. All descriptor indices on both the Available Ring and the Used
-Ring are even.
-
-Packet reception occurs as follows:
-
-- The host consumes a descriptor index off the Available Ring. This index is
- even (=2*N), and fingers the head descriptor of the chain belonging to packet
- N.
-
-- The host reads the descriptors D(2*N) and -- following the Next link there
- --- D(2*N+1), and stores the virtio-net request header at A(2*N), and the
- packet data at A(2*N+1).
-
-- The host places the index of the head descriptor, 2*N, onto the Used Ring,
- and sets the Len field in the same Used Ring Element to the total number of
- bytes transferred for the entire descriptor chain. This enables the guest to
- identify the length of Rx packets.
-
-- VirtioNetReceive polls the Used Ring. If a new Used Ring Element shows up, it
- copies the data out to the caller, and recycles the index of the head
- descriptor (ie. 2*N) to the Available Ring.
-
-- Because the host can process (answer) Rx requests in any order theoretically,
- the order of head descriptor indices on each of the Available Ring and the
- Used Ring is virtually random. (Except right after the initial population in
- VirtioNetInitRx, when the Available Ring is full and increasing, and the Used
- Ring is empty.)
-
-- If the Available Ring is empty, the host is forced to drop packets. If the
- Used Ring is empty, VirtioNetReceive returns EFI_NOT_READY (no packet
- available).
-
-
-Virtio internals -- Tx
-----------------------
-
-The transmission structure erected by VirtioNetInitTx is similar, it differs
-in the following:
-
-- There is no Receive Destination Area.
-
-- Each head descriptor, D(2*N), points to a read-only virtio-net request header
- that is shared by all of the head descriptors. This virtio-net request header
- is never modified by the host.
-
-- Each tail descriptor is re-pointed to the caller-supplied packet buffer
- whenever VirtioNetTransmit places the corresponding head descriptor on the
- Available Ring. The caller is responsible to hang on to the unmodified buffer
- until it is reported transmitted by VirtioNetGetStatus.
-
-Steps of packet transmission:
-
-- Client code calls VirtioNetTransmit. VirtioNetTransmit tracks free descriptor
- chains by keeping the indices of their head descriptors in a stack that is
- private to the driver instance. All elements of the stack are even.
-
-- If the stack is empty (that is, each descriptor chain, in isolation, is
- either pending transmission, or has been processed by the host but not
- yet recycled by a VirtioNetGetStatus call), then VirtioNetTransmit returns
- EFI_NOT_READY.
-
-- Otherwise the index of a free chain's head descriptor is popped from the
- stack. The linked tail descriptor is re-pointed as discussed above. The head
- descriptor's index is pushed on the Available Ring.
-
-- The host moves the head descriptor index from the Available Ring to the Used
- Ring when it transmits the packet.
-
-- Client code calls VirtioNetGetStatus. In case the Used Ring is empty, the
- function reports no Tx completion. Otherwise, a head descriptor's index is
- consumed from the Used Ring and recycled to the private stack. The client
- code's original packet buffer address is fetched from the tail descriptor
- (where it has been stored at VirtioNetTransmit time) and returned to the
- caller.
-
-- The Len field of the Used Ring Element is not checked. The host is assumed to
- have transmitted the entire packet -- VirtioNetTransmit had forced it below
- 1514 bytes (inclusive). The Virtio specification suggests this packet size is
- always accepted (and a lower MTU could be encountered on any later hop as
- well). Additionally, there's no good way to report a short transmit via
- VirtioNetGetStatus; EFI_DEVICE_ERROR seems too serious from the specification
- and higher level protocols could interpret it as a fatal condition.
-
-- The host can theoretically reorder head descriptor indices when moving them
- from the Available Ring to the Used Ring (out of order transmission). Because
- of this (and the choice of a stack over a list for free descriptor chain
- tracking) the order of head descriptor indices on either Ring is
- unpredictable.