Packet Reception

Receiving data from the network is trickier than transmitting it because an sk_buff must be allocated and handed off to the upper layers from within an interrupt handler. The usual way to receive a packet is through an interrupt, unless the interface is a purely software one like snull or the loopback interface. Although it is possible to write polling drivers, and a few exist in the official kernel, interrupt-driven operation is much better, both in terms of data throughput and computational demands. Because most network interfaces are interrupt driven, we won’t talk about the polling implementation, which just exploits kernel timers.

The implementation of snull separates the “hardware” details from the device-independent housekeeping. The function snull_rx is thus called after the hardware has received the packet and it is already in the computer’s memory. snull_rx receives a pointer to the data and the length of the packet; its sole responsibility is to send the packet and some additional information to the upper layers of networking code. This code is independent of the way the data pointer and length are obtained.

void snull_rx(struct net_device *dev, int len, unsigned char *buf)
{
    struct sk_buff *skb;
    struct snull_priv *priv = (struct snull_priv *) dev->priv;

    /*
     * The packet has been retrieved from the transmission
     * medium. Build an skb around it, so upper layers can handle it
     */
    skb = dev_alloc_skb(len+2);
    if (!skb) {
        printk("snull rx: low on mem - packet dropped\n");
        priv->stats.rx_dropped++;
        return;
    }
    memcpy(skb_put(skb, len), buf, len);

    /* Write metadata, and then pass to the receive level */
    skb->dev = dev;
    skb->protocol = eth_type_trans(skb, dev);
    skb->ip_summed = CHECKSUM_UNNECESSARY; /* don't check it */
    priv->stats.rx_packets++;
    priv->stats.rx_bytes += len;
    netif_rx(skb);
    return;
}

The function is sufficiently general to act as a template for any network driver, but some explanation is necessary before you can reuse this code fragment with confidence.

The first step is to allocate a buffer to hold the packet. Note that the buffer allocation function (dev_alloc_skb) needs to know the data length. The information is used by the function to allocate space for the buffer. dev_alloc_skb calls kmalloc with atomic priority; it can thus be used safely at interrupt time. The kernel offers other interfaces to socket-buffer allocation, but they are not worth introducing here; socket buffers are explained in detail in Section 14.9, later in this chapter.

Once there is a valid skb pointer, the packet data is copied into the buffer by calling memcpy; the skb_put function updates the end-of-data pointer in the buffer and returns a pointer to the newly created space.

If you are writing a high-performance driver for an interface that can do full bus-mastering I/O, there is a possible optimization that is worth considering here. Some drivers allocate socket buffers for incoming packets prior to their reception, then instruct the interface to place the packet data directly into the socket buffer’s space. The networking layer cooperates with this strategy by allocating all socket buffers in DMA-capable space. Doing things this way avoids the need for a separate copy operation to fill the socket buffer, but requires being careful with buffer sizes because you won’t know in advance how big the incoming packet is. The implementation of a change_mtu method is also important in this situation, since it allows the driver to respond to a change in the maximum packet size.

The network layer needs to have some information spelled out before it will be able to make sense of the packet. To this end, the dev and protocol fields must be assigned before the buffer is passed upstairs. Then we need to specify how checksumming is to be performed or has been performed on the packet (snull does not need to perform any checksums). The possible policies for skb->ip_summed are as follows:

CHECKSUM_HW

The device has already performed checksums in hardware. An example of a hardware checksum is the SPARC HME interface.

CHECKSUM_NONE

Checksums are still to be verified, and the task must be accomplished by system software. This is the default in newly allocated buffers.

CHECKSUM_UNNECESSARY

Don’t do any checksums. This is the policy in snull and in the loopback interface.

Finally, the driver updates its statistics counter to record that a packet has been received. The statistics structure is made up of several fields; the most important are rx_packets, rx_bytes, tx_packets, and tx_bytes, which contain the number of packets received and transmitted and the total number of octets transferred. All the fields are thoroughly described in Section 14.12 later in this chpater.

The last step in packet reception is performed by netif_rx, which hands off the socket buffer to the upper layers.

Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.