* Networking Networking is even more layered than file systems. Recall the 7 OSI layers. In linux there's even more layers. Below system calls, in networking, there are two APIs: 1. VFS: 2. Socket API (logically at the same level as the VFS): In Unix, sockets appear as "file descriptors" (struct file in the kernel), and you can operate on them with the same syscalls as you can with files: open("/dev/tcp") read(2) and write() on a socket fd close(2) select(2) Other networking calls that don't have a file-system equivalent, go through the Socket API, which looks very much like a VFS: objects with refcounts, locks, methods to operate on the object, etc. Example syscalls include: bind, listen, accept, socket, poll, etc. * SK Buff services SKBuff: Simple kernel buffer (called "mbuf" in BSD/solaris), a data structure for holding networking data, like packets. Most of the network layers can 'talk' to this SKbuf "service". SKbuf services can hold all these objects, with appropriate locks, can allocate/free them, manage them, etc. Alternative: each layer allocates its own data and then passes it to the layers above/below it. Layered abstractions are convenient to design, but hard to implement efficiently. Layering makes it easier to implement changes at each layer, or create new instances of implementations at the same layer (e.g., a new file system just has to follow the VFS API; a new network service "just" has to follow the Socket API layer). Alas, too much layering means too much data copying and de/allocation b/t layers, slowing performance considerably. Solution that Linux picked is a "hybrid": functions are layered, but data is shared through a strictly controlled API. You don't pass random "void*" or any object you want, but specific objects that all layers can understand, and use a specific API for managing those objects. The layers would pass a reference to a common object instead of copying the actual object. * struct skbuff An skb is a buffer of some contiguous no. of bytes. It has a pointer to the "head" and "tail" and some room. The head+tail identify where the "valid" data inside that buffer reside. A used skb often has its data in the middle of the buffer, with head+tail pointing to the start/end of the valid data; but it has unused room bot at the head and tail of the skb. In networking, as payload data goes up and down the layers (e.g., HTTP, TCP/UDP, IP, Ethernet), data gets dis/assembled, with protocol headers/trailers added/removed. By leaving room both before and after the data payload, we can add/remove these headers/trailers very quickly, and adjust the head/tail skb ptrs w/o having to allocate a new/bigger skb. SKB services is effectively a custom memory allocator designed specifically for networking! It includes functions such as 1. skb alloc (of any size) and skb free 2. duplicate an skb 3. split an skb in two (fragment packets on the way down) 4. concatenate skbs together (packet assembly on the way up) * Network hardware NIC: Network Interface Card So far we've concerned ourselves with two layers 1. User processes 2. Kernel Now we're going to add another layer, the hardware 1. User processes 2. Kernel 3. Hardware (e.g., NIC) The reason is that every piece of hardware operates largely independently and has to communicate with the kernel (and main CPU) in some way. What's inside a NIC: - processor (a small CPU to run the NIC) - software: yes, called "firmware" -- a small program to perform the network actions - memory, yes RAM (usually small amounts) - inputs and outputs: yes, it can talk to the main CPU over the PCI bus, and the network itself (e.g., Ethernet using RJ45 cable, wifi). - may even have some persistent flash/EEPROM to hold configurations and be able to receive new firmware. A NIC is a "mini computer", with its own "embedded OS". Thus, a modern OS doesn't really "manage all the hardware" (as per traditional definitions of OS), but more like "coordinates with all the hardware". * Receiving packets from the network Each NIC has its own hardware address (Ethernet or MAC addr) that uniquely identifies THIS host interface. A NIC first has to "listen" to packets on the wire. NICs continuously listed to packets on the wire. But the NIC would only take action IFF the bits its sees on the wire are prefixed with this NIC's MAC address (that's why the MAC addr often shows up at the head of any packet). Example, Ethernet MAC addr is 48 bits. If NIC sees a packet whose MAC address does NOT match its own MAC addr, just ignore. If the NIC sees a packet whose MAC addr matches, then it has to receive these bits and place them into its own memory! ... meaning the NIC has to ensure it has enough room for the new packet.