Chapter 1 - Introduction

A collection of related protocols is called a protocol suite.

TCP/IP stack is a protocol suite originated from ARPANET Reference Model (ARM) [RFC0871].

1.1 Architectural Principles

TCP/IP lays the foundation of today's global Internet, a wide area network (WAN).

WWW is the application utilizing Internet for communication.

Goals in creating the Internet.
Primary Goal:

"Develop an effective technique for multiplexed utilization of existing interconnected networks." ------ Clark

1.1.1 Packets, Connections, and Datagrams

Up to 1960s the concept of a network was based on telephone network.

The concept of packet switching was developed in 1960s.

Chunks (packets) of bytes can be carried through network independently.

Chunks from different sources can be mixed together and pulled apart later, this is called multiplexing.

Packet switch stores packets in buffer memory or queue, and process them in a FIFO methodology to manage received packets.

Statistical multiplexing, time-division multiplexing, static multiplexing.

Virtual circuits (VCs) from protocol X.25 can be implemented atop connection-oriented packets, and was largely replaced by Frame Relay and ultimately digital subscriber line (DSL) technology in 1990s.

Datagrams was developed in the late 1960s, with all the identifying information about the source and destination being in the packet itself instead of being in the packet switch, which allowed a connectionless network to be built and eliminating the need for a complicated signaling protocol.

Message boundaries or record markers were the other related concept.

1.1.2 The End-to-End Argument and Fate Sharing

End-to-End Argument stated that correctness and completeness can only be achieved by involving the application or ultimate user of the communication system. It argues that important functions like error control, encryption, delivery acknowledgement should usually not be implemented at low levels or layers of large systems. A dumb network with smart end hosts.

TCP/IP follows End-to-End Argument that methods to ensure data is not lost, controlling the rate at which a sender sends are implemented in the end hosts where the applications reside.

Fate Sharing suggests all the necessary state to maintain an active communication association be kept at the same location with the communicating endpoints.

The question nowadays is that what functions reside in the network and what functions do not.

1.1.3 Error Control and Flow Control

According to End-to-End Argument and Fate Sharing, error control should be implemented close to or within applications.

In circuit-switched or VC-switched networks such as X.25, retransmission tends to be done inside the network.

Best-effort delivery was adopted by Frame Relay and the Internet Protocol where errant datagram is merely discraded.

If best-effort is adopted, a fast sender which can produce information at a rate that exceeds the receiver's ability to consume should be slowed down by the flow control mechanism. Flow control operates outside the network at higher levels of the communication system.

1.2 Design and Implementation

THE multiprogramming system advocated the use of a hierarchical structure to deal with verification of the logical soundness and correctness of a large software implementation. This led to the layer design of implementation of network protocol suites. And this approach is called layering.

1.2.1 Layering

With layering, each layer is responsible for a specific functionality in internet communication.

The concept of protocol layering is based on a standard called the Open Systems Interconnection (OSI) model defined by the International Organization for Standardization (ISO).

OSI 7 layer model.

1.2.2 Multiplexing, Demultiplexing, and Encapsulation in Layered Implementations

Layered architecture allows protocol multiplexing, which made possible of multiple different protocols to coexist and multiple instantiations of the same protocol object (connections) to be used simultaneously.

Each packet carries a protocol identifier filed value in each layer indicating what protocol it is using.

When a message or packet, called a protocol data unit (PDU), at one layer is carried by a lower layer it is said to be encapsulated by the next layer down.

Each layer has its own concept of a message object (a PDU) corresponding to the particular layer responsible for creating it. A layer 4 (transport) protocol produces a packet called 4 PDU or transport PDU (TPDU).

Each layer promises not to look into the PDU provided by the layer above. This is encapsulation, PDUs are opaque to layers down.

In encapsulation, each layer prepend header to the PDU received from above layer, sometimes trailers.

In TCP/IP networks the identifiers are commonly hardware addresses, IP addresses, and port numbers, state information.

A simple network connection.
The two hosts on either side is called end system. The router in the middle is called the intermediate system for a particular protocol suite. Layers above the network layer use end-to-end protocols. The network layer provides a hop-by-hop protocol. It's used by the end systems and every intermediate system. The switch (bridge) is not ordinarily an intermediate system. It is not addressed using the internetowling protocol's addressing format and it operates in a fashion that is largely transparent to the network-layer protocol.

Any system with multiple interfaces is called multihomed.

1.3 The Architecture and Protocols of the TCP/IP Suite

1.3.1 The ARPANET Reference Model

ARPANET Reference Model.

The PDU that IP sends to link-layer protocol is called an IP datagram. It may be as large as 64KB (and up to 4GB with IPv6). IP datagram is also referred to as packet.

Link-layer PDUs are called frames. They can be further fragmented by the fragmentation function.

The process of determining and sending the datagram to its next hop is called forwarding.

There are three types of IP addresses and they affect how forwarding is performed: unicast, broadcast, multicast.

The Internet Control Message Protocol (ICMP) is an adjunct to IP, and is labeled as layer 3.5 protocol. It is used by the IP layer to exchange error messages and other vital information with the IP layer in another host or router.

ICMPv6 is more complex and includes functions such as address autoconfiguration and Neighbor Discovery.

ICMP is primarily used by IP, but it can be used by applications too like ping and traceroute.

The Internet Group Management Protocol (IGMP) is another protocol adjunct to IPv4. It is used with multicast addressing and delivery to manage which hosts are members of a multicast group.

The Transmission Control Protocol (TCP) in layer 4 deals with problems such as packet loss, duplication, and reordering that are not repaired by the IP layer. It operates in a connection-oriented (VC) fashion and does not preserve message boundaries.

The User Datagram Protocol (UDP) allows datagrams to preserve message boundaries but imposes no reate control or error control.

TCP concerns with things such as dividing the data into appropriately sized chinks for the network layer below, acknowledging received packets, and setting timeouts to make certain the other end acknowledges packets that are sent.

The PDU that TCP sends to IP is called a TCP segment.

1.3.2 Multiplexing, Demultiplexing, and Encapsulation in TCP/IP

The following diagram shows how demultiplexing works.

An arriving Ethernet frame contains a 48-bit destination address ( MAC - Media Access Control - Address) and a 16-bit field called the Ethernet Type.

A value of 0x0800 indicates IPv4 datagram.

A value of 0x0806 indicates ARP datagram.

A value of 0x86DD indicates IPv6.

Then, the Ethernet header and/or trailer is removed and the frame is handed over to IP for processing. IP checks the destination IP address. If the IP address matches one of its own and the datagram contains no errors in its header (IPv4 does not check the payload), the 8-bit IPv4 Protocol filed (called Next Header in IPv6) is checked to determine which protocol to invoke next. Common values are: 1 (ICMP), 2 (IGMP), 4 (IPv4), 6 (TCP), and 17 (UDP). Interestingly, if number 4 is found, which means IP datagram inside the payload area of an IP datagram. This seems to violate the original concepts of layering and encapsulation but is the basis for a powerful technique known as tunneling.

1.3.3 Port Numbers

Port Numbers are 16-bit nonnegative integers.

Port numbers are assigned by the Internet Assigned Numbers Authority (IANA).

There are three sets of port numbers: the well-known (0-1023 port numbers, the registered (1024-49151) port numbers and the dynamic/private (49152-65535) port numbers.

1.3.4 Names, Addresses, and the DNS

DNS helps translate IP addresses to human readable domain names.

1.4 Internets, Intranets, and Extranets

Lower case internet refers to multiple networks connected together using a common protocol suite. The uppercase Internet refers to the collection of hosts around the world that can communicate with each other using TCP/IP. The Internet is an internet, but the reverse is not true.

"The value of a computer network is proportional to the square of the number of connected endpoints (users or devices) ------ Metcalfe's Law"

The easiest way to build an internet is to connect tow or more networks with a router. The router provides connections to many different types of physical networks: Ethernet, Wi-Fi, point-to-point links, DSL, cable Internet service and so on.

Routers were called gateways in the past. Today the term gateway is used for an application-layer gateway (ALG), a process that connects two different protocol suites.

An intranet is a private internetwork, usually run by a business. Users connect to intranet using a virtual private network (VPN) with predefined authorities.

An extranet involves more partners to the intranet.

1.5 Designing Applications

Network applications are designed with common disign patterns, namely client/server and peer-to-peer.

1.5.1 Client/Server

Most network applications are designed so that one side is client and the other is server. Server can be categorized into two classes: iterative and concurrent.

Iterative Server:
I1. Wait for a client request to arrive.
I2. Process the client request.
I3. Send the response back to the client that sent the request.
I4. Go back to step I1.

Concurrent Server:
C1. Wait for a client request to arrive.
C2. Start a new process or task or thread to handle the client's request. While the new process is handling the request, the man process proceeds to C3.
C3. Go back to C1.

1.5.2 Peer-to-Peer

Each application in peer-to-peer network acts both as a client and a server.A concurrent p2p application receive an incoming request, determine if it is able to respond to the request, and if not forward the request on to some other peer.The p2p applications form a network among aplications, called an overlay network.

1.5.3 Application Programming Interfaces (APIs)

Applications need to perform certain types of operations, like connect, read data and write data. This is usually done by using a networking application programming interface (API).

The most popular API is called sockets or Berkeley sockets.

Modified sockets for IPv6 can be found in RFC3493, RFC3542, RFC3678, RFC4584, RFC5014.

1.6 Standardization Process

The Internet Engineering Task Force (IETF) meets three times a year to develop, discuss and afree on standards for the Internet's core protocols, including but not limit to IPv4, IPv6, TCP, UDP, and DNS.

The Internet Architecture Board (IAB) and the Internet Engineering Steering Group (IESG)'s leadership groups are elected by IETF forum. The IAB is chartered to provide architectural guidance to activities in IETF and to perform a number of other tasks such as appointing liaisons to other standards-defining organizations (SDOs). The IESG is responsible for approving new standards, along with modifying existing ones. The Internet Research Task Force (IRTF) explores protocols, architectures, and procedures that are not deemed mature enough for standardization.

1.6.1 Request for Comments (RFC)

Every official standard in the Internet community is published as a Request for Comments, or RFC.

RFCs are not standards, only those standard-track category RFCs are considered to be official standards. Other categories include best current practice (BCP), informational, experimental, and historic.

Important RFCs: RFC5000 defines the set of all other TFCs that are considered official standards as of mid-2008. The Host Requirements RFC1122 and RFC1123 define requirements for protocol implementations in Internet IPv4 hosts. The Router Requirements RFC1812 does the same for routers. The Node Requirements RFC4249 does both for IPv6 systems.

1.6.2 Other Standards

The Institute of Electrical and Electronics Engineers (IEEE) is concerned with standards below layer 3 (Ethernet and Wi-Fi). The World Wide Web Consortium (W3C) is concerned with application-layer protocols (Web technologies, e.g., HTML-based syntax). The International Telecommunication Union (ITU, ITU-T) standardizes protocols used within the telephone and cellular networks.

1.7 Implementations and Software Distributions

The historic standard TCP/IP implementations were from the Computer Systems Research Group (CSRG) at the University of California, Berkeley. They were distributed with the 4.x BSD (Berkeley Software Distribution) system and with the BSD Networking Releases until the mid -1990s.

1.8 Attacks Involving the Internet Architecture

IP datagrams makes spoofing possible.

Spoofing can be used with a variaty of other attacks seen on the Internet. Denial-of-Service (DoS) or Distributed Dos (DDoS) attacks are the common ones.

Only host-to-host encryption (IP layer or above) can protect the information across the multiple network segments an IP datagram is likely to traverse on its way to its final destination.