02 Dec 2020 - tsp
Last update 03 Dec 2020
TL;DR: The internet is an interconnection of independent networks and machines that use a really simple ruleset to route traffic in between them in a failure resistant way without any guarantee on relieability or trustability.
Since I know many people who have no idea how the Internet really works and what it is but are talking often on how one should regulate it I decided to write this hopefully short introduction into some basic networking theory. Please don’t be frightened when I write about Ethernet and basic IP networks as well as subnetting at first - it’s required to discuss the concept of the Internet later on.
First let’s take a look at the local network layer. There is a variety of technologies available but I’ll focus on the one most commonly seen in private and consumer areas. Ethernet is a rather old technology that dates back to 1973. The basic technology still works the same as back then with some minor changes.
First Ethernet provides an electrical standard for networking - usually seen over Twisted Pair cables today (what’s commonly seen as network connectors) but in fact the 802.3 specification supports a huge number of technologies. Coaxial cables have been used back in the 80’s, fiber optic cables are used for long distance connections, there are parallel short cables (CX4) used to interconnect switches in datacenters over less than a meter, there are backplane specifications (KR) and other electrical carrier specifications.
Since electrical transfer is only half of the story Ethernet also provides a Link-layer protocol. Each and every Ethernet device worldwide has a unique 48 bit device address also called the media-access-control (MAC) address. This address is assigned by the manufacturer of the network equipment from a namespace that has been assigned to the manufacturer. It should usually not be changed and there are many applications that require that MAC addresses are worldwide unique. At least it’s required for them to be unique on the local network segment to allow ethernet to function correctly. Note that it’s easy to spoof so it’s not usable for any security related stuff though.
What’s the MAC address used for? Data is transmitted between network equipment in the form of packets. The Ethernet frame specifies the layout of a packet transmitted between Ethernet capable equipment. The ethernet frame consists of:
A network might consist of an arbitrary number of devices and might either be switched (usually it’s today) or connected via a hub (done in the early days).
Using a hub the electrical signals from one transmitter are simply transmitted to all receivers on the same network. This is done always - even if other devices are transmitting at the same time. Collisions are handled with a schema calls CSMA/CD - carrier sense multiple access / collision detection. This basically means that a station that wishes to transmit does:
Note that this behavior is the reason why one should consider a network that’s loaded up to 80% of it’s capacity outside of peak periods to be under-dimensioned.
All devices listen to all incoming traffic. Whenever they receive a packet with their MAC address or a broadcast address listed as the target / destination address they receive the packet. In any other case - i.e. a non matching address - they discard the packet silently directly on the network processor except when this device has been put into promiscuous mode deliberately. In this case the software layer receives all packets that have been detected on the Ethernet layer which had been a nice method to listen for all traffic on non-switched Ethernet segments - it’s not sufficient to listen to all traffic by other nodes on modern networks though, one would have to perform some ARP cache poisoning to do this (more on that later).
Since using a hub all packets have to be received by every station and the whole network segment can only be used by a single device at the same time switching has been introduced. A switch initially works exactly as a hub - all incoming packets are transmitted on all ports. But for each incoming packet the switch learns the source addresses and remembers on which ports they’ve been seen. In case a packet arrives whose destination address matches a previously seen source address the packet is only transmitted to the previously learned port. Table entries are removed when they get updated with other up to date data (i.e. a device has changed physical switch port) or after a given timeout. A special class of packets - so called broadcast and multicast packets - are transmitted to all ports and received by all nodes anyways in traditionally switched networks, some more advanced networks only do this to broadcast packets and perform a process called IGMP snooping to forward multicast packets only to ports that contain subscribers for the given multicast group.
Of course switching also has a drawback - if one forms switching loops inside a network by connecting switches with multiple paths (either indirectly or forming a ring topology) packets might be passed around infinitely and cause network overload. Of course building a ring topology or a fully meshed interconnect might be interesting for fault tolerance so modern managed switches usually employ a protocol called spanning tree (STP) that detects loops and disables ports as long as multiple paths exist.
Note that this switching does not provide any security since it’s easy to poison the switches caches to redirect traffic to different ports or flood them so they start broadcasting packets again. On the other hand switching using crossbar switches allows better network utilization and reduces the load on single network devices. Depending on the switch it may even allow different speed devices to be attached to the same switch without limiting the whole network segment to communicate at slower speeds.
Note that the policy in case of network congestion (i.e. two devices try to transmit to the same destination port) is to silently drop one of the packets. Therefore higher protocol layers like TCP will have to detect such situations.
As one can see this schema works perfectly well for a single network segment that’s small enough to employ CSMA/CD and that also is capable of broadcasting a single packet onto all ports as long as routes are not known. But it’s not sufficient for larger networks, multiple segments or the Internet. Also all devices on the same network segment are contained in the same broadcast domain so the number of devices is usually also limited by broadcast traffic.
To counter these problems the Internet Protocol (IP) has been designed. It can be used on top of any other networking protocol like Ethernet or even serial protocols (serial line IP - SLIP - for example). It’s also often tunneled in SONET/SDH ATM frames, generic routing encapsulation packets or VPN tunnels on top of other IP networks.
The basic idea is similar to Ethernet. Each packet that should be transmitted contains a source IP and a target IP address. The network itself is divided into subnets that are formed by applying subnet masks to addresses. One can imagine a subnet being simply all addresses that share a given number of high order bits of their IP addresses called a prefix. In fact subnets are usually encoded using an IP address together with a prefix length.
In the following section the format of IP addresses as well as some special subnets are listed. Don’t be afraid if you don’t totally understand the meaning of subnets and prefixes until now - more on that will be written in the next section on routing.
IPv4 (32 bit) addresses are usually written as a sequence of 4 decimal numbers separated
by dots (for example
127.0.0.1). One is usually tempted to read this number
decimal but only the binary representation is relevant. One can of course arbitrarily
select a prefix length but usually one’s limited by a given subnet that one can
use for a specific application.
There is a number of networks that can be used for private networks that are never routed on any public network:
|IP Address||Prefix length||Usage|
|10.0.0.0/8||8 bits||Private class A network that consists of $2^24$ addresses|
|172.16.0.0/12||12 bits||16 class B networks that consist in total of $2^20$ addresses|
|192.168.0.0/16||16 bits||256 class C networks that consist in total of $2^16$ addresses|
|100.64.0.0/10||10 bits||Range assigned specially for carrier grade NAT as private network that does not collide with the earlier mentioned networks. Should not be used in private home networks|
|169.254.0.0/16||16 bits||A single class B network that should never be routed - not even in private networks. Used as link local addresses|
There are other networks that are reserved for special purposes:
|IP Address||Prefix length||Usage|
|0.0.0.0/8||8 bits||Current network (only as source). Is used for some broadcasts|
|127.0.0.0/8||8 bits||Used for loopback addresses to the host itself - and sometimes also to local virtual machines and containers. Never leaves the local system|
|22.214.171.124/24||24 bits||Reserved, previously used for IPv6 to IPv4 relays|
|198.18.0.0/15||15 bits||Used for benchmarking and testing inside local networks across subnet boundaries. Sometimes seen at exchange points|
|126.96.36.199/4||4 bits||Prefix used for all multicast groups|
|255.255.255.255/32||32 bits||Limited broadcast|
IPv6 addresses are written in hexadecimal notation since they are 128 bits long.
Each 16 bit group is separated by a colon (
:). One can omit a single series
of zeroes by a double colon - which is sometimes done to separate an assigned prefix
and a static IP address consisting largely of zeros - or for the loopback address
that consists of 127 zeros and a single one.
Again there is a group of special address ranges:
|::/0||0||Default route (only used symbolically)|
|::/128||128 bits||Unspecified invalid address, used only in software locally|
|::ffff:0:0/96||96 bits||Space to map IPv4 addresses for easy transport of IPv4 over IPv6 (lower 32 bits are set to the IPv4 address)|
|::ffff:0:0:0/96||96 bits||IPv4 translated addresses. Another transition mechanism not as easy as the previous one|
|64:ff9b::/96||96 bits||Internet global IPv4 to IPv6 translation mechanism|
|100::/64||64 bits||Discard packets prefix|
|2001::/32||32 bits||Teredo tunnel solutions - allows IPv6 access over IPv4 networks without any tunnel broker|
|2001:20::/28||28 bits||Overlay Routable Cryptographic Hash Identifiers (ORCHID)|
|2002::/16||16 bits||Older 6to4 translation mechanism|
|fc00::/7||7 bits||Unique local address (ULA) - all site local networks reside under this subnet|
|fe80::/10||10 bits||Link local addresses|
|ff00::/8||8 bits||Multicast group prefix|
So what does this whole subnet process mean? IP networks are segmented into a
different groups of hosts and networks called subnets. Nearly every network can
be divided into further networks that share a common prefix. For example one
can take the private network
10.0.0.0/8 and choose to split it into
256 other networks ranging from
10.2.0.0/16 up to
The same can be done with IPv6 networks - but for technical reasons subnets are
not allowed to have a prefix longer than 64 bits there.
Traffic inside each subnet is transferred via the previously presented switching process. Packets broadcasted or multicasted (without having switches that perform IGMP snooping) are transmitted to all members of the networks - they form a common broadcast domain. Additionally limitations on the network size apply.
Different subnets are connected by components called
routers. How do routers
decide how to route a package? They employ a routing table. A routing table basically
consists of a list of prefixes, prefix lengths and target ports (in practice there is
some different information like link metrics / costs, etc. but this is not required
for basic understanding and system configuration). Whenever a packet arrives at a
router and it has to take a decision it logically and’s the binary representation
with the prefix length (i.e. sets all bits not corresponding to the prefix to zero)
and compares the result with each known prefix. In case one prefix matches it selects
the port recorded with the specific prefix; in case multiple prefixes match routers
usually employ a longest prefix rule that chooses the longest common prefix for
a given packet to choose which port to forward to.
Let’s take a look at an (IPv4) example. Again it works the same for IPv6. Let’s say the routing table consists of:
|Subnet||Binary notation (Subnet)||Binary notation (mask)||Port|
So if now a packet arrives with the destination address
192.168.1.16 the router would
logically and for each entry:
As one can see three of the known routes would match the packet. The router then chooses
the longest prefix and thus selects the first entry - choosing
eth0 as target
interface to forward the packet to. If not match would have been found the router
would transmit an ICMP message called no route to host back to the senders address
to indicate there is no known route in existence for the selected target.
If one looks at the last entry in the routing table above one notices the
entry which at first glance doesn’t make sense since it provides always a match. This
is called a default route and is usually not found on large routers on the Internet
but only on smaller routers in private and corporate networks. The idea is to forward
packets for which no route is known into the public Internet. This way not every
local router has to know a route to every other network worldwide. This - in contrast - is
totally different for the routers found on the Internet. All routers on the Internet
know via which interface they reach every other publicly announced network
How is this routing information configured? One can imagine that for smaller networks this can be done manually - which is in fact what’s done for example when manually assigning IP addresses to interfaces.
If one assigns a static address to an interface one also assigns a subnet mask together with the address. As soon as this happens the system adds a routing entry for the given interface in it’s local routing table. For many small scale site local networks this is enough - sometimes adding a default route is sufficient. For dynamically configured local networks the same thing happens - the devices get an address assigned by DHCP (IPv4) or SLAAC (IPv6) together with a subnet mask or prefix length as well as a default route and calculate their routing tables from this information.
When using DHCP a system starts by transmitting a DHCP discovery message onto the local network using the broadcast address asking for an address assignment by a service called DHCP server. The DHCP server(s) see this request, select an IP address for a device - sometimes this is done dynamically, sometimes it’s assigned statically based on the MAC address or the physical location of a device - and transmit information back. This requires a stateful DHCP server that keeps track of assigned IP addresses though. This has been changed with IPv6 where IP address assignment is usually done via stateless autoconfiguration (SLAAC) which works by having routers broadcasting the prefixes they’re authoritative for onto the network (ICMP router advisory). Every device seeing such an advisory can take the announced prefix, derive a 64 bit local part from it’s MAC address and attach it to the prefix - and thus has an address assigned automatically. One can also assign additional configuration like a default route and DNS server configuration using the same mechanism or then use DHCPv6 for providing additional configuration information.
But how does this work for larger networks containing multiple subnets or even the Internet? There is a bunch of routing protocols by which routers can exchange their routing information. The two most commonly used protocols today are called Optimized Link State Routing (OLSR) which is used on ad hoc networks like wireless meshes or in local networks and Border Gateway Protocol (BGP) which is in fact the protocol that literally builds the Internet.
The main difference is how both protocols implement discovery of neighbors and how much information they keep. For BGP each router that is allowed to route a given network is configured by a network operator to announce this responsibility. Announcing means the router simply broadcasts to all neighbors that it’s responsible for the given subnet via one hop. Routers then additionally also transmit all of their routing information increasing the hop count by one (or adding some kind of metric information for a given link to reflect link cost, link quality, etc.). Thus they slowly learn which routes their neighbors can reach, which routes the neighbors of their neighbors are capable of reaching, etc. and they learn which of their ports provides the shortest route. Whenever they see a network announcement they don’t already know they learn that the given port is responsible for the given network with the seen link cost / hop count. Whenever they see an announcement for an already known network they check if the newly seen announcement has a lower link cost / lower hop count. In this case they simply update their local routing table. That way routers know always via which port they’re capable of reaching a network the shortest way - and it allows them to transmit packets into third party networks that forward traffic to other attached networks - that way one can reach any network attached to the Internet since usually networks are configured to provide transit for all data packets through them. This is one of the major building points of the Internet - any traffic received by a network is forwarded as long as a route is known without charging the transmitter or receiver, without looking at the payload or service and without any content discrimination - this is what’s really known as net neutrality and this is the second building point of the Internet itself.
This behavior is also the reason why the Internet is fault resistant. Whenever a network fails routers immediately learn different routes to other networks as long as any direct or indirect link exists. This usually takes between seconds up to minutes - sometimes there is a process called route flapping or short routing loops for some minutes but the network converges into optimal state by using these simple rules after a really short timespan.
There is an additional advantage to this process - one can announce a subnet
at different physical locations. This might not seem obvious or useful at first
but this technique - called anycast - allows service operators to host the
same service at different network topological and physical locations. Any packet
sent towards one of their networks always reaches the network topological closest
router that announces their network. That way one always reaches the geographically
closest system - but not necessarily the same system all the time. This is for example
the reason why people can use
188.8.131.52 as Google’s DNS server address worldwide
without reaching the same routers or same system worldwide. And it allows another
method to provide redundancy.
Another advantage is that one can arbitrarily use one’s IP subnets. IP addresses never have any geographic association. One can announce any subnet of one’s addresses anywhere worldwide - even at multiple locations.
One drawback is of course that one can - and this has happened by some state actors more often than one would think - announce networks that one’s not responsible for to divert traffic to one own’s network. This does not go without notice and usually leads to one’s peers to disconnect from one’s network if done on purpose. Usually there are no laws governing this but it’s an unwritten rule on the Internet that no one peers with someone else who announces invalid rules (be it for malicious or for legal reasons - it doesn’t matter) because it’s in the interest of everyone themselves to keep routing tables functional and correct. Large network operators also usually monitor the network for announcements not matching their owns. There have been ideas of addition some kind of signatures to BGP but this has not been deployed up until now and would require some additional central authority.
Now that’s basically what the Internet is:
net neutrality) free of charge (
Keep in mind: The Internet would not work without network neutrality (in this case each network would require a direct connection to each other network) and without cost neutrality (each packet would have indeterminate price). This is not some political standpoint, it’s just a basic building block of the Internet.
Since no one wants to remember IP addresses an additional hierarchical naming schema
as well as a resolution protocol has been defined. The domain name service. The DNS
forms a hierarchical database - the root domain name servers know only the entries
. root zone - these are called top level domains (
These entries contain references to another set of name-servers operated by local
registrars. These know the next level of domain names. For example the servers
.at. operated by nic.at know
which of my own DNS servers are responsible for
tspi.at. and know some additional
signature information for DNSSEC - but they don’t know who the
zone. Traditionally webpages are found under the
www host that’s than resolved
by my own domain name servers when asked who
www.tspi.at. is. This is an recursive
process. Please also keep in mind it’s bad practice to offer webservices directly
under the domain APEX (in this case
tspi.at.) for a variety of reasons. Just don’t
Whenever a recursive resolver tries to resolve
at.is. The root servers then respond with one of the known DNS servers responsible for the Austrian zone.
tspi.at.is. These resolvers then return one of my own DNS server addresses back to the resolver.
This process is called recursive resolving and is rather resource intensive and
places some load onto the root servers. Thus a second method usually used by
clients and small gateways is employed - which is called forwarding. In this case
the client directly asks it’s forwarder who
www.tspi.at. is who then either
performs the recursion on behalf of the client or forwards to another DNS server.
This has the advantage that forward DNS servers are capable of caching queried
information. For example nameservers know for more than 24 hours who’s responsible
.at.. One has to keep these (multiple) caching layers in mind when updating
DNS zones - this is a rather slow process that might take up to days depending
on configured TTLs and DNS server behavior.
Since the database is hierarchical it’s managed by different hierarchical entities. One usually has to pay for usage of domain names.
The DNS can contain additional information besides just resolving to IP addresses. For example one can refer mail servers, keep information about DNSSEC signatures, keep information about used public keys for OpenPGP, publish information about servers being responsible for specific services (heavily used for XMPP for example), add telephone numbers - there is also a zone that resolves telephone numbers to SIP accounts to aid transition to voice over IP networks - etc.
So now one knows what the Internet is - but why haven’t webpages, E-Mail, etc. been mentioned? Because these are technologies built on top of the Internet. Basically the world wide web is a simple concept. It consists of a markup language called the hypertext markup language (HTML) as well as an address schema called unified resource locator (URL). Information is exchanged via a protocol called hypertext transfer protocol (HTTP).
The idea again is pretty simple. Anyone who wants to publish content runs their
own servers that are reachable via the Internet. One then publishes some documents
written in HTML on ones server and everyone requesting the document gets them
sent by the server if one does not want to use any kind of paywall or authentication.
On the local system resources are specified by a path-like string that many times
maps to a filesystem structure (for example
/directoryA/B/filename) - but this
is simply an identifier. Whenever a system connects to the webserver and asks
/directoryA/B/filename the webserver transmits it’s answer to the client.
Note that it means that webpages thus are not out there for anyone to reach but
are kept in a public area and handed to requesters on purpose and explicitly by
the hosting servers - usually the content owners. This works in contrast to broadcast
systems like television where the provider simply sends out a stream of data
to all customers and they decide what they want to receive or decode - which is
an argument when politic requests payment of TV flat-rates for internet connections,
it makes simply no sense.
web part of the WWW is built by hyperlinks. Each HTML document can contain
links - everyone knows them when reading a webpage. These are
simply references to some other resource on any other server worldwide.
That’s basically it.