Why BGP?

2025-03-04

The internet got its name from the fact that it is a “network of networks”. Various businesses, universities, and government entities developed networks but had to coordinate in order to communicate across these. The challenges here are more than just technical; the reality is that different organizations have different incentives.

In the jargon of computer networks, such entities are called autonomous systems. They may control routing within their domains, but how should routing happen across domains?

Some of these autonomous systems are “stubs”, i.e. organizations that just want to provide internet access to their own networks. Others, such as an ISP or network carrier, are in the business of selling connectivity or transit services. Autonomous systems can enter into various relationships with each other: a service provider might sell access to a stub, and two service providers might agree to transfer data between themselves (known as “peering”).

As noted, each AS has their own incentives and priorities. A transit AS may want to forward traffic for which they get paid but not traffic for which they don’t. A government may want to ensure that their data is never routed through a certain country. For the internet to even exist, such incentives have to be aligned. From this perspective, inter-AS (also called inter-domain) routing isn’t just about minimizing the number of hops or the network latency. Rather, it’s about providing autonomous systems with a sufficiently rich set of policy levers that they can conduct business.

Within the world of routing protocols, a distinction is often made between “distance-vector” and “link-state” protocols. A distance-vector protocol computes routes by distributing the computation. Routers publish the destinations they can reach along with the cost to reach them (hence the name: mathematically this looks like a vector of distances). When one router receives distance vectors from another, it re-calculates its own costs based on this updated information. This is a distributed form of the Bellman-Ford algorithm, and one of the oldest approaches to routing. One of the earliest implementations is the Routing Information Protocol (RIP).

The distance-vector idea is conceptually simple, but a naive implementation suffers from several performance issues. One notable problem is that convergence can be slow in the face of topology changes. Underlying this is the fact that stale information can circulate around network loops, and routers don’t have any way of knowing that they are acting on this stale information.

Over time a new paradigm emerged, the so-called link-state algorithms. Rather than distribute the path computation, link-state protocols distribute knowledge of the underlying network topology. Routers do this by advertising the links they have available (hence the name). They also maintain a database of the advertisements they have seen. Shortest paths through the network can be computed on a router directly (e.g. using Dijkstra’s algorithm). The internet standard here is the Open Shortest Path First protocol.

Link-state protocols typically have better convergence properties than distance-vector protocols, but they are not well suited for routing across autonomous systems. This goes back to issues of policy and incentive. Suppose an autonomous system is only willing to route traffic when they get paid to. This kind of constraint cannot be enforced in the network topology or from clever ways of assigning link costs. Nonetheless, evaluating whether a proposed route adheres to this constraint is straightforward.

This is where the internet’s Border Gateway Protocol (BGP) comes in. BGP takes the distance-vector paradigm but turns it on its head. What distance-vector gets right, from the perspective of inter-domain routing, is that it is centered on sharing reachability information rather than link information. But the emphasis on distances or costs as the bottom line is too narrow-minded. Instead, routers running BGP advertise paths and metadata about these paths. Administrators have control over what paths their AS advertises. They also control how their AS uses path information shared by domains.

BGP was first proposed in 1989 in RFC 1105. The initial RFC is surprisingly short. I was amused to learn that the original idea was developed in an IETF cafeteria as the authors scribbled their thoughts on a pair of napkins, leading BGP to be known as the “two-knapkin protocol”. BGP was not created in a vacuum. The problem of inter-domain routing had already been considered in RFC 827 but was showing its limits as the internet grew. So BGP was an attempt at a next-generation inter-domain routing protocol.

The latest version, BGP-4, takes 100 pages to specify in the core standard. Today, inter-domain routing runs on BGP-4. What the protocol has lost in elegance or brevity, it’s made up for by successfully scaling the internet.

#networking

Reply to this post by email ↪