In today's material I will talk about Overlay VPNs. Examples of such networks include NetBeard, ZeroTier and the increasingly popular TailScale, which we will use as an example. I'm not saying it's the best, but we use it the most.
We will talk about how Overlay VPN differs from classic VPNs, e.g. WireGuard, IPSec, OpenVPN. We will also try to answer the question, which type of VPN for you will be better and when?
We can see an example of a standard infrastructure in the figure. The characteristic feature is one central point, which is the VPN server. In real solutions, there can often be duplicated elements for redundancy reasons or even multiplied for performance reasons.
Regardless, they will have one common element in the form of a central point to which all elements of the system will talk. Examples of open solutions are WireGuard, IPSeek or OpenVPN. The undoubted advantage of such a solution is simplicity. We have a firewall with a VPN, configure it, expose the entire firewall or just one of its ports to the Internet and it works.
There are many solutions that are well known and developed. Unfortunately, these solutions have several drawbacks. A major downside of standard VPN servers such as OpenVPN is that it must have a public IP address, preferably an unchangeable one. This is not a big problem or obstacle to solve, but sometimes ISPs, for example, demand additional fees.
Another disadvantage is that any traffic transmitted by any of the network elements will have to pass through the central node. If the users are not many and the amount of transmitted traffic is small, there is no problem. However, let one of the users, as in the figure, start downloading some large files from one of the offices. In that case, all the data will pass through the VPN server mercilessly, consuming FireWall's resources twice. Another example would be the need to regularly synchronize larger amounts of data between offices, such as video footage or surveillance recordings. Do you understand what I'm getting at? It's pretty easy to get to a situation where you start to run out of firewall power with the VPN, or Internet bandwidth at the PBX. Saturation of firewall power by one or more large streams can negatively affect the stability of the entire VPN.
The inevitable result of such a topology will be increased packet transmission time. It will add up the time from the client to the VPN server and from the VPN server to the other client. In most cases, with today's broadband networks, this should not be a problem, but some applications are sensitive to the increase in response time. Overlay VPN-based networks have a slightly different logical infrastructure when comparing it to standard VPNs.
ZeroTier, Nebula or Tailscale are examples of such solutions. Let's start with the basic difference that always strikes our eyes first, which is the additional element in the form of Coordination Manager. It's kind of our logical focal point, only it doesn't transmit any data. Its primary task is to coordinate the transmitted data. In addition, it manages permissions.
Who can connect to a given network and who can't? Coordination Manager can manage traffic filtering. Who can open communication to whom and on what port? The way it works is that any client joining the network connects to our Coordination Manager somewhere in the network. After proper authentication, it gets information about all the devices available on its network that it can access. In Overlay VPN, like Tailscale, an additional virtual interface is created and in the system it shows up as a new additional network card. From then on, the computer can communicate directly with devices on the VPN through the new interface as if they were on the local network.
In order to start transmitting data between computers on a network, you must first connect them to each other. And this is where the magic of VPN overlay begins to happen. Each computer is connected to the Coordination Manager. With its help and very sophisticated manipulation of network packets, each computer creates a direct encrypted tunnel with every other computer in its VPN with which it will want to exchange data.
Regardless of how many and what nats the computers are buried behind, the protocol will find the shortest and most likely optimal path between two specific computers and set up an encrypted point-to-point tunnel. The result is a set of multiple point-to-point tunnels that forms an each-to-any network, known as a full mesh. On top of this physical infrastructure of multiple tunnels, a logical network of computers seeing each other in a single virtual network is created. The undoubted advantage of Overlay VPNs, such as Tailscale or NetBeard, is that they establish a direct connection between computers exchanging traffic with each other, making the traffic take the shortest route and the transmission the fastest possible by skipping additional intermediate points.
The next advantage is the elimination of the bottleneck that is the central VPN server. We don't transmit traffic through it, we don't overload processors, we don't add to the Internet bandwidth. Our focal point, the Coordination Manager, since it does not transmit any traffic, has much lower hardware requirements than in the case of the central VPN server. It still needs to be exposed to the Internet, but it doesn't need to be at any of our locations. The configuration of all Overlay VPN clients is completely transparent from the perspective of where they connect from. There is no need for a public IP, no rules on the firewall, no port forwarding. Nothing, it just works.
An interesting option in this Scale is the exit node option. Literally by checking one box we allow other network members to use our computer or our firewall as an exit point to the world. Very convenient and simple.
The advantage and at the same time disadvantage of Overlay VPN is again Coordination Manager. We have the option to use the providers, or to use the self-hosted version. In the case of NetBeard and ZeroTier, this can be done, but in TateScale there is no self-hosted version provided, although you can use the Headscale equivalent, which is and works quite properly. Why do I mention this? Because if we are going to host the coordination server ourselves, then basically everything is ok, so far we keep everything in our own hands.
However, I will admit that using providers, is a very tempting option here. Especially since all the solutions in question offer free plans for their solutions. In NetBeard and TailScale, even up to 100 devices. That's all well and good, but why am I bringing this up with the disadvantages? We assume that someone will take control of our overlay VPN provider. I really wouldn't treat this as science fiction. We have numerous examples of security providers taking control of their servers and, as a result, taking control of customers' devices. In practice, at this point, the managing provider, or rather, whoever took control of it, controls what computer will appear on our network and will be treated as a trusted and secure network by default. It also controls the possible settings of Firewall rules on that network. According to the maxim that control is a higher form of trust, this should lead us to the only valid conclusion that in practice we should treat such networks as a DMZ at best, but absolutely not as a safe zone.
The obligation to encrypt and filter all traffic remains in effect. The principle of limited trust applies. As of now, September 2024, few devices or systems freely support VPN overlay. There are client versions available for Windows macOS or Linux. You can also find built-in OpenVPN support in the simplest network devices, routers, network drives such as Synology's TrueNAS or the smallest IoT devices. Often these can be configured and work. The case is different for VPN overlay clients. Here you have to make do with routers that support Overlay VPN, with all the advantages and disadvantages of this, such as pfsense supports tailscale, or create additional instances such as containers that are traffic brokers.
An interesting solution is the Tailscale add-on for PFSense. Very simple, intuitive, and the configuration is limited to installing and entering the key. That's all. But be warned, there will be drawbacks. If you've rubbed shoulders with network management, this piece will be severely unintuitive. If you create a side-to-side tunnel with TailScale ipf sence, you will find that traffic in another location natures through our firewall. The result, in the other location, is that it looks as if the traffic is coming from our firewall. This is, in practice, a big problem if you want to stick to good practices and filter traffic, and you don't know whether it's coming from your firewall or from the phone of someone in one of your offices.
Remember the exit node option I mentioned? Well, that's exactly what you need to remember here too. If we allow our firewall to be used as an exit node, we allow others to go out into the world as our firewall without special control. This is not always a good thing.