26 Sep 2021 - tsp
Last update 26 Sep 2021
So before digging into the matter of Jabber/XMPP and what it exactly is let’s first recall what the problem is that one usually wants to solve: Chat and instant messaging. Usually these two are mixed - and to be honest chat is an subset of instant messaging.
Chat is basically a service that allows instantaneous exchange of text messages
between participants as long as they’re online. Chat tools are usually rather lightweight
like the old Unix
talkutility that was a direct successor of systems
PLATO that has been developed in the early 1980s. It allowed
to send a text message directly into the terminal of another networked machines
user. In case the user currently was logged in the user was directly able
to read the message and use
talk to send a text message back. The Netiquette guidelines
specific to talk are still valid as of today - also for instant messaging applications.
Chat applications usually do come in two flavors - offering chatting between
two online users or into a group - for which a custom protocol called
provides the best solution (way better than using any other solution such as web interfaces).
As soon as internet connections became more and more available companies like AOL and Mirabilis took the concept and built applications like the AOL messenger and the well known ICQ in the early 1990s. These application provided instant messaging services: As long as the presence status of an user was any of the online states they received messages instantaneous, in case the machines have been offline a centralized server cached the messages and delivered them later on. In contrast to traditional chat applications instant messaging thus also provided presence status information as well as server side storing of messages in case the recipient is offline.
The main difference between todays instant messaging services is their service topology. There are three main messaging topologies:
The services mostly in use by common people are - unfortunately - centralized services such as WhatsApp, Telegram, Facebook’s Messenger, ICQ, Signal, etc. These services have a single provider that runs the server infrastructure and in most circumstances they’ve even a single provider for the client applications and web interfaces. A notable example is the ICQ network for which there have been numerous alternative clients that usually supported many different messaging protocols so one only had to use a single client for multiple networks.
The main drawbacks of a centralized service are:
Advantages by a centralized approach:
In a federated approach - that works exactly as E-Mail does - anyone can operate one owns server. Servers are exchanging information whenever necessary and users are only communicating with their own servers. If anyone wanting to talk to another user on a different server they’re transmitting their message to their own server that then forwards the message to the destination. This is the approach that’s also taken by Jabber/XMPP.
The main advantage is:
The main disadvantage:
The in theory best approach is a totally decentralized one. In this case there wouldn’t even be any servers that are ran by anyone. Every client would join a peer to peer overlay network (such as for example Pastry or Kadmelia - networks that are also used by peer-to-peer file sharing solutions) and forward as well as store messages for any other node in a statistical fashion. As of today there is no established messaging network working on this principles since it’s really hard to develop a distributed fault tolerant and manipulation resilient system. The main gain would be that there would be no need for anyone to operate an server (but anyone would share their own Internet connection to keep the system up - as long as a proportion of users stays online and reachable by the outside world all the time the network would continue to operate) so no one could be forces to pull the service down or perform some kind of manipulation. Combined with metadata anonymization services such as TOR such an network would provide an nearly uncontrollable stable and resilient messaging network. The main disadvantage is of course that such an network requires enough nodes that are externally reachable - in a world that is built more and more (instead of less and less) around network address translation that poses somewhat of a problem.
Jabber/XMPP is one of the older instant messaging protocols built around the federated approach. It dates back to the early 1990s - but is nonetheless a modern messaging protocol. It’s built around the concept of XML data streams so all messages are human readable. It has been deployed also by a myriad of different messaging services (even WhatsApp seems to be built around a variant using some proprietary stream compression) and many services have been - until lately - capable of federating using XMPP such as Google’s Talk/Hangouts network and even Facebook’s Messenger - unfortunately they’ve converted their networks to closed networks lately.
Since XMPP is federated it’s built around servers. Users are addresses by addresses
that look somewhat like E-Mail addresses (i.e.
user@domain). The domain part
identifies the server that the user’s account is located on - for example
would be located on a server found via a DNS lookup at
supports a variety of solutions of locating the real hostname and IP address of
the specific server - DNS
SRV records being the most common ones which is
totally transparent for the user.
Besides simple text message exchange XMPP offers:
These extensions are of course optional - all useful clients support at least presence notification, offline messages, server side roasters (i.e. storing contact lists on the server), multi user chats, transport encryption and usually file transfers if not running on mobile clients.
The message payload is usually only
text/plain without formatting - some
clients do support formatting by simply transmitting HTML snippets inside
the messages. This usually works pretty well but as soon as one uses cryptography
layers such as OTR or OMEMO one should refrain from using formatted text since
then there is much heuristics involved of detecting if a message is formatted or
XMPP itself only offers the use of transport encryption. Transport encryption means that messages are encrypted on their route between the client and the server - but the server would have full access to messages - in contrast to end to end encryption in which one also doesn’t have to trust the server. Luckily there is a bunch of encryption mechanisms available on top of XMPP - but usually they also have some minor drawbacks like lack of multi-client support (i.e. not being able to run multiple clients at the same time on the same account - for example on the desktop and on a mobile device).
This is the most common used cryptography layer on top of XMPP. It is - of course - totally independent of the used instant messaging system and could also be used over any other network.
It basically offers:
As already mentioned session management for OTR has to be done manually. Since OTR requires both sides of a private messaging session to participate in challenge response mechanisms this only works while both sides are actively online or are at least storing state. This is also the largest problem with OTR when used in day to day settings. People usually forget to run the session end, exit their messenger clients or shut down their machines and any further message sent then will be sent to the void since no one knows the encryption keys any more. Usually clients also silently drop messages without correct authentication since notifying would open up the path for some denial of service attacks. So one really has to follow a strict procedure:
There is one major drawback: OTR does not support group chats.
The OpenPGP encryption system is well known from E-Mail - and is in fact currently the only useful and secure cryptography system for E-Mail that’s in place and used since S/MIME had been totally cracked. On the other hand OpenPGP has not been designed for chat systems. There is an XMPP extension protocol that allows one to use OpenPGP over XMPP - but up to my knowledge there is no client out there that really implements using OpenPGP.
OMEMO has been designed as a successor of OTR. It’s based on the same double ratchet system that’s also used in Signal and some other messengers. It’s pretty well designed though there have been a number of possible cryptoattacks on the protocol. It’s not as actively developed as OTR and not as widely used with XMPP though it would offer some more advanced features:
Unfortunately the support in some Clients is rather buggy or even more cumbersome than OTR so usually it’s currently not a simple choice to make.
Pidgin is the messaging client that I’m personally using most. It offers multi protocol support - though I’m only using XMPP as of today. It’s robust, offers voice and video on all supported platforms except Microsoft Windows, it runs on a huge number of platforms including Windows, Linux, BSDs, MacOS, etc.
OTR is implemented via an external plugin that has to be installed separately.
On FreeBSD Pidgin is available in the
net-im/pidgin package, the OTR
plugin can be found in
This client has been developed to be an alternative to existing voice and video solutions using XMPP and the Jingle extension. It was one of the first ones supporting encrypted video chatting using ZRTP as the carrier protocol for video and voice streams. It’s supported on many desktop and mobile platforms. Unfortunately the development focused more on the WebRTC based conferencing solution - Jitsi Meet - that’s a nice alternative for group video conferences and the client is somewhat unstable.
The profanity client will be not of interest for most people. It’s a command line client useful on systems that do not use a graphical user interface. It works rock solid but doesn’t have support for off the record (OTR) messaging.
Again Xabber is the client that I’m using. It has builtin support for OTR, multi account support and just works in a stable fashion.
Another Android client is Conversations. This client supports voice and video calls using Jingle but doesn’t support OTR any more.
This article is tagged: