As Elon Musk drives Twitter users to Mastodon, its underlying architecture can overload content-provider networks
By now you’ve probably heard about Mastodon, the open-source microblogging platform that’s been gaining popularity since Elon Musk took over Twitter.
A major feature of the platform is it’s de-centralized, distributed architecture that provides resilience, but a downside is that it can cause congestion and increase latency for the unprepared.
Here’s how Mastodon works. Its servers (instances) operate semi-independently of each other, and users register with servers geared toward communities that interest them. But users can follow and interact with others from across the Fediverse—users hosted on other Mastodon instances as well as other services utilizing the open-source ActivityPub protocol from the Worldwide Web Consortium.
Active users of Mastodon nearly doubled between Oct. 27 and Nov. 6, according to the company’s CEO Eugen Rochko, causing some growing pains. The distributed nature of Mastodon and ActivityPub have strengths in terms of keeping the service community driven both at the instance and Fediverse level, but some users are starting to notice a warts here and there that seem related to their architecture.
Decentralization: Robust, not necessarily efficient
One constant with distributed systems is that each instance has to share some subset of its data. In the case of Mastodon much of this revolves around followers. If user A on one Mastodon instance follows user B on a different instance, the second instance needs to know which instance to notify when user B posts.Because the first instance is notified about a new post by user B, user A and other users on that instance can efficiently view that post in their federated feed or even receive a notification even though the post occurred on another instance.
This federation ultimately means that each new post can trigger synchronization between multiple Mastodon instances depending on who follows the user. As new Mastodon instances are stood up and the complexity of user networks increase the resulting traffic from user posts will continue to climb.
A similar effort is undertaken when a user migrates their account from one instance to another. The instance hosting the user must notify instances following the user of the move and must provide a list of followers to the receiving instance. This process also involves the Mastodon instances re-negotiating the authentication link between user and follower. As each Mastodon instance is scaled differently both in terms of server configuration (hardware and software) and user count, the amount of time involved with migrating can take days or even weeks. While users are in limbo their service capability is degraded.
These potential network-traffic concerns really only impact those hosting a Mastodon instance, which is admittedly a small subset of IT pros. But that doesn’t mean that corporate admins don’t have a stake in the game. Industry legend Jamie Zawinski, one of the early developers of the Netscape browser, noted this week that his blog has been taken offline repeatedly immediately after posting to his Mastodon profile.
Following some investigation, Zawinski attributes this behavior to a rapid ramp up in traffic from multiple Mastodon instances all attempting to hit the blog post simultaneously. Other users have noted similar issues, specifically that each Mastodon instance hits the URL in order to retrieve a preview image and the page title to display as part of the post.
Protecting your content
Content providers are the obvious niche that should be the most concerned about these findings. If you manage a site or service that drives shares and social-media interaction then there is potential that your infrastructure could be affected by Mastodon’s underlying architecture. Going viral is great, but if your system can’t handle the impact of the additional user traffic plus system-generated hits then the additional attention may not be of the revenue-generating variety.
Best practices are the ultimate answer for making sure your service remains stable, and use of monitoring tools to track performance and utilization are an important first step. Without the ability to identify the source of traffic causing spikes in bandwidth utilization, it’s hard to react appropriately. Similarly, employing a content distribution network or a caching capability will help mitigate the impact of network-traffic spikes. Planning for automated elasticity in the form of a cloud-based application platform or a containerized app infrastructure that can scale up or down dynamically may also be required to fully handle Mastodon’s increasing scale.