Threat modeling Meta, the fediverse, and privacy (DRAFT)

Last major update July 7. See the update log at the bottom for updates. Thanks everybody for feedback on earlier versions!

One very important piece of feedback I haven’t yet incorporated is that it needs a summary … very true. In the current version, a combination of Today’s fediverse is prototyping at scale and the Charting a path forward sections are the best approximation of a summary – assuming you already know how little privacy there is in most of the fediverse. If that’s news to you, the first section (There’s very little privacy on the fediverse today. But it doesn’t have to be that way!) is the place to start.

DRAFT! Work in Progress!
Feedback welcome!

Note to people using assistive technologies: the images aren’t final yet, and all the information in them is in the text of the article as well, so I’m holding off on adding detailed alt-text until they’re finalized. But I will!

Contents:

There’s very little privacy on the fediverse today. But it doesn’t have to be that way!
Today’s fediverse is prototyping at scale
Threat modeling 101
They can’t scrape it if they can’t fetch it
Different kinds of mitigations
Attack surface reduction and privacy by default
Scraping’s far from the only attack to consider
Win/win “monetization” partnerships, threat or menace?
A quick note to instance admins
Charting a path forward
Recommendations – If you’re short on time and just want to know how you can help improve privacy and safety in the fediverse, there are suggestions here developers, instance admins, journalists, hosting companies, funders, businesses, civil society organizations and anybody who’s active in the fediverse today.

There’s very little privacy on the fediverse today. But it doesn’t have to be that way!

Meta’s potential arrival may well catalyze a lot of positive changes in the fediverse. And changes are certainly needed!

– from In chaos there is opportunity!

As the discussions of Meta’s new fediverse-compatible Twitter competitor Threads highlight, privacy and other aspects of safety are some of those areas where positive changes are certainly needed in the interconnected web of decentralized social networks known as the “

Let’s briefly discuss the other three threats we touched on earlier.

At least on Mastodon, there’s no way to disable RSS feeds. Hometown and GoToSocial, by contrast, take a “privacy by default” approach of turning them off unless people choose to enable them. Other fediverse software should follow their lead. An attack surface reduction approach of reducing the amount of data that’s available via RSS feeds is a complementary approach. For example, unlisted posts are currently in the RSS feed; removing them (or introducing a setting that defaults to leaving them out) is an easy improvement. And there may well be other opportunities.

Recommendations:

Provide control for individuals and sites to determine whether RSS feeds are available, and investigate options to reduce information in RSS feeds

For API access, the existing “Authorized Fetch” (also sometimes called “Secure Fetch”) mechanism prevents API access from blocked instances; What does AUTHORIZED_FETCH actually do? is a good overview. Fediverse instances that prioritize safety already have it turned on because it reduces dogpiling and other kinds of harassment; others should turn it on ASAP. And in another good example of “privacy by default”, software platforms and hosting companies should make it the default going forward.

Combining Authorized Fetch that with shared up-to-date blocklists that include all of Meta’s instances could prevent Meta-run instances from directly accessing data via an API. But Meta’s decentralized approach could well mean that their instances are hosted under lots of different domains,⁵so it’s not clear what infrastructure will be needed to do it effectively – or if it’s even feasible. Most fediverse software can be run in “allow-list” federation mode; on Mastodon, for example, it’s the LIMITED_FEDERATION_MODE setting.

However, relatively few instances take this privacy-friendly approach today; indeed, Mastodon’s documentation misleadingly describes it as “contrary to Mastodon’s mission of decentralization.” To date, relatively little work has been done developing processes and tooling for making allow-list approaches work well, or looking at other alternatives besides pure blocklist or allow-list approaches. So this is an area where investigation is needed … and there are a lot of interesting complementary possibilities here. For example:

Peertube’s manual approval of federation requests (similar to the “approval first” mode described by vitunvuohi here) and ability to follow a list of instances to approve
the “letters of introduction” Erin Shephard describes in A better moderation system is possible for the social web, applied to federation instead of individual following
Darius Kaziemi’s “threaderation” (which is a bit more restrictive than the current “silencing” approach).
ophiocephalic’s proposal of “fedifams“, a family or alliance of instances where admins could deliberate together on blocklist/allow-list decisions, and a broader moderation council with a representative from multiple fedifams
Kat Marchán’s suggestion of “caracoles“, concentric federations of instances that have all agreed to federate with each other, with smaller caracoles able to vote to federate with entire other caracoles.

As mentioned above that fedifams and caracoles could also be potential post visibility boundaries, providing more alternatives to fully public posts.

Apps can also get data via APIs. Suppose Threads lets people login to their accounts on other instances as well. If I’ve blocked Meta, but somebody I’m friends with in the fediverse uses Meta’s app to log into their account and view my status, could the data get to Meta? Without doing a full analysis, my guess is probably yes. If so, what are the countermeasures – for example, allow-lists for apps?

Recommendations:

Enable “Authorized Fetch” on current sites, and make it the default setting in software releases and at hosting sites.
Investigate shifting to allow-list federation, and look at alternative approaches like “approval first” federation.
Investigate tooling for and feasibility of blocklist of Meta’s instances if there are a huge number of them hosted on different domains
Look at potential app-based dataflows and potential counter-measures.

Win/win “monetization” partnerships, threat or menace?

The indirect approach, where Meta enlists instances that federate with them to harvest (and “monetize”) data from people and instances who don’t, opens up new cans of worms. If Meta really does pursue a decentralized strategy (still a big open question) this is new ground in a lot of ways, and a lot will depend on the implementation, so right now this section has a lot more questions than answers. They’re very important questions, though, because the answers to these will have a big impact on how people in the different regions of the fediverse will be able to interact.

If I’m on an instance that blocks a Meta instance, you’re on an instance that federates with them, and somebody on a Meta instance is following you, what happens when you boost, quote boost, favorite, or reply to my status? As Mastodon Migration’s post (based on input from Calckey maintainer Kainoa as well as infosec.exchange admin and security expert Jerry) shows, it’s complex, and the answer is different for different software (and depend on whether instances are running Authorized Fetch). So this is a place where detailed analysis is needed.

And suppose instances that federate with Meta decide to take advantage of Meta’s services to recommend content (and/or target and serve ads)? Even if I’ve blocked Meta, if you’re following me from an instance federated with Meta, the software might send my data to them to help recommend better content for you (and/or better target you) … and once they’ve got it for one purpose, who knows what they’ll use it for. Is there a way to prevent that from happening?

Meta’s implementation, and the legal agreements they put in place for instances that federate with them, are both wildcards at this point. It’ll take a while to analyze the implementation even in the very unlikely event they open-source everything and provide a good architecture and design documents, so the best course of action for now is to take a “privacy by default” approach here of transitive defederation: defederating Meta, any instances that federate with Meta directly or indirectly.

That said, this is a very blunt hammer. Are other approaches possible? For example, suppose there was a way for an instance to say “I’ll only federate with a site if their privacy statement legally commits them to not sharing any data they receive from my instance with Meta, or with any instance that will share it with Meta” – and the software provided the functionality that could make that commitment real? Is there a role for “bridge” instances, not federating with Meta and somehow allowing people from both “free fediverse” and Meta-friendly instances to communicate while limiting the data that could flow back to Meta?

Recommendation: initially adopt the “privacy by default” approach of transitive defederation to protect against indirect data flows via Meta-federating instances while analyzing implementation and policies and investigating more flexible alternate approaches

A quick note to instance admins

Now’s a crucial time for instance admins to think about their goals and responsibilities. Are you committed to providing a private and safe experience for people in your community? If so, then show your commitment by shifting your thinking to “privacy by default”, and making sure your configuration settings (and software choices) reflect that. Of course, no principle is absolute, so there may well be tradeoffs; and these changes can’t necessarily be made overnight. Still, if this the direction you want to go, taking some initial meaningful steps and announcing a plan and timeframe is a good way to demonstrate that you really mean it.

If that’s not the direction you want to go, well, that’s also a good thing to let people in your community know so they can make a good decision about whether it’s the right instance for them.

Threat modeling’s especially important for admins who are considering federating with Meta. For one thing, one of the arguments in favor of federation at the instance level is that people on the instance have the “agency” of defederating themselves. But will that defederation actually protect them, or still leave them at risk? It’s a hard question to answer until we see the actual implementation, but based on what we know today assuming that’ll be the case seems overly optimistic.

Not only that, if you can’t prevent some of the indirect attacks discussed in Win/win “monetization” partnerships, threat or menace?, then instances that don’t want anything to do with Meta will have to defederate from you as well – meaning that people on your instance can’t talk to their friends elsewhere in the Fediverse.

It’s also important to model some of the other threats that are beyond the scope of this article. For example, a lot of LGBTQ+ people are concerned that instances federating with Meta will put them at risk. It’s not enough to dismiss this threat by saying “we have tools to prevent this.” If you want to ensure people on your instance are safe – or avoid endangering people on other instances – more detailed analysis is needed.

And finally, if you’re an instance admin who’s advocating working with Meta as long as it doesn’t put people on your instance at risk, and you’re not currently taking steps to provide better privacy safety to people on your instance, ask yourself why people should trust your evaluation of what does or doesn’t put people at risk.

Charting a path forward

One of the interesting things about the recommendations here is how many of them are straightforward: changing defaults, using and improving existing features, mainline Mastodon and instance admins adopting features like local-only posts that have long been implemented in forks (and other fediverse software). Most of what listed here helps protect against other bad actors besides Meta, and much of it helps with safety as well, so these improvements have been needed for quite a while. They just haven’t been prioritized.

It’s possible that Meta’s arrival will lead to the current forks taking privacy and other aspects of safety s to date changing their attitudes, and changing their priorities. Then again, Meta’s arrival will also require a lot of other work – scaling, support for whatever win/win monetization Meta’s offering, working on the next version of ActivityPub (which will surely need significant improvements to deal with anything on the scale of threads), and ensuring that instances that want to federate can meet other whatever standards Meta requires. So it’s also possible that privacy will continue to take a back seat in mainline Mastodon and other platforms.⁶ If so, it’s a good time for forks that prioritizes privacy and safety. And while I know less about other fediverse, the same dynamics may play out there as well.

Recommendation: Create forks that prioritize privacy and safety if mainline Mastodon (and potentially other platforms) continues not to

Of course, even though many of these recommendations are straightforward, there are a lot of them – some of which (like moving to “privacy by default” and getting away from a federate-by-default approach) are a chunk o’ work. Not only that, several of the recommendations are to “investigate” potential approaches; and, not to sound like a broken record, this high-level discussion here only scratches the surface and only for this one threat, so what’s really needed is a systematic threat modelling effort. And new implementations, designed with privacy and safety in mind, need resources as well.

All of that’s going to require funding.

As Afsenah Rigot discusses in Design From the Margins, centering the marginalized people directly impacted by design decisions leads to products that are better for everybody. That’s especially important for threat modeling. So equity and diversity needs to be a key consideration in terms of who gets funded to work on these projects – something that hasn’t historically happened on Mastodon.

The good news is that there are plenty of potential funding sources who see Meta as a threat, and some of them have decent budgets. Companies like Fastly and WordPress who are looking at the fediverse have a lot to lose if it winds up dominated by Meta and surveillance capitalism business models. Responsible companies and governments who are considering partnering with Meta have an interest in making sure that they don’t getting coopted in (potentially-illegal) data harvesting – or disinformation broadcasting. Many civil society organizations also see Meta as a huge threat. So do inviduals, so crowdfunding’s an option as well.

Pursue funding from anti-surveillance-capitalism companies, civil society groups, and crowdfunding to allow more detailed analysis, design, and rapid implementation – and ensure that the funding is directed in a way that increases equity and diversity.

Who knows, maybe the fediverse as a whole isn’t ready for this yet, and it’ll continue to stay at the prototyping stage for a while more. Even if that happens, the kind of informal threat modeling I’ve done here can still be useful to instances that want to insulate themselves from Meta’s threats to the extent that they can, platforms that want to prioritize making improvements with their existing resources, and new implementations that want to do better.

With luck, though, Meta’s arrival will be the kick in the pants the fediverse needs to shift modes and start taking privacy and other aspects of safety (and equity and accessiblity and usability …) seriously – and approach it in a way that also improves equity. The opportunity’s certainly there!

Appendix: Short-term recommendations for improving privacy of existing fediverse software

This section pulls together the recommendations scattered throughout the article combining a few similar ones in the process and organizing them into categories.

July 7: currently in the midst of an update, so some minor differences between this and recommendations in the article still need to be synced up.

Anybody who’s active in the fediverse today.

Ask your instance admin to provide better privacy and safety – you can point them to the Instance Admins recommendations below. If they say no, and don’t have good reasons, consider voting with your (virtual) feet and moving to another instance.
Let the developers of the software know that you care about privacy and safety and ask them to prioritize improving it.
If you’re running a single-person instance, see the recommendations for instance admins – including turning on Authorized Fetch (you may need to ask your hosting provider to do it for you).

Journalists

Reject the incorrect and misleading talking point that there’s no privacy harms in instances federating because they can (supposedly) already access all the data on the fediverse. Even if it were true (which it’s not), as Esther Payne discusses in Consent and the fediverse, it wouldn’t be a good argument; and, it takes today’s low bar as a given.
Highlight that until we know the details of Meta’s federation plans – including technical solutions and legal agreements – and how current fediverse software will evolve, instances planning on blocking from Meta may well also have to do “transitive blocking” and all instances that federate with Meta in order to keep their community’s data from being gathered without consent.

Instance admins

“Privacy by default”: Set defaults so that people start with stronger privacy protections, and can then choose to give it up.
Initially adopt the “privacy by default” approach of transitive defederation to protect against indirect data flows via Meta-federating instances while analyzing implementation and policies and investigating more flexible alternate approaches
Enable “Authorized Fetch” on current instance, and make it the default setting in software releases and at hosting sites.
Consider shifting to allow-list federation or (if your platform supports it) other approaches such as “approval first” federation. If your platform does’t support it, ask them to add it – even if it’s not right for you, it can help other instances protect themselves.

Hosting sites

“Privacy by default”: Set defaults so that people start with stronger privacy protections, and can then choose to give it up. For example, make “Authorized Fetch” the default setting, and ensure documention explains it well.

Developers and fediverse software projects

Note: these should be reordered!

Press the project leaders to prioritize privacy and safety in their planning. If they don’t consider voting with your feet and creating a fork that prioritizes privacy and safety (as well as equity, usability, onboarding and all the other issues that need to be prioritized).
“Privacy by default”: Set defaults so that people start with stronger privacy protections, and can then choose to give it up. For example, make “Authorized Fetch” the default setting, and ensure documention explains it well.
Provide a usable way for instances and individual users to protect all profiles, timelines, statuses, images, etc from anonymous access.
Reduce the number of public pages and the amount of data in them, including supporting private profiles; increasing use of non-public visibility like local-only posts; investigating new visibility levels like “within the same fedifam or caracol”, “viewable only to people logged into instances that don’t federate with Meta”, differentiating between what’s shown at different levels of visibility, and (potentially) requiring login to view unlisted statuses
Provide control for individuals and sites to determine whether RSS feeds are available, and implement an option to remove unlisted statuses from RSS feeds
Educate people on (and potentially package) existing processes and tools for IP Blocking, integrate with suspend/limit federation-level processes and infrastructure, and potentially develop new tools and look for ways to address downsides.
Develop processes and tooling for blocking Meta instances and all instances that federate with them. For example, extend nodeinfo to have a field with an instance’s policy towards Meta (transitively block, block but federate with instances that federate with Meta domains, federate with Meta domains). Investigate solutions for blocklists Meta and federating-with-Meta domains (taking into account that there are likely to be a huge number of them hosted on different domains), and for allow-lists for instances that don’t share data with Meta
Investigate shifting to allow-list federation, and look at alternative approaches like “approval first” federation.
Look at potential app-based dataflows and potential counter-measures, possibly including allow- and block-lists for apps as well
Pursue funding from anti-surveillance-capitalism companies, civil society groups, governments, and crowdfunding to allow more detailed analysis, design, and rapid implementation of privacy and safety features – and ensure that the funding is directed in a way that increases equity and diversity.

Business, government agencies, civil society organizations, and funders:

Fund (and contribute staff time to) forks and new projects that prioritize privacy and safety (as well as equity, usability, onboarding and all the other issues that need to be prioritized).
Staff sand fund safety, security, and privacy work including threat modeling – and make sure the funding is distributed in ways that increase equity and diversity.

Notes

¹ Christine Lemmer-Webber (who co-authored the spec) says that from a security and social threat perspective, “the way ActivityPub is currently rolled out is under-prepared to protect its users.” In ActivityPub: The “Worse Is Better” Approach to Federated Social Networking, Ariadne Conill describes ActivityPub’s approach as prioritizing other concerns over safety, and the same’s just as true for privacy. ActivityPub’s weaknesses make it especially vulnerable to an “embrace and extend” attack where Meta introduces proprietary solutions that are genuine improvements over the standard.

² Dan Goodin’s How secure a Twitter replacement is Mastodon? Let us count the ways is a good overview of some of the issues and the low current bar, like last fall’s bug that let people steal passwords by injecting HTML and configuration error that left private photographs from hundreds of instances open to the web. As I was writing this post, the admins of anarchist instance Kolektiva posted an alert that an unencrypted copy of their database had been seized by the FBI.

^2.1 An “instance”, also a server, is a site running Mastodon, Misskey, Pixelfed, or any other fediverse software.

^2.2 Instance admins typically are responsible for configuring the software and setting the sites policies. Many also install, maintain, and update the software, although hosting companies such as masto.host offer those services, so are attractive options for smaller instances or instances whose admins don’t have Linux sysadmin skills (or like me have basic skills but hate sysadmining).

^2.3 In the open-source world, a “fork” is a variant of a code base run as a separate project. Glitch-soc and Hometown, both of which support local-only posts, are two popular Mastodon forks. Many Mastodon innovations were first developed in forks and then adopted by the mainline; “exclusive lists”, coming in the next version, was originally developed by Hometown. Sometimes, though, the mainline decides not to adopt functionality from forks.

³ They’re also exploring other revenue models including subscriptions (like Instagram Plus) and an app store in th EU (with the Verge rather hilariously reporting that “at least initially” Meta isn’t planning on taking a cut of of in-app revenue, as if they won’t change that at the drop of a hat as soon as they decide it makes sense). These are still fairly small contributors to their revenue stream, however, and in any case are also likely to leverage Meta’s competence of collecting data without consent.

^3.8 Thanks to Vyr Crossont for pointing out Fediseer and Fedibuzz.

⁴ The setting is called DISALLOW_UNAUTHENTICATED_API_ACCESS. What does AUTHORIZED_FETCH actually do? includes a good short discussion However, the article warns that this setting blocks anonymous access to everything, including the instance’s “About” and registration pages. If so, turning it on makes an instance useless for people who don’t already have accounts (and could potentially violate laws requiring an Impressum or publicly-posted privacy policy). Bummer. I’m not completely sure this is always true, however; I’ve seen some sites that appear to block not-logged-in access to statuses and profiles, while still allowing access to the site’s about page. I’m not sure if it’s this setting or another.

⁵ Especially if they offer individuals the ability to have their account on their own domain name as part of a $100/year “threads plus” package, or bundle it and use it to increase adoption of Insta Plus, or whatever. Even if they don’t go that far and only offer it to a list of partners – celebrities, sports teams, politicians, media outlets, tech pundits, etc – who want to host their communities, there will still be lots of domains. Darnell Clayton’s Facebook Fears The Fediverse. Here’s How Instagram Will Try To Conquer It (EEE!!!!) has some interesting thinking on these kinds of scenarios.

⁶ As The Gibson (mayor of hackers.town) pointed out to me, platforms that aren’t taking security and privacy considerations seriously need to be treated as a possible supply-chain attack at this point.

Update Log

August 24: add TODO with links on scraping.

August 10: add link to fedifams and caracoles in allow-list discussion; add a brief mention of “token donation” and fediseer or fedibuzz; clarify RSS discussion;

July 20: add paragraph about new threat vectors from trust and safety services, link to “threaderation” suggestion.

July 7: added the top paragraph in response to excellent feedback that a summary would be very helpful! Reworked recommendations.

July 6: many small changes in response to excellent feedback. More specific thanks and acknowledgments coming soon!

July 5: first draft published

July 2-4: early partial drafts sent to “friendlies” for feedback. Because, y’know what else do people have to do on a holiday weekend? So I greatly appreciate everybody who took the time to read it and respond, thank you thank you thank you!