{"id":4142,"date":"2022-12-04T17:08:00","date_gmt":"2022-12-04T17:08:00","guid":{"rendered":"https:\/\/2024.thenexus.today\/index.php\/2022\/12\/04\/mastodon-privacy-remember-that-public-and-unlisted-posts-can-be-indexed-by-search-engines\/"},"modified":"2024-01-20T05:24:57","modified_gmt":"2024-01-20T05:24:57","slug":"mastodon-privacy-remember-that-public-and-unlisted-posts-can-be-indexed-by-search-engines","status":"publish","type":"post","link":"https:\/\/2024.thenexus.today\/index.php\/2022\/12\/04\/mastodon-privacy-remember-that-public-and-unlisted-posts-can-be-indexed-by-search-engines\/","title":{"rendered":"Mastodon privacy: you can&#8217;t really opt out of search engine indexing"},"content":{"rendered":"<p>There are a lot of reasons people might not want their posts on a social network to be indexed by search engines. \u00a0One of the most important is personal safety. Harassers often use search engines to find out information about the people they&#8217;re targeting \u2013 or to find new people to target. \u00a0So Mastodon gives you the option (on the Preferences\/Other settings page) to opt out of search engine indexing.<\/p>\n<p>Unfortunately, as we&#8217;ll discuss below, selecting this option doesn&#8217;t actually fully opt you out of search engine indexing. \u00a0<\/p>\n<p>If you&#8217;re familiar with Mastodon&#8217;s cavalier approach to privacy and security, this probably doesn&#8217;t come as a surprise to you. \u00a0As hachyderm.io admin Kris Nova says in <a href=\"https:\/\/medium.com\/@kris-nova\/operating-mastodon-privacy-and-content-399eef251e65\">Operating Mastodon, Privacy, and Content<\/a><\/p>\n<blockquote><p>My immediate advice is to treat everything on Mastodon as if it is <strong>public data!<\/strong><\/p><\/blockquote>\n<p>However, a lot of people think Mastodon&#8217;s more private and secure than it really is. For one thing, Mastodon has long positioned itself as reducing harassment by <a href=\"http:\/\/blog.joinmastodon.org\/2017\/03\/learning-from-twitters-mistakes\/\">learning from Twitter\u2019s mistakes<\/a>. \u00a0And making it harder for harassers to search for past messages is often used as an example of this. \u00a0As EFF&#8217;s Bill Buddington says in \u00a0<a href=\"https:\/\/www.eff.org\/deeplinks\/2022\/11\/mastodon-private-and-secure-lets-take-look\">Is Mastodon Private and Secure?<\/a> <\/p>\n<blockquote><p>&#8220;This cuts down on harassment, because abusive accounts will have a harder time discovering posts and accounts using key words typically used by the population they\u2019re targeting (a technique frequently used by trolls and harassers)&#8221;<\/p><\/blockquote>\n<p>So having this option that doesn&#8217;t actually work gives people a false sense of security \u2013 but leaves them exposed if they&#8217;re at risk of harassment.<\/p>\n<h2 id=\"why-doesnt-opting-out-really-opt-you-out\">Why doesn&#8217;t opting out really opt you out?<\/h2>\n<p>Understanding why this option doesn&#8217;t work as expected requires a bit of digging into how &#8220;noindex&#8221; rule works in search engines and how Mastodon treats the opt-out setting.<\/p>\n<p>As Google&#8217;s <a href=\"https:\/\/developers.google.com\/search\/docs\/crawling-indexing\/block-indexing\">Block Search Indexing with \u2018noindex&#8217;<\/a> describes, when an HTML page has a &lt;meta name=&#8221;robots&#8221; content=&#8221;noindex&#8221;&gt; tag, Google and other search engines that support the noindex rule won&#8217;t index the page. \u00a0Of course, badly-behaved search engines <em>can<\/em> ignore the tag, so this won&#8217;t stop people who write their own crawlers. \u00a0Still, writing a crawler and storing all your own data is a pretty significant investment, so this is a useful level of protection.<\/p>\n<p>Mastodon version 4.0 always puts a noindex tag <a href=\"https:\/\/github.com\/mastodon\/mastodon\/pull\/19319\">on most pages<\/a>, but whether or not it&#8217;s on your profile depends on the &#8220;opt-out of search engine indexing&#8221; option. \u00a0The option is also used to determine whether there&#8217;s a noindex tag on web pages for your posts. \u00a0My indieweb.social account has that option turned on, so if you look at the HTML for <a href=\"https:\/\/indieweb.social\/@jdp23\/109454142903621001\">this post<\/a>, you&#8217;ll see the noindex rule. \u00a0So far so good.<\/p>\n<p>But if I do a search for &#8220;<a href=\"https:\/\/www.google.com\/search?as_q=%40jdp23%40indieweb.social&amp;as_eq=twitter\">@jdp23@indieweb.social<\/a>&#8220;, I&#8217;ll find pages with my posts in them. \u00a0In fact, I can even do a search for specific text in one of my posts \u2013 here&#8217;s one for &#8220;<a href=\"https:\/\/www.google.com\/search?q=jdp23+gh*st\">jdp23 gh*st<\/a>&#8221; that brings up a thread Anil Dash started that I replied to. \u00a0In this particular case I don&#8217;t care, but imagine a situation where instead of gh*st I had used a term that attracts harassers. \u00a0<\/p>\n<p>That&#8217;d be very bad. \u00a0<\/p>\n<p>And as Darius Kazemi <a href=\"https:\/\/indieweb.social\/@darius@friend.camp\/109454083964785804\">verified<\/a>, this applies to unlisted posts as well as public posts. <\/p>\n<p>From a software perspective, the bug here is that Mastodon is only checking the &#8220;opt out of search engines&#8221; setting for the original author. \u00a0Anil, like many others, doesn&#8217;t mind if his posts are indexed by search engines. \u00a0When I reply to him, that means that my post will be indexed by search engines as well \u2013 even though I&#8217;ve opted out.<\/p>\n<p>I filed a <a href=\"https:\/\/github.com\/mastodon\/mastodon\/issues\/22047\">bug report<\/a> on this and it&#8217;ll be interesting to see what the response is.<\/p>\n<h3 id=\"but-wait-theres-more\">But wait, there&#8217;s more<\/h3>\n<p>This isn&#8217;t the only way your public and unlisted Mastodon posts can wind up in a search engine even if you&#8217;ve opted out. \u00a0If somebody from another instance is following you, there&#8217;s no guarantee that the software they&#8217;re running will pay attention to the &#8220;opt-out from search engines&#8221; setting. \u00a0As long as the other instances are running Mastodon software, this isn&#8217;t an issue unless admins have intentionally disabled this functionality. \u00a0However, other software that&#8217;s compatible with Mastodon may not know about this setting. \u00a0<\/p>\n<p>In fact, because of the way Mastodon implements federation, even posts that have been deleted have copies on other instances that can still be found by search engines. \u00a0Yikes!<\/p>\n<p>It&#8217;s worth noting that &#8220;<a href=\"https:\/\/github.com\/hometown-fork\/hometown\/wiki\/Local-only-posting\">local-only posts<\/a>,&#8221; supported by Mastodon forks (variants) like Glitch and Hometown, provide significant protection here. \u00a0Local-only posts aren&#8217;t included in externally-accesses pages, so search engines never see them. As Hometown maintainer Kazemi points out, if you&#8217;re on an instance where you trust the admins and the other members, local-only posts give you the ability to ensure that your stuff only goes to actors you personally trust. Unfortunately, Mastodon&#8217;s BDFL (benevolent dictator for life) has rejcted this valuable anti-harassment technology from the main line of code, so most instances don&#8217;t have this functionality.<\/p>\n<p>Of course, Mastodon&#8217;s not the only social network site where you don&#8217;t have any privacy. \u00a0Twitter allows you to delete your tweets and direct messages, but doesn&#8217;t actually commit to deleting them from their internal databases or backups. \u00a0And since there&#8217;s currently more organized harassment on Twitter than Mastodon, and their investors (including Larry Ellison of Oracle, Prince Alwaleed bin Talal bin Abdulaziz of Saudi Arabia, and the Qatar Investment Authority) get special rights to your personal data, the risks are likely higher there. \u00a0<\/p>\n<p>Still, don&#8217;t kid yourself: Mastodon&#8217;s security and privacy story is not good. Lenin Alevski recently found a system misconfiguration vulnerability making <a href=\"https:\/\/www.alevsk.com\/2022\/11\/system-misconfiguration-is-the-number-one-vulnerability-at-least-for-mastodon\/\">content and videos from supposedly-private direct messages open to the world<\/a>; as well as infosec.exchange&#8217;s 33,000 users, Alevski reports this affected several other high-profile sites. \u00a0The lack of end-to-end encryption means that admins can read supposedly-private direct messages \u2013 and if you&#8217;re DM&#8217;ing with somebody on another instance, their admins can read it as well. \u00a0 <\/p>\n<p>The <strong>What to do? <\/strong>section of Dan Goodin&#8217;s <a href=\"https:\/\/arstechnica.com\/information-technology\/2022\/11\/how-secure-a-twitter-replacement-is-mastodon-let-us-count-the-ways\/\">How secure a Twitter replacement is Mastodon? Let us count the ways<\/a> has a useful list of some of the things you can do to cut down the risks, but they only go so far. \u00a0At the end of the day, I agree with Kevin Beaumont, a security professional and admin for the cyberplace.social instance, who Goodin quotes as saying:<\/p>\n<blockquote><p>\u201cMy take is the same as Twitter. Don\u2019t write anything on social media you wouldn\u2019t write in public.&#8221;<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>There are a lot of reasons people might not want their posts on a social network to be indexed by search engines. \u00a0One of the most important is personal safety. Harassers often use search engines to find out information about the people they&#8217;re targeting \u2013 or to find new people to target. \u00a0So Mastodon gives [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[470,436],"class_list":["post-4142","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-fediverse","tag-mastodon"],"_links":{"self":[{"href":"https:\/\/2024.thenexus.today\/index.php\/wp-json\/wp\/v2\/posts\/4142","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/2024.thenexus.today\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/2024.thenexus.today\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/2024.thenexus.today\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/2024.thenexus.today\/index.php\/wp-json\/wp\/v2\/comments?post=4142"}],"version-history":[{"count":1,"href":"https:\/\/2024.thenexus.today\/index.php\/wp-json\/wp\/v2\/posts\/4142\/revisions"}],"predecessor-version":[{"id":4346,"href":"https:\/\/2024.thenexus.today\/index.php\/wp-json\/wp\/v2\/posts\/4142\/revisions\/4346"}],"wp:attachment":[{"href":"https:\/\/2024.thenexus.today\/index.php\/wp-json\/wp\/v2\/media?parent=4142"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/2024.thenexus.today\/index.php\/wp-json\/wp\/v2\/categories?post=4142"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/2024.thenexus.today\/index.php\/wp-json\/wp\/v2\/tags?post=4142"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}