Links are one of the core features of the web. We use them to browse websites and to refer to content hosted somewhere else. The problem with links is that they might stop working at any time. Websites change and die, content is moved, modified and deleted, services introduce paywalls and login pages, laws make sites inaccessible. This is usually referred to as “link rot”.
Just a few hours ago Twitter decided to put all tweets[a] behind a login wall. This change might not be permanent if we are to believe a tweet from the owner (and of course you need an account to read it), but just like that, millions of links shared over the years, bookmarks, and open tabs no longer work as expected. And some of those links are important.
Popular services have been doing similar things for years. For example, Facebook and Instagram redirect users they don’t like (IP, browser, etc) to a login page independently of the content you’re trying to access (could be a meme or some announcement from your government). Reddit started hiding content from mobile users, requiring them to login or install their app (even though the content is right there behind the popup). Imgur, a very popular image hosting service, now has a problem with hosting images, started deleting content, and redirects users accessing images to pages with ads and trackers. Google tried very hard to create a Facebook competitor with Google+, but the service closed and a lot of content was lost.
This affects embedded content too. On top of the privacy problems of adding external content to pages, if the content isn’t really on the page, it might disappear, change or be put behind some wall. For some content this is not a big deal, but sometimes it is. For example, some news websites embed public posts from politicians. What if the post is removed? Did the person ever said what the site claims they did? Or what if the post is updated to say something else?
How to mitigate this?
There’s nothing we can do to stop links we don’t control from breaking, but we can duplicate the content so it exists in more than one place. We can:
- Take screenshots of the page/content.
- Archive the content on services like the Wayback Machine and Archive.today.
- Provide different sources.
- Keep our own copy of the content.
And so on.