By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
MadisonyMadisony
Notification Show More
Font ResizerAa
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Reading: The Open Supply Device That Has Preserved 150,000 Items of On-line Proof
Share
Font ResizerAa
MadisonyMadisony
Search
  • Home
  • National & World
  • Politics
  • Investigative Reports
  • Education
  • Health
  • Entertainment
  • Technology
  • Sports
  • Money
  • Pets & Animals
Have an existing account? Sign In
Follow US
2025 © Madisony.com. All Rights Reserved.
Investigative Reports

The Open Supply Device That Has Preserved 150,000 Items of On-line Proof

Madisony
Last updated: August 13, 2025 11:56 am
Madisony
Share
The Open Supply Device That Has Preserved 150,000 Items of On-line Proof
SHARE


Contents
Automated Archiving and Collaboration – When to Use This Device?Help BellingcatThe Way forward for Net ArchivingWhat Modified, What StaysSubscribe to the Bellingcat e-newsletterA New Structure

Bellingcat’s Auto Archiver is a device aimed toward preserving on-line digital content material earlier than it may be modified, deleted or taken down. Publicly launched in 2022, it has preserved over 150,000 internet pages and social media posts thus far. The Auto Archiver has been utilized by Bellingcat’s journalists to protect data on dozens of fast paced occasions such because the Jan. 6 riots – once we first used the device internally – in addition to collect digital proof for our Justice and Accountability challenge and to observe Civilian Hurt in Ukraine.

The Auto Archiver has additionally been adopted by each massive newsrooms and NGOs. It has been  utilized by particular person researchers, journalists, activists, archivists, lecturers and builders as properly.  With curiosity within the device sturdy, now we have labored arduous so as to add to and enhance it over time. However now we have used the previous few months to take a step again and to construct a brand new and extra strong ecosystem to additional assist particular person organisations and researchers use and profit from it.

Our intention has been to make it extra dependable and even simpler to make use of for extra individuals. At this time, we’re comfortable to announce an up to date model of the Auto Archiver which incorporates many new options like:

  • Detailed documentation for all options and configurations
  • A user-friendly interface designed for groups utilizing a shared occasion
  • A brand new modular construction that improves the startup velocity and reliability of the device
  • New options like chain of custody, perceptual hashing for deduplication, and strategies to keep away from anti-bot measures and captchas on web sites
  • A user-friendly device to configure the Auto Archiver, with out the necessity to edit configuration textual content information
Screenshot of recent Documentation website for the Auto Archiver

For an in-depth have a look at the modifications made on this steady model of the Auto Archiver, see the What Modified, What Stays part additional down on this article.

Automated Archiving and Collaboration – When to Use This Device?

The newest model of the Auto Archiver has an easy-to-use internet interface and a simplified set up course of that makes it extra simple to arrange than earlier than. Nevertheless, some technical abilities are nonetheless required for this preliminary course of, and there are different instruments out there that might meet lots of your archiving wants.

Help Bellingcat

Your donations instantly contribute to our skill to publish groundbreaking investigations and uncover wrongdoing all over the world.

If all you want is to archive a number of unauthenticated URLs, we suggest utilizing the Wayback Machine or Archive.right now. Alternatively, WebRecorder’s browser extension ArchiveWebPage can create a replayable archive of a web site you go to – even for content material behind login partitions. For batch processing, the Wayback Machine has a bulk add service that accepts Google Sheets. Should you individually must document all of your browser interactions and retailer content material alongside the way in which there are paid choices like Hunchly. Lastly, if all you have an interest in are movies and are snug with the command line, yt-dlp will most likely be sufficient to obtain these, even in bulk.

However if you happen to’re hoping to automate your archiving, or archive numerous URLs in a collaborative setting, then that is the place the Auto Archiver actually shines. Its modular framework permits you or your crew to customize archiving primarily based in your wants, and offers a solution to generate metadata that ensures others can belief that your archived content material has not been tampered with. 

Study extra about what websites the Auto Archiver can archive right here.

The Way forward for Net Archiving

Archiving the net is difficult, particularly when logins, captchas, and different bot prevention methods are in place. We’ll do our greatest to maintain enhancing our Auto Archiver, however we observe that it must be simply one in all many instruments in your researcher’s toolkit. You possibly can discover quite a lot of different helpful instruments within the Bellingcat Open Supply Investigation Toolkit.

Nonetheless, if you wish to help us on this journey of archiving crucial data, you may:

  • Obtain and use this device
  • Donate on to Bellingcat
  • Take a look at, give suggestions, and develop new options in our GitHub

For newsrooms:
Should you work in a newsroom or analysis crew and need to entry a demo or assist to deploy the Auto Archiver internally you may attain us at contact-tech@bellingcat.com with the Topic “Auto Archiver at [my team/organisation]” and inform us extra about your organisation and archiving wants. Constructing a larger adoption base is one of the best ways to make sure the way forward for this device and its versatility.

What Modified, What Stays

Subscribe to the Bellingcat e-newsletter

Subscribe to our e-newsletter for first entry to our printed content material and occasions that our workers and contributors are concerned with, together with interviews and coaching workshops.

Now that now we have given a broad overview of the device and its modifications, what follows is a deeper have a look at how totally different components of it work and work together. This may probably be of larger profit for extra technical customers, and we once more stress that profitable customers of the device will probably want some technical information to set it up for the primary time. 

However assist is accessible with our reside Auto Archiver Documentation. That is the place you’ll all the time discover the newest data on easy methods to set up, configure or debug the device. Even when some points talked about on this article change within the coming years, the documentation shall be your go-to area for the updated directions. 

When you’ve got questions or issues please open an subject on GitHub. That’s the place others may also be going to for assist and constitutes our shared information area.

A New Structure

Many open supply researchers, together with at Bellingcat, favour utilizing the Auto Archiver with the Google Sheets integration, which permits customers to work collaboratively by including hyperlinks to a spreadsheet and letting the Auto Archiver run within the background. Nevertheless, now we have now made it less complicated to combine the Auto Archiver into different methods. One such instance is ATLOS, a collaborative investigations platform that built-in the Auto Archiver and which has been used by Bellingcat and the Centre for Data Resilience. 

Integration is feasible through the brand new modular structure of the Auto Archiver and could be seen within the two new tasks that we lately made public beneath open supply code licenses: the Auto Archiver API and the Auto Archiver Net Interface.

A display seize of the brand new Auto Archiver Net Interface exhibiting the Google Spreadsheets administration web page, the place customers can allow the Auto Archiver to run periodically on new or current spreadsheets.

Modules are the constructing blocks of the archiving pipeline and inform the device easy methods to run. They element the place to seek out the URLs, which archiving strategies to make use of, what further processing to hold out on archived content material and the place and easy methods to retailer it. Every module falls into a selected class:

  1. Feeder modules specify the place to learn the URLs from. There’s one for Google Sheets, for instance. 
  2. Extractor modules obtain media and different metadata from a URL: our most versatile one is the Generic Extractor, which makes use of yt-dlp to obtain movies. Nevertheless, extractors could be tailor made for particular platforms just like the Telethon Extractor, which requires a Telegram account to obtain all media and metadata from the messages in public or non-public chats an account has joined. 
  3. Enricher modules improve the worth of the archived content material with further data or checks, comparable to hashing or timestamping the content material for future consistency or chain of custody validations. 
  4. Formatter modules gather and show the results of the method in a single formatted output. We use the HTML Formatter, as proven in this Bluesky submit instance.
  5. Storage modules inform the device the place to place the information it downloaded or generated. The best is to retailer it domestically. However to make sure higher preservation the very best follow is to make use of cloud storages like S3 or Google Drive. 
  6. Database modules merely point out the place to save lots of a document of this archive, comparable to whether or not archival was profitable and which strategies had been used. You should use a CSV file and Google Sheets, for instance. 

The modules documentation could be discovered right here and it’s there that will help you perceive how every module works and is configured. Configuring which modules to make use of is finished through a YAML file. In case you are not snug with these, now we have you lined with a brand new interface known as the configuration editor the place you may visually create or edit your modules configuration. In reality, the primary time you run the Auto Archiver a minimal working YAML configuration file is generated which you should utilize right away to learn URLs from the command line and retailer archived content material domestically.

Some platforms rate-limit or outright block IPs primarily based on inauthentic behaviour. One of many methods we make use of to avoid that’s sending visitors by means of a proxy, which you’ll configure in particular modules just like the Generic Extractor . We have now been utilizing Oxylab’s Residential Proxies as a part of their Challenge 4beta efficiently for over a 12 months, however know that there are a number of good suppliers on the market. 

In case you are a developer, you may design new modules as wanted utilizing Python code, and we welcome it if you wish to contribute these again to our code. Think about a Feeder that’s continually scraping URLs from a Bluesky account, or an Enricher that makes use of an AI mannequin to detect and blur graphic content material. All of that’s attainable and straightforward to construct on this new structure. 

We hope you’ll benefit from the up to date device.

Please give us any suggestions or solutions for enhancements by contacting us through contact-tech@bellingcat.com.


Bellingcat is a non-profit and the flexibility to hold out our work relies on the sort help of particular person donors. If you need to help our work, you are able to do so right here. You may as well subscribe to our Patreon channel right here. Subscribe to our E-newsletter and observe us on Bluesky right here and Instagram right here.



Subscribe to Our Newsletter
Subscribe to our newsletter to get our newest articles instantly!
[mc4wp_form]
Share This Article
Email Copy Link Print
Previous Article Lithium Argentina-Ganfeng Lithium three way partnership to spice up lithium manufacturing Lithium Argentina-Ganfeng Lithium three way partnership to spice up lithium manufacturing
Next Article Trump will go to the Kennedy Middle on the day honorees are introduced Trump will go to the Kennedy Middle on the day honorees are introduced
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

POPULAR

Rubio says launch of hostages from Gaza is “most emergent and rapid part” of Trump peace plan
Politics

Rubio says launch of hostages from Gaza is “most emergent and rapid part” of Trump peace plan

Tenorio guides Magnolia previous ex-team Ginebra in PBA teaching debut as fiftieth season opens
Investigative Reports

Tenorio guides Magnolia previous ex-team Ginebra in PBA teaching debut as fiftieth season opens

Analyst Says Meta (META) Glasses Will Be a ‘Low Quantity’ Product
Money

Analyst Says Meta (META) Glasses Will Be a ‘Low Quantity’ Product

NFL Week 5 damage report: Accidents for bettors to know embody Lamar Jackson, Brock Bowers and Bucky Irving
Sports

NFL Week 5 damage report: Accidents for bettors to know embody Lamar Jackson, Brock Bowers and Bucky Irving

Dangerous Bunny jokes about Tremendous Bowl halftime throughout ‘SNL’
National & World

Dangerous Bunny jokes about Tremendous Bowl halftime throughout ‘SNL’

UK police to get new powers after newest pro-Palestinian protest
Politics

UK police to get new powers after newest pro-Palestinian protest

Our Favourite Humidifiers for House, Journey & Extra (2025)
Technology

Our Favourite Humidifiers for House, Journey & Extra (2025)

You Might Also Like

Unhealthy Religion Podcast
Investigative Reports

Unhealthy Religion Podcast

Whitney joined Unhealthy Religion podcast to debate the current Trump cupboard picks, the dominant function tech CEOs & the deep…

1 Min Read
ICC postpones Duterte listening to, lawyer says ex-president ‘not match to face trial’
Investigative Reports

ICC postpones Duterte listening to, lawyer says ex-president ‘not match to face trial’

That is AI generated summarization, which can have errors. For context, all the time discuss with the total article. (1st…

5 Min Read
ICC postpones Duterte listening to, lawyer says ex-president ‘not match to face trial’
Investigative Reports

ICC prosecutors cost Duterte with 3 counts of homicide

That is AI generated summarization, which can have errors. For context, at all times consult with the total article. Prosecutors…

3 Min Read
Keep in mind Leila Barros? Brazilian volleyball star returns to PH
Investigative Reports

Keep in mind Leila Barros? Brazilian volleyball star returns to PH

That is AI generated summarization, which can have errors. For context, all the time seek advice from the complete article.…

5 Min Read
Madisony

We cover the stories that shape the world, from breaking global headlines to the insights behind them. Our mission is simple: deliver news you can rely on, fast and fact-checked.

Recent News

Rubio says launch of hostages from Gaza is “most emergent and rapid part” of Trump peace plan
Rubio says launch of hostages from Gaza is “most emergent and rapid part” of Trump peace plan
October 5, 2025
Tenorio guides Magnolia previous ex-team Ginebra in PBA teaching debut as fiftieth season opens
Tenorio guides Magnolia previous ex-team Ginebra in PBA teaching debut as fiftieth season opens
October 5, 2025
Analyst Says Meta (META) Glasses Will Be a ‘Low Quantity’ Product
Analyst Says Meta (META) Glasses Will Be a ‘Low Quantity’ Product
October 5, 2025

Trending News

Rubio says launch of hostages from Gaza is “most emergent and rapid part” of Trump peace plan
Tenorio guides Magnolia previous ex-team Ginebra in PBA teaching debut as fiftieth season opens
Analyst Says Meta (META) Glasses Will Be a ‘Low Quantity’ Product
NFL Week 5 damage report: Accidents for bettors to know embody Lamar Jackson, Brock Bowers and Bucky Irving
Dangerous Bunny jokes about Tremendous Bowl halftime throughout ‘SNL’
  • About Us
  • Privacy Policy
  • Terms Of Service
Reading: The Open Supply Device That Has Preserved 150,000 Items of On-line Proof
Share

2025 © Madisony.com. All Rights Reserved.

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?