Mariana Bay

Join the Mariana Bay to take part in the discussion, we would love to have you on board! We have multiple boards for whatever your interests may be and our community will make you feel right at home, so what are you waiting for? Sign up today!

Technology Data-hoarding Thread (+guide)

Welcome to the data-hoarding thread (and guide), where I will explain what is data-hoarding, why you should do it, and how.

(This guide shall be edited with time. Consider it not entirely finished. The thread also is for general discussion of data-hoarding, the guide is just a bonus).

You might wonder:

What is data-hoarding? :newspaperpepe:

If you type "data-hoarding definition" on Goolag™, you'll be prompted with a somewhat negative definition, going as far as to qualify data-hoarding as a "mental disorder", "compulsive behavior":
According to Wikipedia (bad definition):
Digital hoarding
(also known as e-hoarding, e-clutter, data hoarding, digital pack-rattery or cyber hoarding) is defined by researchers as an emerging sub-type of hoarding disorder
characterized by individuals collecting excessive digital material which lead to those individuals experiencing stress and disorganization. :soyjakspeech:

In reality, data-hoarding is, in my own definition, the process of downloading and storing every single tidbits of internet media you like or come across, it is more of a value/principle and of a hobby. :mooman:

I didn't start getting into data-hoarding because I had some form of "compulsive disorder" but who knows. In simpler words, if you like something on any platform like TheyTube or a website that you like, series, movies, animes, imageboard threads, blogs, texts and such, you download it and store it on either a hard-drive or else (USB key storage is rare since not many have Terabytes of capacity, or are much more expensive, and are easier to lose in a corner of your room or else, also tend to dysfunction more often).

But then, it begs the following question:

Why should you do data-hoarding ? :pepechill:
"Why would I want to download EVERYTHING (or not) I like on the Internet?" :pepecringe: you might ask yourself. And that is a valid question.

Well, because of several reasons:


1. Because everything on the Internet gets deleted at some point.

That's right, contrary to what the (false) saying is "Nothing disappears from the Internet", everything does disappears at some point. For the sake of storage, or because the person(s) that uploaded a media suddenly want it deleted (which happens much, much more often than you think), either out of controversy related to it.

Or because the service that hosted it went bankrupt or faced legal issues (many file hosters recently were deleted, same for torrent websites like
rarbg that was recently shut down), or because they grew tired of it, obscure IRL reasons... there are millions of reasons really.

Or sometimes because things (such as softwares) were either only launchable or available on a very old OS/machine and are today nearly unfindable or do not work even on emulators (such is the case for the
iMac TTS, a text-to-speech software famous for its use in the anime Serial Experiment Lain [which i'm not necessarily a fan of, or its weird connection with troons]).


Literal hundreds of pentabytes (if not more) were lost with time since the beginning of the Internet. And to point out that you never know when things might get deleted, I at one point data-hoarded a TheyTube channel, and merely a week later the owner decided to delete all of the videos with no prior warnings, and I now only have 50% of the videos on hard-drive, sucks, although I probably am the only person with the videos still, which are on my hard drive.

You never know when the things you like can disappear.

Another example is the way imageboards function with threads. They have a limited number of threads active, only keeping those who are bumped/have the most interaction on, and deleting threads that went inactive for a long time, to save disk space. As such, an insane amount of threads were deleted to this day, fortunately with some being available on archive websites, torrent collections of threads, or rarely relayed with screenshots by other people somewhat randomly.


As a sidenote, Goolagis going to remove all TheyTube channels that are unactive for more than 2 years (goodbye pentabytes of Internet memes and history of Y2K) in december 2023. So another reason to start data-hoarding.

Data-hoarding is also often the only way lost media can ever get solved/found (another interesting topic you could be interested in, of which I will not talk about however, in this guide).

2. Because of censorship.

As most of you already know, platforms like TheyTube changed policies with time, progressively becoming more and more restrictive and enforcing globohomo agenda. As such, many (although not limited to) TheyTube channels were banned, videos censored (and more recently age-restricted for some), some deleted due to copyright enfringement, with no hope of ever recovering them (unless they were archive but more often than not they are not if the videos/channels don't go further than 5k or 10k subscribers), on top of the fact that TheyTubers generally do not store all of their videos on their hard-drive (to give you an example, I have some videos I recorded with OBS on my hard drive, and something as simple as gameplay of trolling in L4D2 in HD in 12 mins takes nearly 3 gigabytes, but that's just a single case and the result varies in general. Imagine the TheyTube channels that have hundreds of these videos, on top of having all TheyTube videos by default having 720p, with some having 2k, or even 4k quality available).

And anything that is remotely politically incorrect or contains swear words or "nudity" (more of an excuse since videos banned for that reason do not generally have explicit nudity despite e-whores never having any problems with their videos) gets banned too. :pepecringe:

3. Because you can watch/access the things you data-hoard whenever you want.

With data-hoarding, you can access everything you downloaded at whatever moment you want, with no bandwith, and without depending on some corporation's platform, websites and ToS to access, and (generally) no one can delete what you have on your hard drive, no matter what it is. You don't need to send information to anyone, not even your ISP if you look at the things you data-hoard (although the exception can be made that your OS technically could know with data collection, or possibly glowies since most CPUs and motherboards have backdoors inside of them :glowie:).

4. Because you can transfer the things you data-hoard on whatever devices you want.

If you downloaded some .mp4 or .mkv files (videos), you can transfer them from one disk to another, to your phone, your tablet, and share it with your friends if you want, since the things you have on your disk drive cannot be deleted (with the only exception that you still technically can lose it if your hard-drive is damaged, which i'll talk about later on).

5. Because it stands for web preservation and archiving.

If you're more of a moral/value person, it also stands for web preservation and archiving, allowing medias, history and such to be kept. Most of the internet's history/drama/else are generally written/shown by data-hoarders, which still are (to various degrees depending on the person) archivists.
??. And other hundreds of reasons.


There are many other reasons that exist for one to want to data-hoard, which either that currently do not come to my mind, or because others are unknown. The reasons I gave you are reasons that generally apply to me, and as such, there can be many other reasons for other persons.

Convinced? If so, onto the next and last section of the guide.


How to data-hoard? :hmmpepe:

"Data-hoarding is cool and all but you still didn't tell us how to do it. :soy:" Yup. And that's what i'm going to do. Keep in mind that I am nowhere a professional data-hoarder, I only started data-hoarding about 1-2 years ago which is VERY late for data-hoarding standards. What i'll give you is more of the way *I* data-hoard, and there may be many more softwares or websites I do not use or talk about, this guide is more of an introduction point do data-hoarding, a beginner's guide really (which will get updated with time).

Data-hoarding is primarily made by:
- The use of softwares, be it web crawlers, torrent softwares, downloading softwares
- Archive websites, sometimes torrent/file-sharing websites, and bunch of others.


I will now give the websites and softwares that I use to data-hoard, explain briefly how they work, with a few screenshots and small tutorials to get you started.


DISCLAIMER: I data-hoard on windows 10, so there may be small differences of performance or installation process or even software that are and can be used with Linux.

Jdownloader, or the data-hoarder's best friend:

Jdownloader is the software I use the most often. It is free, open-source, and with it you can download nearly everything you wish. It is written in Java (ew, i know). It can be used to download TheyTube links, files on file-hosting websites, it supports the use of accounts (for example for premium file hosting websites or private trackers or else), and you can tinker with the settings to make the downloads go wherever you want it to go, how the folders must be made, what quality settings you want it to use for TheyTube videos (by default it takes 720p if I remember correctly), it even downloads descriptions in .txt files, as well as subtitles in a .srt file (generally those automatically generated in the video).

All-in-all a very good data-hoarding software. It also has a (by default activated) clipboard function that grabs all the links you copy, so that it automatically puts it in the links to download (and you can turn that off if you want), it's perfect to download multiple youtube videos, you just copy the link and it grabs it easily, perfect for your liked TheyTube videos playlist (since it is by default private) where you can simply right click, copy and do that over and over (still quite a long process but much less compared to using third party websites to "convert" the video), although I never tried putting my account login in the software.

Link for Jdownloader: https://jdownloader.org/jdownloader2

Choose your OS, it will redirect you to MEGA and from here you just download and execute it.
Don't mind the very old layout and UI, it is a legit and good software.



How to use it:

Jdownloader is very easy to use and only revolves about three parts:
- Downloads section
- LinkGrabber
- Settings
.

Here is how the downloads look like (with copious amounts of sometimes unnecessary captions):


View attachment 881

View attachment 882

(Linkgrabber part above).


Then, the settings. You don't really need to tinker with it if you just want to simply download links and whatnot, but I modified one setting that allowed to name the file with the date of which
the original file (that I downloaded) was uploaded on the internet, useful to date TheyTube videos. IF you want to do that, go to the settings tab, Plugins, choose the TheyTube.com plugin, scroll down until you arrive at "filename & packagername", scroll some more and in filename for video files, make sure the following is written:
*3D* *360* *VIDEO_NAME* (*H*p_*FPS*fps_*VIDEO_CODEC*-*AUDIO_BITRATE*kbit_*AUDIO_CODEC*)*DATE_UPLOAD* *DATE*.*EXT*
and you should be good. But again that's not necessary and i've had a few problems with it, make a backup of the original line in case you have problems.

Now that Jdownloader is out of the way, let's now talk about the second thing I use the most. It technically isn't strictly made for data-hoarding.

Qbittorrent, or the free sailor's ship:

Qbittorrent of which most of you probably already know is a torrent software, it's once again free, and open-source, and without the sketchiness of the garbage-tier ex-crypto-miner malware that is utorrent. :swag:

But why would we want to use a torrent software for data-hoarding? :hmmpepe:

Well, because there are many torrents out there that are collections of different stuff, including for example things such as 20 Gb of 4chan threads between 2009-2012. If you search well you can download multiple collections of archived stuff, youtube videos, etc.

Or even simply download series, animes, and such, which TECHNICALLY still counts as data-hoarding.

Link for Qbittorrent: https://www.qbittorrent.org/download

Qbittorrent is even easier to use than Jdownloader, so I won't be providing any guide here. Just a tip however, I recommend once you downloaded Qbittorrent that you go in Tools (or just click the cogwheel, easier like that) -> Settings -> BitTorrent and tick "Enable anonymous mode".

Then, onto the third thing I use the most to data-hoard.


Archive.org, the heaven of all data-hoarders:

Archive.org (<- link), a website that devotes itself to web preservation, unfortunately facing legal problems (lawsuits) due to 'copyright issues'. Hopefully it will stick around, yet I recommend you still download as much stuff as you can there (and you can even couple it with Jdownloader). Always go from the principle that if it can be deleted, it will at some point.

You can find an INSANE amount of things there, be it books, youtube videos, even includes the
wayback machine which allows you to go to snapshots of websites (if users were kind enough to do snapshots of them), some trace back to early 2000s.

The only problem with archive.org is its difficulty to navigate and find the stuff you want to look for, the search bar isn't that great, and you'll definitely have to be patient (or lucky) to some degree.

Therefore, you also can use another alternative, which is to write in Goolag in quotation marks the things you want to find, for example "lost media" "playlist" site:archive.org. What it basically does is that it will only show the pages that contain the words in the quotation marks, and limits itself to the website archive.org. Much quicker alternative indeed.


HTTrack, the website downloading software:

Did you ever told yourself "man, i'd really like to have this whole website available at any time on my PC :apusad:" and thought it to be not possible? Sure you could archive.org, but it is clunky, long to use, and doesn't allow you to access the website offline, without third-parties and it doesn't give you the guarantee that archive.org will stay forever.

I present to you WinHTTrack, an open source offline browser (and web-crawler/downloader) for websites. It is quite old, and its last update goes back to 2017, but it still works perfectly.
:mooman:


Link for HTTrack: https://www.httrack.com/page/2/en/index.html

Both the UI from the website and the software are very old-looking, but it still works just fine and is legit. Do note that you need to make an individual folder for a website you wish to download. Since this software isn't too hard to use either, i'm not going to post a guide for the time being, it's very accessible anyway. Also, if you can't decide in action "download web site(s)" or "download web site(s) + questions" just choose the first option.

Tips for data-hoarding:

- ALWAYS backup the stuff you download on other disk drives. You never know when you can have a faulty disk drive that decides to die for some random reason. :trollface:
- IF you decide to use servers instead of disk drive storage (which also is possible in data-hoarding), you need to be prepared to handle security and put a solid password, otherwise there is a small probability that you will lose your files or possibly more. You'd be amazed at how many servers exist that are public for a decade and contains a lot of stuff everyone can download (even if it was not supposed to be so), with security issues. In my opinion disk drive storage without internet is the most secure and efficient way, but whatever floats your boat of course.
- Consider buying additional disk drives (externals are a good way to do so) if you aim to download a lot of stuff.

And that's basically it for the guide!

Don't hesitate to ask questions or even just generally talk about data-hoarding, that's why I made this thread after all, it's not just about the guide. :mooman:
 

Khastle

Herald of the Mariana
Janny
Marianan ID
4
Joined
Apr 2, 2023
Threads
191
Messages
1,924
Reaction score
1,037
Awards
80
Location
Dwayne's Basement, Mariana Bay
LV
4
 
Offline
Best thread, must be pinned!
It's in the Hobby Talk article category so it's easy to find thankfully. Actually been meaning to add this article to the blogfront too, just haven't got around to it and the original poster (who just disappeared and we really have no idea why) used really unique formatting that will need extra attention when moving over for archival purposes.
 

Khastle

Herald of the Mariana
Janny
Marianan ID
4
Joined
Apr 2, 2023
Threads
191
Messages
1,924
Reaction score
1,037
Awards
80
Location
Dwayne's Basement, Mariana Bay
LV
4
 
Offline
Months on from this brilliant article being posted by that unknown anon I've actually got quite into data hoarding myself, got a 2tb lacie filled up already and am considering getting a 5tb one to further expand my film and rom backup collection as yeah especially with videogame companies (looking at you Nintendo) coming down hard on sites who dare upload roms for games that they no longer make money from its needed more than ever and its awesome to be able to access your favourite content on the fly without having to worry about no longer being able to have access to it (unless you suffer data corruption which is why you should have backups and use a decent HDD like a lacie)

I will say use a HDD for any backups you do make as while SSDs are cool and fast asf they make awful cold storage as theres the rare chance of it going wrong if not powered on for an extended amount of time compared to a HDD which is a disk that can left alone as is albeit more fragile which is why using a decently protected HDD is essential. Fun hobby, worth taking up.
 

Bibr

Well-known member
V.I.P.
Marianan ID
92
Joined
Aug 27, 2023
Threads
21
Messages
426
Reaction score
565
Awards
43
Location
Araara Island
Website
birb-site.neocities.org
LV
2
 
Offline
Months on from this brilliant article being posted by that unknown anon I've actually got quite into data hoarding myself, got a 2tb lacie filled up already and am considering getting a 5tb one to further expand my film and rom backup collection as yeah especially with videogame companies (looking at you Nintendo) coming down hard on sites who dare upload roms for games that they no longer make money from its needed more than ever and its awesome to be able to access your favourite content on the fly without having to worry about no longer being able to have access to it (unless you suffer data corruption which is why you should have backups and use a decent HDD like a lacie)

I will say use a HDD for any backups you do make as while SSDs are cool and fast asf they make awful cold storage as theres the rare chance of it going wrong if not powered on for an extended amount of time compared to a HDD which is a disk that can left alone as is albeit more fragile which is why using a decently protected HDD is essential. Fun hobby, worth taking up.
If you're just going to save data to it and then put it away, why not go for some refurbished server drives ?
Example offer. I'm sure you can find many more if you look around ebay
 

playFACE

Well-known member
V.I.P.
Marianan ID
175
Joined
Jan 8, 2024
Threads
0
Messages
461
Reaction score
553
Awards
44
LV
2
 
Offline
Months on from this brilliant article being posted by that unknown anon I've actually got quite into data hoarding myself, got a 2tb lacie filled up already and am considering getting a 5tb one to further expand my film and rom backup collection as yeah especially with videogame companies (looking at you Nintendo) coming down hard on sites who dare upload roms for games that they no longer make money from its needed more than ever and its awesome to be able to access your favourite content on the fly without having to worry about no longer being able to have access to it (unless you suffer data corruption which is why you should have backups and use a decent HDD like a lacie)

I will say use a HDD for any backups you do make as while SSDs are cool and fast asf they make awful cold storage as theres the rare chance of it going wrong if not powered on for an extended amount of time compared to a HDD which is a disk that can left alone as is albeit more fragile which is why using a decently protected HDD is essential. Fun hobby, worth taking up.
look on ebay and other sites for used/recertified hard drives. its what i did to build my server. bought 4 20tb exos drives for half around 240 each. i originally planned on buying WD 20tb drives but they cost 500+ USD brand new so the exos drives became more cost effective.
 

Khastle

Herald of the Mariana
Janny
Marianan ID
4
Joined
Apr 2, 2023
Threads
191
Messages
1,924
Reaction score
1,037
Awards
80
Location
Dwayne's Basement, Mariana Bay
LV
4
 
Offline
If you're just going to save data to it and then put it away, why not go for some refurbished server drives ?
Example offer. I'm sure you can find many more if you look around ebay
Cool but nah I like to use them to watch movies on the telly tbf so I still want a fairly portable one.
look on ebay and other sites for used/recertified hard drives. its what i did to build my server. bought 4 20tb exos drives for half around 240 each. i originally planned on buying WD 20tb drives but they cost 500+ USD brand new so the exos drives became more cost effective.
Tempting, I'll go with the 5tb one first and go from there, still waiting on the seller of the used drive I'm looking at to do a health test on the drive before I purchase.
 

Khastle

Herald of the Mariana
Janny
Marianan ID
4
Joined
Apr 2, 2023
Threads
191
Messages
1,924
Reaction score
1,037
Awards
80
Location
Dwayne's Basement, Mariana Bay
LV
4
 
Offline
New 5tb drive came, works well as promised and am backing up my data rn from my other drive. Seller cheaped out on the lead tho, its useable but its condition makes me want to get a new one soon for safety reasons and to prevent potential data corruption, can't complain too much tho considering the price I got the SSD for and its barely used condition (only had 87 hours logged on it when I asked the seller for hard drive condition before buying).
 

playFACE

Well-known member
V.I.P.
Marianan ID
175
Joined
Jan 8, 2024
Threads
0
Messages
461
Reaction score
553
Awards
44
LV
2
 
Offline
New 5tb drive came, works well as promised and am backing up my data rn from my other drive. Seller cheaped out on the lead tho, its useable but its condition makes me want to get a new one soon for safety reasons and to prevent potential data corruption, can't complain too much tho considering the price I got the SSD for and its barely used condition (only had 87 hours logged on it when I asked the seller for hard drive condition before buying).
probably should do some SMART TESTs to see if there is any bitrot or anything going on with the drive tbh. better to be safe than to suddenly find out your "brand new" hard drive dying out within 3 months or so
 

Bibr

Well-known member
V.I.P.
Marianan ID
92
Joined
Aug 27, 2023
Threads
21
Messages
426
Reaction score
565
Awards
43
Location
Araara Island
Website
birb-site.neocities.org
LV
2
 
Offline
probably should do some SMART TESTs to see if there is any bitrot or anything going on with the drive tbh. better to be safe than to suddenly find out your "brand new" hard drive dying out within 3 months or so
This and block by block test would be nice

I would not trust those recorded runtime metrics. It's easy to flash a controler so it thinks it barely had any use
 

Khastle

Herald of the Mariana
Janny
Marianan ID
4
Joined
Apr 2, 2023
Threads
191
Messages
1,924
Reaction score
1,037
Awards
80
Location
Dwayne's Basement, Mariana Bay
LV
4
 
Offline
probably should do some SMART TESTs to see if there is any bitrot or anything going on with the drive tbh. better to be safe than to suddenly find out your "brand new" hard drive dying out within 3 months or so
Just as I started moving files over lol, what program would you recommend for doing it?
This and block by block test would be nice

I would not trust those recorded runtime metrics. It's easy to flash a controler so it thinks it barely had any use
Cool will do look into that.
 

playFACE

Well-known member
V.I.P.
Marianan ID
175
Joined
Jan 8, 2024
Threads
0
Messages
461
Reaction score
553
Awards
44
LV
2
 
Offline
Just as I started moving files over lol, what program would you recommend for doing it?

Cool will do look into that.
windows has some in built in SMART TEST functionality. if not then give CrystalDiskInfo a whirl. i recommend the latter more imo.
 

playFACE

Well-known member
V.I.P.
Marianan ID
175
Joined
Jan 8, 2024
Threads
0
Messages
461
Reaction score
553
Awards
44
LV
2
 
Offline
This is what I got back when I got the fella to test it for me the other day
View attachment 5893
seems to be fine for now but i would monitor that drive as time goes just to be safe. i know i had a drive once that was fine in the moment and 2 weeks later still died on me. also for God's sake does no one remove the damn search, cortana and modify their taskbars? i know im an tech guy but jesus
 

Bibr

Well-known member
V.I.P.
Marianan ID
92
Joined
Aug 27, 2023
Threads
21
Messages
426
Reaction score
565
Awards
43
Location
Araara Island
Website
birb-site.neocities.org
LV
2
 
Offline
This is what I got back when I got the fella to test it for me the other day
View attachment 5893
mmmm yes we sure do understand hex

Function -> advanced functions -> raw values -> 10[DEC]
now you gonna see exacly what your drive is telling you.

For checking drive per sector, try this
select this drive ni "standard" tab, go to "tests" and run it. It's gonna take a long while. you can still use pc while this test is going. Just make sure to not copy or play any media from this drive as it might screw up the results
 
Boatyard
Rules Help Users
  • No one is chatting at the moment.
      There are no messages in the current room.