I'm working on creating an archive. One of my goals right now is saving every Wikipedia page related to tobacco. That means cigarette cards / baseball cards, kiseru pipes, cigars, etc. Everything. I've been copying the links into a Google Doc for a couple days, and I've been thinking the whole time that it would be nice if it were automated. How feasible would it be to create a program that crawls all of the pages? I eventually want to download them and put them in cold storage. I'm much more tech literate than most, but I am not a computer programmer, so I doubt that I could do it myself, but how easy would it be for me to find someone who could? Of course, I'd be willing to pay a fee.
Nothing Ever Happens Shirt $21.68 |
UFOs Are A Psyop Shirt $21.68 |
Nothing Ever Happens Shirt $21.68 |
Just download the entire Wikipedia. Less headache and you get more encyclopedia.
You know you can just download Wikipedia right?
Like its a feature they offer
lmao you can literally just download the whole wiki off their own site
https://www.kiwix.org/en/
https://library.kiwix.org/
just use wget
One thing I never got about kiwix. Do they do delta updates? Or do you have to download 60GB of data every time they have a new archive release?
Yes you technically have to redownload every time to see updated articles. I just don't give enough of a shit to get them
thank you based anon for making smoking more enjoyable
i hope you read them too
Most of what I've been doing has not been fun reading. It's more like the type of reading you do for a school project. If I wasn't copying links constantly, I'd have a good time going down rabbit holes and taking in information. Thanks for the kind thought.
Yes, I'm aware of this. The problem with downloading all of Wikipedia is that people might have trouble finding stuff related to tobacco in the heap. I guess I might as well do it anyway. With it all downloaded, I guess I could leave rather work to someone else in the future.
you can download individual pages too
what is your goal?
He's probably having a manic episode where he thinks the history of tobacco use will be scrubbed from the internet.
boomers are schizophrenic like that
besides, tobacco history is a meme and wikipedia barely covers anything
No, I am not manic. I am a translator of ancient languages and an amateur archivist. Tobacco is an interest of mine. I have already done some archival work, and based on my knowledge of ancient history and experience with archiving, I'm aware of how much is lost to time, even in very brief spans of time. My reason for wanting to save a bunch of tobacco-related Wikipedia articles is that Wikipedia is a very general resource that cites sources that are more rigorous and in-depth.
In the United States, there is an anti-tobacco push happening. YouTube has banned and threatened to ban tobacco-related channels. The biggest snus reviewer got banned because he mentioned where you can buy snus. Other channels got strikes for the same thing. A nasal snuff store in the UK that has shipped to Americans for years was told by FedEx, without warning, that their packages will no longer be shipped. They returned packages to this company that had been in transit for weeks. The Biden administration is criminalizing Juuls and is pushing to limit the amount of nicotine in cigarettes. This all has happened in the last month. Some of this stuff, like content on YouTube, could be gone tomorrow, which is why I'd like to save it.
nicotine addiction will be eradicated and there's nothing you can do about it
doh-ho-ho-no no no I don't think so
there a bunch of people now who start douche flutin without even having smoked first
Prohibition actually increased Alcohol consumption shortly before it was overturned [1]
I thought we learned from history.
We were even starting to legalize marijuana.
Evidently, Democrats are brain dead and need to give even more votes to Republicans as they crash the Economy.
[1] https://www.cato.org/policy-analysis/alcohol-prohibition-was-failure
stop being so schizo grandpa
there is no content on youtube worth saving
postal companies have never shipped tobacco products internationally because of excise reasons, they just didnt know or didnt care about snus until now, likely because some alphabet agency complained
>Beginning on June 29, 2010, the Postal Service will no longer accept or transport any package that it knows, or reasonably believes, to contain nonmailable smokeless tobacco or cigarettes, unless covered by one of the defined exceptions.
do you realise you can just grow tobacco yourself and it will be infinitely better than the factory farm produced stuff in cigarettes and snus
both are a poor people thing anyway
>you can just grow tobacco yourself and it will be infinitely better than the factory farm produced stuff in cigarettes
Not OP but that's simply not true. Blending and curing tobacco takes a lot of expertise, chances are if you make your own it'll taste like shit.
I smoked natural tobacco that was grown and dried by morons in Papua new Guinea and I smoked it as a cigarette rolled in newspaper
Smelled and tasted way better than commercial ciggies
basically this
people pretend its extremely difficult but its actually extremely easy, its just a meme to keep up the "premium" image of the product
tobacco is an easy plant, many of the pests dont even exist in temperate climates
drying is just making sure it doesnt mold and releases enough ammonia to be smokable
blending is just smelling it and putting the small leaves inside and the big leaves on the outside
ez pz, good luck
Not a grandpa. There's plenty of stuff on YouTube worth saving. As
said, processing tobacco is not as easy as it sounds. I'm currently venturing into making snuff from raw leaf for 15 people, so I'm in a position where I can comment. Growing your own tobacco is also difficult, but I might grow some next year. I wouldn't say cigarettes and snus are for poor people. Snus certainly isn't, but your thinking isn't very productive and is quite presumptive. Today, I took some nasal snuff and smoked a cigar (pic rel).
>cigarettes are for poor people
where i live the cost of a cigarette, A cigarette, shot past $1 a few years ago, i don't even know what it is now because i can't afford them anymore
First, I am getting all the links. I put an "x" next to a page that has been exhausted for all relevant links about tobacco content. For instance, the article about pipes might link to various pipe manufacturers. After getting all of the manufacturers in the Google Doc, then I'll put an x next to the URL for the Wikipedia pipes page. After that, I'd go check all of the pipe manufacturers for relevant article links. I will then download all the pages I have in the Google Doc. That's my goal for this section of the project. My greater goal is to preserve any valuable information about tobacco and things related to tobacco, like pipes for instance.
>no i'm not compulsively saving every scrap of information on tobacco use in case it gets scrubbed, even though I literally said I'm gonna do that
Sounds comfy.
this is the most moronic way to do it possible
Yes, I know. That's why I am asking for help.
M-DISC is what I am primarily relying on for longevity. They'd mirror what's on hard drives that I will refresh regularly. I'm beginning to follow 3-2-1.
use a counting machine! a wikipedia page is like 50mb
Counting machine?
Yeah, but that will be an issue 1,000 years from now.
a counting machine yes, but you need to go further
1000 years is like, a day to god. no time at all
all optical media deteriorates much quicker than they want you to believe
disk rot
how are you going to cold store
Not sure what's more stupid, OPs obsession with tobacco and how "it's getting scrubbed" or the way he's going about it
one million
wikipedia s
a day! one!
million! aa!!
and then, you tabulate
(one million
wikipedia pages a day)
STOP CRAWLING WIKIPEDIA YOU moron Black person
USE THE DUMPS
i would much rather sort through all of the tobacco pages and the links there (and chat pages etc.) and delete, than sort through them saving. thats all though. i dont know about any dumps
you could also host some version of them as a torrent, and on some sites, while you use tapes or whatever for cold storage
Since gallery-dl thread is nowhere to be seen, I'll just ask here: is that script or yt-dl capable of downloading whole channels?
youtube dl or at least youtube dlp has that built in
Are there any doomsday wiki's which have recipies/blueprints of day to day items?
bump