Commit graph

1142 commits

Author SHA1 Message Date
Tobias Gruetzmacher 68d4dd463a Revert robots.txt handling.
This brings us back to only honouring robots.txt on page downloads, not
on image downloads.

Rationale: Dosage is not a "robot" in the classical sense. It's not
designed to spider huge amounts of web sites in search for some content
to index, it's only intended to help users keep a personal archive of
comics he is interested in. We try very hard to never download any image
twice. This fixes #24.

(Precedent for this rationale: Google Feedfetcher:
https://support.google.com/webmasters/answer/178852?hl=en#robots)
2015-07-17 20:46:56 +02:00
Tobias Gruetzmacher d88b97573d Stable test order. 2015-07-16 01:32:24 +02:00
Tobias Gruetzmacher ea4472cd7c Test with comic that is still fetchable... 2015-07-16 00:53:10 +02:00
Tobias Gruetzmacher 7d3bd15c2f Remove AbleAndBaker, site is gone. 2015-07-16 00:49:48 +02:00
Tobias Gruetzmacher 472afa24d3 GoComics doesn't allow spiders, disable them...
This removes 757 comics, including quite popular ones like Calvin and
Hobbes, Garfield, FoxTrot, etc. :(
2015-07-16 00:36:10 +02:00
Tobias Gruetzmacher 7c15ea50d8 Also check robots.txt on image downloads.
We DO want to honour if images are blocked by robots.txt
2015-07-15 23:50:57 +02:00
Tobias Gruetzmacher 5affd8af68 More relaxed robots.txt handling.
This is in line with how Perl's LWP::RobotUA and Google handles server
errors when fetching robots.txt: Just assume access is allowed.

See https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
2015-07-15 19:11:55 +02:00
Tobias Gruetzmacher 88e387ad15 Add Sleepless Domain. 2015-07-12 18:31:21 +02:00
Tobias Gruetzmacher 0b6d7425e1 Remove BladeKitten.
It's not available online anymore, only in print or as a PDF download.
2015-07-11 01:29:21 +02:00
Tobias Gruetzmacher 808b624e5f Remove hard dependency on pycountry again.
This basically reverts commit 86b31dc12b.

It now works like this: If the use has pycountry installed, it is used.
If not, Dosage falls back to a small internal list generated from
pycountry by scripts/mklanguages.py.

This means additional work if we ever decide to translate Dosage, since
pycountry already has all the translations for language names...

This fixes #23.
2015-07-11 01:27:39 +02:00
Tobias Gruetzmacher d97a9c63e4 Add Erstwhile. 2015-07-10 01:14:56 +02:00
Damjan Košir 7abca1222b added NerfNow 2015-07-07 22:18:06 +12:00
Damjan Košir 119a3cd13a added text to ScandinaviaAndTheWorld 2015-07-07 19:48:25 +12:00
Damjan Košir 5f243e3868 not a comic 2015-07-05 18:33:14 +12:00
Damjan Košir 5e7ad33fc8 Nnewts disabled 2015-07-05 18:32:33 +12:00
Damjan Košir 45012ff9c3 BladeKitten disabled 2015-07-05 18:31:38 +12:00
Tobias Gruetzmacher d78db39df9 Support pycountry 1.12. 2015-06-24 01:04:28 +02:00
Tobias Gruetzmacher 0c6feec8cd Fix module name EastCoastVsWestCoast. 2015-06-24 00:51:42 +02:00
Damjan Košir 96572e8cba added TheMelvinChronicles 2015-06-12 21:00:11 +12:00
Damjan Košir 6412e6e542 fixed Spinnerette 2015-06-08 20:31:13 +12:00
Damjan Košir 3d8a49d228 realised TheWebcomicFactory is actually 28 comics... added them 2015-06-07 21:33:59 +12:00
Damjan Košir 05bb22b3ef added TheWebcomicFactory 2015-06-06 14:25:32 +12:00
Damjan Košir c98800388e added Sithrah 2015-06-04 19:24:55 +12:00
Damjan Košir 010b4bf669 renaming comicpress to wordpress (as it's not just for the comicpress theme) 2015-06-04 19:12:40 +12:00
Damjan Košir bc91f5f1fb added MistyTheMouse 2015-06-04 19:06:40 +12:00
Damjan Košir e2d01e4924 fixed ScandinaviaAndTheWorld 2015-06-04 18:58:59 +12:00
Damjan Košir 545a67111e fixed Alice 2015-06-01 15:15:34 +12:00
Damjan Košir a08ad2dc80 fixed GoGetARoomie 2015-06-01 15:11:16 +12:00
Damjan Košir ceb19ed2bc fixed Wulffmorgenthaler (now Wumo), added TruthFacts and MeAndDanielle 2015-06-01 12:14:52 +12:00
Damjan Košir 4cd88ecdc0 fixed WormWorldSaga 2015-06-01 11:45:22 +12:00
Damjan Košir ea6cb925a6 fixed LoadingArtist 2015-06-01 11:33:50 +12:00
Damjan Košir e268b09567 fixed EarthsongSaga 2015-06-01 11:19:02 +12:00
Damjan Košir 29c8d2eea0 fixed Meek 2015-05-31 23:41:12 +12:00
Damjan Košir 9be6f613e4 fixed MysteriesOfTheArcana 2015-05-31 23:39:04 +12:00
Damjan Košir 3ea8236224 fixed FowlLanguage 2015-05-31 23:29:34 +12:00
Damjan Košir c1245a85ad moved Footloose, added Cherry, Desigaspring 2015-05-31 23:23:02 +12:00
Damjan Košir 01aeebfbe4 fixed Footloose 2015-05-31 23:16:12 +12:00
Damjan Košir 029fa74067 fixed Bardsworth 2015-05-31 23:03:40 +12:00
Damjan Košir f3036de8fd fixed Pimpette 2015-05-31 22:57:25 +12:00
Damjan Košir df7404fd7c fixed CatsAndCameras 2015-05-31 22:50:17 +12:00
Damjan Košir d4cc8ac857 added buni 2015-05-27 20:36:11 +12:00
Damjan Košir 9beeceffad added BusinessCat and HappyJar 2015-05-27 20:34:51 +12:00
Damjan Košir d970d27b14 removing duplicate 2015-05-27 00:10:46 +12:00
Damjan Košir 33abd95348 fixed TheGentlemansArmchair 2015-05-26 23:48:22 +12:00
Damjan Košir 5e123ae79e fixed DarkWings (now available under the real name Eryl as well), added Ashes, Laiyu, NoMoreSavePoints and EasilyAmused 2015-05-26 23:43:15 +12:00
Damjan Košir 9adb020fc2 fixed DemolitionSquad 2015-05-26 22:59:25 +12:00
Damjan Košir 605c5f8619 fixed PokeyThePenguin 2015-05-26 22:31:43 +12:00
Damjan Košir 766b7ba99d fixed ProperBarn, added 2214 and OTE 2015-05-26 22:16:55 +12:00
Damjan Košir 2c41435ceb fixing HijiNKS ENSUE and added all 4 comics on that page 2015-05-26 22:06:55 +12:00
Damjan Košir 465e7eaf6f fixing CowboyJedi kinda... there is currently no comic on the front page and the author knows it 2015-05-26 21:35:36 +12:00