Commit graph

666 commits

Author SHA1 Message Date
Kevin Anthony
6e8231e78a Added Namer to BloomingFaeries since the web comic author doesn't seem intrested in sticking to any kind of file naming convention 2015-09-02 11:01:48 -04:00
Kevin Anthony
1045bb7d4a added comic Blooming Faeries 2015-09-02 10:13:42 -04:00
Damjan Košir
11f0aa3989 created Wordpress Scraper class 2015-08-11 21:31:45 +12:00
Damjan Košir
0a5b792c32 added Fragile (English and Spanish) 2015-08-07 23:37:10 +12:00
Damjan Košir
fd9c480d9c adding bonus panel to SWBC and multiple images flag to ParserScraper 2015-08-03 22:58:44 +12:00
Damjan Košir
f8a163a361 added a CMS ComicControl, moved some existing comics there, added StreetFighter and Metacarpolis 2015-08-03 22:40:06 +12:00
Damjan Košir
648a84e38e added Sharksplode 2015-08-03 22:20:17 +12:00
Damjan Košir
c19806b681 added AoiHouse 2015-07-31 23:33:30 +12:00
Damjan Košir
2201c9877a added KiwiBlitz 2015-07-31 23:09:56 +12:00
Damjan Košir
fe22df5e5b added LetsSpeakEnglish 2015-07-31 23:06:06 +12:00
Damjan Košir
79ec427fc0 added CatVersusHuman 2015-07-30 22:16:34 +12:00
Tobias Gruetzmacher
303432fc68 Also use css expressions for textSearch. 2015-07-18 01:22:40 +02:00
Tobias Gruetzmacher
6a70bf4671 Enable some comics based on current policy. 2015-07-18 01:21:29 +02:00
Tobias Gruetzmacher
6b0046f9b3 Fix small typos. 2015-07-18 00:11:44 +02:00
Tobias Gruetzmacher
68d4dd463a Revert robots.txt handling.
This brings us back to only honouring robots.txt on page downloads, not
on image downloads.

Rationale: Dosage is not a "robot" in the classical sense. It's not
designed to spider huge amounts of web sites in search for some content
to index, it's only intended to help users keep a personal archive of
comics he is interested in. We try very hard to never download any image
twice. This fixes #24.

(Precedent for this rationale: Google Feedfetcher:
https://support.google.com/webmasters/answer/178852?hl=en#robots)
2015-07-17 20:46:56 +02:00
Tobias Gruetzmacher
7d3bd15c2f Remove AbleAndBaker, site is gone. 2015-07-16 00:49:48 +02:00
Tobias Gruetzmacher
472afa24d3 GoComics doesn't allow spiders, disable them...
This removes 757 comics, including quite popular ones like Calvin and
Hobbes, Garfield, FoxTrot, etc. :(
2015-07-16 00:36:10 +02:00
Tobias Gruetzmacher
7c15ea50d8 Also check robots.txt on image downloads.
We DO want to honour if images are blocked by robots.txt
2015-07-15 23:50:57 +02:00
Tobias Gruetzmacher
5affd8af68 More relaxed robots.txt handling.
This is in line with how Perl's LWP::RobotUA and Google handles server
errors when fetching robots.txt: Just assume access is allowed.

See https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
2015-07-15 19:11:55 +02:00
Tobias Gruetzmacher
88e387ad15 Add Sleepless Domain. 2015-07-12 18:31:21 +02:00
Tobias Gruetzmacher
0b6d7425e1 Remove BladeKitten.
It's not available online anymore, only in print or as a PDF download.
2015-07-11 01:29:21 +02:00
Tobias Gruetzmacher
808b624e5f Remove hard dependency on pycountry again.
This basically reverts commit 86b31dc12b.

It now works like this: If the use has pycountry installed, it is used.
If not, Dosage falls back to a small internal list generated from
pycountry by scripts/mklanguages.py.

This means additional work if we ever decide to translate Dosage, since
pycountry already has all the translations for language names...

This fixes #23.
2015-07-11 01:27:39 +02:00
Tobias Gruetzmacher
d97a9c63e4 Add Erstwhile. 2015-07-10 01:14:56 +02:00
Damjan Košir
7abca1222b added NerfNow 2015-07-07 22:18:06 +12:00
Damjan Košir
119a3cd13a added text to ScandinaviaAndTheWorld 2015-07-07 19:48:25 +12:00
Damjan Košir
5f243e3868 not a comic 2015-07-05 18:33:14 +12:00
Damjan Košir
5e7ad33fc8 Nnewts disabled 2015-07-05 18:32:33 +12:00
Damjan Košir
45012ff9c3 BladeKitten disabled 2015-07-05 18:31:38 +12:00
Tobias Gruetzmacher
0c6feec8cd Fix module name EastCoastVsWestCoast. 2015-06-24 00:51:42 +02:00
Damjan Košir
96572e8cba added TheMelvinChronicles 2015-06-12 21:00:11 +12:00
Damjan Košir
6412e6e542 fixed Spinnerette 2015-06-08 20:31:13 +12:00
Damjan Košir
3d8a49d228 realised TheWebcomicFactory is actually 28 comics... added them 2015-06-07 21:33:59 +12:00
Damjan Košir
05bb22b3ef added TheWebcomicFactory 2015-06-06 14:25:32 +12:00
Damjan Košir
c98800388e added Sithrah 2015-06-04 19:24:55 +12:00
Damjan Košir
010b4bf669 renaming comicpress to wordpress (as it's not just for the comicpress theme) 2015-06-04 19:12:40 +12:00
Damjan Košir
bc91f5f1fb added MistyTheMouse 2015-06-04 19:06:40 +12:00
Damjan Košir
e2d01e4924 fixed ScandinaviaAndTheWorld 2015-06-04 18:58:59 +12:00
Damjan Košir
545a67111e fixed Alice 2015-06-01 15:15:34 +12:00
Damjan Košir
a08ad2dc80 fixed GoGetARoomie 2015-06-01 15:11:16 +12:00
Damjan Košir
ceb19ed2bc fixed Wulffmorgenthaler (now Wumo), added TruthFacts and MeAndDanielle 2015-06-01 12:14:52 +12:00
Damjan Košir
4cd88ecdc0 fixed WormWorldSaga 2015-06-01 11:45:22 +12:00
Damjan Košir
ea6cb925a6 fixed LoadingArtist 2015-06-01 11:33:50 +12:00
Damjan Košir
e268b09567 fixed EarthsongSaga 2015-06-01 11:19:02 +12:00
Damjan Košir
29c8d2eea0 fixed Meek 2015-05-31 23:41:12 +12:00
Damjan Košir
9be6f613e4 fixed MysteriesOfTheArcana 2015-05-31 23:39:04 +12:00
Damjan Košir
3ea8236224 fixed FowlLanguage 2015-05-31 23:29:34 +12:00
Damjan Košir
c1245a85ad moved Footloose, added Cherry, Desigaspring 2015-05-31 23:23:02 +12:00
Damjan Košir
01aeebfbe4 fixed Footloose 2015-05-31 23:16:12 +12:00
Damjan Košir
029fa74067 fixed Bardsworth 2015-05-31 23:03:40 +12:00
Damjan Košir
f3036de8fd fixed Pimpette 2015-05-31 22:57:25 +12:00