Commit graph

852 commits

Author SHA1 Message Date
Tobias Gruetzmacher
a9f0dfdce4 Merge pull request #39 from peterjanes/peterjanes/sherman-fix
Fix Sherman's Lagoon
2016-04-03 22:20:04 +02:00
Tobias Gruetzmacher
926439cd14 Every comic need an url. 2016-04-03 22:03:16 +02:00
Tobias Gruetzmacher
2c6decb7f5 Move WebcomicFactory in its own module.
Also, add an updater script for it.
2016-04-03 21:31:56 +02:00
Peter Janes
759bd0c360 Fix Sherman's Lagoon 2016-04-03 14:54:41 -04:00
Tobias Gruetzmacher
bb1f20d867 Remove make_scraper for most WordPress comics.
- Dropped KatzenfutterGeleespritzer, because robots.txt.
- Move all WordPress/ComicPress scrapers into alphabetical files.
- Move _WordPressScraper & _ComicPress scraper into common.py.
- Some smaller PEP8 fixes.
2016-04-02 00:19:53 +02:00
Tobias Gruetzmacher
7f1e136d8b Sort comics alphabetically & PEP8 style fixes. 2016-03-31 23:13:54 +02:00
Tobias Gruetzmacher
d6db1d0b81 Fix a conflict with IPython. 2016-03-20 23:57:07 +01:00
Tobias Gruetzmacher
90dfceaeb1 Remove dead modules (& format). 2016-03-20 20:48:42 +01:00
Tobias Gruetzmacher
f243096d49 Fix GastroPhobia, remove GeneralProtectionFault.
(& formatting)
2016-03-20 20:11:21 +01:00
Tobias Gruetzmacher
cfcfcc2468 Switch plugin loading to pkgutil.
This should work with all PEP-302 loaders that implement iter_modules.
Unfortunatly, PyInstaller (which I plan to use for Windows releases)
does not support it, so we don't get around a special case. Anyways,
this should help for #22.
2016-03-20 15:13:24 +01:00
Tobias Gruetzmacher
1af022895e Fix NuklearPower (fixes #38).
Also remove make_scraper magic.
2016-03-17 23:19:52 +01:00
Tobias Gruetzmacher
552f29e5fc Update ComicFury comics. (+871, -245)
- Remove make_scraper magic
- Switch to HTML parser
- Update parsing of comic listing.
2016-03-17 00:44:06 +01:00
Tobias Gruetzmacher
6727e9b559 Use vendored urllib3.
As long as requests ships with urllib3, we can't fall back to the
"system" urllib3, since that breaks class-identity checks.
2016-03-16 23:18:19 +01:00
Damjan Košir
615f094ef3 fixing EdmundFinney 2016-03-14 20:32:18 +13:00
Tobias Gruetzmacher
c4fcd985dd Let urllib3 handle all retries. 2016-03-13 21:30:36 +01:00
Tobias Gruetzmacher
78e13962f9 Sort scraper modules (mostly for test stability). 2016-03-13 20:24:21 +01:00
Tobias Gruetzmacher
017d35cb3c Fallback version if pkg_resources not available.
This helps for Windows packaging.
2016-03-03 01:05:36 +01:00
Johannes Schöpp
351fa7154e Modified maximum page size
Fixes #36
2016-03-01 22:19:44 +01:00
Damjan Košir
b0dc510b08 adding LastNerdsOnEarth 2016-01-03 14:16:58 +13:00
Damjan Košir
a1e79cbbf2 fixing Fragile 2016-01-03 14:08:49 +13:00
Tobias Gruetzmacher
81827f83bc Use GitHub releases API for update checks. 2015-11-06 23:07:19 +01:00
Tobias Gruetzmacher
a41574e31a Make version fetching a bit more robust (use pbr). 2015-11-06 22:08:14 +01:00
Tobias Gruetzmacher
64f7e313d5 Remove make_scraper magic from footloosecomic.py. 2015-11-05 00:03:13 +01:00
Tobias Gruetzmacher
7f7a69818b Remove make_scraper magic from creators module. 2015-11-04 23:43:31 +01:00
Tobias Gruetzmacher
94470d564c Fix import for Python 3. 2015-11-03 23:40:45 +01:00
Tobias Gruetzmacher
b819afec39 Switch build to PBR.
This gets us:
- Automatic changelog
- Automatic authors list
- Automatic git version management
2015-11-03 23:27:53 +01:00
Tobias Gruetzmacher
dc22d7b32a Add CatNine comic. 2015-11-02 23:29:56 +01:00
Tobias Gruetzmacher
10d9eac574 Remove support for very old versions of "requests". 2015-11-02 23:24:01 +01:00
MariusK
3e1ea816cc Fixed 'Ruthe' 2015-10-02 13:52:44 +02:00
Helge Stasch
48d8519efd Changed Goblins comic - moved to new scraper and fixed minor issues with some comics (old scrapper was unstable for some comics of Goblins) 2015-09-28 23:50:15 +02:00
Helge Stasch
17fbdf2bf7 Added comic "Ahoy Earth" 2015-09-27 00:44:47 +02:00
Tobias Gruetzmacher
d72ceb92d5 BloomingFaeries: Remove imageUrlModifier (not needed). 2015-09-04 00:37:05 +02:00
Tobias Gruetzmacher
abd80a1d35 Merge pull request #28 from KevinAnthony/master
added comic Blooming Faeries
2015-09-03 23:26:37 +02:00
Tobias Gruetzmacher
b737218182 ZenPencils: Allow multiple images per page. 2015-09-03 23:24:28 +02:00
Kevin Anthony
62ec1f1d18 Removed debugging print state 2015-09-02 11:22:24 -04:00
Kevin Anthony
d7180eaf99 removed bad whitespace 2015-09-02 11:04:32 -04:00
Kevin Anthony
6e8231e78a Added Namer to BloomingFaeries since the web comic author doesn't seem intrested in sticking to any kind of file naming convention 2015-09-02 11:01:48 -04:00
Kevin Anthony
1045bb7d4a added comic Blooming Faeries 2015-09-02 10:13:42 -04:00
Damjan Košir
11f0aa3989 created Wordpress Scraper class 2015-08-11 21:31:45 +12:00
Damjan Košir
0a5b792c32 added Fragile (English and Spanish) 2015-08-07 23:37:10 +12:00
Damjan Košir
fd9c480d9c adding bonus panel to SWBC and multiple images flag to ParserScraper 2015-08-03 22:58:44 +12:00
Damjan Košir
f8a163a361 added a CMS ComicControl, moved some existing comics there, added StreetFighter and Metacarpolis 2015-08-03 22:40:06 +12:00
Damjan Košir
648a84e38e added Sharksplode 2015-08-03 22:20:17 +12:00
Damjan Košir
c19806b681 added AoiHouse 2015-07-31 23:33:30 +12:00
Damjan Košir
2201c9877a added KiwiBlitz 2015-07-31 23:09:56 +12:00
Damjan Košir
fe22df5e5b added LetsSpeakEnglish 2015-07-31 23:06:06 +12:00
Damjan Košir
79ec427fc0 added CatVersusHuman 2015-07-30 22:16:34 +12:00
Tobias Gruetzmacher
303432fc68 Also use css expressions for textSearch. 2015-07-18 01:22:40 +02:00
Tobias Gruetzmacher
6a70bf4671 Enable some comics based on current policy. 2015-07-18 01:21:29 +02:00
Tobias Gruetzmacher
6b0046f9b3 Fix small typos. 2015-07-18 00:11:44 +02:00
Tobias Gruetzmacher
68d4dd463a Revert robots.txt handling.
This brings us back to only honouring robots.txt on page downloads, not
on image downloads.

Rationale: Dosage is not a "robot" in the classical sense. It's not
designed to spider huge amounts of web sites in search for some content
to index, it's only intended to help users keep a personal archive of
comics he is interested in. We try very hard to never download any image
twice. This fixes #24.

(Precedent for this rationale: Google Feedfetcher:
https://support.google.com/webmasters/answer/178852?hl=en#robots)
2015-07-17 20:46:56 +02:00
Tobias Gruetzmacher
7d3bd15c2f Remove AbleAndBaker, site is gone. 2015-07-16 00:49:48 +02:00
Tobias Gruetzmacher
472afa24d3 GoComics doesn't allow spiders, disable them...
This removes 757 comics, including quite popular ones like Calvin and
Hobbes, Garfield, FoxTrot, etc. :(
2015-07-16 00:36:10 +02:00
Tobias Gruetzmacher
7c15ea50d8 Also check robots.txt on image downloads.
We DO want to honour if images are blocked by robots.txt
2015-07-15 23:50:57 +02:00
Tobias Gruetzmacher
5affd8af68 More relaxed robots.txt handling.
This is in line with how Perl's LWP::RobotUA and Google handles server
errors when fetching robots.txt: Just assume access is allowed.

See https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
2015-07-15 19:11:55 +02:00
Tobias Gruetzmacher
88e387ad15 Add Sleepless Domain. 2015-07-12 18:31:21 +02:00
Tobias Gruetzmacher
0b6d7425e1 Remove BladeKitten.
It's not available online anymore, only in print or as a PDF download.
2015-07-11 01:29:21 +02:00
Tobias Gruetzmacher
808b624e5f Remove hard dependency on pycountry again.
This basically reverts commit 86b31dc12b.

It now works like this: If the use has pycountry installed, it is used.
If not, Dosage falls back to a small internal list generated from
pycountry by scripts/mklanguages.py.

This means additional work if we ever decide to translate Dosage, since
pycountry already has all the translations for language names...

This fixes #23.
2015-07-11 01:27:39 +02:00
Tobias Gruetzmacher
d97a9c63e4 Add Erstwhile. 2015-07-10 01:14:56 +02:00
Damjan Košir
7abca1222b added NerfNow 2015-07-07 22:18:06 +12:00
Damjan Košir
119a3cd13a added text to ScandinaviaAndTheWorld 2015-07-07 19:48:25 +12:00
Damjan Košir
5f243e3868 not a comic 2015-07-05 18:33:14 +12:00
Damjan Košir
5e7ad33fc8 Nnewts disabled 2015-07-05 18:32:33 +12:00
Damjan Košir
45012ff9c3 BladeKitten disabled 2015-07-05 18:31:38 +12:00
Tobias Gruetzmacher
0c6feec8cd Fix module name EastCoastVsWestCoast. 2015-06-24 00:51:42 +02:00
Damjan Košir
96572e8cba added TheMelvinChronicles 2015-06-12 21:00:11 +12:00
Damjan Košir
6412e6e542 fixed Spinnerette 2015-06-08 20:31:13 +12:00
Damjan Košir
3d8a49d228 realised TheWebcomicFactory is actually 28 comics... added them 2015-06-07 21:33:59 +12:00
Damjan Košir
05bb22b3ef added TheWebcomicFactory 2015-06-06 14:25:32 +12:00
Damjan Košir
c98800388e added Sithrah 2015-06-04 19:24:55 +12:00
Damjan Košir
010b4bf669 renaming comicpress to wordpress (as it's not just for the comicpress theme) 2015-06-04 19:12:40 +12:00
Damjan Košir
bc91f5f1fb added MistyTheMouse 2015-06-04 19:06:40 +12:00
Damjan Košir
e2d01e4924 fixed ScandinaviaAndTheWorld 2015-06-04 18:58:59 +12:00
Damjan Košir
545a67111e fixed Alice 2015-06-01 15:15:34 +12:00
Damjan Košir
a08ad2dc80 fixed GoGetARoomie 2015-06-01 15:11:16 +12:00
Damjan Košir
ceb19ed2bc fixed Wulffmorgenthaler (now Wumo), added TruthFacts and MeAndDanielle 2015-06-01 12:14:52 +12:00
Damjan Košir
4cd88ecdc0 fixed WormWorldSaga 2015-06-01 11:45:22 +12:00
Damjan Košir
ea6cb925a6 fixed LoadingArtist 2015-06-01 11:33:50 +12:00
Damjan Košir
e268b09567 fixed EarthsongSaga 2015-06-01 11:19:02 +12:00
Damjan Košir
29c8d2eea0 fixed Meek 2015-05-31 23:41:12 +12:00
Damjan Košir
9be6f613e4 fixed MysteriesOfTheArcana 2015-05-31 23:39:04 +12:00
Damjan Košir
3ea8236224 fixed FowlLanguage 2015-05-31 23:29:34 +12:00
Damjan Košir
c1245a85ad moved Footloose, added Cherry, Desigaspring 2015-05-31 23:23:02 +12:00
Damjan Košir
01aeebfbe4 fixed Footloose 2015-05-31 23:16:12 +12:00
Damjan Košir
029fa74067 fixed Bardsworth 2015-05-31 23:03:40 +12:00
Damjan Košir
f3036de8fd fixed Pimpette 2015-05-31 22:57:25 +12:00
Damjan Košir
df7404fd7c fixed CatsAndCameras 2015-05-31 22:50:17 +12:00
Damjan Košir
d4cc8ac857 added buni 2015-05-27 20:36:11 +12:00
Damjan Košir
9beeceffad added BusinessCat and HappyJar 2015-05-27 20:34:51 +12:00
Damjan Košir
d970d27b14 removing duplicate 2015-05-27 00:10:46 +12:00
Damjan Košir
33abd95348 fixed TheGentlemansArmchair 2015-05-26 23:48:22 +12:00
Damjan Košir
5e123ae79e fixed DarkWings (now available under the real name Eryl as well), added Ashes, Laiyu, NoMoreSavePoints and EasilyAmused 2015-05-26 23:43:15 +12:00
Damjan Košir
9adb020fc2 fixed DemolitionSquad 2015-05-26 22:59:25 +12:00
Damjan Košir
605c5f8619 fixed PokeyThePenguin 2015-05-26 22:31:43 +12:00
Damjan Košir
766b7ba99d fixed ProperBarn, added 2214 and OTE 2015-05-26 22:16:55 +12:00
Damjan Košir
2c41435ceb fixing HijiNKS ENSUE and added all 4 comics on that page 2015-05-26 22:06:55 +12:00
Damjan Košir
465e7eaf6f fixing CowboyJedi kinda... there is currently no comic on the front page and the author knows it 2015-05-26 21:35:36 +12:00
Damjan Košir
529a41397a fixing CorydonCafe 2015-05-26 21:32:25 +12:00
Damjan Košir
c3abb93e99 fixing ChainsawSuit 2015-05-26 19:53:04 +12:00
Damjan Košir
f8690af029 fixing Curvy 2015-05-26 19:47:31 +12:00
Damjan Košir
36c790fa4b fixing CraftedFables 2015-05-26 19:32:12 +12:00
Damjan Košir
7067c51056 fixed CheckerboardNightmare 2015-05-25 22:19:36 +12:00
Damjan Košir
5569439c43 fixed 16 comics 2015-05-25 21:57:06 +12:00
Damjan Košir
3edaa97fb9 fixing KatzenfutterGeleespritzer 2015-05-25 20:06:58 +12:00
Damjan Košir
8a245e1d10 fixing BloodBound 2015-05-21 00:04:07 +12:00
Damjan Košir
dc2349951a moving BroodHollow to comicpress 2015-05-21 00:00:35 +12:00
Damjan Košir
a05ae9c75d fixing PandyLand 2015-05-20 23:56:49 +12:00
Damjan Košir
fd60065591 fixing OnTheEdge 2015-05-20 23:50:18 +12:00
Damjan Košir
80b783c016 fixing CourtingDisaster 2015-05-20 23:16:54 +12:00
Damjan Košir
ff239ff58e Merge branch 'comicpress' 2015-05-20 23:12:03 +12:00
Damjan Košir
77c5dbce9b better prevSearch for comic press 2015-05-20 23:08:02 +12:00
Damjan Košir
bc4e7a03f2 fixed BroodHollow 2015-05-20 23:03:15 +12:00
Damjan Košir
8de620c78b fixed CigarroAndCerveja 2015-05-20 22:58:13 +12:00
Damjan Košir
4529fdee3b adding no downsize option 2015-05-20 22:38:29 +12:00
Damjan Košir
77a9cce00d fixing Hipsters 2015-05-19 19:49:45 +12:00
Damjan Košir
79d775a8d9 adding comicpress scraper 2015-05-16 00:15:32 +12:00
Damjan Košir
962286d391 fixed OctopusPie 2015-05-14 23:06:12 +12:00
Damjan Košir
3bbf2d5c23 fixing neko the kitty 2015-05-14 22:42:04 +12:00
Damjan Košir
f75fc62e84 fixing pebbleversion 2015-05-14 22:33:46 +12:00
Helge Stasch
5a1ef9b791 Fixed problem with LookingForGroup comic 2015-05-07 13:57:10 +02:00
Damjan Košir
9a009018c7 adding strip Moonsticks 2015-05-07 23:00:55 +12:00
Helge Stasch
64a875388f Added Comic MaxOveracts 2015-05-04 14:06:01 +02:00
Marc Winkelmann
69e5b8ad93 Shermans Lagoon and On The Fastrack working again. Also corrected name. 2015-05-02 22:27:08 +02:00
DirkReiners
1438330a94 Fixes and Additions...
Fixed SabrinaOnline
Fixed SMBC
Added StandStillStaySilent (partial, prevsearch not working yet)
2015-04-29 10:37:14 -05:00
DirkReiners
749beff7a3 Added MareInternum (marecomic.com) 2015-04-29 10:36:12 -05:00
DirkReiners
273b429fcd Merge branch 'master' of https://github.com/webcomics/dosage 2015-04-29 09:51:47 -05:00
Damjan Košir
391313972c fixed ManlyGuysDoingManlyThings 2015-04-26 23:47:38 +12:00
Damjan Košir
9837a87a43 fixed omake teather 2015-04-26 23:32:22 +12:00
Damjan Košir
8df9d20556 added doctor cat 2015-04-26 22:32:52 +12:00
Damjan Košir
dc427d6066 fixed the gamercat 2015-04-26 21:52:31 +12:00
Damjan Košir
561005887a unneeded max 2015-04-26 00:23:45 +12:00
Damjan Košir
ac7b0d7e0e adding parallel run option 2015-04-26 00:19:08 +12:00
Damjan Košir
1e94a3c7c5 now the same as offical version 2015-04-25 20:52:03 +12:00
Damjan Košir
dae2698102 removing mismerge 2015-04-25 20:40:28 +12:00
Damjan Košir
dc014a7cb4 Merge remote-tracking branch 'upstream/master'
Conflicts:
	dosagelib/plugins/e.py
	dosagelib/plugins/i.py
	dosagelib/plugins/n.py
	dosagelib/plugins/s.py
	dosagelib/plugins/t.py
	dosagelib/plugins/w.py
2015-04-25 20:28:27 +12:00
DirkReiners
b8ef6958b9 Merge branch 'master' of https://github.com/webcomics/dosage 2015-04-24 15:38:36 -05:00
Helge Stasch
4cdd92dcd7 Added comic Magellan 2015-04-23 09:12:24 +02:00
Tobias Gruetzmacher
9f33c31c68 Merge pull request #12 from Freestila/master
Changed comic name, since comic is named FowlLanguage instead of FoulLan...

Conflicts:
	dosagelib/plugins/f.py
2015-04-22 22:24:26 +02:00
Tobias Gruetzmacher
bf9f45b380 Switch to setuptools and cleanup metadata.
py2exe support is gone for now, will be restored later.
2015-04-22 22:22:03 +02:00
Helge Stasch
8218e805b2 Changed comic name, since comic is named FowlLanguage instead of FoulLanguage 2015-04-22 21:25:10 +02:00
Tobias Gruetzmacher
bf9bf5e9b0 Merge pull request #11 from Freestila/master
Added "Ralf the Destroyer"
2015-04-21 23:46:11 +02:00
Tobias Gruetzmacher
86b31dc12b Depend on pycountry directly. 2015-04-21 21:56:54 +02:00
Helge Stasch
d7e9c8eb94 Added "Ralf the Destroyer" 2015-04-21 19:12:40 +02:00
Tobias Gruetzmacher
d5e7690419 Fix size comparison for RSS & HTML output.
This was always broken, but somehow worked with Python 2.7 (WTF?). Now
that we test with Pillow, this code path runs with Python 3 and throws
an error.
2015-04-21 00:01:23 +02:00
Tobias Gruetzmacher
ff21df596b Remove descriptions and genres (closes #9).
Maintaining the descriptions creates quite a bit of overhead (finding
them, copying them, checking if they are still correct) for a minimal
user benefit.

PS: Viewing this diff should be easier in a difftool that shows changes
in a line, for example kdiff3.
2015-04-20 20:29:09 +02:00
Tobias Gruetzmacher
3b33129e58 Fix ViiviJaWagner. 2015-04-18 22:45:13 +02:00
Tobias Gruetzmacher
e8af5adcb8 Update list of supported GoComics comics. 2015-04-18 02:04:31 +02:00
Tobias Gruetzmacher
f0831a1f0f Fix and update ArcaMax (fixes #8). 2015-04-17 21:53:13 +02:00
DirkReiners
99f33151e2 Merge branch 'master' of https://github.com/webcomics/dosage 2015-04-16 18:36:42 -05:00
DirkReiners
8f3a9f660a Fixed ASofterWorld 2015-04-16 18:35:21 -05:00
DirkReiners
49b964cb3c Added PS238 2015-04-16 18:20:14 -05:00
Manabi
65c021ef2b Fixed IAmArg 2015-04-15 14:43:06 -04:00
Manabi
475739ea60 Fixing DogHouseDiaries 2015-04-15 12:56:03 -04:00
Manabi
c0619e8dca Fixing DogHouseDiaries 2015-04-15 12:51:45 -04:00
Manabi
2b98a9023e Added Peanuts Begins & Wizard of Id Classics 2015-04-13 22:26:12 -04:00
Tobias Gruetzmacher
974752951b Fix xkcd (closes #3), remove adult tag (fixes wummel#85). 2015-04-12 20:06:34 +02:00
Tobias Gruetzmacher
5934f03453 Merge branch 'htmlparser' - I think it's ready.
This closes pull request #70.
2015-04-01 22:13:55 +02:00
Tobias Gruetzmacher
614c25e278 Fix coding style. 2015-03-22 17:13:53 +01:00
Tobias Gruetzmacher
e94e2ae432 Merge pull request #95 from serenitas50/master
Added comic Beetlebum (http://blog.beetlebum.de/).
2015-03-22 17:04:36 +01:00
Tobias Gruetzmacher
b5ed4c56b6 Merge pull request #94 from Manabi/master
Added definition for Drive comic

Conflicts:
	dosagelib/plugins/g.py
2015-03-22 16:34:07 +01:00
Tobias Gruetzmacher
b5368b366a Merge Gaia(German), SandraAndWoo(German) into common base.
This also fixes #97 by correcting the imageSearch regex.
2015-02-04 19:41:52 +01:00
Manabi
f85464ccb2 Fixed unclosed ' error
Lines 293/294 should have been one line, this is now fixed.
2015-02-02 04:35:49 -05:00
Manabi
190f53ee4d Fixing name of GunnkriggCourt
Existing name was missing a g.
2015-02-02 04:24:32 -05:00
Serenitas50
94004846cd Added comic Beetlebum (http://blog.beetlebum.de/). 2015-01-31 22:07:35 -02:00
Manabi
a5b0d0c5de Added definition for Drive comic 2015-01-26 04:21:24 -05:00
Dirk Reiners
b710d3fa81 Merge branch 'master' of https://github.com/wummel/dosage 2015-01-16 13:24:48 -06:00
Dirk Reiners
c6f0dd6117 PiledHigherAndDeeper: Fix for new website format 2015-01-16 12:06:17 -06:00
Dirk Reiners
e25270c866 Dilbert: Fix for new websitre format 2015-01-16 12:05:53 -06:00
Dirk Reiners
3724eba835 Cyanide And Happiness: Fix for new website format 2015-01-16 12:05:36 -06:00
Tobias Gruetzmacher
f8531eca57 Move SinFest back to KeenSpot namespace. 2015-01-16 00:16:28 +01:00
Tobias Gruetzmacher
4733153d01 Merge pull request #87 from rpglover64/master
Update SinFest to work with new website.
2015-01-16 00:15:04 +01:00
Alex Rozenshteyn
a0506b22f0 Update ZenPencils URL. 2014-12-16 13:51:52 -05:00
Alex Rozenshteyn
51996e45ed Update SinFest to work with new website. 2014-12-16 12:01:54 -05:00
Tobias Gruetzmacher
2c1ff889fa Fix scope in HTML output. 2014-12-10 00:57:17 +01:00
Tobias Gruetzmacher
b7bc16650a Merge branch 'carlosefonseca/master' 2014-12-10 00:07:21 +01:00
Tobias Gruetzmacher
5af4f45505 Merge branch 'zac9/patch-2' 2014-12-10 00:03:08 +01:00
Tobias Gruetzmacher
32265c99d7 Merge branch 'zac9/patch-1' 2014-12-10 00:00:51 +01:00
Carlos Fonseca
04cc07a466 Added comic Nimona 2014-12-08 13:28:37 +00:00
mbrandis
25cf4888ae - Adapted ShermansLagoon
- Better version of OnTheFastTrack
2014-11-14 20:37:06 +01:00
mbrandis
c63f927e5c - Modified OnTheFasttrack adapting the new API. 2014-11-14 20:09:42 +01:00
mbrandis
cd48801b0d - Added next and previous day at end of page. 2014-11-14 15:39:42 +01:00
Dirk Reiners
fda654b5e0 Some fixes...
AbstruseGoose: fixed prev
Carciphona: fixed latest
Curtailed: fixed image and prev (moved to WP)
DorkTower: fixed image search
GrrlPower: fixed site name issue
MadamAndEve: archive not updated in a long time, but current strip is.
Works, but needs to be run daily.
PennyArcade: fixed namer
PvPonline: fixed prev
2014-10-24 16:42:32 -05:00
Dirk Reiners
77a5e09c10 Minor fix for using pathes to pick comics 2014-10-24 16:39:40 -05:00
Tobias Gruetzmacher
6769e1eb36 Add StrongFemaleProtagonist.
This uses the _ParserScraper and CSS selectors.
2014-10-13 23:39:50 +02:00
Tobias Gruetzmacher
1d52d6a152 Add support for CSS selectors to HTML parser.
Each comic module author can decide if she wants to use CSS or XPath,
not a mix of both. Using CSS needs the cssselect python module and the
module gets disabled if it is unavailable.
2014-10-13 22:43:06 +02:00
Tobias Gruetzmacher
17bc454132 Bugfix: Don't assume RE patterns in base class. 2014-10-13 22:29:47 +02:00
Tobias Gruetzmacher
e92a3fb3a1 New feature: Comic modules ca be "disabled".
This is modeled parallel to the "adult" feature, except the user can't
override it via the command line. Each comic module can override the
classmethod getDisabledReasons and give the user a reason why this
module is disabled. The user can see the reason in the comic list (-l or
--singlelist) and the comic module refuses to run, showing the same
message.

This is currently used to disable modules that use the _ParserScraper if
the LXML python module is missing.
2014-10-13 21:43:46 +02:00
Tobias Gruetzmacher
d495d95ee0 Refactor: Move repeated check into its own function. 2014-10-13 21:29:54 +02:00
Tobias Gruetzmacher
3235b8b312 Pass unicode strings to lxml.
This reverts commit fcde86e9c0 & some
more. This lets python-requests do all the encoding stuff and leaves
LXML with (hopefully) clean unicode HTML to parse.
2014-10-13 19:39:48 +02:00
zac9
6ca200419a Update s.py 2014-09-28 19:48:26 -07:00
zac9
5b7ab5a711 Update o.py 2014-09-28 19:41:29 -07:00
zac9
491b5457b2 Added comic ShotgunShuffle 2014-09-28 06:29:02 -07:00
Bastian Kleineidam
731291979d Fixed RedMeat. 2014-09-22 22:14:31 +02:00
Bastian Kleineidam
e43694c156 Don't crash on multiple HTML output runs per day. 2014-09-22 22:00:16 +02:00
Bastian Kleineidam
e87f5993b8 Merge branch 'master' into htmlparser 2014-08-07 18:10:15 +02:00
Tobias Gruetzmacher
08175d28c9 Fix Ruthe (see #73). 2014-07-31 21:27:49 +02:00
Tobias Gruetzmacher
ca2d722d39 Fix DieFruehreifen (closes #73). 2014-07-31 21:18:15 +02:00
Tobias Gruetzmacher
6c7fb176b1 Add Blade Kitten as an example for the new parser. 2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
f9f0b75d7c Create new HTML parser based scraper class. 2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
fcde86e9c0 Change getPageContent to (optionally) return raw text.
This allows LXML to do its own "magic" encoding detection
2014-07-26 11:28:43 +02:00