Commit graph

768 commits

Author SHA1 Message Date
Tobias Gruetzmacher b9d9564085 Fix Dilbert (fixes #44). 2016-05-16 01:21:23 +02:00
Tobias Gruetzmacher e9b3c487c0 Remove some dead comics. 2016-05-16 01:10:20 +02:00
Tobias Gruetzmacher bd60155d9f Some more ComicFury comics gone... 2016-05-16 00:53:22 +02:00
Tobias Gruetzmacher 849e60e795 Remove make_scraper magic from webcomiceu. 2016-05-07 03:20:01 +02:00
Tobias Gruetzmacher 975d2376bf Another round of comic module fixes. 2016-05-07 01:50:10 +02:00
Tobias Gruetzmacher efe1308db2 Replace home-grown Python2/3 compat. with six. 2016-05-05 23:33:48 +02:00
Tobias Gruetzmacher 77ed0218e0 Fix some comic modules. 2016-05-05 20:55:14 +02:00
Tobias Gruetzmacher bb2ac39639 Fix some URLs. 2016-05-05 10:12:03 +02:00
Tobias Gruetzmacher d05316e3ac Seems ComicFury is deleting comics regularly...
Well, there's nothing we can do: Remove them.
2016-05-04 08:26:53 +02:00
Tobias Gruetzmacher 0c1aa9e8bd Move libxml < 2.9.3 workaround to base class. 2016-05-02 23:22:06 +02:00
Tobias Gruetzmacher b93a8fde65 Move PensAndTales comics and fix them. 2016-05-02 22:32:14 +02:00
Tobias Gruetzmacher 4006ced43d Move all HijinksEnsue comics into alphabetic files. 2016-05-02 01:25:34 +02:00
Tobias Gruetzmacher d5f91ecfd2 Fix some modules in m.py. 2016-04-30 01:59:28 +02:00
Tobias Gruetzmacher 1d52d33311 Remove missing SmackJeeves comics. 2016-04-30 00:56:20 +02:00
Tobias Gruetzmacher d796f3476c Fix some modules in d.py. 2016-04-30 00:44:18 +02:00
Tobias Gruetzmacher cc16fea880 Fix some modules in c.py 2016-04-29 00:35:02 +02:00
Tobias Gruetzmacher 1d94439715 Fix some more comic modules. 2016-04-27 00:31:27 +02:00
Tobias Gruetzmacher 8b1ac4eb35 Fix "tagsoup" on SmackJeeves
Unfortunatly, browsers render < outside of HTML tags differently then
libXML until recently (libXML 2.9.3), so we need to preprocess pages
before parsing them...

(This was fixed in libXML commit 140c25)
2016-04-26 08:05:38 +02:00
Tobias Gruetzmacher 035d6e94e4 Allow output level for warnings and errors. 2016-04-26 07:53:53 +02:00
Tobias Gruetzmacher 8ddf553eb4 Fix some more SmackJeeves modules. 2016-04-22 01:04:47 +02:00
Tobias Gruetzmacher fd85c8583a Unify similar code in fetchUrl and fetchText 2016-04-22 00:42:46 +02:00
Tobias Gruetzmacher 6574997e01 Refactor: All the other class methods.
Turns out, it would have been better if all methods had been instance
methods and not class methods. This finished a big chunk of the rework
needed for #42.
2016-04-21 23:52:31 +02:00
Tobias Gruetzmacher 0d436b8ca9 Refactor: url modifiers to normal methods.
As before, to implement #42 these might want to access information from
the instance, so they should be normal methods.
2016-04-21 21:39:25 +02:00
Tobias Gruetzmacher c3f32dfef7 Refactor: Make namer a method.
When #42 is realized, the naming of files might differ between comic
modules, so the namer's logical location is the instance, not the class.
2016-04-21 08:20:49 +02:00
Tobias Gruetzmacher 5bd2a49f48 Add debug output on matched XPath/CSS expression. 2016-04-20 23:51:54 +02:00
Tobias Gruetzmacher fe51a449df Update SmackJeeves
- Now uses _ParserScraper, which makes the pattern quite a bit more
  generic and IMHO more readable
- remove make_scraper magic
- No new comics, only fixed existing ones and removed some dead ones.
2016-04-20 23:36:45 +02:00
Tobias Gruetzmacher 190cd3b063 Convert language & getDisabledReasons to methods.
Both are more properties of a webcomic (this is part of the design
changes for #42)
2016-04-19 23:53:46 +02:00
Tobias Gruetzmacher df46907f39 Register EXSLT extensions by default.
This allows comic module authors to use the full power of regular
expressions in XPath expression, see http://exslt.org/regexp/regexp.html
for usage. Please be aware that these use the prefix re: instead of
regexp: here.
2016-04-19 23:48:14 +02:00
Tobias Gruetzmacher 4204f5f1e4 Send "If-Modified-Since" header for images. 2016-04-19 00:36:50 +02:00
Tobias Gruetzmacher 13a3409854 Remove some comics that are gone or block us. 2016-04-17 19:42:43 +02:00
Tobias Gruetzmacher 1fbc844077 Update GoComics. 2016-04-17 18:40:09 +02:00
Tobias Gruetzmacher 73e958670d Update ComicFury (again). 2016-04-17 16:19:44 +02:00
Tobias Gruetzmacher b0481a01f7 Update languages. 2016-04-16 13:14:12 +02:00
Tobias Gruetzmacher 3329027e4b Update ComicFury. 2016-04-16 13:13:47 +02:00
Tobias Gruetzmacher ee99c087d7 Remove prevUrlMatchesStripUrl.
It was only used for one test.
2016-04-16 01:14:26 +02:00
Tobias Gruetzmacher 92a688457a Remove useless indirection. 2016-04-15 23:42:24 +02:00
Tobias Gruetzmacher 52515b5fc5 Update GoComics. 2016-04-15 00:26:14 +02:00
Tobias Gruetzmacher 031a523846 Fix SnafuComics. 2016-04-14 23:52:35 +02:00
Tobias Gruetzmacher 7626b1e100 Webcomics Nation is gone. 2016-04-14 22:46:52 +02:00
Tobias Gruetzmacher 497653c448 Remove make_scraper magic from Arcamax. 2016-04-14 00:17:59 +02:00
Tobias Gruetzmacher db87ed95e7 Use new features to make modules simpler. 2016-04-13 23:28:43 +02:00
Tobias Gruetzmacher b266e28ae1 Remove debugging prints 😭 2016-04-13 22:59:06 +02:00
Tobias Gruetzmacher ff3b824311 Fix variable shadowing... 2016-04-13 22:43:34 +02:00
Tobias Gruetzmacher 060281e5ff Use concrete scraper objects everywhere.
This is a first step for #42. Since most access to the scraper classes
is through instances, modules can now dynamically override url and name
(name is now a property).
2016-04-13 22:17:30 +02:00
Tobias Gruetzmacher 0468f2f31a Refactor: Convert starter to simple method. 2016-04-13 20:01:51 +02:00
Tobias Gruetzmacher 16004e43e4 Use default bounceStarter for site modules. 2016-04-13 01:24:13 +02:00
Tobias Gruetzmacher 9028724a74 Clean up update helper scripts. 2016-04-13 00:52:16 +02:00
Tobias Gruetzmacher 42e43fa4e6 Read starter parameters from class.
This allows to specify starters in a more declarative and dynamic way.
2016-04-12 23:11:39 +02:00
Tobias Gruetzmacher b865a171f9 Remove some broken comics. 2016-04-12 08:21:06 +02:00
Tobias Gruetzmacher 4e2e4ac529 Prevent scraper from moving to a different comic. 2016-04-12 08:10:47 +02:00