Commit graph

784 commits

Author SHA1 Message Date
André-Patrick Bubel 192751073c Add KillSixBillionDemons comic 2016-05-31 07:28:32 +00:00
Tobias Gruetzmacher 807bee6342 Migrate GoComics to single-class module. 2016-05-23 00:01:10 +02:00
Tobias Gruetzmacher 2c8e57bdea Migrate Creators to single-class module. 2016-05-22 23:56:59 +02:00
Tobias Gruetzmacher f5dff27b0a Migrate SmackJeeves to single-class module. 2016-05-22 23:54:21 +02:00
Tobias Gruetzmacher 1ea20e1743 Migrate WebcomicFactory to single-class module. 2016-05-22 23:40:58 +02:00
Tobias Gruetzmacher c62a7283a2 Migrate ComicFury to single-class module. 2016-05-22 23:31:53 +02:00
Tobias Gruetzmacher 1834bf179f Migrate Arcamax to single-class module. 2016-05-22 23:17:24 +02:00
Tobias Gruetzmacher f29472c143 Make auto-update script more flexible. 2016-05-22 23:06:05 +02:00
Tobias Gruetzmacher e4650d5941 Remove make_scraper from Nitrocosm. 2016-05-21 14:35:53 +02:00
Tobias Gruetzmacher b6eb8ab8ef Remove make_scraper from SandraAndWoo 2016-05-21 14:12:11 +02:00
Tobias Gruetzmacher 4630ea047c Implement Oglaf's strange navigation (fixes #33)
(also should fix wummel#91)
2016-05-21 02:38:07 +02:00
Tobias Gruetzmacher 51008a975b Refactor: Introduce generator methods for scrapers
This allows one comic module class to generate multiple scrapers. This
change is to support a more dynamic module system as described in #42.
2016-05-21 01:29:36 +02:00
Tobias Gruetzmacher 89cfd9d310 Add comics from catomix.com. 2016-05-16 23:55:41 +02:00
Tobias Gruetzmacher a6cf4e7040 Fix some more comic modules. 2016-05-16 23:16:29 +02:00
Tobias Gruetzmacher be1a63da0c Update GoComics comic list. 2016-05-16 18:26:45 +02:00
Tobias Gruetzmacher 6d3f74142c Move command line tool into package.
This way we can use the default Python console_scripts install process.
2016-05-16 14:57:47 +02:00
Tobias Gruetzmacher b9d9564085 Fix Dilbert (fixes #44). 2016-05-16 01:21:23 +02:00
Tobias Gruetzmacher e9b3c487c0 Remove some dead comics. 2016-05-16 01:10:20 +02:00
Tobias Gruetzmacher bd60155d9f Some more ComicFury comics gone... 2016-05-16 00:53:22 +02:00
Tobias Gruetzmacher 849e60e795 Remove make_scraper magic from webcomiceu. 2016-05-07 03:20:01 +02:00
Tobias Gruetzmacher 975d2376bf Another round of comic module fixes. 2016-05-07 01:50:10 +02:00
Tobias Gruetzmacher efe1308db2 Replace home-grown Python2/3 compat. with six. 2016-05-05 23:33:48 +02:00
Tobias Gruetzmacher 77ed0218e0 Fix some comic modules. 2016-05-05 20:55:14 +02:00
Tobias Gruetzmacher bb2ac39639 Fix some URLs. 2016-05-05 10:12:03 +02:00
Tobias Gruetzmacher d05316e3ac Seems ComicFury is deleting comics regularly...
Well, there's nothing we can do: Remove them.
2016-05-04 08:26:53 +02:00
Tobias Gruetzmacher 0c1aa9e8bd Move libxml < 2.9.3 workaround to base class. 2016-05-02 23:22:06 +02:00
Tobias Gruetzmacher b93a8fde65 Move PensAndTales comics and fix them. 2016-05-02 22:32:14 +02:00
Tobias Gruetzmacher 4006ced43d Move all HijinksEnsue comics into alphabetic files. 2016-05-02 01:25:34 +02:00
Tobias Gruetzmacher d5f91ecfd2 Fix some modules in m.py. 2016-04-30 01:59:28 +02:00
Tobias Gruetzmacher 1d52d33311 Remove missing SmackJeeves comics. 2016-04-30 00:56:20 +02:00
Tobias Gruetzmacher d796f3476c Fix some modules in d.py. 2016-04-30 00:44:18 +02:00
Tobias Gruetzmacher cc16fea880 Fix some modules in c.py 2016-04-29 00:35:02 +02:00
Tobias Gruetzmacher 1d94439715 Fix some more comic modules. 2016-04-27 00:31:27 +02:00
Tobias Gruetzmacher 8b1ac4eb35 Fix "tagsoup" on SmackJeeves
Unfortunatly, browsers render < outside of HTML tags differently then
libXML until recently (libXML 2.9.3), so we need to preprocess pages
before parsing them...

(This was fixed in libXML commit 140c25)
2016-04-26 08:05:38 +02:00
Tobias Gruetzmacher 035d6e94e4 Allow output level for warnings and errors. 2016-04-26 07:53:53 +02:00
Tobias Gruetzmacher 8ddf553eb4 Fix some more SmackJeeves modules. 2016-04-22 01:04:47 +02:00
Tobias Gruetzmacher fd85c8583a Unify similar code in fetchUrl and fetchText 2016-04-22 00:42:46 +02:00
Tobias Gruetzmacher 6574997e01 Refactor: All the other class methods.
Turns out, it would have been better if all methods had been instance
methods and not class methods. This finished a big chunk of the rework
needed for #42.
2016-04-21 23:52:31 +02:00
Tobias Gruetzmacher 0d436b8ca9 Refactor: url modifiers to normal methods.
As before, to implement #42 these might want to access information from
the instance, so they should be normal methods.
2016-04-21 21:39:25 +02:00
Tobias Gruetzmacher c3f32dfef7 Refactor: Make namer a method.
When #42 is realized, the naming of files might differ between comic
modules, so the namer's logical location is the instance, not the class.
2016-04-21 08:20:49 +02:00
Tobias Gruetzmacher 5bd2a49f48 Add debug output on matched XPath/CSS expression. 2016-04-20 23:51:54 +02:00
Tobias Gruetzmacher fe51a449df Update SmackJeeves
- Now uses _ParserScraper, which makes the pattern quite a bit more
  generic and IMHO more readable
- remove make_scraper magic
- No new comics, only fixed existing ones and removed some dead ones.
2016-04-20 23:36:45 +02:00
Tobias Gruetzmacher 190cd3b063 Convert language & getDisabledReasons to methods.
Both are more properties of a webcomic (this is part of the design
changes for #42)
2016-04-19 23:53:46 +02:00
Tobias Gruetzmacher df46907f39 Register EXSLT extensions by default.
This allows comic module authors to use the full power of regular
expressions in XPath expression, see http://exslt.org/regexp/regexp.html
for usage. Please be aware that these use the prefix re: instead of
regexp: here.
2016-04-19 23:48:14 +02:00
Tobias Gruetzmacher 4204f5f1e4 Send "If-Modified-Since" header for images. 2016-04-19 00:36:50 +02:00
Tobias Gruetzmacher 13a3409854 Remove some comics that are gone or block us. 2016-04-17 19:42:43 +02:00
Tobias Gruetzmacher 1fbc844077 Update GoComics. 2016-04-17 18:40:09 +02:00
Tobias Gruetzmacher 73e958670d Update ComicFury (again). 2016-04-17 16:19:44 +02:00
Tobias Gruetzmacher b0481a01f7 Update languages. 2016-04-16 13:14:12 +02:00
Tobias Gruetzmacher 3329027e4b Update ComicFury. 2016-04-16 13:13:47 +02:00