Commit graph

89 commits

Author SHA1 Message Date
Tobias Gruetzmacher 23125c74d4
Unify XPath NS config over modules 2024-03-17 21:44:46 +01:00
Tobias Gruetzmacher b4bcb65249
Update GoComics modules 2024-02-14 23:39:08 +01:00
Tobias Gruetzmacher 9dbde1bdba
Update flake8 & plugins (#251)
Additionally, this adds some hackery to let flake8 read its config from
pyproject.toml.
2022-12-11 20:15:09 +01:00
Tobias Gruetzmacher f00696813c Remove old location for Widdershins 2022-06-06 16:48:39 +02:00
Tobias Gruetzmacher 8e1e398a8d Deprecate underscore-prefixed parent classes
This is trying to strike a balance between updating as much existing
classes as possible, but not making the diff too big...
2022-06-06 12:08:32 +02:00
Tobias Gruetzmacher 0d8871b253 Update GoComics module 2022-06-05 20:23:56 +02:00
Tobias Gruetzmacher 2ecfcaec17 Update GoComics 2021-02-01 00:19:22 +01:00
Tobias Gruetzmacher e64635e86b Stricter style checking & related style fixes 2020-10-11 20:15:27 +02:00
Tobias Gruetzmacher 8d7fd8b884 Update GoComics modules
The usual: GoComics removed some comics, added some and renamed some...
2020-09-28 01:15:07 +02:00
Tobias Gruetzmacher 7a176b29f2 Replace xpath_class with custom xpath function 2020-07-31 22:56:30 +02:00
DavidAccola 131deeaa34
Add False Knees and Yes, I'm Hot in This (#161) 2020-04-20 01:00:38 +02:00
Tobias Gruetzmacher 27d28b8eef Update file headers
The default encoding for source files is UTF-8 since Python 3, so we can
drop all encoding headers. While we are at it, just replace them with
SPDX headers.
2020-04-18 13:45:44 +02:00
Tobias Gruetzmacher c4b7d5b930 Fix index feature for GoComics (fixes #155) 2020-03-26 00:43:43 +01:00
Tobias Gruetzmacher 44791439a5 Drop Python 2 support: Obsolete future statements 2020-02-04 01:06:19 +01:00
Tobias Gruetzmacher 58611fe600 Remove missing GoComics submodules 2019-12-29 02:17:52 +01:00
Tobias Gruetzmacher 66f154f074 Add throttling for GoComics (fixes #90)
Since this was the goal of the whole throttling implementation ;)
2019-12-04 00:28:27 +01:00
Tobias Gruetzmacher 49ec3cc3fa Fix (and simplify) GoComics expressions (fixes #117) 2018-07-14 11:00:27 +02:00
Peter Janes 2a2ff2d545 GoComics no longer has nav on the comic's home page. 2018-04-06 14:09:13 -04:00
Tobias Gruetzmacher 1fe98d2f7f Use a diferent div class for GoComics (fixes #102). 2018-03-23 00:29:40 +01:00
Tobias Gruetzmacher a99098d5ad Update GoComics module. 2017-05-21 23:10:32 +02:00
Tobias Gruetzmacher ebbb27d05d Move xpath_class to helpers module. 2017-02-13 22:41:17 +01:00
Tobias Gruetzmacher b57945efd1 Update GoComic modules. 2017-02-12 12:21:01 +01:00
Tobias Gruetzmacher a183e812ae Update GoComics module for new site layout.
(fixes #77)
2017-01-11 02:21:05 +01:00
Tobias Gruetzmacher 46b7a374f6 Small GoComics update. 2016-11-01 02:51:00 +01:00
Tobias Gruetzmacher 9a6a310b76 Fixup copyright years. 2016-10-29 00:21:41 +02:00
Tobias Gruetzmacher f342a93aa1 Update GoComics module. 2016-10-01 03:39:36 +02:00
Tobias Gruetzmacher 807bee6342 Migrate GoComics to single-class module. 2016-05-23 00:01:10 +02:00
Tobias Gruetzmacher f29472c143 Make auto-update script more flexible. 2016-05-22 23:06:05 +02:00
Tobias Gruetzmacher 51008a975b Refactor: Introduce generator methods for scrapers
This allows one comic module class to generate multiple scrapers. This
change is to support a more dynamic module system as described in #42.
2016-05-21 01:29:36 +02:00
Tobias Gruetzmacher be1a63da0c Update GoComics comic list. 2016-05-16 18:26:45 +02:00
Tobias Gruetzmacher c3f32dfef7 Refactor: Make namer a method.
When #42 is realized, the naming of files might differ between comic
modules, so the namer's logical location is the instance, not the class.
2016-04-21 08:20:49 +02:00
Tobias Gruetzmacher 1fbc844077 Update GoComics. 2016-04-17 18:40:09 +02:00
Tobias Gruetzmacher 52515b5fc5 Update GoComics. 2016-04-15 00:26:14 +02:00
Tobias Gruetzmacher db87ed95e7 Use new features to make modules simpler. 2016-04-13 23:28:43 +02:00
Tobias Gruetzmacher 060281e5ff Use concrete scraper objects everywhere.
This is a first step for #42. Since most access to the scraper classes
is through instances, modules can now dynamically override url and name
(name is now a property).
2016-04-13 22:17:30 +02:00
Tobias Gruetzmacher 0468f2f31a Refactor: Convert starter to simple method. 2016-04-13 20:01:51 +02:00
Tobias Gruetzmacher 4e2e4ac529 Prevent scraper from moving to a different comic. 2016-04-12 08:10:47 +02:00
Tobias Gruetzmacher 443ab119e9 Refresh GoComics list from online directory. 2016-04-12 00:36:33 +02:00
Tobias Gruetzmacher 0e385a3697 Update GoComics (no change in supported comics)
- remove make_scraper magic
- switch to _ParserScraper
2016-04-11 22:42:01 +02:00
Tobias Gruetzmacher 68d4dd463a Revert robots.txt handling.
This brings us back to only honouring robots.txt on page downloads, not
on image downloads.

Rationale: Dosage is not a "robot" in the classical sense. It's not
designed to spider huge amounts of web sites in search for some content
to index, it's only intended to help users keep a personal archive of
comics he is interested in. We try very hard to never download any image
twice. This fixes #24.

(Precedent for this rationale: Google Feedfetcher:
https://support.google.com/webmasters/answer/178852?hl=en#robots)
2015-07-17 20:46:56 +02:00
Tobias Gruetzmacher 472afa24d3 GoComics doesn't allow spiders, disable them...
This removes 757 comics, including quite popular ones like Calvin and
Hobbes, Garfield, FoxTrot, etc. :(
2015-07-16 00:36:10 +02:00
Tobias Gruetzmacher e8af5adcb8 Update list of supported GoComics comics. 2015-04-18 02:04:31 +02:00
Manabi 2b98a9023e Added Peanuts Begins & Wizard of Id Classics 2015-04-13 22:26:12 -04:00
Bastian Kleineidam 641daa738b Updated list of comics 2014-07-03 17:12:25 +02:00
Bastian Kleineidam 0ee5c08771 Match zoom image for GoComics pages. 2014-06-08 10:06:34 +02:00
Peter B 124cf99665 Added Poorly Drawn lines replacing GoComic's version. 2014-01-12 19:08:02 -05:00
Bastian Kleineidam 4d63920434 Updated copyright. 2014-01-05 16:50:57 +01:00
Bastian Kleineidam 5c5aa166c7 Fix gocomic image matcher 2013-12-12 22:54:03 +01:00
Bastian Kleineidam f23aa86a2c Get larger Gocomic images. 2013-12-11 17:53:52 +01:00
Bastian Kleineidam f6fc604745 Fix GoComics image URL. 2013-11-14 21:30:51 +01:00