Dirk Reiners
3724eba835
Cyanide And Happiness: Fix for new website format
2015-01-16 12:05:36 -06:00
Tobias Gruetzmacher
f8531eca57
Move SinFest back to KeenSpot namespace.
2015-01-16 00:16:28 +01:00
Tobias Gruetzmacher
4733153d01
Merge pull request #87 from rpglover64/master
...
Update SinFest to work with new website.
2015-01-16 00:15:04 +01:00
Tobias Gruetzmacher
081d605e81
Merge pull request #89 from rpglover64/zp
...
Update ZenPencils image URL
2015-01-15 23:59:12 +01:00
Alex Rozenshteyn
a0506b22f0
Update ZenPencils URL.
2014-12-16 13:51:52 -05:00
Alex Rozenshteyn
51996e45ed
Update SinFest to work with new website.
2014-12-16 12:01:54 -05:00
Tobias Gruetzmacher
2c1ff889fa
Fix scope in HTML output.
2014-12-10 00:57:17 +01:00
Tobias Gruetzmacher
5b2ce4350e
Update changelog.
2014-12-10 00:23:32 +01:00
Tobias Gruetzmacher
b7bc16650a
Merge branch 'carlosefonseca/master'
2014-12-10 00:07:21 +01:00
Tobias Gruetzmacher
5af4f45505
Merge branch 'zac9/patch-2'
2014-12-10 00:03:08 +01:00
Tobias Gruetzmacher
32265c99d7
Merge branch 'zac9/patch-1'
2014-12-10 00:00:51 +01:00
Carlos Fonseca
04cc07a466
Added comic Nimona
2014-12-08 13:28:37 +00:00
mbrandis
25cf4888ae
- Adapted ShermansLagoon
...
- Better version of OnTheFastTrack
2014-11-14 20:37:06 +01:00
mbrandis
c63f927e5c
- Modified OnTheFasttrack adapting the new API.
2014-11-14 20:09:42 +01:00
mbrandis
cd48801b0d
- Added next and previous day at end of page.
2014-11-14 15:39:42 +01:00
Dirk Reiners
fda654b5e0
Some fixes...
...
AbstruseGoose: fixed prev
Carciphona: fixed latest
Curtailed: fixed image and prev (moved to WP)
DorkTower: fixed image search
GrrlPower: fixed site name issue
MadamAndEve: archive not updated in a long time, but current strip is.
Works, but needs to be run daily.
PennyArcade: fixed namer
PvPonline: fixed prev
2014-10-24 16:42:32 -05:00
Dirk Reiners
77a5e09c10
Minor fix for using pathes to pick comics
2014-10-24 16:39:40 -05:00
Tobias Gruetzmacher
6769e1eb36
Add StrongFemaleProtagonist.
...
This uses the _ParserScraper and CSS selectors.
2014-10-13 23:39:50 +02:00
Tobias Gruetzmacher
1d52d6a152
Add support for CSS selectors to HTML parser.
...
Each comic module author can decide if she wants to use CSS or XPath,
not a mix of both. Using CSS needs the cssselect python module and the
module gets disabled if it is unavailable.
2014-10-13 22:43:06 +02:00
Tobias Gruetzmacher
17bc454132
Bugfix: Don't assume RE patterns in base class.
2014-10-13 22:29:47 +02:00
Tobias Gruetzmacher
e92a3fb3a1
New feature: Comic modules ca be "disabled".
...
This is modeled parallel to the "adult" feature, except the user can't
override it via the command line. Each comic module can override the
classmethod getDisabledReasons and give the user a reason why this
module is disabled. The user can see the reason in the comic list (-l or
--singlelist) and the comic module refuses to run, showing the same
message.
This is currently used to disable modules that use the _ParserScraper if
the LXML python module is missing.
2014-10-13 21:43:46 +02:00
Tobias Gruetzmacher
d495d95ee0
Refactor: Move repeated check into its own function.
2014-10-13 21:29:54 +02:00
Tobias Gruetzmacher
3235b8b312
Pass unicode strings to lxml.
...
This reverts commit fcde86e9c0
& some
more. This lets python-requests do all the encoding stuff and leaves
LXML with (hopefully) clean unicode HTML to parse.
2014-10-13 19:39:48 +02:00
zac9
6ca200419a
Update s.py
2014-09-28 19:48:26 -07:00
zac9
5b7ab5a711
Update o.py
2014-09-28 19:41:29 -07:00
zac9
491b5457b2
Added comic ShotgunShuffle
2014-09-28 06:29:02 -07:00
Bastian Kleineidam
731291979d
Fixed RedMeat.
2014-09-22 22:14:31 +02:00
Bastian Kleineidam
e43694c156
Don't crash on multiple HTML output runs per day.
2014-09-22 22:00:16 +02:00
Bastian Kleineidam
bed49c19ad
Bump up version.
2014-09-22 21:59:26 +02:00
Bastian Kleineidam
2e5114c2ec
Updated votes
...
[ci skip]
2014-09-10 02:04:30 +02:00
Bastian Kleineidam
e86586226c
Updated votes
...
[ci skip]
2014-08-20 01:49:22 +02:00
Bastian Kleineidam
e87f5993b8
Merge branch 'master' into htmlparser
2014-08-07 18:10:15 +02:00
Bastian Kleineidam
f76006d89d
Merge branch 'master' of github.com:wummel/dosage
2014-08-06 20:01:46 +02:00
Bastian Kleineidam
b9f7fb23e7
Updated votes
...
[ci skip]
2014-08-06 01:56:37 +02:00
Tobias Gruetzmacher
08175d28c9
Fix Ruthe (see #73 ).
2014-07-31 21:27:49 +02:00
Tobias Gruetzmacher
ca2d722d39
Fix DieFruehreifen ( closes #73 ).
2014-07-31 21:18:15 +02:00
Tobias Gruetzmacher
6c7fb176b1
Add Blade Kitten as an example for the new parser.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
f9f0b75d7c
Create new HTML parser based scraper class.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
fcde86e9c0
Change getPageContent to (optionally) return raw text.
...
This allows LXML to do its own "magic" encoding detection
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
0e03eca8f0
Move all regular expression operation into the new class.
...
- Move fetchUrls, fetchUrl and fetchText.
- Move base URL handling.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
fde1fdced6
Fix some typos.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
2567bd4e57
Convert starters and other helpers to new interface.
...
This allows those starters to work with future scrapers.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
4265053846
Refactor: Move regualar expression scraping into a new class.
...
- This also makes "<base href>" handling an internal detail of the regular
expression scraper, future scrapers might not need that or handle it in
another way.
2014-07-26 11:28:43 +02:00
Bastian Kleineidam
3a929ceea6
Allow comic text to be optional. Patch from TobiX
2014-07-24 20:49:57 +02:00
Bastian Kleineidam
950dd2932c
Remove stray print statement.
2014-07-21 20:20:15 +02:00
Bastian Kleineidam
bc6279f2ab
Merge branch 'master' of github.com:wummel/dosage
2014-07-21 20:19:17 +02:00
Tobias Gruetzmacher
ea5d533e30
Fix index lookups for SnowFlame and SnowFlakes.
2014-07-19 13:23:42 +02:00
Bastian Kleineidam
05f0afdf99
Updated votes
...
[ci skip]
2014-07-16 02:02:14 +02:00
Bastian Kleineidam
dd51f1618d
Updated votes
...
[ci skip]
2014-07-09 01:40:43 +02:00
Bastian Kleineidam
011ef49b94
Updated webpage meta info
...
[ci skip]
2014-07-03 22:01:51 +02:00