Manabi
f85464ccb2
Fixed unclosed ' error
...
Lines 293/294 should have been one line, this is now fixed.
2015-02-02 04:35:49 -05:00
Manabi
190f53ee4d
Fixing name of GunnkriggCourt
...
Existing name was missing a g.
2015-02-02 04:24:32 -05:00
Serenitas50
94004846cd
Added comic Beetlebum ( http://blog.beetlebum.de/ ).
2015-01-31 22:07:35 -02:00
Manabi
a5b0d0c5de
Added definition for Drive comic
2015-01-26 04:21:24 -05:00
Dirk Reiners
b710d3fa81
Merge branch 'master' of https://github.com/wummel/dosage
2015-01-16 13:24:48 -06:00
Dirk Reiners
c6f0dd6117
PiledHigherAndDeeper: Fix for new website format
2015-01-16 12:06:17 -06:00
Dirk Reiners
e25270c866
Dilbert: Fix for new websitre format
2015-01-16 12:05:53 -06:00
Dirk Reiners
3724eba835
Cyanide And Happiness: Fix for new website format
2015-01-16 12:05:36 -06:00
Tobias Gruetzmacher
f8531eca57
Move SinFest back to KeenSpot namespace.
2015-01-16 00:16:28 +01:00
Tobias Gruetzmacher
4733153d01
Merge pull request #87 from rpglover64/master
...
Update SinFest to work with new website.
2015-01-16 00:15:04 +01:00
Alex Rozenshteyn
a0506b22f0
Update ZenPencils URL.
2014-12-16 13:51:52 -05:00
Alex Rozenshteyn
51996e45ed
Update SinFest to work with new website.
2014-12-16 12:01:54 -05:00
Tobias Gruetzmacher
2c1ff889fa
Fix scope in HTML output.
2014-12-10 00:57:17 +01:00
Tobias Gruetzmacher
b7bc16650a
Merge branch 'carlosefonseca/master'
2014-12-10 00:07:21 +01:00
Tobias Gruetzmacher
5af4f45505
Merge branch 'zac9/patch-2'
2014-12-10 00:03:08 +01:00
Tobias Gruetzmacher
32265c99d7
Merge branch 'zac9/patch-1'
2014-12-10 00:00:51 +01:00
Carlos Fonseca
04cc07a466
Added comic Nimona
2014-12-08 13:28:37 +00:00
mbrandis
25cf4888ae
- Adapted ShermansLagoon
...
- Better version of OnTheFastTrack
2014-11-14 20:37:06 +01:00
mbrandis
c63f927e5c
- Modified OnTheFasttrack adapting the new API.
2014-11-14 20:09:42 +01:00
mbrandis
cd48801b0d
- Added next and previous day at end of page.
2014-11-14 15:39:42 +01:00
Dirk Reiners
fda654b5e0
Some fixes...
...
AbstruseGoose: fixed prev
Carciphona: fixed latest
Curtailed: fixed image and prev (moved to WP)
DorkTower: fixed image search
GrrlPower: fixed site name issue
MadamAndEve: archive not updated in a long time, but current strip is.
Works, but needs to be run daily.
PennyArcade: fixed namer
PvPonline: fixed prev
2014-10-24 16:42:32 -05:00
Dirk Reiners
77a5e09c10
Minor fix for using pathes to pick comics
2014-10-24 16:39:40 -05:00
Tobias Gruetzmacher
6769e1eb36
Add StrongFemaleProtagonist.
...
This uses the _ParserScraper and CSS selectors.
2014-10-13 23:39:50 +02:00
Tobias Gruetzmacher
1d52d6a152
Add support for CSS selectors to HTML parser.
...
Each comic module author can decide if she wants to use CSS or XPath,
not a mix of both. Using CSS needs the cssselect python module and the
module gets disabled if it is unavailable.
2014-10-13 22:43:06 +02:00
Tobias Gruetzmacher
17bc454132
Bugfix: Don't assume RE patterns in base class.
2014-10-13 22:29:47 +02:00
Tobias Gruetzmacher
e92a3fb3a1
New feature: Comic modules ca be "disabled".
...
This is modeled parallel to the "adult" feature, except the user can't
override it via the command line. Each comic module can override the
classmethod getDisabledReasons and give the user a reason why this
module is disabled. The user can see the reason in the comic list (-l or
--singlelist) and the comic module refuses to run, showing the same
message.
This is currently used to disable modules that use the _ParserScraper if
the LXML python module is missing.
2014-10-13 21:43:46 +02:00
Tobias Gruetzmacher
d495d95ee0
Refactor: Move repeated check into its own function.
2014-10-13 21:29:54 +02:00
Tobias Gruetzmacher
3235b8b312
Pass unicode strings to lxml.
...
This reverts commit fcde86e9c0
& some
more. This lets python-requests do all the encoding stuff and leaves
LXML with (hopefully) clean unicode HTML to parse.
2014-10-13 19:39:48 +02:00
zac9
6ca200419a
Update s.py
2014-09-28 19:48:26 -07:00
zac9
5b7ab5a711
Update o.py
2014-09-28 19:41:29 -07:00
zac9
491b5457b2
Added comic ShotgunShuffle
2014-09-28 06:29:02 -07:00
Bastian Kleineidam
731291979d
Fixed RedMeat.
2014-09-22 22:14:31 +02:00
Bastian Kleineidam
e43694c156
Don't crash on multiple HTML output runs per day.
2014-09-22 22:00:16 +02:00
Bastian Kleineidam
e87f5993b8
Merge branch 'master' into htmlparser
2014-08-07 18:10:15 +02:00
Tobias Gruetzmacher
08175d28c9
Fix Ruthe (see #73 ).
2014-07-31 21:27:49 +02:00
Tobias Gruetzmacher
ca2d722d39
Fix DieFruehreifen ( closes #73 ).
2014-07-31 21:18:15 +02:00
Tobias Gruetzmacher
6c7fb176b1
Add Blade Kitten as an example for the new parser.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
f9f0b75d7c
Create new HTML parser based scraper class.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
fcde86e9c0
Change getPageContent to (optionally) return raw text.
...
This allows LXML to do its own "magic" encoding detection
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
0e03eca8f0
Move all regular expression operation into the new class.
...
- Move fetchUrls, fetchUrl and fetchText.
- Move base URL handling.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
fde1fdced6
Fix some typos.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
2567bd4e57
Convert starters and other helpers to new interface.
...
This allows those starters to work with future scrapers.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
4265053846
Refactor: Move regualar expression scraping into a new class.
...
- This also makes "<base href>" handling an internal detail of the regular
expression scraper, future scrapers might not need that or handle it in
another way.
2014-07-26 11:28:43 +02:00
Bastian Kleineidam
3a929ceea6
Allow comic text to be optional. Patch from TobiX
2014-07-24 20:49:57 +02:00
Bastian Kleineidam
950dd2932c
Remove stray print statement.
2014-07-21 20:20:15 +02:00
Tobias Gruetzmacher
ea5d533e30
Fix index lookups for SnowFlame and SnowFlakes.
2014-07-19 13:23:42 +02:00
Bastian Kleineidam
4d49d4394b
Fix doc
2014-07-03 18:42:06 +02:00
Bastian Kleineidam
f194e430bc
TheThinHLine: fetch bigger images and name image files from sequence number.
2014-07-03 18:41:25 +02:00
Bastian Kleineidam
4845a4ccc1
Merge branch 'master' of github.com:wummel/dosage
2014-07-03 17:12:42 +02:00
Bastian Kleineidam
641daa738b
Updated list of comics
2014-07-03 17:12:25 +02:00
Bastian Kleineidam
93fe5d5987
Minor useragent refactoring
2014-07-03 17:12:25 +02:00
Bastian Kleineidam
4c2a339e25
Fix some comics.
2014-07-02 19:51:53 +02:00
Luc Fouin
cb76198da7
added the thin H line, fixes #67
2014-07-02 17:14:33 +02:00
Luc Fouin
763f9b02a2
added the thin H line
2014-07-02 17:11:33 +02:00
Bastian Kleineidam
b03ba158ef
Fixed LookingForGroup
2014-07-01 23:44:01 +02:00
Bastian Kleineidam
3485e2ac54
Added Whomp.
2014-06-24 20:48:49 +02:00
wummel
a0086bfcd8
Merge pull request #63 from sehrgut/master
...
Updated GirlGenius to new markup
2014-06-24 20:40:15 +02:00
Peter B
8f1c864ec3
Added Safely Endangered
2014-06-17 01:05:11 -04:00
Keith Beckman
236b840363
Updated GirlGenius to new markup
...
GG markup has changed, so I fixed the prevSearch regex to find the
"previous" button on the redesigned page.
As well, I set multipleImagesPerStrip to true, since there are quite a
few comics with multiple images that were being discarded.
2014-06-13 16:43:40 -04:00
Bastian Kleineidam
68afeaf82d
Make appname lowercase.
2014-06-09 13:24:58 +02:00
Bastian Kleineidam
00e424aed0
Fix zenpencils.
2014-06-08 13:40:42 +02:00
Bastian Kleineidam
687d27d534
Stripping should be done in normaliseUrl.
2014-06-08 10:12:33 +02:00
Bastian Kleineidam
c528fd1822
Merge branch 'master' of github.com:wummel/dosage
2014-06-08 10:07:36 +02:00
Bastian Kleineidam
0ee5c08771
Match zoom image for GoComics pages.
2014-06-08 10:06:34 +02:00
Peter B
78954da9d7
fix StandStillStaySilent, strip urls when downloading
2014-06-04 01:58:16 -04:00
Peter B
71ed9ad69d
fixed foul language
2014-06-04 01:35:40 -04:00
Bastian Kleineidam
62a3a55b82
Fixed LoadingArtist
2014-03-26 19:59:42 +01:00
Bastian Kleineidam
813e6876fc
Add missing @classmethod
2014-03-26 19:59:42 +01:00
Bastian Kleineidam
c2cf58560e
Remove unused import.
2014-03-26 19:59:42 +01:00
Bastian Kleineidam
4bb31953ad
Fix PennyArcade
2014-03-26 19:59:42 +01:00
Freestila
0faf4a722b
Update o.py
...
Removed procedure for "I am over 18" button, sicne this button no longer exists
2014-03-05 09:28:34 +01:00
Bastian Kleineidam
348dd5e6c0
Add documentation
2014-03-04 20:53:19 +01:00
Bastian Kleineidam
3108c9124a
Fix thread import for py3
2014-03-04 20:50:34 +01:00
Bastian Kleineidam
18972d3830
Remove old waitSeconds parameter.
2014-03-04 18:38:46 +01:00
Bastian Kleineidam
15ef59262a
Make threads interruptable.
2014-03-04 18:38:46 +01:00
Tobias Gruetzmacher
33801376f9
Fix indentation.
2014-02-27 22:31:21 +01:00
Tobias Gruetzmacher
1bcac66c03
Mark MonsieurLeChien as french.
2014-02-27 22:30:02 +01:00
Tobias Gruetzmacher
8e2ba15410
Merge pull request #60 from Freestila/master
...
Added comics - looks good
2014-02-27 22:24:57 +01:00
Luc Fouin
da9f518a7a
add french commit M. Le Chien
2014-02-27 17:45:29 +01:00
Freestila
53ebb51b10
Added comic DungeonsAndDenizens
2014-02-27 15:08:07 +01:00
Freestila
b8fefb37c0
Added comic Underling
2014-02-20 12:54:40 +01:00
Freestila
3d19d45e81
Added wait 1 sek because of permanent Timeout / connection pool exceed from server
2014-02-20 12:54:13 +01:00
Freestila
67c31284f1
Added comic GrimTales from Down Below
2014-02-18 21:12:29 +01:00
Freestila
de0bb1c9d5
Added comic "The Landscaper"
2014-02-18 21:00:43 +01:00
Freestila
96f61542ee
Added comic "Die Fruehreifen"
2014-02-18 21:00:19 +01:00
Peter B
b44b751efa
Fixed EvilInc comics. Closes #58
2014-02-14 19:33:13 -05:00
Bastian Kleineidam
f50ef910be
Skip CyanideAndHappiness videos
2014-02-10 21:58:26 +01:00
Bastian Kleineidam
875e431edc
Provide page data in shouldSkipUrl() function
2014-02-10 21:58:09 +01:00
Bastian Kleineidam
73e1af7aba
Fixed FredoAndPidjin
2014-02-06 19:57:56 +01:00
Peter B
d86442efed
Added Oh Joy Sex Toy.
2014-01-30 22:45:50 -05:00
Peter B
add63d6d6c
Added The Gentleman's Armchair Comic.
2014-01-30 22:32:46 -05:00
Tobias Gruetzmacher
44ef1831bf
Sluggy Freelance has some pages with multiple comics.
...
See for example SluggyFreelance:010422
2014-01-28 19:08:39 +01:00
wummel
6b8854e7b2
Merge pull request #55 from Lugoues/upstream
...
Added MrLovenstein Comic
2014-01-26 05:49:50 -08:00
Bastian Kleineidam
cc5ee572fb
Fix some comics
2014-01-24 23:17:21 +01:00
Peter B
66f6b08163
Added MrLovenstein Comic
2014-01-23 20:23:24 -05:00
Bastian Kleineidam
1a56fbb3dd
Fix DemolitionSquad
2014-01-20 19:01:47 +01:00
Bastian Kleineidam
8b0f149c2b
Updated copyright
2014-01-19 13:16:22 +01:00
Peter B
740bcb72ce
Added Eat That Toast
2014-01-12 19:08:02 -05:00
Peter B
124cf99665
Added Poorly Drawn lines replacing GoComic's version.
2014-01-12 19:08:02 -05:00
Bastian Kleineidam
e738454cb1
Correct drunkduck disablement comment.
2014-01-11 20:04:52 +01:00
Peter B
d0031b65c8
Added "Stand Still. Stay Silent." comic.
2014-01-08 11:08:19 -05:00
Bastian Kleineidam
69bffc9c92
Fix invalid description.
2014-01-06 16:25:42 +01:00
Bastian Kleineidam
264a20a4db
Disable disallowed drunkduck comics.
2014-01-06 09:58:24 +01:00
Bastian Kleineidam
3f4be55332
Merge branch 'upstream' of https://github.com/Lugoues/dosage into Lugoues-upstream
2014-01-06 09:38:25 +01:00
Bastian Kleineidam
d98c2a52dd
Skip phdcomic video URL.
2014-01-06 08:20:58 +01:00
Peter B
ceca4ba102
Added FoulLanguage Comic
2014-01-06 00:34:37 -05:00
Peter B
1de57ea1fe
added Camp Comic
2014-01-05 23:09:19 -05:00
Bastian Kleineidam
ef17268ace
Fix comic list output.
2014-01-05 17:37:13 +01:00
Bastian Kleineidam
5fe48d013a
Increase wait interval.
2014-01-05 17:14:19 +01:00
Bastian Kleineidam
4d63920434
Updated copyright.
2014-01-05 16:50:57 +01:00
Bastian Kleineidam
b6c913e2d5
Wait some time between requests.
2014-01-05 16:23:45 +01:00
Bastian Kleineidam
1affe58370
Use thread name in log output.
2014-01-05 16:17:34 +01:00
Bastian Kleineidam
bb18295798
Use realpath to detect symlinked instances.
2014-01-05 11:16:57 +01:00
Bastian Kleineidam
d9edeb1343
Limit cyanideandhappiness filename length
2014-01-05 11:08:15 +01:00
Bastian Kleineidam
9172aba146
Remove stray print
2014-01-05 10:50:25 +01:00
Bastian Kleineidam
1f38895681
Ensure only on instance of dosage is running to prevent accedental DoS on sites with multiple comics.
2014-01-05 10:36:22 +01:00
Bastian Kleineidam
732b50811d
Only ensure the maximum width.
2013-12-22 13:38:29 +01:00
Bastian Kleineidam
f488935072
Fix AbstruseGoose and QuestionabelContent.
2013-12-22 08:01:58 +01:00
Bastian Kleineidam
a1a773dd52
Fix loader in frozen executables.
2013-12-18 20:55:23 +01:00
Bastian Kleineidam
5c5aa166c7
Fix gocomic image matcher
2013-12-12 22:54:03 +01:00
Bastian Kleineidam
799d3040f0
Refactoring
2013-12-11 17:54:39 +01:00
Bastian Kleineidam
f23aa86a2c
Get larger Gocomic images.
2013-12-11 17:53:52 +01:00
Bastian Kleineidam
b5d973e2d4
Only resize really big images.
2013-12-11 00:01:29 +01:00
Bastian Kleineidam
5ad423c15e
Limit image size also in HTML.
2013-12-10 19:59:19 +01:00
Bastian Kleineidam
c3078ed855
Added EdmundFinney, Gaia, GaiaGerman, InternetWebcomic,
...
NotInventedHere, RedsPlanet, RomanticallyApocalyptic,
ScandinaviaAndTheWorld, TheGamerCat, Weregeek
2013-12-10 19:50:21 +01:00
Damjan Košir
4e40f02642
added comic Gaia in German
2013-12-10 18:02:20 +13:00
Damjan Košir
4e5717be57
added comic Gaia
2013-12-10 17:08:15 +13:00
Damjan Košir
f48b22b512
added comic Not Invented Here
2013-12-10 16:40:44 +13:00
Damjan Košir
e181b287c9
added comic Romantically Apocalyptic
2013-12-10 16:39:30 +13:00
Damjan Košir
58b62dbad3
added comic Scandinavia and the World
2013-12-10 16:37:35 +13:00
Damjan Košir
5982e27c7b
added comic Red's Planet
2013-12-10 16:34:47 +13:00
Damjan Košir
4f47792dee
added comic The Gamer Cat
2013-12-10 16:33:07 +13:00
Damjan Košir
b53ca04ee7
added comic Internet Webcomic
2013-12-10 16:32:16 +13:00
Damjan Košir
f095f6309e
added comic Edmund Finney's Quest to Find the Meaning of Life
2013-12-10 16:31:03 +13:00
Bastian Kleineidam
67c2203e7e
Ensure maxium aspect ratio in RSS images.
2013-12-08 15:55:39 +01:00
Bastian Kleineidam
df9a381ae4
Document getfp() function.
2013-12-08 11:46:26 +01:00
Bastian Kleineidam
03fff069ee
Apply same file checks files as for image files.
2013-12-05 18:29:15 +01:00
Bastian Kleineidam
599672acbf
Fix xkcd text regex. Closes #46
2013-12-05 18:29:15 +01:00
Bastian Kleineidam
7343932a5a
Strip whitespace from image text.
2013-12-04 18:07:13 +01:00
wummel
0378c9d855
Merge pull request #45 from Lugoues/master
...
Store alt text from AbstruseGoose
2013-12-04 09:01:50 -08:00
Bastian Kleineidam
c583e8717e
Store large xkcd images.
2013-12-04 17:56:54 +01:00
Bastian Kleineidam
0e5c59133c
Provide HTML page data for image URL modifier function.
2013-12-04 17:54:55 +01:00
Peter B
36dcadc7d4
Store alt text from AbstruseGoose
2013-12-03 21:56:54 -05:00
Bastian Kleineidam
3c5424c2ef
Add text in RSS and HTML output.
2013-11-29 20:32:54 +01:00
Bastian Kleineidam
142c418dc0
Store alt text from xkcd comics.
2013-11-29 20:27:11 +01:00
Bastian Kleineidam
0eaf9a3139
Add text search in comic strips.
2013-11-29 20:26:49 +01:00
Bastian Kleineidam
468b34034b
cyanideandhappiness skip URL
2013-11-29 18:31:34 +01:00
Bastian Kleineidam
9514a8eeae
Fixed ForLackOfABetterComic
2013-11-27 20:49:35 +01:00
Bastian Kleineidam
7d05b666da
Updated RSS link name
2013-11-25 21:20:48 +01:00
Bastian Kleineidam
01085d56c2
Regenerated.
2013-11-24 12:19:54 +01:00