Commit graph

677 commits

Author SHA1 Message Date
DirkReiners
273b429fcd Merge branch 'master' of https://github.com/webcomics/dosage 2015-04-29 09:51:47 -05:00
Damjan Košir
391313972c fixed ManlyGuysDoingManlyThings 2015-04-26 23:47:38 +12:00
Damjan Košir
9837a87a43 fixed omake teather 2015-04-26 23:32:22 +12:00
Damjan Košir
8df9d20556 added doctor cat 2015-04-26 22:32:52 +12:00
Damjan Košir
dc427d6066 fixed the gamercat 2015-04-26 21:52:31 +12:00
Damjan Košir
561005887a unneeded max 2015-04-26 00:23:45 +12:00
Damjan Košir
ac7b0d7e0e adding parallel run option 2015-04-26 00:19:08 +12:00
Damjan Košir
1e94a3c7c5 now the same as offical version 2015-04-25 20:52:03 +12:00
Damjan Košir
dae2698102 removing mismerge 2015-04-25 20:40:28 +12:00
Damjan Košir
dc014a7cb4 Merge remote-tracking branch 'upstream/master'
Conflicts:
	dosagelib/plugins/e.py
	dosagelib/plugins/i.py
	dosagelib/plugins/n.py
	dosagelib/plugins/s.py
	dosagelib/plugins/t.py
	dosagelib/plugins/w.py
2015-04-25 20:28:27 +12:00
DirkReiners
b8ef6958b9 Merge branch 'master' of https://github.com/webcomics/dosage 2015-04-24 15:38:36 -05:00
Helge Stasch
4cdd92dcd7 Added comic Magellan 2015-04-23 09:12:24 +02:00
Tobias Gruetzmacher
9f33c31c68 Merge pull request #12 from Freestila/master
Changed comic name, since comic is named FowlLanguage instead of FoulLan...

Conflicts:
	dosagelib/plugins/f.py
2015-04-22 22:24:26 +02:00
Tobias Gruetzmacher
bf9f45b380 Switch to setuptools and cleanup metadata.
py2exe support is gone for now, will be restored later.
2015-04-22 22:22:03 +02:00
Helge Stasch
8218e805b2 Changed comic name, since comic is named FowlLanguage instead of FoulLanguage 2015-04-22 21:25:10 +02:00
Tobias Gruetzmacher
bf9bf5e9b0 Merge pull request #11 from Freestila/master
Added "Ralf the Destroyer"
2015-04-21 23:46:11 +02:00
Tobias Gruetzmacher
86b31dc12b Depend on pycountry directly. 2015-04-21 21:56:54 +02:00
Helge Stasch
d7e9c8eb94 Added "Ralf the Destroyer" 2015-04-21 19:12:40 +02:00
Tobias Gruetzmacher
d5e7690419 Fix size comparison for RSS & HTML output.
This was always broken, but somehow worked with Python 2.7 (WTF?). Now
that we test with Pillow, this code path runs with Python 3 and throws
an error.
2015-04-21 00:01:23 +02:00
Tobias Gruetzmacher
ff21df596b Remove descriptions and genres (closes #9).
Maintaining the descriptions creates quite a bit of overhead (finding
them, copying them, checking if they are still correct) for a minimal
user benefit.

PS: Viewing this diff should be easier in a difftool that shows changes
in a line, for example kdiff3.
2015-04-20 20:29:09 +02:00
Tobias Gruetzmacher
3b33129e58 Fix ViiviJaWagner. 2015-04-18 22:45:13 +02:00
Tobias Gruetzmacher
e8af5adcb8 Update list of supported GoComics comics. 2015-04-18 02:04:31 +02:00
Tobias Gruetzmacher
f0831a1f0f Fix and update ArcaMax (fixes #8). 2015-04-17 21:53:13 +02:00
DirkReiners
99f33151e2 Merge branch 'master' of https://github.com/webcomics/dosage 2015-04-16 18:36:42 -05:00
DirkReiners
8f3a9f660a Fixed ASofterWorld 2015-04-16 18:35:21 -05:00
DirkReiners
49b964cb3c Added PS238 2015-04-16 18:20:14 -05:00
Manabi
65c021ef2b Fixed IAmArg 2015-04-15 14:43:06 -04:00
Manabi
475739ea60 Fixing DogHouseDiaries 2015-04-15 12:56:03 -04:00
Manabi
c0619e8dca Fixing DogHouseDiaries 2015-04-15 12:51:45 -04:00
Manabi
2b98a9023e Added Peanuts Begins & Wizard of Id Classics 2015-04-13 22:26:12 -04:00
Tobias Gruetzmacher
974752951b Fix xkcd (closes #3), remove adult tag (fixes wummel#85). 2015-04-12 20:06:34 +02:00
Tobias Gruetzmacher
5934f03453 Merge branch 'htmlparser' - I think it's ready.
This closes pull request #70.
2015-04-01 22:13:55 +02:00
Tobias Gruetzmacher
614c25e278 Fix coding style. 2015-03-22 17:13:53 +01:00
Tobias Gruetzmacher
e94e2ae432 Merge pull request #95 from serenitas50/master
Added comic Beetlebum (http://blog.beetlebum.de/).
2015-03-22 17:04:36 +01:00
Tobias Gruetzmacher
b5ed4c56b6 Merge pull request #94 from Manabi/master
Added definition for Drive comic

Conflicts:
	dosagelib/plugins/g.py
2015-03-22 16:34:07 +01:00
Tobias Gruetzmacher
b5368b366a Merge Gaia(German), SandraAndWoo(German) into common base.
This also fixes #97 by correcting the imageSearch regex.
2015-02-04 19:41:52 +01:00
Manabi
f85464ccb2 Fixed unclosed ' error
Lines 293/294 should have been one line, this is now fixed.
2015-02-02 04:35:49 -05:00
Manabi
190f53ee4d Fixing name of GunnkriggCourt
Existing name was missing a g.
2015-02-02 04:24:32 -05:00
Serenitas50
94004846cd Added comic Beetlebum (http://blog.beetlebum.de/). 2015-01-31 22:07:35 -02:00
Manabi
a5b0d0c5de Added definition for Drive comic 2015-01-26 04:21:24 -05:00
Dirk Reiners
b710d3fa81 Merge branch 'master' of https://github.com/wummel/dosage 2015-01-16 13:24:48 -06:00
Dirk Reiners
c6f0dd6117 PiledHigherAndDeeper: Fix for new website format 2015-01-16 12:06:17 -06:00
Dirk Reiners
e25270c866 Dilbert: Fix for new websitre format 2015-01-16 12:05:53 -06:00
Dirk Reiners
3724eba835 Cyanide And Happiness: Fix for new website format 2015-01-16 12:05:36 -06:00
Tobias Gruetzmacher
f8531eca57 Move SinFest back to KeenSpot namespace. 2015-01-16 00:16:28 +01:00
Tobias Gruetzmacher
4733153d01 Merge pull request #87 from rpglover64/master
Update SinFest to work with new website.
2015-01-16 00:15:04 +01:00
Alex Rozenshteyn
a0506b22f0 Update ZenPencils URL. 2014-12-16 13:51:52 -05:00
Alex Rozenshteyn
51996e45ed Update SinFest to work with new website. 2014-12-16 12:01:54 -05:00
Tobias Gruetzmacher
2c1ff889fa Fix scope in HTML output. 2014-12-10 00:57:17 +01:00
Tobias Gruetzmacher
b7bc16650a Merge branch 'carlosefonseca/master' 2014-12-10 00:07:21 +01:00
Tobias Gruetzmacher
5af4f45505 Merge branch 'zac9/patch-2' 2014-12-10 00:03:08 +01:00
Tobias Gruetzmacher
32265c99d7 Merge branch 'zac9/patch-1' 2014-12-10 00:00:51 +01:00
Carlos Fonseca
04cc07a466 Added comic Nimona 2014-12-08 13:28:37 +00:00
mbrandis
25cf4888ae - Adapted ShermansLagoon
- Better version of OnTheFastTrack
2014-11-14 20:37:06 +01:00
mbrandis
c63f927e5c - Modified OnTheFasttrack adapting the new API. 2014-11-14 20:09:42 +01:00
mbrandis
cd48801b0d - Added next and previous day at end of page. 2014-11-14 15:39:42 +01:00
Dirk Reiners
fda654b5e0 Some fixes...
AbstruseGoose: fixed prev
Carciphona: fixed latest
Curtailed: fixed image and prev (moved to WP)
DorkTower: fixed image search
GrrlPower: fixed site name issue
MadamAndEve: archive not updated in a long time, but current strip is.
Works, but needs to be run daily.
PennyArcade: fixed namer
PvPonline: fixed prev
2014-10-24 16:42:32 -05:00
Dirk Reiners
77a5e09c10 Minor fix for using pathes to pick comics 2014-10-24 16:39:40 -05:00
Tobias Gruetzmacher
6769e1eb36 Add StrongFemaleProtagonist.
This uses the _ParserScraper and CSS selectors.
2014-10-13 23:39:50 +02:00
Tobias Gruetzmacher
1d52d6a152 Add support for CSS selectors to HTML parser.
Each comic module author can decide if she wants to use CSS or XPath,
not a mix of both. Using CSS needs the cssselect python module and the
module gets disabled if it is unavailable.
2014-10-13 22:43:06 +02:00
Tobias Gruetzmacher
17bc454132 Bugfix: Don't assume RE patterns in base class. 2014-10-13 22:29:47 +02:00
Tobias Gruetzmacher
e92a3fb3a1 New feature: Comic modules ca be "disabled".
This is modeled parallel to the "adult" feature, except the user can't
override it via the command line. Each comic module can override the
classmethod getDisabledReasons and give the user a reason why this
module is disabled. The user can see the reason in the comic list (-l or
--singlelist) and the comic module refuses to run, showing the same
message.

This is currently used to disable modules that use the _ParserScraper if
the LXML python module is missing.
2014-10-13 21:43:46 +02:00
Tobias Gruetzmacher
d495d95ee0 Refactor: Move repeated check into its own function. 2014-10-13 21:29:54 +02:00
Tobias Gruetzmacher
3235b8b312 Pass unicode strings to lxml.
This reverts commit fcde86e9c0 & some
more. This lets python-requests do all the encoding stuff and leaves
LXML with (hopefully) clean unicode HTML to parse.
2014-10-13 19:39:48 +02:00
zac9
6ca200419a Update s.py 2014-09-28 19:48:26 -07:00
zac9
5b7ab5a711 Update o.py 2014-09-28 19:41:29 -07:00
zac9
491b5457b2 Added comic ShotgunShuffle 2014-09-28 06:29:02 -07:00
Bastian Kleineidam
731291979d Fixed RedMeat. 2014-09-22 22:14:31 +02:00
Bastian Kleineidam
e43694c156 Don't crash on multiple HTML output runs per day. 2014-09-22 22:00:16 +02:00
Bastian Kleineidam
e87f5993b8 Merge branch 'master' into htmlparser 2014-08-07 18:10:15 +02:00
Tobias Gruetzmacher
08175d28c9 Fix Ruthe (see #73). 2014-07-31 21:27:49 +02:00
Tobias Gruetzmacher
ca2d722d39 Fix DieFruehreifen (closes #73). 2014-07-31 21:18:15 +02:00
Tobias Gruetzmacher
6c7fb176b1 Add Blade Kitten as an example for the new parser. 2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
f9f0b75d7c Create new HTML parser based scraper class. 2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
fcde86e9c0 Change getPageContent to (optionally) return raw text.
This allows LXML to do its own "magic" encoding detection
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
0e03eca8f0 Move all regular expression operation into the new class.
- Move fetchUrls, fetchUrl and fetchText.
- Move base URL handling.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
fde1fdced6 Fix some typos. 2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
2567bd4e57 Convert starters and other helpers to new interface.
This allows those starters to work with future scrapers.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
4265053846 Refactor: Move regualar expression scraping into a new class.
- This also makes "<base href>" handling an internal detail of the regular
  expression scraper, future scrapers might not need that or handle it in
  another way.
2014-07-26 11:28:43 +02:00
Bastian Kleineidam
3a929ceea6 Allow comic text to be optional. Patch from TobiX 2014-07-24 20:49:57 +02:00
Bastian Kleineidam
950dd2932c Remove stray print statement. 2014-07-21 20:20:15 +02:00
Tobias Gruetzmacher
ea5d533e30 Fix index lookups for SnowFlame and SnowFlakes. 2014-07-19 13:23:42 +02:00
Bastian Kleineidam
4d49d4394b Fix doc 2014-07-03 18:42:06 +02:00
Bastian Kleineidam
f194e430bc TheThinHLine: fetch bigger images and name image files from sequence number. 2014-07-03 18:41:25 +02:00
Bastian Kleineidam
4845a4ccc1 Merge branch 'master' of github.com:wummel/dosage 2014-07-03 17:12:42 +02:00
Bastian Kleineidam
641daa738b Updated list of comics 2014-07-03 17:12:25 +02:00
Bastian Kleineidam
93fe5d5987 Minor useragent refactoring 2014-07-03 17:12:25 +02:00
Bastian Kleineidam
4c2a339e25 Fix some comics. 2014-07-02 19:51:53 +02:00
Luc Fouin
cb76198da7 added the thin H line, fixes #67 2014-07-02 17:14:33 +02:00
Luc Fouin
763f9b02a2 added the thin H line 2014-07-02 17:11:33 +02:00
Bastian Kleineidam
b03ba158ef Fixed LookingForGroup 2014-07-01 23:44:01 +02:00
Bastian Kleineidam
3485e2ac54 Added Whomp. 2014-06-24 20:48:49 +02:00
wummel
a0086bfcd8 Merge pull request #63 from sehrgut/master
Updated GirlGenius to new markup
2014-06-24 20:40:15 +02:00
Peter B
8f1c864ec3 Added Safely Endangered 2014-06-17 01:05:11 -04:00
Keith Beckman
236b840363 Updated GirlGenius to new markup
GG markup has changed, so I fixed the prevSearch regex to find the
"previous" button on the redesigned page.

As well, I set multipleImagesPerStrip to true, since there are quite a
few comics with multiple images that were being discarded.
2014-06-13 16:43:40 -04:00
Bastian Kleineidam
68afeaf82d Make appname lowercase. 2014-06-09 13:24:58 +02:00
Bastian Kleineidam
00e424aed0 Fix zenpencils. 2014-06-08 13:40:42 +02:00
Bastian Kleineidam
687d27d534 Stripping should be done in normaliseUrl. 2014-06-08 10:12:33 +02:00
Bastian Kleineidam
c528fd1822 Merge branch 'master' of github.com:wummel/dosage 2014-06-08 10:07:36 +02:00
Bastian Kleineidam
0ee5c08771 Match zoom image for GoComics pages. 2014-06-08 10:06:34 +02:00
Peter B
78954da9d7 fix StandStillStaySilent, strip urls when downloading 2014-06-04 01:58:16 -04:00
Peter B
71ed9ad69d fixed foul language 2014-06-04 01:35:40 -04:00
Bastian Kleineidam
62a3a55b82 Fixed LoadingArtist 2014-03-26 19:59:42 +01:00
Bastian Kleineidam
813e6876fc Add missing @classmethod 2014-03-26 19:59:42 +01:00
Bastian Kleineidam
c2cf58560e Remove unused import. 2014-03-26 19:59:42 +01:00
Bastian Kleineidam
4bb31953ad Fix PennyArcade 2014-03-26 19:59:42 +01:00
Freestila
0faf4a722b Update o.py
Removed procedure for "I am over 18" button, sicne this button no longer exists
2014-03-05 09:28:34 +01:00
Bastian Kleineidam
348dd5e6c0 Add documentation 2014-03-04 20:53:19 +01:00
Bastian Kleineidam
3108c9124a Fix thread import for py3 2014-03-04 20:50:34 +01:00
Bastian Kleineidam
18972d3830 Remove old waitSeconds parameter. 2014-03-04 18:38:46 +01:00
Bastian Kleineidam
15ef59262a Make threads interruptable. 2014-03-04 18:38:46 +01:00
Tobias Gruetzmacher
33801376f9 Fix indentation. 2014-02-27 22:31:21 +01:00
Tobias Gruetzmacher
1bcac66c03 Mark MonsieurLeChien as french. 2014-02-27 22:30:02 +01:00
Tobias Gruetzmacher
8e2ba15410 Merge pull request #60 from Freestila/master
Added comics - looks good
2014-02-27 22:24:57 +01:00
Luc Fouin
da9f518a7a add french commit M. Le Chien 2014-02-27 17:45:29 +01:00
Freestila
53ebb51b10 Added comic DungeonsAndDenizens 2014-02-27 15:08:07 +01:00
Freestila
b8fefb37c0 Added comic Underling 2014-02-20 12:54:40 +01:00
Freestila
3d19d45e81 Added wait 1 sek because of permanent Timeout / connection pool exceed from server 2014-02-20 12:54:13 +01:00
Freestila
67c31284f1 Added comic GrimTales from Down Below 2014-02-18 21:12:29 +01:00
Freestila
de0bb1c9d5 Added comic "The Landscaper" 2014-02-18 21:00:43 +01:00
Freestila
96f61542ee Added comic "Die Fruehreifen" 2014-02-18 21:00:19 +01:00
Peter B
b44b751efa Fixed EvilInc comics. Closes #58 2014-02-14 19:33:13 -05:00
Bastian Kleineidam
f50ef910be Skip CyanideAndHappiness videos 2014-02-10 21:58:26 +01:00
Bastian Kleineidam
875e431edc Provide page data in shouldSkipUrl() function 2014-02-10 21:58:09 +01:00
Bastian Kleineidam
73e1af7aba Fixed FredoAndPidjin 2014-02-06 19:57:56 +01:00
Peter B
d86442efed Added Oh Joy Sex Toy. 2014-01-30 22:45:50 -05:00
Peter B
add63d6d6c Added The Gentleman's Armchair Comic. 2014-01-30 22:32:46 -05:00
Tobias Gruetzmacher
44ef1831bf Sluggy Freelance has some pages with multiple comics.
See for example SluggyFreelance:010422
2014-01-28 19:08:39 +01:00
wummel
6b8854e7b2 Merge pull request #55 from Lugoues/upstream
Added MrLovenstein Comic
2014-01-26 05:49:50 -08:00
Bastian Kleineidam
cc5ee572fb Fix some comics 2014-01-24 23:17:21 +01:00
Peter B
66f6b08163 Added MrLovenstein Comic 2014-01-23 20:23:24 -05:00
Bastian Kleineidam
1a56fbb3dd Fix DemolitionSquad 2014-01-20 19:01:47 +01:00
Bastian Kleineidam
8b0f149c2b Updated copyright 2014-01-19 13:16:22 +01:00
Peter B
740bcb72ce Added Eat That Toast 2014-01-12 19:08:02 -05:00
Peter B
124cf99665 Added Poorly Drawn lines replacing GoComic's version. 2014-01-12 19:08:02 -05:00
Bastian Kleineidam
e738454cb1 Correct drunkduck disablement comment. 2014-01-11 20:04:52 +01:00
Peter B
d0031b65c8 Added "Stand Still. Stay Silent." comic. 2014-01-08 11:08:19 -05:00
Bastian Kleineidam
69bffc9c92 Fix invalid description. 2014-01-06 16:25:42 +01:00
Bastian Kleineidam
264a20a4db Disable disallowed drunkduck comics. 2014-01-06 09:58:24 +01:00
Bastian Kleineidam
3f4be55332 Merge branch 'upstream' of https://github.com/Lugoues/dosage into Lugoues-upstream 2014-01-06 09:38:25 +01:00
Bastian Kleineidam
d98c2a52dd Skip phdcomic video URL. 2014-01-06 08:20:58 +01:00
Peter B
ceca4ba102 Added FoulLanguage Comic 2014-01-06 00:34:37 -05:00
Peter B
1de57ea1fe added Camp Comic 2014-01-05 23:09:19 -05:00
Bastian Kleineidam
ef17268ace Fix comic list output. 2014-01-05 17:37:13 +01:00
Bastian Kleineidam
5fe48d013a Increase wait interval. 2014-01-05 17:14:19 +01:00
Bastian Kleineidam
4d63920434 Updated copyright. 2014-01-05 16:50:57 +01:00
Bastian Kleineidam
b6c913e2d5 Wait some time between requests. 2014-01-05 16:23:45 +01:00
Bastian Kleineidam
1affe58370 Use thread name in log output. 2014-01-05 16:17:34 +01:00
Bastian Kleineidam
bb18295798 Use realpath to detect symlinked instances. 2014-01-05 11:16:57 +01:00
Bastian Kleineidam
d9edeb1343 Limit cyanideandhappiness filename length 2014-01-05 11:08:15 +01:00