Tobias Gruetzmacher
7f1e136d8b
Sort comics alphabetically & PEP8 style fixes.
2016-03-31 23:13:54 +02:00
Tobias Gruetzmacher
d6db1d0b81
Fix a conflict with IPython.
2016-03-20 23:57:07 +01:00
Tobias Gruetzmacher
90dfceaeb1
Remove dead modules (& format).
2016-03-20 20:48:42 +01:00
Tobias Gruetzmacher
f243096d49
Fix GastroPhobia, remove GeneralProtectionFault.
...
(& formatting)
2016-03-20 20:11:21 +01:00
Tobias Gruetzmacher
cfcfcc2468
Switch plugin loading to pkgutil.
...
This should work with all PEP-302 loaders that implement iter_modules.
Unfortunatly, PyInstaller (which I plan to use for Windows releases)
does not support it, so we don't get around a special case. Anyways,
this should help for #22 .
2016-03-20 15:13:24 +01:00
Tobias Gruetzmacher
1af022895e
Fix NuklearPower ( fixes #38 ).
...
Also remove make_scraper magic.
2016-03-17 23:19:52 +01:00
Tobias Gruetzmacher
552f29e5fc
Update ComicFury comics. (+871, -245)
...
- Remove make_scraper magic
- Switch to HTML parser
- Update parsing of comic listing.
2016-03-17 00:44:06 +01:00
Tobias Gruetzmacher
6727e9b559
Use vendored urllib3.
...
As long as requests ships with urllib3, we can't fall back to the
"system" urllib3, since that breaks class-identity checks.
2016-03-16 23:18:19 +01:00
Damjan Košir
615f094ef3
fixing EdmundFinney
2016-03-14 20:32:18 +13:00
Tobias Gruetzmacher
c4fcd985dd
Let urllib3 handle all retries.
2016-03-13 21:30:36 +01:00
Tobias Gruetzmacher
78e13962f9
Sort scraper modules (mostly for test stability).
2016-03-13 20:24:21 +01:00
Tobias Gruetzmacher
017d35cb3c
Fallback version if pkg_resources not available.
...
This helps for Windows packaging.
2016-03-03 01:05:36 +01:00
Johannes Schöpp
351fa7154e
Modified maximum page size
...
Fixes #36
2016-03-01 22:19:44 +01:00
Damjan Košir
b0dc510b08
adding LastNerdsOnEarth
2016-01-03 14:16:58 +13:00
Damjan Košir
a1e79cbbf2
fixing Fragile
2016-01-03 14:08:49 +13:00
Tobias Gruetzmacher
81827f83bc
Use GitHub releases API for update checks.
2015-11-06 23:07:19 +01:00
Tobias Gruetzmacher
a41574e31a
Make version fetching a bit more robust (use pbr).
2015-11-06 22:08:14 +01:00
Tobias Gruetzmacher
64f7e313d5
Remove make_scraper magic from footloosecomic.py.
2015-11-05 00:03:13 +01:00
Tobias Gruetzmacher
7f7a69818b
Remove make_scraper magic from creators module.
2015-11-04 23:43:31 +01:00
Tobias Gruetzmacher
94470d564c
Fix import for Python 3.
2015-11-03 23:40:45 +01:00
Tobias Gruetzmacher
b819afec39
Switch build to PBR.
...
This gets us:
- Automatic changelog
- Automatic authors list
- Automatic git version management
2015-11-03 23:27:53 +01:00
Tobias Gruetzmacher
dc22d7b32a
Add CatNine comic.
2015-11-02 23:29:56 +01:00
Tobias Gruetzmacher
10d9eac574
Remove support for very old versions of "requests".
2015-11-02 23:24:01 +01:00
MariusK
3e1ea816cc
Fixed 'Ruthe'
2015-10-02 13:52:44 +02:00
Helge Stasch
48d8519efd
Changed Goblins comic - moved to new scraper and fixed minor issues with some comics (old scrapper was unstable for some comics of Goblins)
2015-09-28 23:50:15 +02:00
Helge Stasch
17fbdf2bf7
Added comic "Ahoy Earth"
2015-09-27 00:44:47 +02:00
Tobias Gruetzmacher
d72ceb92d5
BloomingFaeries: Remove imageUrlModifier (not needed).
2015-09-04 00:37:05 +02:00
Tobias Gruetzmacher
abd80a1d35
Merge pull request #28 from KevinAnthony/master
...
added comic Blooming Faeries
2015-09-03 23:26:37 +02:00
Tobias Gruetzmacher
b737218182
ZenPencils: Allow multiple images per page.
2015-09-03 23:24:28 +02:00
Kevin Anthony
62ec1f1d18
Removed debugging print state
2015-09-02 11:22:24 -04:00
Kevin Anthony
d7180eaf99
removed bad whitespace
2015-09-02 11:04:32 -04:00
Kevin Anthony
6e8231e78a
Added Namer to BloomingFaeries since the web comic author doesn't seem intrested in sticking to any kind of file naming convention
2015-09-02 11:01:48 -04:00
Kevin Anthony
1045bb7d4a
added comic Blooming Faeries
2015-09-02 10:13:42 -04:00
Damjan Košir
11f0aa3989
created Wordpress Scraper class
2015-08-11 21:31:45 +12:00
Damjan Košir
0a5b792c32
added Fragile (English and Spanish)
2015-08-07 23:37:10 +12:00
Damjan Košir
fd9c480d9c
adding bonus panel to SWBC and multiple images flag to ParserScraper
2015-08-03 22:58:44 +12:00
Damjan Košir
f8a163a361
added a CMS ComicControl, moved some existing comics there, added StreetFighter and Metacarpolis
2015-08-03 22:40:06 +12:00
Damjan Košir
648a84e38e
added Sharksplode
2015-08-03 22:20:17 +12:00
Damjan Košir
c19806b681
added AoiHouse
2015-07-31 23:33:30 +12:00
Damjan Košir
2201c9877a
added KiwiBlitz
2015-07-31 23:09:56 +12:00
Damjan Košir
fe22df5e5b
added LetsSpeakEnglish
2015-07-31 23:06:06 +12:00
Damjan Košir
79ec427fc0
added CatVersusHuman
2015-07-30 22:16:34 +12:00
Tobias Gruetzmacher
303432fc68
Also use css expressions for textSearch.
2015-07-18 01:22:40 +02:00
Tobias Gruetzmacher
6a70bf4671
Enable some comics based on current policy.
2015-07-18 01:21:29 +02:00
Tobias Gruetzmacher
6b0046f9b3
Fix small typos.
2015-07-18 00:11:44 +02:00
Tobias Gruetzmacher
68d4dd463a
Revert robots.txt handling.
...
This brings us back to only honouring robots.txt on page downloads, not
on image downloads.
Rationale: Dosage is not a "robot" in the classical sense. It's not
designed to spider huge amounts of web sites in search for some content
to index, it's only intended to help users keep a personal archive of
comics he is interested in. We try very hard to never download any image
twice. This fixes #24 .
(Precedent for this rationale: Google Feedfetcher:
https://support.google.com/webmasters/answer/178852?hl=en#robots )
2015-07-17 20:46:56 +02:00
Tobias Gruetzmacher
7d3bd15c2f
Remove AbleAndBaker, site is gone.
2015-07-16 00:49:48 +02:00
Tobias Gruetzmacher
472afa24d3
GoComics doesn't allow spiders, disable them...
...
This removes 757 comics, including quite popular ones like Calvin and
Hobbes, Garfield, FoxTrot, etc. :(
2015-07-16 00:36:10 +02:00
Tobias Gruetzmacher
7c15ea50d8
Also check robots.txt on image downloads.
...
We DO want to honour if images are blocked by robots.txt
2015-07-15 23:50:57 +02:00
Tobias Gruetzmacher
5affd8af68
More relaxed robots.txt handling.
...
This is in line with how Perl's LWP::RobotUA and Google handles server
errors when fetching robots.txt: Just assume access is allowed.
See https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
2015-07-15 19:11:55 +02:00
Tobias Gruetzmacher
88e387ad15
Add Sleepless Domain.
2015-07-12 18:31:21 +02:00
Tobias Gruetzmacher
0b6d7425e1
Remove BladeKitten.
...
It's not available online anymore, only in print or as a PDF download.
2015-07-11 01:29:21 +02:00
Tobias Gruetzmacher
808b624e5f
Remove hard dependency on pycountry again.
...
This basically reverts commit 86b31dc12b
.
It now works like this: If the use has pycountry installed, it is used.
If not, Dosage falls back to a small internal list generated from
pycountry by scripts/mklanguages.py.
This means additional work if we ever decide to translate Dosage, since
pycountry already has all the translations for language names...
This fixes #23 .
2015-07-11 01:27:39 +02:00
Tobias Gruetzmacher
d97a9c63e4
Add Erstwhile.
2015-07-10 01:14:56 +02:00
Damjan Košir
7abca1222b
added NerfNow
2015-07-07 22:18:06 +12:00
Damjan Košir
119a3cd13a
added text to ScandinaviaAndTheWorld
2015-07-07 19:48:25 +12:00
Damjan Košir
5f243e3868
not a comic
2015-07-05 18:33:14 +12:00
Damjan Košir
5e7ad33fc8
Nnewts disabled
2015-07-05 18:32:33 +12:00
Damjan Košir
45012ff9c3
BladeKitten disabled
2015-07-05 18:31:38 +12:00
Tobias Gruetzmacher
0c6feec8cd
Fix module name EastCoastVsWestCoast.
2015-06-24 00:51:42 +02:00
Damjan Košir
96572e8cba
added TheMelvinChronicles
2015-06-12 21:00:11 +12:00
Damjan Košir
6412e6e542
fixed Spinnerette
2015-06-08 20:31:13 +12:00
Damjan Košir
3d8a49d228
realised TheWebcomicFactory is actually 28 comics... added them
2015-06-07 21:33:59 +12:00
Damjan Košir
05bb22b3ef
added TheWebcomicFactory
2015-06-06 14:25:32 +12:00
Damjan Košir
c98800388e
added Sithrah
2015-06-04 19:24:55 +12:00
Damjan Košir
010b4bf669
renaming comicpress to wordpress (as it's not just for the comicpress theme)
2015-06-04 19:12:40 +12:00
Damjan Košir
bc91f5f1fb
added MistyTheMouse
2015-06-04 19:06:40 +12:00
Damjan Košir
e2d01e4924
fixed ScandinaviaAndTheWorld
2015-06-04 18:58:59 +12:00
Damjan Košir
545a67111e
fixed Alice
2015-06-01 15:15:34 +12:00
Damjan Košir
a08ad2dc80
fixed GoGetARoomie
2015-06-01 15:11:16 +12:00
Damjan Košir
ceb19ed2bc
fixed Wulffmorgenthaler (now Wumo), added TruthFacts and MeAndDanielle
2015-06-01 12:14:52 +12:00
Damjan Košir
4cd88ecdc0
fixed WormWorldSaga
2015-06-01 11:45:22 +12:00
Damjan Košir
ea6cb925a6
fixed LoadingArtist
2015-06-01 11:33:50 +12:00
Damjan Košir
e268b09567
fixed EarthsongSaga
2015-06-01 11:19:02 +12:00
Damjan Košir
29c8d2eea0
fixed Meek
2015-05-31 23:41:12 +12:00
Damjan Košir
9be6f613e4
fixed MysteriesOfTheArcana
2015-05-31 23:39:04 +12:00
Damjan Košir
3ea8236224
fixed FowlLanguage
2015-05-31 23:29:34 +12:00
Damjan Košir
c1245a85ad
moved Footloose, added Cherry, Desigaspring
2015-05-31 23:23:02 +12:00
Damjan Košir
01aeebfbe4
fixed Footloose
2015-05-31 23:16:12 +12:00
Damjan Košir
029fa74067
fixed Bardsworth
2015-05-31 23:03:40 +12:00
Damjan Košir
f3036de8fd
fixed Pimpette
2015-05-31 22:57:25 +12:00
Damjan Košir
df7404fd7c
fixed CatsAndCameras
2015-05-31 22:50:17 +12:00
Damjan Košir
d4cc8ac857
added buni
2015-05-27 20:36:11 +12:00
Damjan Košir
9beeceffad
added BusinessCat and HappyJar
2015-05-27 20:34:51 +12:00
Damjan Košir
d970d27b14
removing duplicate
2015-05-27 00:10:46 +12:00
Damjan Košir
33abd95348
fixed TheGentlemansArmchair
2015-05-26 23:48:22 +12:00
Damjan Košir
5e123ae79e
fixed DarkWings (now available under the real name Eryl as well), added Ashes, Laiyu, NoMoreSavePoints and EasilyAmused
2015-05-26 23:43:15 +12:00
Damjan Košir
9adb020fc2
fixed DemolitionSquad
2015-05-26 22:59:25 +12:00
Damjan Košir
605c5f8619
fixed PokeyThePenguin
2015-05-26 22:31:43 +12:00
Damjan Košir
766b7ba99d
fixed ProperBarn, added 2214 and OTE
2015-05-26 22:16:55 +12:00
Damjan Košir
2c41435ceb
fixing HijiNKS ENSUE and added all 4 comics on that page
2015-05-26 22:06:55 +12:00
Damjan Košir
465e7eaf6f
fixing CowboyJedi kinda... there is currently no comic on the front page and the author knows it
2015-05-26 21:35:36 +12:00
Damjan Košir
529a41397a
fixing CorydonCafe
2015-05-26 21:32:25 +12:00
Damjan Košir
c3abb93e99
fixing ChainsawSuit
2015-05-26 19:53:04 +12:00
Damjan Košir
f8690af029
fixing Curvy
2015-05-26 19:47:31 +12:00
Damjan Košir
36c790fa4b
fixing CraftedFables
2015-05-26 19:32:12 +12:00
Damjan Košir
7067c51056
fixed CheckerboardNightmare
2015-05-25 22:19:36 +12:00
Damjan Košir
5569439c43
fixed 16 comics
2015-05-25 21:57:06 +12:00
Damjan Košir
3edaa97fb9
fixing KatzenfutterGeleespritzer
2015-05-25 20:06:58 +12:00
Damjan Košir
8a245e1d10
fixing BloodBound
2015-05-21 00:04:07 +12:00
Damjan Košir
dc2349951a
moving BroodHollow to comicpress
2015-05-21 00:00:35 +12:00
Damjan Košir
a05ae9c75d
fixing PandyLand
2015-05-20 23:56:49 +12:00
Damjan Košir
fd60065591
fixing OnTheEdge
2015-05-20 23:50:18 +12:00
Damjan Košir
80b783c016
fixing CourtingDisaster
2015-05-20 23:16:54 +12:00
Damjan Košir
ff239ff58e
Merge branch 'comicpress'
2015-05-20 23:12:03 +12:00
Damjan Košir
77c5dbce9b
better prevSearch for comic press
2015-05-20 23:08:02 +12:00
Damjan Košir
bc4e7a03f2
fixed BroodHollow
2015-05-20 23:03:15 +12:00
Damjan Košir
8de620c78b
fixed CigarroAndCerveja
2015-05-20 22:58:13 +12:00
Damjan Košir
4529fdee3b
adding no downsize option
2015-05-20 22:38:29 +12:00
Damjan Košir
77a9cce00d
fixing Hipsters
2015-05-19 19:49:45 +12:00
Damjan Košir
79d775a8d9
adding comicpress scraper
2015-05-16 00:15:32 +12:00
Damjan Košir
962286d391
fixed OctopusPie
2015-05-14 23:06:12 +12:00
Damjan Košir
3bbf2d5c23
fixing neko the kitty
2015-05-14 22:42:04 +12:00
Damjan Košir
f75fc62e84
fixing pebbleversion
2015-05-14 22:33:46 +12:00
Helge Stasch
5a1ef9b791
Fixed problem with LookingForGroup comic
2015-05-07 13:57:10 +02:00
Damjan Košir
9a009018c7
adding strip Moonsticks
2015-05-07 23:00:55 +12:00
Helge Stasch
64a875388f
Added Comic MaxOveracts
2015-05-04 14:06:01 +02:00
Marc Winkelmann
69e5b8ad93
Shermans Lagoon and On The Fastrack working again. Also corrected name.
2015-05-02 22:27:08 +02:00
DirkReiners
1438330a94
Fixes and Additions...
...
Fixed SabrinaOnline
Fixed SMBC
Added StandStillStaySilent (partial, prevsearch not working yet)
2015-04-29 10:37:14 -05:00
DirkReiners
749beff7a3
Added MareInternum (marecomic.com)
2015-04-29 10:36:12 -05:00
DirkReiners
273b429fcd
Merge branch 'master' of https://github.com/webcomics/dosage
2015-04-29 09:51:47 -05:00
Damjan Košir
391313972c
fixed ManlyGuysDoingManlyThings
2015-04-26 23:47:38 +12:00
Damjan Košir
9837a87a43
fixed omake teather
2015-04-26 23:32:22 +12:00
Damjan Košir
8df9d20556
added doctor cat
2015-04-26 22:32:52 +12:00
Damjan Košir
dc427d6066
fixed the gamercat
2015-04-26 21:52:31 +12:00
Damjan Košir
561005887a
unneeded max
2015-04-26 00:23:45 +12:00
Damjan Košir
ac7b0d7e0e
adding parallel run option
2015-04-26 00:19:08 +12:00
Damjan Košir
1e94a3c7c5
now the same as offical version
2015-04-25 20:52:03 +12:00
Damjan Košir
dae2698102
removing mismerge
2015-04-25 20:40:28 +12:00
Damjan Košir
dc014a7cb4
Merge remote-tracking branch 'upstream/master'
...
Conflicts:
dosagelib/plugins/e.py
dosagelib/plugins/i.py
dosagelib/plugins/n.py
dosagelib/plugins/s.py
dosagelib/plugins/t.py
dosagelib/plugins/w.py
2015-04-25 20:28:27 +12:00
DirkReiners
b8ef6958b9
Merge branch 'master' of https://github.com/webcomics/dosage
2015-04-24 15:38:36 -05:00
Helge Stasch
4cdd92dcd7
Added comic Magellan
2015-04-23 09:12:24 +02:00
Tobias Gruetzmacher
9f33c31c68
Merge pull request #12 from Freestila/master
...
Changed comic name, since comic is named FowlLanguage instead of FoulLan...
Conflicts:
dosagelib/plugins/f.py
2015-04-22 22:24:26 +02:00
Tobias Gruetzmacher
bf9f45b380
Switch to setuptools and cleanup metadata.
...
py2exe support is gone for now, will be restored later.
2015-04-22 22:22:03 +02:00
Helge Stasch
8218e805b2
Changed comic name, since comic is named FowlLanguage instead of FoulLanguage
2015-04-22 21:25:10 +02:00
Tobias Gruetzmacher
bf9bf5e9b0
Merge pull request #11 from Freestila/master
...
Added "Ralf the Destroyer"
2015-04-21 23:46:11 +02:00
Tobias Gruetzmacher
86b31dc12b
Depend on pycountry directly.
2015-04-21 21:56:54 +02:00
Helge Stasch
d7e9c8eb94
Added "Ralf the Destroyer"
2015-04-21 19:12:40 +02:00
Tobias Gruetzmacher
d5e7690419
Fix size comparison for RSS & HTML output.
...
This was always broken, but somehow worked with Python 2.7 (WTF?). Now
that we test with Pillow, this code path runs with Python 3 and throws
an error.
2015-04-21 00:01:23 +02:00
Tobias Gruetzmacher
ff21df596b
Remove descriptions and genres ( closes #9 ).
...
Maintaining the descriptions creates quite a bit of overhead (finding
them, copying them, checking if they are still correct) for a minimal
user benefit.
PS: Viewing this diff should be easier in a difftool that shows changes
in a line, for example kdiff3.
2015-04-20 20:29:09 +02:00
Tobias Gruetzmacher
3b33129e58
Fix ViiviJaWagner.
2015-04-18 22:45:13 +02:00
Tobias Gruetzmacher
e8af5adcb8
Update list of supported GoComics comics.
2015-04-18 02:04:31 +02:00
Tobias Gruetzmacher
f0831a1f0f
Fix and update ArcaMax ( fixes #8 ).
2015-04-17 21:53:13 +02:00
DirkReiners
99f33151e2
Merge branch 'master' of https://github.com/webcomics/dosage
2015-04-16 18:36:42 -05:00
DirkReiners
8f3a9f660a
Fixed ASofterWorld
2015-04-16 18:35:21 -05:00
DirkReiners
49b964cb3c
Added PS238
2015-04-16 18:20:14 -05:00
Manabi
65c021ef2b
Fixed IAmArg
2015-04-15 14:43:06 -04:00
Manabi
475739ea60
Fixing DogHouseDiaries
2015-04-15 12:56:03 -04:00
Manabi
c0619e8dca
Fixing DogHouseDiaries
2015-04-15 12:51:45 -04:00
Manabi
2b98a9023e
Added Peanuts Begins & Wizard of Id Classics
2015-04-13 22:26:12 -04:00
Tobias Gruetzmacher
974752951b
Fix xkcd ( closes #3 ), remove adult tag (fixes wummel#85).
2015-04-12 20:06:34 +02:00
Tobias Gruetzmacher
5934f03453
Merge branch 'htmlparser' - I think it's ready.
...
This closes pull request #70 .
2015-04-01 22:13:55 +02:00
Tobias Gruetzmacher
614c25e278
Fix coding style.
2015-03-22 17:13:53 +01:00
Tobias Gruetzmacher
e94e2ae432
Merge pull request #95 from serenitas50/master
...
Added comic Beetlebum (http://blog.beetlebum.de/ ).
2015-03-22 17:04:36 +01:00
Tobias Gruetzmacher
b5ed4c56b6
Merge pull request #94 from Manabi/master
...
Added definition for Drive comic
Conflicts:
dosagelib/plugins/g.py
2015-03-22 16:34:07 +01:00
Tobias Gruetzmacher
b5368b366a
Merge Gaia(German), SandraAndWoo(German) into common base.
...
This also fixes #97 by correcting the imageSearch regex.
2015-02-04 19:41:52 +01:00
Manabi
f85464ccb2
Fixed unclosed ' error
...
Lines 293/294 should have been one line, this is now fixed.
2015-02-02 04:35:49 -05:00
Manabi
190f53ee4d
Fixing name of GunnkriggCourt
...
Existing name was missing a g.
2015-02-02 04:24:32 -05:00
Serenitas50
94004846cd
Added comic Beetlebum ( http://blog.beetlebum.de/ ).
2015-01-31 22:07:35 -02:00
Manabi
a5b0d0c5de
Added definition for Drive comic
2015-01-26 04:21:24 -05:00
Dirk Reiners
b710d3fa81
Merge branch 'master' of https://github.com/wummel/dosage
2015-01-16 13:24:48 -06:00
Dirk Reiners
c6f0dd6117
PiledHigherAndDeeper: Fix for new website format
2015-01-16 12:06:17 -06:00
Dirk Reiners
e25270c866
Dilbert: Fix for new websitre format
2015-01-16 12:05:53 -06:00
Dirk Reiners
3724eba835
Cyanide And Happiness: Fix for new website format
2015-01-16 12:05:36 -06:00
Tobias Gruetzmacher
f8531eca57
Move SinFest back to KeenSpot namespace.
2015-01-16 00:16:28 +01:00
Tobias Gruetzmacher
4733153d01
Merge pull request #87 from rpglover64/master
...
Update SinFest to work with new website.
2015-01-16 00:15:04 +01:00
Alex Rozenshteyn
a0506b22f0
Update ZenPencils URL.
2014-12-16 13:51:52 -05:00
Alex Rozenshteyn
51996e45ed
Update SinFest to work with new website.
2014-12-16 12:01:54 -05:00
Tobias Gruetzmacher
2c1ff889fa
Fix scope in HTML output.
2014-12-10 00:57:17 +01:00
Tobias Gruetzmacher
b7bc16650a
Merge branch 'carlosefonseca/master'
2014-12-10 00:07:21 +01:00
Tobias Gruetzmacher
5af4f45505
Merge branch 'zac9/patch-2'
2014-12-10 00:03:08 +01:00
Tobias Gruetzmacher
32265c99d7
Merge branch 'zac9/patch-1'
2014-12-10 00:00:51 +01:00
Carlos Fonseca
04cc07a466
Added comic Nimona
2014-12-08 13:28:37 +00:00
mbrandis
25cf4888ae
- Adapted ShermansLagoon
...
- Better version of OnTheFastTrack
2014-11-14 20:37:06 +01:00
mbrandis
c63f927e5c
- Modified OnTheFasttrack adapting the new API.
2014-11-14 20:09:42 +01:00
mbrandis
cd48801b0d
- Added next and previous day at end of page.
2014-11-14 15:39:42 +01:00
Dirk Reiners
fda654b5e0
Some fixes...
...
AbstruseGoose: fixed prev
Carciphona: fixed latest
Curtailed: fixed image and prev (moved to WP)
DorkTower: fixed image search
GrrlPower: fixed site name issue
MadamAndEve: archive not updated in a long time, but current strip is.
Works, but needs to be run daily.
PennyArcade: fixed namer
PvPonline: fixed prev
2014-10-24 16:42:32 -05:00
Dirk Reiners
77a5e09c10
Minor fix for using pathes to pick comics
2014-10-24 16:39:40 -05:00
Tobias Gruetzmacher
6769e1eb36
Add StrongFemaleProtagonist.
...
This uses the _ParserScraper and CSS selectors.
2014-10-13 23:39:50 +02:00
Tobias Gruetzmacher
1d52d6a152
Add support for CSS selectors to HTML parser.
...
Each comic module author can decide if she wants to use CSS or XPath,
not a mix of both. Using CSS needs the cssselect python module and the
module gets disabled if it is unavailable.
2014-10-13 22:43:06 +02:00
Tobias Gruetzmacher
17bc454132
Bugfix: Don't assume RE patterns in base class.
2014-10-13 22:29:47 +02:00
Tobias Gruetzmacher
e92a3fb3a1
New feature: Comic modules ca be "disabled".
...
This is modeled parallel to the "adult" feature, except the user can't
override it via the command line. Each comic module can override the
classmethod getDisabledReasons and give the user a reason why this
module is disabled. The user can see the reason in the comic list (-l or
--singlelist) and the comic module refuses to run, showing the same
message.
This is currently used to disable modules that use the _ParserScraper if
the LXML python module is missing.
2014-10-13 21:43:46 +02:00
Tobias Gruetzmacher
d495d95ee0
Refactor: Move repeated check into its own function.
2014-10-13 21:29:54 +02:00
Tobias Gruetzmacher
3235b8b312
Pass unicode strings to lxml.
...
This reverts commit fcde86e9c0
& some
more. This lets python-requests do all the encoding stuff and leaves
LXML with (hopefully) clean unicode HTML to parse.
2014-10-13 19:39:48 +02:00
zac9
6ca200419a
Update s.py
2014-09-28 19:48:26 -07:00
zac9
5b7ab5a711
Update o.py
2014-09-28 19:41:29 -07:00
zac9
491b5457b2
Added comic ShotgunShuffle
2014-09-28 06:29:02 -07:00
Bastian Kleineidam
731291979d
Fixed RedMeat.
2014-09-22 22:14:31 +02:00
Bastian Kleineidam
e43694c156
Don't crash on multiple HTML output runs per day.
2014-09-22 22:00:16 +02:00
Bastian Kleineidam
e87f5993b8
Merge branch 'master' into htmlparser
2014-08-07 18:10:15 +02:00
Tobias Gruetzmacher
08175d28c9
Fix Ruthe (see #73 ).
2014-07-31 21:27:49 +02:00
Tobias Gruetzmacher
ca2d722d39
Fix DieFruehreifen ( closes #73 ).
2014-07-31 21:18:15 +02:00
Tobias Gruetzmacher
6c7fb176b1
Add Blade Kitten as an example for the new parser.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
f9f0b75d7c
Create new HTML parser based scraper class.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
fcde86e9c0
Change getPageContent to (optionally) return raw text.
...
This allows LXML to do its own "magic" encoding detection
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
0e03eca8f0
Move all regular expression operation into the new class.
...
- Move fetchUrls, fetchUrl and fetchText.
- Move base URL handling.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
fde1fdced6
Fix some typos.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
2567bd4e57
Convert starters and other helpers to new interface.
...
This allows those starters to work with future scrapers.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher
4265053846
Refactor: Move regualar expression scraping into a new class.
...
- This also makes "<base href>" handling an internal detail of the regular
expression scraper, future scrapers might not need that or handle it in
another way.
2014-07-26 11:28:43 +02:00
Bastian Kleineidam
3a929ceea6
Allow comic text to be optional. Patch from TobiX
2014-07-24 20:49:57 +02:00
Bastian Kleineidam
950dd2932c
Remove stray print statement.
2014-07-21 20:20:15 +02:00
Tobias Gruetzmacher
ea5d533e30
Fix index lookups for SnowFlame and SnowFlakes.
2014-07-19 13:23:42 +02:00
Bastian Kleineidam
4d49d4394b
Fix doc
2014-07-03 18:42:06 +02:00
Bastian Kleineidam
f194e430bc
TheThinHLine: fetch bigger images and name image files from sequence number.
2014-07-03 18:41:25 +02:00
Bastian Kleineidam
4845a4ccc1
Merge branch 'master' of github.com:wummel/dosage
2014-07-03 17:12:42 +02:00
Bastian Kleineidam
641daa738b
Updated list of comics
2014-07-03 17:12:25 +02:00
Bastian Kleineidam
93fe5d5987
Minor useragent refactoring
2014-07-03 17:12:25 +02:00
Bastian Kleineidam
4c2a339e25
Fix some comics.
2014-07-02 19:51:53 +02:00
Luc Fouin
cb76198da7
added the thin H line, fixes #67
2014-07-02 17:14:33 +02:00
Luc Fouin
763f9b02a2
added the thin H line
2014-07-02 17:11:33 +02:00
Bastian Kleineidam
b03ba158ef
Fixed LookingForGroup
2014-07-01 23:44:01 +02:00
Bastian Kleineidam
3485e2ac54
Added Whomp.
2014-06-24 20:48:49 +02:00
wummel
a0086bfcd8
Merge pull request #63 from sehrgut/master
...
Updated GirlGenius to new markup
2014-06-24 20:40:15 +02:00
Peter B
8f1c864ec3
Added Safely Endangered
2014-06-17 01:05:11 -04:00
Keith Beckman
236b840363
Updated GirlGenius to new markup
...
GG markup has changed, so I fixed the prevSearch regex to find the
"previous" button on the redesigned page.
As well, I set multipleImagesPerStrip to true, since there are quite a
few comics with multiple images that were being discarded.
2014-06-13 16:43:40 -04:00
Bastian Kleineidam
68afeaf82d
Make appname lowercase.
2014-06-09 13:24:58 +02:00
Bastian Kleineidam
00e424aed0
Fix zenpencils.
2014-06-08 13:40:42 +02:00
Bastian Kleineidam
687d27d534
Stripping should be done in normaliseUrl.
2014-06-08 10:12:33 +02:00
Bastian Kleineidam
c528fd1822
Merge branch 'master' of github.com:wummel/dosage
2014-06-08 10:07:36 +02:00
Bastian Kleineidam
0ee5c08771
Match zoom image for GoComics pages.
2014-06-08 10:06:34 +02:00
Peter B
78954da9d7
fix StandStillStaySilent, strip urls when downloading
2014-06-04 01:58:16 -04:00
Peter B
71ed9ad69d
fixed foul language
2014-06-04 01:35:40 -04:00
Bastian Kleineidam
62a3a55b82
Fixed LoadingArtist
2014-03-26 19:59:42 +01:00
Bastian Kleineidam
813e6876fc
Add missing @classmethod
2014-03-26 19:59:42 +01:00
Bastian Kleineidam
c2cf58560e
Remove unused import.
2014-03-26 19:59:42 +01:00
Bastian Kleineidam
4bb31953ad
Fix PennyArcade
2014-03-26 19:59:42 +01:00
Freestila
0faf4a722b
Update o.py
...
Removed procedure for "I am over 18" button, sicne this button no longer exists
2014-03-05 09:28:34 +01:00
Bastian Kleineidam
348dd5e6c0
Add documentation
2014-03-04 20:53:19 +01:00
Bastian Kleineidam
3108c9124a
Fix thread import for py3
2014-03-04 20:50:34 +01:00
Bastian Kleineidam
18972d3830
Remove old waitSeconds parameter.
2014-03-04 18:38:46 +01:00
Bastian Kleineidam
15ef59262a
Make threads interruptable.
2014-03-04 18:38:46 +01:00
Tobias Gruetzmacher
33801376f9
Fix indentation.
2014-02-27 22:31:21 +01:00
Tobias Gruetzmacher
1bcac66c03
Mark MonsieurLeChien as french.
2014-02-27 22:30:02 +01:00
Tobias Gruetzmacher
8e2ba15410
Merge pull request #60 from Freestila/master
...
Added comics - looks good
2014-02-27 22:24:57 +01:00
Luc Fouin
da9f518a7a
add french commit M. Le Chien
2014-02-27 17:45:29 +01:00
Freestila
53ebb51b10
Added comic DungeonsAndDenizens
2014-02-27 15:08:07 +01:00
Freestila
b8fefb37c0
Added comic Underling
2014-02-20 12:54:40 +01:00
Freestila
3d19d45e81
Added wait 1 sek because of permanent Timeout / connection pool exceed from server
2014-02-20 12:54:13 +01:00
Freestila
67c31284f1
Added comic GrimTales from Down Below
2014-02-18 21:12:29 +01:00
Freestila
de0bb1c9d5
Added comic "The Landscaper"
2014-02-18 21:00:43 +01:00
Freestila
96f61542ee
Added comic "Die Fruehreifen"
2014-02-18 21:00:19 +01:00
Peter B
b44b751efa
Fixed EvilInc comics. Closes #58
2014-02-14 19:33:13 -05:00
Bastian Kleineidam
f50ef910be
Skip CyanideAndHappiness videos
2014-02-10 21:58:26 +01:00
Bastian Kleineidam
875e431edc
Provide page data in shouldSkipUrl() function
2014-02-10 21:58:09 +01:00
Bastian Kleineidam
73e1af7aba
Fixed FredoAndPidjin
2014-02-06 19:57:56 +01:00
Peter B
d86442efed
Added Oh Joy Sex Toy.
2014-01-30 22:45:50 -05:00
Peter B
add63d6d6c
Added The Gentleman's Armchair Comic.
2014-01-30 22:32:46 -05:00
Tobias Gruetzmacher
44ef1831bf
Sluggy Freelance has some pages with multiple comics.
...
See for example SluggyFreelance:010422
2014-01-28 19:08:39 +01:00
wummel
6b8854e7b2
Merge pull request #55 from Lugoues/upstream
...
Added MrLovenstein Comic
2014-01-26 05:49:50 -08:00
Bastian Kleineidam
cc5ee572fb
Fix some comics
2014-01-24 23:17:21 +01:00