Commit graph

580 commits

Author SHA1 Message Date
Tobias Gruetzmacher 68d4dd463a Revert robots.txt handling.
This brings us back to only honouring robots.txt on page downloads, not
on image downloads.

Rationale: Dosage is not a "robot" in the classical sense. It's not
designed to spider huge amounts of web sites in search for some content
to index, it's only intended to help users keep a personal archive of
comics he is interested in. We try very hard to never download any image
twice. This fixes #24.

(Precedent for this rationale: Google Feedfetcher:
https://support.google.com/webmasters/answer/178852?hl=en#robots)
2015-07-17 20:46:56 +02:00
Tobias Gruetzmacher 7d3bd15c2f Remove AbleAndBaker, site is gone. 2015-07-16 00:49:48 +02:00
Tobias Gruetzmacher 472afa24d3 GoComics doesn't allow spiders, disable them...
This removes 757 comics, including quite popular ones like Calvin and
Hobbes, Garfield, FoxTrot, etc. :(
2015-07-16 00:36:10 +02:00
Tobias Gruetzmacher 88e387ad15 Add Sleepless Domain. 2015-07-12 18:31:21 +02:00
Tobias Gruetzmacher 0b6d7425e1 Remove BladeKitten.
It's not available online anymore, only in print or as a PDF download.
2015-07-11 01:29:21 +02:00
Tobias Gruetzmacher d97a9c63e4 Add Erstwhile. 2015-07-10 01:14:56 +02:00
Damjan Košir 7abca1222b added NerfNow 2015-07-07 22:18:06 +12:00
Damjan Košir 119a3cd13a added text to ScandinaviaAndTheWorld 2015-07-07 19:48:25 +12:00
Damjan Košir 5f243e3868 not a comic 2015-07-05 18:33:14 +12:00
Damjan Košir 5e7ad33fc8 Nnewts disabled 2015-07-05 18:32:33 +12:00
Damjan Košir 45012ff9c3 BladeKitten disabled 2015-07-05 18:31:38 +12:00
Tobias Gruetzmacher 0c6feec8cd Fix module name EastCoastVsWestCoast. 2015-06-24 00:51:42 +02:00
Damjan Košir 96572e8cba added TheMelvinChronicles 2015-06-12 21:00:11 +12:00
Damjan Košir 6412e6e542 fixed Spinnerette 2015-06-08 20:31:13 +12:00
Damjan Košir 3d8a49d228 realised TheWebcomicFactory is actually 28 comics... added them 2015-06-07 21:33:59 +12:00
Damjan Košir 05bb22b3ef added TheWebcomicFactory 2015-06-06 14:25:32 +12:00
Damjan Košir c98800388e added Sithrah 2015-06-04 19:24:55 +12:00
Damjan Košir 010b4bf669 renaming comicpress to wordpress (as it's not just for the comicpress theme) 2015-06-04 19:12:40 +12:00
Damjan Košir bc91f5f1fb added MistyTheMouse 2015-06-04 19:06:40 +12:00
Damjan Košir e2d01e4924 fixed ScandinaviaAndTheWorld 2015-06-04 18:58:59 +12:00
Damjan Košir 545a67111e fixed Alice 2015-06-01 15:15:34 +12:00
Damjan Košir a08ad2dc80 fixed GoGetARoomie 2015-06-01 15:11:16 +12:00
Damjan Košir ceb19ed2bc fixed Wulffmorgenthaler (now Wumo), added TruthFacts and MeAndDanielle 2015-06-01 12:14:52 +12:00
Damjan Košir 4cd88ecdc0 fixed WormWorldSaga 2015-06-01 11:45:22 +12:00
Damjan Košir ea6cb925a6 fixed LoadingArtist 2015-06-01 11:33:50 +12:00
Damjan Košir e268b09567 fixed EarthsongSaga 2015-06-01 11:19:02 +12:00
Damjan Košir 29c8d2eea0 fixed Meek 2015-05-31 23:41:12 +12:00
Damjan Košir 9be6f613e4 fixed MysteriesOfTheArcana 2015-05-31 23:39:04 +12:00
Damjan Košir 3ea8236224 fixed FowlLanguage 2015-05-31 23:29:34 +12:00
Damjan Košir c1245a85ad moved Footloose, added Cherry, Desigaspring 2015-05-31 23:23:02 +12:00
Damjan Košir 01aeebfbe4 fixed Footloose 2015-05-31 23:16:12 +12:00
Damjan Košir 029fa74067 fixed Bardsworth 2015-05-31 23:03:40 +12:00
Damjan Košir f3036de8fd fixed Pimpette 2015-05-31 22:57:25 +12:00
Damjan Košir df7404fd7c fixed CatsAndCameras 2015-05-31 22:50:17 +12:00
Damjan Košir d4cc8ac857 added buni 2015-05-27 20:36:11 +12:00
Damjan Košir 9beeceffad added BusinessCat and HappyJar 2015-05-27 20:34:51 +12:00
Damjan Košir d970d27b14 removing duplicate 2015-05-27 00:10:46 +12:00
Damjan Košir 33abd95348 fixed TheGentlemansArmchair 2015-05-26 23:48:22 +12:00
Damjan Košir 5e123ae79e fixed DarkWings (now available under the real name Eryl as well), added Ashes, Laiyu, NoMoreSavePoints and EasilyAmused 2015-05-26 23:43:15 +12:00
Damjan Košir 9adb020fc2 fixed DemolitionSquad 2015-05-26 22:59:25 +12:00
Damjan Košir 605c5f8619 fixed PokeyThePenguin 2015-05-26 22:31:43 +12:00
Damjan Košir 766b7ba99d fixed ProperBarn, added 2214 and OTE 2015-05-26 22:16:55 +12:00
Damjan Košir 2c41435ceb fixing HijiNKS ENSUE and added all 4 comics on that page 2015-05-26 22:06:55 +12:00
Damjan Košir 465e7eaf6f fixing CowboyJedi kinda... there is currently no comic on the front page and the author knows it 2015-05-26 21:35:36 +12:00
Damjan Košir 529a41397a fixing CorydonCafe 2015-05-26 21:32:25 +12:00
Damjan Košir c3abb93e99 fixing ChainsawSuit 2015-05-26 19:53:04 +12:00
Damjan Košir f8690af029 fixing Curvy 2015-05-26 19:47:31 +12:00
Damjan Košir 36c790fa4b fixing CraftedFables 2015-05-26 19:32:12 +12:00
Damjan Košir 7067c51056 fixed CheckerboardNightmare 2015-05-25 22:19:36 +12:00
Damjan Košir 5569439c43 fixed 16 comics 2015-05-25 21:57:06 +12:00
Damjan Košir 3edaa97fb9 fixing KatzenfutterGeleespritzer 2015-05-25 20:06:58 +12:00
Damjan Košir 8a245e1d10 fixing BloodBound 2015-05-21 00:04:07 +12:00
Damjan Košir dc2349951a moving BroodHollow to comicpress 2015-05-21 00:00:35 +12:00
Damjan Košir a05ae9c75d fixing PandyLand 2015-05-20 23:56:49 +12:00
Damjan Košir fd60065591 fixing OnTheEdge 2015-05-20 23:50:18 +12:00
Damjan Košir 80b783c016 fixing CourtingDisaster 2015-05-20 23:16:54 +12:00
Damjan Košir ff239ff58e Merge branch 'comicpress' 2015-05-20 23:12:03 +12:00
Damjan Košir 77c5dbce9b better prevSearch for comic press 2015-05-20 23:08:02 +12:00
Damjan Košir bc4e7a03f2 fixed BroodHollow 2015-05-20 23:03:15 +12:00
Damjan Košir 8de620c78b fixed CigarroAndCerveja 2015-05-20 22:58:13 +12:00
Damjan Košir 77a9cce00d fixing Hipsters 2015-05-19 19:49:45 +12:00
Damjan Košir 79d775a8d9 adding comicpress scraper 2015-05-16 00:15:32 +12:00
Damjan Košir 962286d391 fixed OctopusPie 2015-05-14 23:06:12 +12:00
Damjan Košir 3bbf2d5c23 fixing neko the kitty 2015-05-14 22:42:04 +12:00
Damjan Košir f75fc62e84 fixing pebbleversion 2015-05-14 22:33:46 +12:00
Helge Stasch 5a1ef9b791 Fixed problem with LookingForGroup comic 2015-05-07 13:57:10 +02:00
Damjan Košir 9a009018c7 adding strip Moonsticks 2015-05-07 23:00:55 +12:00
Helge Stasch 64a875388f Added Comic MaxOveracts 2015-05-04 14:06:01 +02:00
Marc Winkelmann 69e5b8ad93 Shermans Lagoon and On The Fastrack working again. Also corrected name. 2015-05-02 22:27:08 +02:00
DirkReiners 1438330a94 Fixes and Additions...
Fixed SabrinaOnline
Fixed SMBC
Added StandStillStaySilent (partial, prevsearch not working yet)
2015-04-29 10:37:14 -05:00
DirkReiners 749beff7a3 Added MareInternum (marecomic.com) 2015-04-29 10:36:12 -05:00
DirkReiners 273b429fcd Merge branch 'master' of https://github.com/webcomics/dosage 2015-04-29 09:51:47 -05:00
Damjan Košir 391313972c fixed ManlyGuysDoingManlyThings 2015-04-26 23:47:38 +12:00
Damjan Košir 9837a87a43 fixed omake teather 2015-04-26 23:32:22 +12:00
Damjan Košir 8df9d20556 added doctor cat 2015-04-26 22:32:52 +12:00
Damjan Košir dc427d6066 fixed the gamercat 2015-04-26 21:52:31 +12:00
Damjan Košir 1e94a3c7c5 now the same as offical version 2015-04-25 20:52:03 +12:00
Damjan Košir dae2698102 removing mismerge 2015-04-25 20:40:28 +12:00
Damjan Košir dc014a7cb4 Merge remote-tracking branch 'upstream/master'
Conflicts:
	dosagelib/plugins/e.py
	dosagelib/plugins/i.py
	dosagelib/plugins/n.py
	dosagelib/plugins/s.py
	dosagelib/plugins/t.py
	dosagelib/plugins/w.py
2015-04-25 20:28:27 +12:00
DirkReiners b8ef6958b9 Merge branch 'master' of https://github.com/webcomics/dosage 2015-04-24 15:38:36 -05:00
Helge Stasch 4cdd92dcd7 Added comic Magellan 2015-04-23 09:12:24 +02:00
Tobias Gruetzmacher 9f33c31c68 Merge pull request #12 from Freestila/master
Changed comic name, since comic is named FowlLanguage instead of FoulLan...

Conflicts:
	dosagelib/plugins/f.py
2015-04-22 22:24:26 +02:00
Helge Stasch 8218e805b2 Changed comic name, since comic is named FowlLanguage instead of FoulLanguage 2015-04-22 21:25:10 +02:00
Tobias Gruetzmacher bf9bf5e9b0 Merge pull request #11 from Freestila/master
Added "Ralf the Destroyer"
2015-04-21 23:46:11 +02:00
Helge Stasch d7e9c8eb94 Added "Ralf the Destroyer" 2015-04-21 19:12:40 +02:00
Tobias Gruetzmacher ff21df596b Remove descriptions and genres (closes #9).
Maintaining the descriptions creates quite a bit of overhead (finding
them, copying them, checking if they are still correct) for a minimal
user benefit.

PS: Viewing this diff should be easier in a difftool that shows changes
in a line, for example kdiff3.
2015-04-20 20:29:09 +02:00
Tobias Gruetzmacher 3b33129e58 Fix ViiviJaWagner. 2015-04-18 22:45:13 +02:00
Tobias Gruetzmacher e8af5adcb8 Update list of supported GoComics comics. 2015-04-18 02:04:31 +02:00
Tobias Gruetzmacher f0831a1f0f Fix and update ArcaMax (fixes #8). 2015-04-17 21:53:13 +02:00
DirkReiners 99f33151e2 Merge branch 'master' of https://github.com/webcomics/dosage 2015-04-16 18:36:42 -05:00
DirkReiners 8f3a9f660a Fixed ASofterWorld 2015-04-16 18:35:21 -05:00
DirkReiners 49b964cb3c Added PS238 2015-04-16 18:20:14 -05:00
Manabi 65c021ef2b Fixed IAmArg 2015-04-15 14:43:06 -04:00
Manabi 475739ea60 Fixing DogHouseDiaries 2015-04-15 12:56:03 -04:00
Manabi c0619e8dca Fixing DogHouseDiaries 2015-04-15 12:51:45 -04:00
Manabi 2b98a9023e Added Peanuts Begins & Wizard of Id Classics 2015-04-13 22:26:12 -04:00
Tobias Gruetzmacher 974752951b Fix xkcd (closes #3), remove adult tag (fixes wummel#85). 2015-04-12 20:06:34 +02:00
Tobias Gruetzmacher 5934f03453 Merge branch 'htmlparser' - I think it's ready.
This closes pull request #70.
2015-04-01 22:13:55 +02:00
Tobias Gruetzmacher 614c25e278 Fix coding style. 2015-03-22 17:13:53 +01:00
Tobias Gruetzmacher e94e2ae432 Merge pull request #95 from serenitas50/master
Added comic Beetlebum (http://blog.beetlebum.de/).
2015-03-22 17:04:36 +01:00
Tobias Gruetzmacher b5ed4c56b6 Merge pull request #94 from Manabi/master
Added definition for Drive comic

Conflicts:
	dosagelib/plugins/g.py
2015-03-22 16:34:07 +01:00
Tobias Gruetzmacher b5368b366a Merge Gaia(German), SandraAndWoo(German) into common base.
This also fixes #97 by correcting the imageSearch regex.
2015-02-04 19:41:52 +01:00
Manabi f85464ccb2 Fixed unclosed ' error
Lines 293/294 should have been one line, this is now fixed.
2015-02-02 04:35:49 -05:00
Manabi 190f53ee4d Fixing name of GunnkriggCourt
Existing name was missing a g.
2015-02-02 04:24:32 -05:00
Serenitas50 94004846cd Added comic Beetlebum (http://blog.beetlebum.de/). 2015-01-31 22:07:35 -02:00
Manabi a5b0d0c5de Added definition for Drive comic 2015-01-26 04:21:24 -05:00
Dirk Reiners b710d3fa81 Merge branch 'master' of https://github.com/wummel/dosage 2015-01-16 13:24:48 -06:00
Dirk Reiners c6f0dd6117 PiledHigherAndDeeper: Fix for new website format 2015-01-16 12:06:17 -06:00
Dirk Reiners e25270c866 Dilbert: Fix for new websitre format 2015-01-16 12:05:53 -06:00
Dirk Reiners 3724eba835 Cyanide And Happiness: Fix for new website format 2015-01-16 12:05:36 -06:00
Tobias Gruetzmacher f8531eca57 Move SinFest back to KeenSpot namespace. 2015-01-16 00:16:28 +01:00
Tobias Gruetzmacher 4733153d01 Merge pull request #87 from rpglover64/master
Update SinFest to work with new website.
2015-01-16 00:15:04 +01:00
Alex Rozenshteyn a0506b22f0 Update ZenPencils URL. 2014-12-16 13:51:52 -05:00
Alex Rozenshteyn 51996e45ed Update SinFest to work with new website. 2014-12-16 12:01:54 -05:00
Tobias Gruetzmacher b7bc16650a Merge branch 'carlosefonseca/master' 2014-12-10 00:07:21 +01:00
Tobias Gruetzmacher 5af4f45505 Merge branch 'zac9/patch-2' 2014-12-10 00:03:08 +01:00
Tobias Gruetzmacher 32265c99d7 Merge branch 'zac9/patch-1' 2014-12-10 00:00:51 +01:00
Carlos Fonseca 04cc07a466 Added comic Nimona 2014-12-08 13:28:37 +00:00
mbrandis 25cf4888ae - Adapted ShermansLagoon
- Better version of OnTheFastTrack
2014-11-14 20:37:06 +01:00
mbrandis c63f927e5c - Modified OnTheFasttrack adapting the new API. 2014-11-14 20:09:42 +01:00
Dirk Reiners fda654b5e0 Some fixes...
AbstruseGoose: fixed prev
Carciphona: fixed latest
Curtailed: fixed image and prev (moved to WP)
DorkTower: fixed image search
GrrlPower: fixed site name issue
MadamAndEve: archive not updated in a long time, but current strip is.
Works, but needs to be run daily.
PennyArcade: fixed namer
PvPonline: fixed prev
2014-10-24 16:42:32 -05:00
Tobias Gruetzmacher 6769e1eb36 Add StrongFemaleProtagonist.
This uses the _ParserScraper and CSS selectors.
2014-10-13 23:39:50 +02:00
zac9 6ca200419a Update s.py 2014-09-28 19:48:26 -07:00
zac9 5b7ab5a711 Update o.py 2014-09-28 19:41:29 -07:00
zac9 491b5457b2 Added comic ShotgunShuffle 2014-09-28 06:29:02 -07:00
Bastian Kleineidam 731291979d Fixed RedMeat. 2014-09-22 22:14:31 +02:00
Bastian Kleineidam e87f5993b8 Merge branch 'master' into htmlparser 2014-08-07 18:10:15 +02:00
Tobias Gruetzmacher 08175d28c9 Fix Ruthe (see #73). 2014-07-31 21:27:49 +02:00
Tobias Gruetzmacher ca2d722d39 Fix DieFruehreifen (closes #73). 2014-07-31 21:18:15 +02:00
Tobias Gruetzmacher 6c7fb176b1 Add Blade Kitten as an example for the new parser. 2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher 2567bd4e57 Convert starters and other helpers to new interface.
This allows those starters to work with future scrapers.
2014-07-26 11:28:43 +02:00
Tobias Gruetzmacher 4265053846 Refactor: Move regualar expression scraping into a new class.
- This also makes "<base href>" handling an internal detail of the regular
  expression scraper, future scrapers might not need that or handle it in
  another way.
2014-07-26 11:28:43 +02:00
Bastian Kleineidam 950dd2932c Remove stray print statement. 2014-07-21 20:20:15 +02:00
Tobias Gruetzmacher ea5d533e30 Fix index lookups for SnowFlame and SnowFlakes. 2014-07-19 13:23:42 +02:00
Bastian Kleineidam 4d49d4394b Fix doc 2014-07-03 18:42:06 +02:00
Bastian Kleineidam f194e430bc TheThinHLine: fetch bigger images and name image files from sequence number. 2014-07-03 18:41:25 +02:00
Bastian Kleineidam 4845a4ccc1 Merge branch 'master' of github.com:wummel/dosage 2014-07-03 17:12:42 +02:00
Bastian Kleineidam 641daa738b Updated list of comics 2014-07-03 17:12:25 +02:00
Bastian Kleineidam 4c2a339e25 Fix some comics. 2014-07-02 19:51:53 +02:00
Luc Fouin cb76198da7 added the thin H line, fixes #67 2014-07-02 17:14:33 +02:00
Luc Fouin 763f9b02a2 added the thin H line 2014-07-02 17:11:33 +02:00
Bastian Kleineidam b03ba158ef Fixed LookingForGroup 2014-07-01 23:44:01 +02:00
Bastian Kleineidam 3485e2ac54 Added Whomp. 2014-06-24 20:48:49 +02:00
wummel a0086bfcd8 Merge pull request #63 from sehrgut/master
Updated GirlGenius to new markup
2014-06-24 20:40:15 +02:00
Peter B 8f1c864ec3 Added Safely Endangered 2014-06-17 01:05:11 -04:00
Keith Beckman 236b840363 Updated GirlGenius to new markup
GG markup has changed, so I fixed the prevSearch regex to find the
"previous" button on the redesigned page.

As well, I set multipleImagesPerStrip to true, since there are quite a
few comics with multiple images that were being discarded.
2014-06-13 16:43:40 -04:00
Bastian Kleineidam 00e424aed0 Fix zenpencils. 2014-06-08 13:40:42 +02:00
Bastian Kleineidam c528fd1822 Merge branch 'master' of github.com:wummel/dosage 2014-06-08 10:07:36 +02:00
Bastian Kleineidam 0ee5c08771 Match zoom image for GoComics pages. 2014-06-08 10:06:34 +02:00
Peter B 78954da9d7 fix StandStillStaySilent, strip urls when downloading 2014-06-04 01:58:16 -04:00