Updated documentation and fix some comics.

This commit is contained in:
Bastian Kleineidam 2012-11-20 18:53:53 +01:00
parent 64d9fd6ac2
commit 54eaadf4fc
41 changed files with 541 additions and 674 deletions

View file

@ -6,13 +6,11 @@ ARCHIVE:=dosage-$(VERSION).tar.gz
PY_FILES_DIRS := dosage dosagelib tests *.py
PY2APPOPTS ?=
NUMPROCESSORS:=$(shell grep -c processor /proc/cpuinfo)
MAXFAILEDTESTS:=10
# Pytest options:
# - stop after MAXFAILEDTESTS failed errors
# - use multiple processors
# - write test results in file
# - run all tests found in the "tests" subdirectory
PYTESTOPTS:=--maxfail=$(MAXFAILEDTESTS) -n $(NUMPROCESSORS) --resultlog=testresults.txt --tb=short
PYTESTOPTS:=-n $(NUMPROCESSORS) --resultlog=testresults.txt --tb=short
CHMODMINUSMINUS:=--
# directory or file with tests to run
TESTS ?= tests

View file

@ -38,16 +38,6 @@ strip of all of them:
For advanced options and features execute `dosage -h` or look at the dosage
manual page.
Offensive comics
-----------------
There are some comics supported by Dosage that may be offensive to readers or
to others that have access to the downloaded images.
SexyLosers is one module that has been discussed. Dosage offers a mechanism
to disable such modules. Modules listed in "/etc/dosage/disabled" and
"~/.dosage/disabled" will be disabled. These files should contain only one
module name per line. Note: Under Windows "~" will also expand to the user's
home directory, usually "C:\Documents and Settings\UserName".
Dependencies
-------------
Dosage requires Python version 2.7 or higher, which can be downloaded

View file

@ -10,11 +10,12 @@ Changes:
- installation: Require and use Python 2.7
- comics: Removed the twisted and zope dependencies by adding
an internal plugin search mechanism.
- comics: Remove the disable mechanism.
- testing: Refactored the test comic routine in proper unit tests.
- cmdline: Improved terminal feature detection.
Fixes:
- comics: Fix a lot of comics.
- comics: Fix a lot of comics; however there are still some that won't work.
- comics: Don't add empty URLs to the list of found URLs.

View file

@ -2,30 +2,27 @@
.SH NAME
dosage \- comic strip downloader
.SH SYNOPSIS
.B dosage
.RI [ options ]
.I module
.RI [ module .\|.\|.]
\fBdosage\fP [\fIoptions\fP] \fImodule\fP...
.SH DESCRIPTION
.B dosage
is an application designed to keep a local \(oqmirror\(cq of specific
is an application designed to keep a local mirror of specific
web comics and other picture\-based content, such as
\(oqPicture Of The Day\(cq sites, with a variety of options
\fIPicture Of The Day\fP sites, with a variety of options
for updating and maintaining collections.
.SH OPTIONS
.TP
.BI \-b " PATH" "\fR,\fP \-\^\-basepath=" PATH
Specifies a base path to put comic subdirectories. The default is \(oqComics\(cq.
\fB\-b\fP \fIPATH\fP, \fB\-\-basepath=\fP\fIPATH\fP
Specifies a base path to put comic subdirectories. The default is \fBComics\fP.
.TP
.BI \-\^\-baseurl= PATH
\fB\-\-baseurl=\fP\fIPATH\fP
Specifies the base URL for output events. The default is a local file URI.
.TP
.BR \-a ", " \-\^\-all
\fB\-a\fP, \fB\-\-all\fP
Traverses all available strips backwards from the current one.
This can be useful you want a full collection of a new comic strip,
or update an existing one where files are missing.
.
Catchups can start at a specific image by using the index syntax, see
Catchups can start at a specific strip by using the index syntax, see
the
.B INDEX SYNTAX
and
@ -35,34 +32,32 @@ and want only to download the missing files. To make this task easy,
the traversal ends at the first existing image file when starting from
an index (excluding the index itself).
.TP
.BR \-h ", " \-\^\-help
\fB\-h\fP, \fB\-\-help\fP
Output brief help information.
.TP
.BR \-l ", " \-\^\-list
\fB\-l\fP, \fB\-\-list\fP
List available comic modules in multi\-column fashion.
.TP
.BR \-\^\-singlelist
\fB\-\-singlelist\fP
List available comic modules in single-column fashion.
.TP
.BI \-m " MODULE" "\fR,\fP \-\^\-modulehelp=" MODULE
Output module-specific help for
.IR MODULE .
\fB\-m\fP \fIMODULE\fP, \fB\-\-modulehelp=\fP\fIMODULE\fP
Output module-specific help for \fIMODULE\fP.
.TP
.BI \-o " OUTPUT" "\fR,\fP \-\^\-output=" OUTPUT
.I OUTPUT
may be any one of the following:
\fB\-o\fP \fIOUTPUT\fP, \fB\-\-output=\fP\fIOUTPUT\fP
\fIOUTPUT\fP may be any one of the following:
.PP
.RS
.BR "html " \-
Writes out an HTML file linking to the strips actually downloaded in the
current run, named by date (ala dailystrips). The files can be found in the
\'html' directory of your Comics directory.
\fBhtml\fP directory of your \fBComics\fP directory.
.RE
.PP
.RS
.BR "rss " \-
Writes out an RSS feed detailing what strips were downloaded in the last 24
hours. The feed can be found in Comics/dailydose.xml.
hours. The feed can be found in \fBComics/dailydose.xml\fP.
.RE
.PP
.RS
@ -71,13 +66,13 @@ Writes an RSS feed with all of the strips downloaded during the run, for use
with your favourite RSS aggregator.
.RE
.TP
.BR \-t ", " \-\^\-timestamps
\fB\-t\fP, \fB\-\-timestamps\fP
Print timestamps for all output at any level.
.TP
.BR \-v ", " \-\^\-verbose
\fB\-v\fP, \fB\-\-verbose\fP
Increase the output level by one with each occurence.
.TP
.BR \-V ", " \-\^\-version
\fB\-V\fP, \fB\-\-version\fP
Display the version number.
.I module
At least one valid
@ -90,32 +85,24 @@ arguments can be specified on the command line.
Module names are case insensitive, and it is sufficient to specify a
unique substring of the module name.
.SH INDEX SYNTAX
One can indicate the start of a list of
.B comma seperated
indices using a
.RB \(oq : "\(cq."
Instead of starting at the latest comic strip, an index lets dosage start
at a certain strip. The index can be specified by appending a colon \fB:\fP
and the index name after the module. Multiple comma-spearated indices can
also be specified.
.PP
The index format is documented when using the \fB\-\-modulehelp\fP option.
.SH OFFENSIVE COMICS
Some users may find certain comics offensive and wish to disable them.
Modules listed in
.B /etc/dosage/disabled
and
.B ~/.dosage/disabled
will be disabled. These files should contain only one module name per line.
The index name itself usually is the part of the comic strip URL that identifiess
a strip, eg. a number or a date. The expected format is documented when using
the \fB\-\-modulehelp\fP option.
.SH SPECIAL SYNTAX
.TP
.B @
This expands to mean all the comics currently in your \(oqComics\(cq
This expands to mean all the comics currently in your \fBComics\fP
directory. All other specified comic module names will be ignored.
.TP
.B @@
This expands to mean all the comics available to Dosage.
.PP
.B INDEX SYNTAX
can not be used with
.B SPECIAL SYNTAX
.
\fBINDEX SYNTAX\fP can not be used with \fBSPECIAL SYNTAX\fP.
.SH EXAMPLES
Retrieve all Mega Tokyo comics:
.RS
@ -127,7 +114,7 @@ Retrieve the current comic of Cyanide and Happiness:
.B dosage cyanideandhappiness
.RE
.PP
Retrieve the current strip of all comics in your \(oqComics\(cq directory:
Retrieve the current strip of all comics in your \fBComics\fP directory:
.RS
.B dosage @
.RE
@ -149,7 +136,7 @@ the beginning until an existing file is found:
.SH ENVIRONMENT
.IP HTTP_PROXY
.B mainline
will use the specified HTTP proxy whenever possible.
will use the specified HTTP proxy when downloading URL contents.
.SH NOTES
Should retrieval fail on any given strip
.B mainline
@ -172,15 +159,8 @@ the program run was aborted with Ctrl-C
.PP
Else the return value is zero.
.SH BUGS
See
.I http://trac.slipgate.za.net/dosage
for a list of current development tasks and suggestions.
.SH FILES
.IP "\fB/etc/dosage/disabled\fR"
Disables comic modules on a global scale.
.IP "\fB~/.dosage/disabled\fR"
Disables comic modules on a local scale.
Users can report or view bugs, patches or feature suggestions at
.I https://github.com/wummel/dosage/issues
.SH AUTHORS
Jonathan Jacobs <korpse@slipgate.za.net>
.br

View file

@ -13,42 +13,29 @@ dosage - comic strip downloader
<A NAME="lbAC">&nbsp;</A>
<H2>SYNOPSIS</H2>
<B>dosage</B>
[<I>options</I>]
<I>module</I>
[<I>module</I>...]
<B>dosage</B> [<I>options</I>] <I>module</I>...
<A NAME="lbAD">&nbsp;</A>
<H2>DESCRIPTION</H2>
<B>dosage</B>
is an application designed to keep a local 'mirror' of specific
is an application designed to keep a local mirror of specific
web comics and other picture-based content, such as
'Picture Of The Day' sites, with a variety of options
<I>Picture Of The Day</I> sites, with a variety of options
for updating and maintaining collections.
<A NAME="lbAE">&nbsp;</A>
<H2>OPTIONS</H2>
<DL COMPACT>
<DT><B>-b</B><I> PATH</I><B></B>, --basepath=<I>PATH</I>
<DD>
Specifies a base path to put comic subdirectories. The default is 'Comics'.
<DT><B>--baseurl=</B><I>PATH</I>
<DD>
<DT><B>-b</B> <I>PATH</I>, <B>--basepath=</B><I>PATH</I><DD>
Specifies a base path to put comic subdirectories. The default is <B>Comics</B>.
<DT><B>--baseurl=</B><I>PATH</I><DD>
Specifies the base URL for output events. The default is a local file URI.
<DT><B>-a</B>, <B>--all</B>
<DD>
<DT><B>-a</B>, <B>--all</B><DD>
Traverses all available strips backwards from the current one.
This can be useful you want a full collection of a new comic strip,
or update an existing one where files are missing.
Catchups can start at a specific image by using the index syntax, see
Catchups can start at a specific strip by using the index syntax, see
the
<B>INDEX SYNTAX</B>
@ -59,30 +46,16 @@ sections for more information. This is useful when you missed some days
and want only to download the missing files. To make this task easy,
the traversal ends at the first existing image file when starting from
an index (excluding the index itself).
<DT><B>-h</B>, <B>--help</B>
<DD>
<DT><B>-h</B>, <B>--help</B><DD>
Output brief help information.
<DT><B>-l</B>, <B>--list</B>
<DD>
<DT><B>-l</B>, <B>--list</B><DD>
List available comic modules in multi-column fashion.
<DT><B>--singlelist</B>
<DD>
<DT><B>--singlelist</B><DD>
List available comic modules in single-column fashion.
<DT><B>-m</B><I> MODULE</I><B></B>, --modulehelp=<I>MODULE</I>
<DD>
Output module-specific help for
<I>MODULE</I>.
<DT><B>-o</B><I> OUTPUT</I><B></B>, --output=<I>OUTPUT</I>
<DD>
<I>OUTPUT</I>
may be any one of the following:
<DT><B>-m</B> <I>MODULE</I>, <B>--modulehelp=</B><I>MODULE</I><DD>
Output module-specific help for <I>MODULE</I>.
<DT><B>-o</B> <I>OUTPUT</I>, <B>--output=</B><I>OUTPUT</I><DD>
<I>OUTPUT</I> may be any one of the following:
</DL>
<P>
@ -91,7 +64,7 @@ may be any one of the following:
Writes out an HTML file linking to the strips actually downloaded in the
current run, named by date (ala dailystrips). The files can be found in the
'html' directory of your Comics directory.
<B>html</B> directory of your <B>Comics</B> directory.
</DL>
<P>
@ -100,7 +73,7 @@ current run, named by date (ala dailystrips). The files can be found in the
<B>rss </B>-
Writes out an RSS feed detailing what strips were downloaded in the last 24
hours. The feed can be found in Comics/dailydose.xml.
hours. The feed can be found in <B>Comics/dailydose.xml</B>.
</DL>
<P>
@ -113,17 +86,11 @@ with your favourite RSS aggregator.
</DL>
<DL COMPACT>
<DT><B>-t</B>, <B>--timestamps</B>
<DD>
<DT><B>-t</B>, <B>--timestamps</B><DD>
Print timestamps for all output at any level.
<DT><B>-v</B>, <B>--verbose</B>
<DD>
<DT><B>-v</B>, <B>--verbose</B><DD>
Increase the output level by one with each occurence.
<DT><B>-V</B>, <B>--version</B>
<DD>
<DT><B>-V</B>, <B>--version</B><DD>
Display the version number.
<I>module</I>
@ -143,34 +110,23 @@ unique substring of the module name.
<A NAME="lbAF">&nbsp;</A>
<H2>INDEX SYNTAX</H2>
One can indicate the start of a list of
<B>comma seperated</B>
indices using a
'<B>:</B>'.
Instead of starting at the latest comic strip, an index lets dosage start
at a certain strip. The index can be specified by appending a colon <B>:</B>
and the index name after the module. Multiple comma-spearated indices can
also be specified.
<P>
The index format is documented when using the <B>--modulehelp</B> option.
The index name itself usually is the part of the comic strip URL that identifiess
a strip, eg. a number or a date. The expected format is documented when using
the <B>--modulehelp</B> option.
<A NAME="lbAG">&nbsp;</A>
<H2>OFFENSIVE COMICS</H2>
Some users may find certain comics offensive and wish to disable them.
Modules listed in
<B>/etc/dosage/disabled</B>
and
<B>~/.dosage/disabled</B>
will be disabled. These files should contain only one module name per line.
<A NAME="lbAH">&nbsp;</A>
<H2>SPECIAL SYNTAX</H2>
<DL COMPACT>
<DT><B>@</B>
<DD>
This expands to mean all the comics currently in your 'Comics'
This expands to mean all the comics currently in your <B>Comics</B>
directory. All other specified comic module names will be ignored.
<DT><B>@@</B>
@ -179,12 +135,8 @@ This expands to mean all the comics available to Dosage.
</DL>
<P>
<B>INDEX SYNTAX</B>
can not be used with
<B>SPECIAL SYNTAX</B>
<A NAME="lbAI">&nbsp;</A>
<B>INDEX SYNTAX</B> can not be used with <B>SPECIAL SYNTAX</B>.
<A NAME="lbAH">&nbsp;</A>
<H2>EXAMPLES</H2>
Retrieve all Mega Tokyo comics:
@ -203,7 +155,7 @@ Retrieve the current comic of Cyanide and Happiness:
<P>
Retrieve the current strip of all comics in your 'Comics' directory:
Retrieve the current strip of all comics in your <B>Comics</B> directory:
<DL COMPACT><DT><DD>
<B>dosage @</B>
@ -232,16 +184,16 @@ the beginning until an existing file is found:
</DL>
<A NAME="lbAJ">&nbsp;</A>
<A NAME="lbAI">&nbsp;</A>
<H2>ENVIRONMENT</H2>
<DL COMPACT>
<DT>HTTP_PROXY<DD>
<B>mainline</B>
will use the specified HTTP proxy whenever possible.
will use the specified HTTP proxy when downloading URL contents.
</DL>
<A NAME="lbAK">&nbsp;</A>
<A NAME="lbAJ">&nbsp;</A>
<H2>NOTES</H2>
Should retrieval fail on any given strip
@ -258,7 +210,7 @@ At the time of writing, a
<B>complete</B>
Dosage collection weighs in at around 3.0GB.
<A NAME="lbAL">&nbsp;</A>
<A NAME="lbAK">&nbsp;</A>
<H2>RETURN VALUE</H2>
The return value greater than zero when
@ -273,24 +225,13 @@ the program run was aborted with Ctrl-C
<P>
Else the return value is zero.
<A NAME="lbAM">&nbsp;</A>
<A NAME="lbAL">&nbsp;</A>
<H2>BUGS</H2>
See
<I><A HREF="http://trac.slipgate.za.net/dosage">http://trac.slipgate.za.net/dosage</A></I>
Users can report or view bugs, patches or feature suggestions at
<I><A HREF="https://github.com/wummel/dosage/issues">https://github.com/wummel/dosage/issues</A></I>
for a list of current development tasks and suggestions.
<P>
<A NAME="lbAN">&nbsp;</A>
<H2>FILES</H2>
<DL COMPACT>
<DT><B>/etc/dosage/disabled</B><DD>
Disables comic modules on a global scale.
<DT><B>~/.dosage/disabled</B><DD>
Disables comic modules on a local scale.
</DL>
<A NAME="lbAO">&nbsp;</A>
<A NAME="lbAM">&nbsp;</A>
<H2>AUTHORS</H2>
Jonathan Jacobs &lt;<A HREF="mailto:korpse@slipgate.za.net">korpse@slipgate.za.net</A>&gt;
@ -300,7 +241,7 @@ Tristan Seligmann &lt;<A HREF="mailto:mithrandi@slipgate.za.net">mithrandi@slipg
<BR>
Bastian Kleineidam &lt;<A HREF="mailto:calvin@users.sourceforge.net">calvin@users.sourceforge.net</A>&gt;
<A NAME="lbAP">&nbsp;</A>
<A NAME="lbAN">&nbsp;</A>
<H2>COPYRIGHT</H2>
Copyright &#169; 2004-2005 Tristan Seligmann and Jonathan Jacobs
@ -317,16 +258,14 @@ Copyright &#169; 2012 Bastian Kleineidam
<DT><A HREF="#lbAD">DESCRIPTION</A><DD>
<DT><A HREF="#lbAE">OPTIONS</A><DD>
<DT><A HREF="#lbAF">INDEX SYNTAX</A><DD>
<DT><A HREF="#lbAG">OFFENSIVE COMICS</A><DD>
<DT><A HREF="#lbAH">SPECIAL SYNTAX</A><DD>
<DT><A HREF="#lbAI">EXAMPLES</A><DD>
<DT><A HREF="#lbAJ">ENVIRONMENT</A><DD>
<DT><A HREF="#lbAK">NOTES</A><DD>
<DT><A HREF="#lbAL">RETURN VALUE</A><DD>
<DT><A HREF="#lbAM">BUGS</A><DD>
<DT><A HREF="#lbAN">FILES</A><DD>
<DT><A HREF="#lbAO">AUTHORS</A><DD>
<DT><A HREF="#lbAP">COPYRIGHT</A><DD>
<DT><A HREF="#lbAG">SPECIAL SYNTAX</A><DD>
<DT><A HREF="#lbAH">EXAMPLES</A><DD>
<DT><A HREF="#lbAI">ENVIRONMENT</A><DD>
<DT><A HREF="#lbAJ">NOTES</A><DD>
<DT><A HREF="#lbAK">RETURN VALUE</A><DD>
<DT><A HREF="#lbAL">BUGS</A><DD>
<DT><A HREF="#lbAM">AUTHORS</A><DD>
<DT><A HREF="#lbAN">COPYRIGHT</A><DD>
</DL>
<HR>
This document was created by

20
dosage
View file

@ -17,6 +17,7 @@
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
from __future__ import print_function
import sys
import os
import optparse
@ -24,7 +25,7 @@ import optparse
from dosagelib import events, scraper
from dosagelib.output import out
from dosagelib.util import get_columns, internal_error
from dosagelib.configuration import App, Freeware, Copyright
from dosagelib.configuration import App, Freeware, Copyright, SupportUrl
def setupOptions():
"""Construct option parser.
@ -48,9 +49,10 @@ def setupOptions():
def displayVersion():
"""Display application name, version, copyright and license."""
print App
print Copyright
print Freeware
print(App)
print(Copyright)
print(Freeware)
print("For support see", SupportUrl)
return 0
@ -70,7 +72,7 @@ def saveComicStrip(strip, basepath):
filename, saved = image.save(basepath)
if saved:
allskipped = False
except IOError, msg:
except IOError as msg:
out.write('Error saving %s: %s' % (image.filename, msg))
errors += 1
return errors, allskipped
@ -123,7 +125,7 @@ def run(options, comics):
if options.modhelp:
return displayHelp(comics, options.basepath)
return getComics(options, comics)
except ValueError, msg:
except ValueError as msg:
out.write("Error: %s" % msg)
return 1
@ -143,7 +145,7 @@ def doList(columnList):
def doSingleList(scrapers):
"""Get list of scraper names, one per line."""
for num, scraperobj in enumerate(scrapers):
print scraperobj.get_name()
print(scraperobj.get_name())
return num
@ -155,7 +157,7 @@ def doColumnList(scrapers):
maxlen = max([len(name) for name in names])
namesPerLine = int(screenWidth / (maxlen + 1))
while names:
print ''.join([name.ljust(maxlen) for name in names[:namesPerLine]])
print(''.join([name.ljust(maxlen) for name in names[:namesPerLine]]))
del names[:namesPerLine]
return num
@ -192,7 +194,7 @@ def main():
options, args = parser.parse_args()
res = run(options, args)
except KeyboardInterrupt:
print "Aborted."
print("Aborted.")
res = 1
except Exception:
internal_error()

View file

@ -53,12 +53,12 @@ class ComicImage(object):
"""Connect to host and get meta information."""
try:
self.urlobj = urlopen(self.url, referrer=self.referrer)
except urllib2.HTTPError, he:
raise FetchComicError, ('Unable to retrieve URL.', self.url, he.code)
except urllib2.HTTPError as he:
raise FetchComicError('Unable to retrieve URL.', self.url, he.code)
if self.urlobj.info().getmaintype() != 'image' and \
self.urlobj.info().gettype() not in ('application/octet-stream', 'application/x-shockwave-flash'):
raise FetchComicError, ('No suitable image found to retrieve.', self.url)
raise FetchComicError('No suitable image found to retrieve.', self.url)
# Always use mime type for file extension if it is sane.
if self.urlobj.info().getmaintype() == 'image':

View file

@ -3,6 +3,7 @@
"""
File and path utilities.
"""
import importlib
def has_module (name):
"""Test if given module can be imported.
@ -10,7 +11,7 @@ def has_module (name):
@rtype: bool
"""
try:
exec "import %s as _bla" % name
importlib.import_module(name)
return True
except (OSError, ImportError):
# some modules (for example HTMLtidy) raise OSError

View file

@ -18,12 +18,12 @@ def get_modules(folder='plugins'):
try:
name ="..%s.%s" % (folder, modname)
yield importlib.import_module(name, __name__)
except StandardError, msg:
except ImportError as msg:
print "ERROR: could not load module %s: %s" % (modname, msg)
def get_importable_modules(folder):
"""Find all module files in the given folder that end witn '.py' and
"""Find all module files in the given folder that end with '.py' and
don't start with an underscore.
@return module names
@rtype: iterator of string

View file

@ -1,6 +1,7 @@
# -*- coding: iso-8859-1 -*-
# Copyright (C) 2004-2005 Tristan Seligmann and Jonathan Jacobs
# Copyright (C) 2012 Bastian Kleineidam
from __future__ import print_function
import time
class Output(object):
@ -20,7 +21,7 @@ class Output(object):
timestamp = time.strftime('%H:%M:%S ')
else:
timestamp = ''
print '%s%s> %s' % (timestamp, self.context, s)
print('%s%s> %s' % (timestamp, self.context, s))
def writelines(self, lines, level=0):
"""Write multiple messages."""

View file

@ -3,12 +3,12 @@
from re import compile, MULTILINE
from ..util import tagre
from ..scraper import _BasicScraper
from ..helpers import regexNamer, bounceStarter, indirectStarter
from ..helpers import regexNamer, bounceStarter
class ALessonIsLearned(_BasicScraper):
latestUrl = 'http://www.alessonislearned.com/'
stripUrl = 'http://www.alessonislearned.com/lesson%s.html'
stripUrl = latestUrl + 'index.php?comic=%s'
imageSearch = compile(tagre("img", "src", r"(cmx/lesson\d+\.[a-z]+)"))
prevSearch = compile(tagre("a", "href", r"(index\.php\?comic=\d+)", quote="'")+r"[^>]+previous")
help = 'Index format: nnn'
@ -16,7 +16,7 @@ class ALessonIsLearned(_BasicScraper):
class ASofterWorld(_BasicScraper):
latestUrl = 'http://www.asofterworld.com/'
stripUrl = 'http://www.asofterworld.com/index.php?id=%s'
stripUrl = latestUrl + 'index.php?id=%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.asofterworld\.com/clean/[^"]+)'))
prevSearch = compile(tagre("a", "href", "(index\.php\?id=\d+)")+'< back')
help = 'Index format: n (unpadded)'
@ -24,15 +24,15 @@ class ASofterWorld(_BasicScraper):
class AbleAndBaker(_BasicScraper):
latestUrl = 'http://www.jimburgessdesign.com/comics/index.php'
stripUrl = 'http://www.jimburgessdesign.com/comics/index.php?comic=%s'
stripUrl = latestUrl + '?comic=%s'
imageSearch = compile(tagre('img', 'src', r'(comics/.+)'))
prevSearch = compile(tagre('a', 'href', r'(.+\d+)') + '.+?previous.gif')
help = 'Index format: nnn'
class AbominableCharlesChristopher(_BasicScraper):
latestUrl = 'http://abominable.cc/'
stripUrl = 'http://abominable.cc/%s'
latestUrl = 'http://www.abominable.cc/'
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.abominable\.cc/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'([^"]+)')+"[^<]+Previous")
help = 'Index format: yyyy/mm/dd/comicname'
@ -49,7 +49,7 @@ class AbsurdNotions(_BasicScraper):
class AbstruseGoose(_BasicScraper):
starter = bounceStarter('http://abstrusegoose.com/',
compile(tagre('a', 'href', r'(http://abstrusegoose\.com/\d+)')+"Next &raquo;</a>"))
stripUrl = 'http://abstrusegoose.com/c%s.html'
stripUrl = 'http://abstrusegoose.com/%s'
imageSearch = compile(tagre('img', 'src', r'(http://abstrusegoose\.com/strips/[^<>"]+)'))
prevSearch = compile(tagre('a', 'href', r'(http://abstrusegoose\.com/\d+)') + r'&laquo; Previous</a>')
help = 'Index format: n (unpadded)'
@ -62,57 +62,37 @@ class AbstruseGoose(_BasicScraper):
class AcademyVale(_BasicScraper):
latestUrl = 'http://imagerie.com/vale/'
stripUrl = 'http://imagerie.com/vale/avarch.cgi?%s'
latestUrl = 'http://www.imagerie.com/vale/'
stripUrl = latestUrl + 'avarch.cgi?%s'
imageSearch = compile(tagre('img', 'src', r'(avale\d{4}-\d{2}\.gif)'))
prevSearch = compile(tagre('a', 'href', r'(avarch[^"]+)') + tagre('img', 'src', 'AVNavBack\.gif'))
prevSearch = compile(tagre('a', 'href', r'(avarch[^">]+)', quote="") + tagre('img', 'src', 'AVNavBack\.gif'))
help = 'Index format: nnn'
class Alice(_BasicScraper):
latestUrl = 'http://alice.alicecomics.com/'
stripUrl = 'http://alice.alicecomics.com/wp-content/webcomic/alicecomics/%s.jpg'
stripUrl = latestUrl + '%s/'
imageSearch = compile(tagre("img", "src", r'(http://alice\.alicecomics\.com/wp-content/webcomic/alicecomics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://alice.alicecomics.com/archive/[^"]+)', after="previous"))
help = 'Index format: yyyy-mm-dd'
prevSearch = compile(tagre("a", "href", r'(http://alice\.alicecomics\.com/archive/[^"]+)', after="previous"))
help = 'Index format: name'
class AlienLovesPredator(_BasicScraper):
stripUrl = 'http://alienlovespredator.com/%s'
imageSearch = compile(r'<img src="(.+?)"[^>]+>(<center>\n|\n|</center>\n)<div style="height: 2px;">&nbsp;</div>', MULTILINE)
prevSearch = compile(r'<a href="(.+?)"><img src="/images/nav_previous.jpg"')
help = 'Index format: nnn'
starter = bounceStarter('http://alienlovespredator.com/index.php', compile(r'<a href="(.+?)"><img src="/images/nav_next.jpg"'))
@classmethod
def namer(cls, imageUrl, pageUrl):
vol = pageUrl.split('/')[-5]
num = pageUrl.split('/')[-4]
ccc = pageUrl.split('/')[-3]
ddd = pageUrl.split('/')[-2]
return '%s-%s-%s-%s' % (vol, num, ccc, ddd)
class AnarchySD(_BasicScraper):
stripUrl = 'http://www.anarchycomic.com/page%s.php'
imageSearch = compile(tagre('img', 'src', r'../(images/page\d+\..+?)'))
prevSearch = compile(tagre('a', 'href', r'(page\d+\.php)')+'PREVIOUS PAGE')
help = 'Index format: n (unpadded)'
starter = indirectStarter(
'http://www.anarchycomic.com/page1.php',
compile(r'<a href="(page\d+\.php)" class="style15">LATEST'))
latestUrl = 'http://alienlovespredator.com/'
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r'(http://alienlovespredator\.com/strips/strip_\d\.jpg)'))
prevSearch = compile(tagre("a", "href", r'([^"]+)', after="prev"))
help = 'Index format: yyyy/mm/dd/name/'
class Altermeta(_BasicScraper):
latestUrl = 'http://altermeta.net/'
stripUrl = 'http://altermeta.net/archive.php?comic=%s&view=showfiller'
stripUrl = latestUrl + 'archive.php?comic=%s'
imageSearch = compile(r'<img src="(comics/[^"]+)" />')
prevSearch = compile(r'<a href="([^"]+)"><img src="http://altermeta\.net/template/default/images/sasha/back\.png')
help = 'Index format: n (unpadded)'
class AltermetaOld(Altermeta):
name = 'Altermeta/Old'
latestUrl = 'http://altermeta.net/oldarchive/index.php'
@ -120,19 +100,17 @@ class AltermetaOld(Altermeta):
prevSearch = compile(r'<a href="([^"]+)">Back')
class Angels2200(_BasicScraper):
latestUrl = 'http://www.janahoffmann.com/angels/'
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r"(http://www\.janahoffmann\.com/angels/comics/[^']+)"))
imageSearch = compile(tagre("img", "src", r"(http://www\.janahoffmann\.com/angels/comics/[^'\"]+)"))
prevSearch = compile(tagre("a", "href", r'([^"]+)')+"&laquo; Previous")
help = 'Index format: yyyy/mm/dd/part-<n>-comic-<n>'
class AppleGeeks(_BasicScraper):
latestUrl = 'http://www.applegeeks.com/'
stripUrl = 'http://www.applegeeks.com/comics/viewcomic.php?issue=%s'
stripUrl = latestUrl + 'comics/viewcomic.php?issue=%s'
imageSearch = compile(tagre("img", "src", r'"(strips/\d+?\..+?)"'))
prevSearch = compile(r'<div class="caption">Previous Comic</div>\s*<p><a href="([^"]+)">', MULTILINE)
help = 'Index format: n (unpadded)'
@ -140,14 +118,13 @@ class AppleGeeks(_BasicScraper):
class Achewood(_BasicScraper):
latestUrl = 'http://www.achewood.com/'
stripUrl = 'http://www.achewood.com/index.php?date=%s'
stripUrl = latestUrl + 'index.php?date=%s'
imageSearch = compile(tagre("img", "src", r'(/comic\.php\?date=\d+)'))
prevSearch = compile(tagre("a", "href", r'(index\.php\?date=\d+)', after="Previous"))
help = 'Index format: mmddyyyy'
namer = regexNamer(compile(r'date%3D(\d{8})'))
class AstronomyPOTD(_BasicScraper):
starter = bounceStarter(
'http://antwrp.gsfc.nasa.gov/apod/astropix.html',
@ -163,7 +140,6 @@ class AstronomyPOTD(_BasicScraper):
imageUrl.split('/')[-1].split('.')[0])
class AfterStrife(_BasicScraper):
latestUrl = 'http://afterstrife.com/?p=262'
stripUrl = 'http://afterstrife.com/?p=%s'
@ -172,29 +148,26 @@ class AfterStrife(_BasicScraper):
help = 'Index format: nnn'
class ALLCAPS(_BasicScraper):
latestUrl = 'http://www.allcapscomix.com/'
stripUrl = 'http://www.allcapscomix.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.allcapscomix\.com/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'([^"]+)')+r"[^<]+Previous</a>")
help = 'Index format: yyyy/mm/strip-name'
class ASkeweredParadise(_BasicScraper):
latestUrl = 'http://aspcomics.net/'
stripUrl = 'http://aspcomics.net/archindex.php?strip_id=%s'
stripUrl = latestUrl + 'comic/%s'
imageSearch = compile(tagre("img", "src", r'(http://aspcomics\.net/sites/default/files[^"]*/asp\d+\.jpg)[^"]+'))
prevSearch = compile(tagre("a", "href", "(/comic/\d+)")+r"[^>]+Previous")
help = 'Index format: nnn'
class AGirlAndHerFed(_BasicScraper):
starter = bounceStarter('http://www.agirlandherfed.com/',
compile(r'<a href="([^"]+)">[^>]+Back'))
stripUrl = 'http://www.agirlandherfed.com/img/strip/%s'
stripUrl = 'http://www.agirlandherfed.com/1.%s.html'
imageSearch = compile(tagre("img", "src", r'(img/strip/[^"]+\.jpg)'))
prevSearch = compile(r'<a href="([^"]+)">[^>]+Back')
help = 'Index format: nnn'
@ -204,88 +177,70 @@ class AGirlAndHerFed(_BasicScraper):
return pageUrl.split('?')[-1]
class AetheriaEpics(_BasicScraper):
latestUrl = 'http://aetheria-epics.schala.net/'
stripUrl = 'http://aetheria-epics.schala.net/%s.html'
imageSearch = compile(r'<td><img src="(\d{5}.\w{3,4})"')
prevSearch = compile(r'<a href="(\d{5}.html)"><img src="prev.jpg"\/>')
stripUrl = latestUrl + '%s.html'
imageSearch = compile(tagre("img", "src", r'(\d{5}\.jpg)'))
prevSearch = compile(tagre("a", "href", r'(\d{5}\.html)') + "Previous")
help = 'Index format: nnn'
class Adrift(_BasicScraper):
latestUrl = 'http://www.adriftcomic.com/'
stripUrl = 'http://www.adriftcomic.com/page%s.html'
imageSearch = compile(r'<IMG SRC="(Adrift_Web_Page\d+.jpg)"')
prevSearch = compile(r'<A HREF="(.+?)"><IMG SRC="AdriftBackLink.gif"')
help = 'Index format: nnn'
class AirForceBlues(_BasicScraper):
latestUrl = 'http://www.afblues.com/'
stripUrl = 'http://www.afblues.com/?p=%s'
imageSearch = compile(r'<img src=\'(http://www.afblues.com/comics/.+?)\'>')
prevSearch = compile(r'<a href="(http://www.afblues.com/.+?)">&laquo; Previous')
help = 'Index format: nnn'
stripUrl = latestUrl + 'wordpress/%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.afblues\.com/wordpress/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'([^"]+)', after='Previous'))
help = 'Index format: yyyy/mm/dd/name/'
class AlienShores(_BasicScraper):
latestUrl = 'http://alienshores.com/alienshores_band/'
stripUrl = 'http://alienshores.com/alienshores_band/?p=%s'
imageSearch = compile(r'><img src="(http://alienshores.com/alienshores_band/comics/.+?)"')
prevSearch = compile(r'<a href="(http://alienshores.com/.+?)" rel="prev">')
help = 'Index format: nnn'
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r'(http://alienshores\.com/alienshores_band/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://alienshores\.com/[^"]+)', after="prev"))
help = 'Index format: yyyy/mm/dd/p<nn>/'
class AllTheGrowingThings(_BasicScraper):
latestUrl = 'http://typodmary.com/growingthings/'
stripUrl = 'http://typodmary.com/growingthings/%s/'
imageSearch = compile(r'<img src="(http://typodmary.com/growingthings/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(http://typodmary.com/growingthings/.+?)"')
latestUrl = 'http://growingthings.typodmary.com/'
stripUrl = latestUrl + '%s/'
imageSearch = compile(tagre("img", "src", r'(http://growingthings\.typodmary\.com/files/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://growingthings\.typodmary\.com/[^"]+)', after="prev"))
help = 'Index format: yyyy/mm/dd/strip-name'
class Amya(_BasicScraper):
latestUrl = 'http://www.amyachronicles.com/'
stripUrl = 'http://www.amyachronicles.com/archives/%s'
stripUrl = latestUrl + 'archives/%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.amyachronicles\.com/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://www\.amyachronicles\.com/archives/\d+)', after="Previous"))
help = 'Index format: n'
class Angband(_BasicScraper):
latestUrl = 'http://angband.calamarain.net/'
stripUrl = 'http://angband.calamarain.net/view.php?date=%s'
stripUrl = latestUrl + 'view.php?date=%s'
imageSearch = compile(tagre("img", "src", r'(comics/Scroll[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(view\.php\?date\=[^"]+)')+"Previous")
help = 'Index format: yyyy-mm-dd'
class ActionAthena(_BasicScraper):
latestUrl = 'http://actionathena.com/'
stripUrl = 'http://actionathena.com/2%s'
stripUrl = latestUrl + '2%s'
imageSearch = compile(r'<img src=\'(http://actionathena.com/comics/.+?)\'>')
prevSearch = compile(r'<a href="(http://actionathena.com/.+?)">&laquo; Previous</a>')
help = 'Index format: yyyy/mm/dd/strip-name'
class AlsoBagels(_BasicScraper):
latestUrl = 'http://www.alsobagels.com/'
stripUrl = 'http://alsobagels.com/index.php/comic/%s/'
imageSearch = compile(r'<img src="(http://alsobagels.com/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(http://alsobagels.com/index.php/comic/.+?)">')
latestUrl = 'http://alsobagels.com/'
stripUrl = latestUrl + 'index.php/comic/%s/'
imageSearch = compile(tagre("img", "src", r'(http://alsobagels\.com/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://alsobagels\.com/index\.php/comic/[^"]+)', after="Previous"))
help = 'Index format: strip-name'
class Annyseed(_BasicScraper):
latestUrl = 'http://www.colourofivy.com/annyseed_webcomic_latest.htm'
stripUrl = 'http://www.colourofivy.com/annyseed_webcomic%s.htm'

View file

@ -8,57 +8,47 @@ from ..scraper import _BasicScraper
class BadlyDrawnKitties(_BasicScraper):
latestUrl = 'http://www.badlydrawnkitties.com/'
stripUrl = 'http://www.badlydrawnkitties.com/new/%s.html'
imageSearch = compile(r'<img src="(/new/.+?)">')
stripUrl = latestUrl + '%s.html'
imageSearch = compile(tagre("img", "src", r'(/new/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(/[^"]+)') + tagre("img", "src", r'/images/previous\.gif'))
help = 'Index format: n (unpadded)'
help = 'Index format: n/nn (unpadded)'
class Bardsworth(_BasicScraper):
latestUrl = 'http://www.bardsworth.com/'
stripUrl = 'http://www.bardsworth.com/archive.php?p=s%'
imageSearch = compile(r'(strips/.+?)"')
prevSearch = compile(r'"(http.+?)".+?/prev')
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.bardsworth\.com/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://www\.bardsworth\.com/[^"]+)', after="prev"))
help = 'Index format: nnn'
class BetterDays(_BasicScraper):
latestUrl = 'http://www.jaynaylor.com/betterdays/'
stripUrl = 'http://www.jaynaylor.com/betterdays/archives/%s'
imageSearch = compile(r'<img src=(/betterdays/comic/.+?)>')
prevSearch = compile(r'<a href="(.+)">&laquo; Previous')
help = 'Index format: yyyy/mm/<your guess>.html'
class BetterYouThanMe(_BasicScraper):
latestUrl = 'http://betteryouthanme.net/'
stripUrl = 'http://betteryouthanme.net/archive.php?date=%s.gif'
imageSearch = compile(r'"(comics/.+?)"')
prevSearch = compile(r'"(archive.php\?date=.+?)">.+?previous')
help = 'Index format: yyyymmdd'
latestUrl = 'http://jaynaylor.com/betterdays/'
stripUrl = latestUrl + 'archives/%s.html'
imageSearch = compile(tagre("img", "src", r'(/betterdays/comic/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'([^"]+)') + '&laquo; Previous')
help = 'Index format: yyyy/mm/<your guess>'
class BiggerThanCheeses(_BasicScraper):
latestUrl = 'http://www.biggercheese.com'
stripUrl = 'http://www.biggercheese.com/index.php?comic=%s'
latestUrl = 'http://www.biggercheese.com/'
stripUrl = latestUrl + 'index.php?comic=%s'
imageSearch = compile(r'src="(comics/.+?)" alt')
prevSearch = compile(r'"(index.php\?comic=.+?)".+?_back')
help = 'Index format: n (unpadded)'
class BizarreUprising(_BasicScraper):
latestUrl = 'http://www.bizarreuprising.com/'
stripUrl = 'http://www.bizarreuprising.com/view/%s'
imageSearch = compile(r'<img src="(comic/[^"]+)"')
prevSearch = compile(r'<a href="(view/\d+/[^"]+)"><img src="images/b_prev\.gif"')
stripUrl = latestUrl + 'view/%s'
imageSearch = compile(tagre("img", "src", r'(comic/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(view/\d+/[^"]+)') + tagre("img", "src", r'images/b_prev\.gif'))
help = 'Index format: n/name'
class Blip(_BasicScraper):
latestUrl = 'http://blipcomic.com/'
stripUrl = 'http://blipcomic.com/index.php?strip_id=%s'
stripUrl = latestUrl + 'index.php?strip_id=%s'
imageSearch = compile(r'(istrip_files/strips/.+?)"')
prevSearch = compile(r'First.+?"(index.php\?strip_id=.+?)".+?prev')
help = 'Index format: n'
@ -66,7 +56,7 @@ class Blip(_BasicScraper):
class BlueCrashKit(_BasicScraper):
latestUrl = 'http://www.bluecrashkit.com/cheese/'
stripUrl = 'http://www.bluecrashkit.com/cheese/node/%s'
stripUrl = latestUrl + 'node/%s'
imageSearch = compile(r'(/cheese/files/comics/.+?)"')
prevSearch = compile(r'(/cheese/node/.+?)".+?previous')
help = 'Index format: non'
@ -74,7 +64,7 @@ class BlueCrashKit(_BasicScraper):
class BMovieComic(_BasicScraper):
latestUrl = 'http://www.bmoviecomic.com/'
stripUrl = 'http://www.bmoviecomic.com/?cid=%s'
stripUrl = latestUrl + '?cid=%s'
imageSearch = compile(r'"(comics/.+?)"')
prevSearch = compile(r'(\?cid=.+?)".+?Prev')
help = 'Index format: n'
@ -86,7 +76,7 @@ class BMovieComic(_BasicScraper):
### to get earlier comics
class BratHalla(_BasicScraper):
latestUrl = 'http://brat-halla.com/'
stripUrl = 'http://brat-halla.com/comic/%s'
stripUrl = latestUrl + 'comic/%s'
imageSearch = compile(r"(/comics/.+?)' target='_blank")
prevSearch = compile(r'headernav2".+?"(http.+?)"')
help = 'Index format: non'
@ -94,172 +84,108 @@ class BratHalla(_BasicScraper):
class Brink(_BasicScraper):
latestUrl = 'http://paperfangs.com/brink/'
stripUrl = 'http://paperfangs.com/brink/?p=%s'
imageSearch = compile(r'/(comics/.+?)"')
prevSearch = compile(r'previous.+?/brink/(.+?)".+?Previous')
help = 'Index format: non'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(tagre("img", "src", r'(http://paperfangs\.com/brink/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://paperfangs\.com/brink/[^"]+)', after="prev"))
help = 'Index format: n'
class BoredAndEvil(_BasicScraper):
latestUrl = 'http://www.boredandevil.com/'
stripUrl = 'http://www.boredandevil.com/archive.php?date=%s'
imageSearch = compile(r'<img src="(strips/.+?)"')
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(tagre("img", "src", r'(strips/[^"]+)'))
prevSearch = compile(r'First Comic.+<a href="(.+?)".+previous-on.gif')
help = 'Index format: yyyy-mm-dd'
class BoyOnAStickAndSlither(_BasicScraper):
latestUrl = 'http://www.boasas.com/'
stripUrl = 'http://www.boasas.com/?c=%s'
imageSearch = compile(r'"(boasas/\d+\..+?)"')
prevSearch = compile(r'<a href="(.+?)"><img src="images/left_20.png"')
stripUrl = latestUrl + 'page/%s'
imageSearch = compile(tagre("img", "src", r'(http://25\.media\.tumblr\.com/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(/page/\d+)') + "<span>Next page")
help = 'Index format: n (unpadded)'
class ButternutSquash(_BasicScraper):
latestUrl = 'http://www.butternutsquash.net/'
stripUrl = 'http://www.butternutsquash.net/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.butternutsquash\.net/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://www\.butternutsquash\.net/[^"]+)', after="prev"))
help = 'Index format: yyyy/mm/dd/strip-name-author-name'
def blankLabel(name, baseUrl):
return type('BlankLabel_%s' % name,
(_BasicScraper,),
dict(
name='BlankLabel/' + name,
latestUrl=baseUrl,
stripUrl=baseUrl+'d/%s.html',
imageSearch=compile(tagre("img", "src", r'(/comic[s|/][^"]+)')),
prevSearch=compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif"),
#prevSearch=compile(r'(?:"([^"]*(?:/d/[^"\r\n]*)|(?:/strip/.+?))")(?:(?:.{43}starshift_back.gif)|(?:.+?cxn_previous)|(?:.{43}previous)|(?:[^<>]*>[^<>]*<[^<>]*previous)|(?:.*?back_button)|(?:.*?comicnav-previous))'),
help='Index format: yyyymmdd')
)
checkerboard = blankLabel('CheckerboardNightmare', 'http://www.checkerboardnightmare.com/')
courtingDisaster = blankLabel('CourtingDisaster', 'http://www.courting-disaster.com/')
evilInc = blankLabel('EvilInc', 'http://www.evil-comic.com/')
greystoneInn = blankLabel('GreystoneInn', 'http://www.greystoneinn.net/')
itsWalky = blankLabel('ItsWalky', 'http://www.itswalky.com/')
# one strip name starts with %20
#krazyLarry = blankLabel('KrazyLarry', 'http://www.krazylarry.com/')
melonpool = blankLabel('Melonpool', 'http://www.melonpool.com/')
# strip names = index.php
#realLife = blankLabel('RealLife', 'http://www.reallifecomics.com/')
schlockMercenary = blankLabel('SchlockMercenary', 'http://www.schlockmercenary.com/')
# hosted on ComicsDotCom
#sheldon = blankLabel('Sheldon', 'http://www.sheldoncomics.com/')
shortpacked = blankLabel('Shortpacked', 'http://www.shortpacked.com/')
starslipCrisis = blankLabel('StarslipCrisis', 'http://www.starslipcrisis.com/')
uglyHill = blankLabel('UglyHill', 'http://www.uglyhill.com/')
class BeePower(_BasicScraper):
latestUrl = 'http://comicswithoutviolence.com/d/20080713.html'
stripUrl = 'http://comicswithoutviolence.com/d/%s.html'
imageSearch = compile(r'src="(/comics/.+?)"')
prevSearch = compile(r'(\d+\.html)"><img[^>]+?src="/images/previous_day.png"')
help = 'Index format: yyyy/mm/dd'
class BlankIt(_BasicScraper):
latestUrl = 'http://blankitcomics.com/'
stripUrl = 'http://blankitcomics.com/%s'
imageSearch = compile(r'<img src="(http://blankitcomics.com/bicomics/.+?)"')
prevSearch = compile(r'<a href="([^"]+)" rel="prev">')
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r'(http://blankitcomics\.com/bicomics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'([^"]+)', after='rel="prev"'))
help = 'Index format: yyyy/mm/dd/name'
class BobWhite(_BasicScraper):
latestUrl = 'http://www.bobwhitecomics.com/'
stripUrl = 'http://www.bobwhitecomics.com/?webcomic_post=%s'
stripUrl = latestUrl + '?webcomic_post=%s'
imageSearch = compile(tagre("img", "src", r"(http://www\.bobwhitecomics\.com/wp/wp-content/webcomic/untitled/\d+.jpg)"))
prevSearch = compile(tagre("a", "href", "(http://www\.bobwhitecomics\.com/\?webcomic_post=\d+)")+r'[^"]+Previous')
help = 'Index format: yyyymmdd'
class BigFatWhale(_BasicScraper):
latestUrl = 'http://www.bigfatwhale.com/'
stripUrl = 'http://www.bigfatwhale.com/archives/bfw_%s.htm'
imageSearch = compile(r'<img src="(archives/bfw_.+?|bfw_.+?)"')
stripUrl = latestUrl + 'archives/bfw_%s.htm'
imageSearch = compile(tagre("img", "src", r'(archives/bfw_[^"]+|bfw_[^"]+)'))
prevSearch = compile(r' HREF="(.+?)" TARGET="_top" TITLE="Previous Cartoon"')
help = 'Index format: nnn'
class BadassMuthas(_BasicScraper):
latestUrl = 'http://badassmuthas.com/pages/comic.php'
stripUrl = 'http://badassmuthas.com/pages/comic.php?%s'
imageSearch = compile(r'<img src="(/images/comicsissue.+?)"')
prevSearch = compile(r'<a href="(.+?)"><img src="/images/comicsbuttonBack.gif" ')
stripUrl = latestUrl + '?%s'
imageSearch = compile(tagre("img", "src", r'(/images/comicsissue[^"]+)'))
prevSearch = compile(tagre("a", "href", r'([^"]+)') + tagre("img", "src", r'/images/comicsbuttonBack\.gif'))
help = 'Index format: nnn'
class Boozeathon4Billion(_BasicScraper):
latestUrl = 'http://boozeathon4billion.com/'
stripUrl = 'http://boozeathon4billion.com/comics/%s'
imageSearch = compile(r'<img src="(http://boozeathon4billion.com/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)"[^>]+?>Previous</a>')
help = 'Index format: (sometimes chapternumber/)-yyyy-mm-dd/stripname'
class BrightlyWound(_BasicScraper):
latestUrl = 'http://www.brightlywound.com/'
stripUrl = 'http://www.brightlywound.com/?comic=%s'
imageSearch = compile(r'<img src=\'(comic/.+?)\'')
stripUrl = latestUrl + '?comic=%s'
imageSearch = compile(tagre("img", "src", r"(comic/[^']+)"))
prevSearch = compile(r'<div id=\'navback\'><a href=\'(\?comic\=\d+)\'><img src=\'images/previous.png\'')
help = 'Index format: nnn'
class BlueCrashKit(_BasicScraper):
latestUrl = 'http://robhamm.com/bluecrashkit'
stripUrl = 'http://robhamm.com/comics/blue-crash-kit/%s'
imageSearch = compile(r'src="(http://robhamm.com/sites/default/files/comics/.+?)"')
prevSearch = compile(r'<li class="previous"><a href="(.+?)">')
latestUrl = 'http://robhamm.com/bluecrashkit/'
stripUrl = latestUrl + 'comics/blue-crash-kit/%s'
imageSearch = compile(tagre("img", "src", r'(http://robhamm\.com/bluecrashkit/sites/default/files/comics/[^"]+)'))
prevSearch = compile(r'<li class="previous"><a href="([^"]+)">')
help = 'Index format: yyyy-mm-dd'
class BloodBound(_BasicScraper):
latestUrl = 'http://www.bloodboundcomic.com/'
stripUrl = 'http://www.bloodboundcomic.com/d/%s.html'
imageSearch = compile(r' src="(/comics/.+?)"')
prevSearch = compile(r' <a href="(/d/.+?)"><img[^>]+?src="/images/previous_day.jpg"')
help = 'Index format: yyyymmdd'
latestUrl = 'http://bloodboundcomic.com/'
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r'(http://bloodboundcomic\.com/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://bloodboundcomic\.com/[^"]+)', after="prev"))
help = 'Index format: yyyy/mm/name'
class BookOfBiff(_BasicScraper):
latestUrl = 'http://www.thebookofbiff.com/'
stripUrl = 'http://www.thebookofbiff.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r'([^"]+/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'([^"]+)', after="Previous"))
help = 'Index format: yyyy/mm/dd/stripnum-strip-name'
class BillyTheDunce(_BasicScraper):
latestUrl = 'http://www.duncepress.com/'
stripUrl = 'http://www.duncepress.com/%s/'
imageSearch = compile(r'<img src="(http://www.duncepress.com/comics/.+?)"')
stripUrl = latestUrl + '%s/'
imageSearch = compile(tagre("img", "src", r'(http://www\.duncepress\.com/comics/[^"]+)'))
prevSearch = compile(r'<div class="nav-previous"><a href="(http://www.duncepress.com/[^"]+)" rel="prev">')
help = 'Index format: yyyy/mm/strip-name'
class BackwaterPlanet(_BasicScraper):
latestUrl = 'http://www.backwaterplanet.com/current.htm'
stripUrl = 'http://www.backwaterplanet.com/archive/bwp%s.htm'
@ -268,28 +194,26 @@ class BackwaterPlanet(_BasicScraper):
help = 'Index format: yymmdd'
class Baroquen(_BasicScraper):
latestUrl = 'http://www.baroquencomics.com/'
stripUrl = 'http://www.baroquencomics.com/2010/01/04/the-man-from-omi/'
imageSearch = compile(r'<img src="(http://www.baroquencomics.com/Comics/.+?)"')
prevSearch = compile(r'<a href="(http://www.baroquencomics.com/.+?)" rel="prev">')
stripUrl = latestUrl + '%s/'
imageSearch = compile(tagre("img", "src", r'(http://www\.baroquencomics\.com/Comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://www\.baroquencomics\.com/[^"]+)', after='prev'))
help = 'Index format: yyyy/mm/dd/strip-name'
class BetweenFailures(_BasicScraper):
latestUrl = 'http://betweenfailures.com/'
stripUrl = 'http://betweenfailures.com/%s'
imageSearch = compile(r'<img src=\'(http://betweenfailures.com/comics/.+?)\'>')
prevSearch = compile(r'<a href="(http://betweenfailures.com/.+?)">&laquo; Previous</a>')
help = 'Index format: yyyy/mm/dd/stripnum-strip-name'
stripUrl = latestUrl + 'archives/archive/%s'
imageSearch = compile(tagre("img", "src", r'(http://betweenfailures\.com/wp-content/webcomic/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://betweenfailures\.com/archives/archive/[^"]+)', after="previous"))
help = 'Index format: stripnum-strip-name'
class BillyTheBeaker(_BasicScraper):
latestUrl = 'http://billy.defectivejunk.com/'
stripUrl = 'http://billy.defectivejunk.com/index.php?strip=%s'
imageSearch = compile(r'<img src="(bub\d+_\d+.+?)"')
prevSearch = compile(r' <a href="(index.php\?strip\=.+?)" title="Previous strip">')
stripUrl = latestUrl + 'index.php?strip=%s'
imageSearch = compile(tagre("img", "src", r'(bub\d+_\d+[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(index\.php\?strip\=[^"]+)', after="Previous strip"))
help = 'Index format: nnn'

View file

@ -23,7 +23,7 @@ class CalvinAndHobbes(_BasicScraper):
class CandyCartoon(_BasicScraper):
latestUrl = 'http://www.candycartoon.com/'
stripUrl = 'http://www.candycartoon.com/archives/%s.html'
stripUrl = latestUrl + 'archives/%s.html'
imageSearch = compile(r'<img alt="[^"]*" src="(http://www\.candycartoon\.com/archives/[^"]+)"')
prevSearch = compile(r'<a href="(http://www\.candycartoon\.com/archives/\d{6}\.html)">prev')
help = 'Index format: nnnnnn'
@ -32,7 +32,7 @@ class CandyCartoon(_BasicScraper):
class CaptainSNES(_BasicScraper):
latestUrl = 'http://captainsnes.com/'
stripUrl = 'http://captainsnes.com/?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r'<img src=\'(http://www.captainsnes.com/comics/.+?)\'')
prevSearch = compile(r'<a href="http://www.captainsnes.com/(.+?)"><span class="prev">')
help = 'Index format: yyyymmdd'
@ -41,7 +41,7 @@ class CaptainSNES(_BasicScraper):
class CaribbeanBlue(_BasicScraper):
latestUrl = 'http://cblue.katbox.net/'
stripUrl = 'http://cblue.katbox.net/index.php?strip_id=%s'
stripUrl = latestUrl + 'index.php?strip_id=%s'
imageSearch = compile(r'="(.+?strips/.+?)"')
prevSearch = compile(r'<a href="(.+?)"><img src="images/navigation_back.png"')
help = 'Index format: n (unpadded)'
@ -58,7 +58,7 @@ class Catena(_BasicScraper):
class Catharsis(_BasicScraper):
latestUrl = 'http://catharsiscomic.com/'
stripUrl = 'http://catharsiscomic.com/archive.php?strip=%s'
stripUrl = latestUrl + 'archive.php?strip=%s'
imageSearch = compile(r'<img src="(strips/.+?)"')
prevSearch = compile(r'<a href="(.+?)".+"Previous')
help = 'Index format: yymmdd-<your guess>.html'
@ -67,16 +67,23 @@ class Catharsis(_BasicScraper):
class ChasingTheSunset(_BasicScraper):
latestUrl = 'http://www.fantasycomic.com/'
stripUrl = 'http://www.fantasycomic.com/index.php?p=c%s'
stripUrl = latestUrl + 'index.php?p=c%s'
imageSearch = compile(r'(/cmsimg/.+?)".+?comic-img')
prevSearch = compile(r'<a href="(.+?)" title="" ><img src="(images/eye-prev.png|images/cn-prev.png)"')
help = 'Index format: n'
class CheckerboardNightmare(_BasicScraper):
latestUrl = 'http://www.checkerboardnightmare.com/'
stripUrl = latestUrl + 'd/%s.shtml'
imageSearch=compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch=compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help='Index format: yyyymmdd'
class Chisuji(_BasicScraper):
latestUrl = 'http://www.chisuji.com/'
stripUrl = 'http://www.chisuji.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'<img src="(http://www.chisuji.com/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(http://www.chisuji.com/.+?)">')
help = 'Index format: yyyy/mm/dd/strip-name'
@ -85,7 +92,7 @@ class Chisuji(_BasicScraper):
class ChugworthAcademy(_BasicScraper):
latestUrl = 'http://chugworth.com/'
stripUrl = 'http://chugworth.com/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'<img src="(.+?)" alt="Comic')
prevSearch = compile(r'<a href="(http://chugworth.com/\?p=\d{1,4})"[^>]+?title="Previous">')
help = 'Index format: n (unpadded)'
@ -103,7 +110,7 @@ class ChugworthAcademyArchive(_BasicScraper):
class CigarroAndCerveja(_BasicScraper):
latestUrl = 'http://www.cigarro.ca/'
stripUrl = 'http://www.cigarro.ca/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r"(/comics/.+?)'")
prevSearch = compile(r'(/\?p=.+?)">&laq')
help = 'Index format: non'
@ -120,7 +127,7 @@ class TinyKittenTeeth(_BasicScraper):
class Comedity(_BasicScraper):
latestUrl = 'http://www.comedity.com/'
stripUrl = 'http://www.comedity.com/index.php?strip_id=%s'
stripUrl = latestUrl + 'index.php?strip_id=%s'
imageSearch = compile(r'<img src="(Comedity_files/.+?)"')
prevSearch = compile(r'<a href="(/?index.php\?strip_id=\d+?)"> *<img alt=\"Prior Strip')
help = 'Index format: n (no padding)'
@ -128,7 +135,7 @@ class Comedity(_BasicScraper):
class Commissioned(_BasicScraper):
latestUrl = 'http://www.commissionedcomic.com/'
stripUrl = 'http://www.commissionedcomic.com/index.php?strip=%s'
stripUrl = latestUrl + 'index.php?strip=%s'
imageSearch = compile(r'<img src="(http://www.commissionedcomic.com/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)">&lsaquo;</a>')
help = 'Index format: n'
@ -137,7 +144,7 @@ class Commissioned(_BasicScraper):
class CoolCatStudio(_BasicScraper):
latestUrl = 'http://www.coolcatstudio.com/'
stripUrl = 'http://www.coolcatstudio.com/strips-cat/ccs%s'
stripUrl = latestUrl + 'strips-cat/ccs%s'
imageSearch = compile(tagre("img", "src", r'(http://www.coolcatstudio.com/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://www\.coolcatstudio\.com/strips-cat/[^"]+)', before="cniprevt"))
help = 'Index format: yyyymmdd'
@ -146,9 +153,9 @@ class CoolCatStudio(_BasicScraper):
class CourtingDisaster(_BasicScraper):
latestUrl = 'http://www.courting-disaster.com/'
stripUrl = 'http://www.courting-disaster.com/archive/%s.html'
stripUrl = latestUrl + 'archive/%s.html'
imageSearch = compile(r'(/comics/.+?)"')
prevSearch = compile(r'</a><a href="(.+?)"><img src="/images/previous.gif"[^>]+?>')
prevSearch = compile(r'<a href="(.+?)"><img src="/images/previous.gif"[^>]+?>')
help = 'Index format: yyyymmdd'
@ -178,7 +185,7 @@ class CtrlAltDelSillies(CtrlAltDel):
class Curvy(_BasicScraper):
latestUrl = 'http://www.c.urvy.org/'
stripUrl = 'http://www.c.urvy.org/?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r'(/c/.+?)"')
prevSearch = compile(r'(/\?date=.+?)">&lt;&lt; Previous page')
help = 'Index format: yyyymmdd'
@ -220,7 +227,7 @@ penny = cloneManga('PennyTribute', 'penny')
class CatAndGirl(_BasicScraper):
latestUrl = 'http://catandgirl.com/'
stripUrl = 'http://catandgirl.com/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(tagre("img", "src", r'(http://catandgirl\.com/archive/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'([^"]+)')+r"[^<]+Previous</a>")
help = 'Index format: n (unpadded)'
@ -341,8 +348,8 @@ def creators(name, shortname):
name='Creators/' + name,
latestUrl='http://www.creators.com/comics_show.cfm?ComicName=%s' % (shortname,),
stripUrl=None,
imageSearch=compile(r'<img alt="[^"]+" src="(\d{4}/.+?/.+?\..+?)">'),
prevSearch=compile(r'<a href="(comics_show\.cfm\?next=\d+&ComicName=.+?)" Title="Previous Comic"'),
imageSearch=compile(tagre("img", "src", r'(\d{4}/[^"]+/[^"]+\.[^"]+)')),
prevSearch=compile(tagre("a", "href", r'(comics_show\.cfm\?next=\d+&ComicName=[^"]+)', after='Previous Comic')),
help='Indexing unsupported')
)
@ -361,7 +368,7 @@ zhi = creators('ZackHill', 'zhi')
class CyanideAndHappiness(_BasicScraper):
latestUrl = 'http://www.explosm.net/comics'
stripUrl = 'http://www.explosm.net/comics/%s'
stripUrl = latestUrl + '/%s'
imageSearch = compile(r'<img alt="Cyanide and Happiness, a daily webcomic" src="(http:\/\/www\.explosm\.net/db/files/Comics/\w+/\S+\.\w+)"')
prevSearch = compile(r'<a href="(/comics/\d+/?)">< Previous</a>')
help = 'Index format: n (unpadded)'
@ -370,7 +377,7 @@ class CyanideAndHappiness(_BasicScraper):
class CrimsonDark(_BasicScraper):
latestUrl = 'http://www.davidcsimon.com/crimsondark/'
stripUrl = 'http://www.davidcsimon.com/crimsondark/index.php?view=comic&strip_id=%s'
stripUrl = latestUrl + 'index.php?view=comic&strip_id=%s'
imageSearch = compile(r'src="(.+?strips/.+?)"')
prevSearch = compile(r'<a href=[\'"](/crimsondark/index\.php\?view=comic&amp;strip_id=\d+)[\'"]><img src=[\'"]themes/cdtheme/images/active_prev.png[\'"]')
help = 'Index format: n (unpadded)'
@ -397,8 +404,8 @@ class CatsAndCameras(_BasicScraper):
class CowboyJedi(_BasicScraper):
latestUrl = 'http://www.cowboyjedi.com/'
stripUrl = 'http://www.cowboyjedi.com/%s'
imageSearch = compile(r'<img src="(http://www.cowboyjedi.com/comics/.+?)"')
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.cowboyjedi.\com/comics/[^"]+)'))
prevSearch = compile(r'<a href="(http://www.cowboyjedi.com/.+?)" class="navi navi-prev"')
help = 'Index format: yyyy/mm/dd/strip-name'
@ -406,16 +413,16 @@ class CowboyJedi(_BasicScraper):
class CasuallyKayla(_BasicScraper):
latestUrl = 'http://casuallykayla.com/'
stripUrl = 'http://casuallykayla.com/?p=%s'
imageSearch = compile(r'<img src="(http://casuallykayla.com/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(.+?)">')
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(tagre("img", "src", r'(http://casuallykayla\.com/comics/[^"]+)'))
prevSearch = compile(tagre("div", "class", r'nav-previous') + tagre("a", "href", r'([^"]+)'))
help = 'Index format: nnn'
class Collar6(_BasicScraper):
latestUrl = 'http://collar6.com/'
stripUrl = 'http://collar6.com/archive/%s'
stripUrl = latestUrl + 'archive/%s'
imageSearch = compile(tagre("img", "src", r'(http://collar6\.com/wp-content/webcomic/collar6/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://collar6\.com/archive/[^"]+)', after="previous"))
help = 'Index format: <name>'
@ -424,8 +431,8 @@ class Collar6(_BasicScraper):
class Chester5000XYV(_BasicScraper):
latestUrl = 'http://jessfink.com/Chester5000XYV/'
stripUrl = 'http://jessfink.com/Chester5000XYV/?p=%s'
imageSearch = compile(r'<img src="(http://jessfink.com/Chester5000XYV/comics/.+?)"')
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(tagre("img", "src", r'(http://jessfink\.com/Chester5000XYV/comics/[^"]+)'))
prevSearch = compile(r'<a href="(.+?)"><span class="prev">')
help = 'Index format: nnn'
@ -433,8 +440,8 @@ class Chester5000XYV(_BasicScraper):
class CalamitiesOfNature(_BasicScraper):
latestUrl = 'http://www.calamitiesofnature.com/'
stripUrl = 'http://www.calamitiesofnature.com/archive/?c=%s'
imageSearch = compile(r'<IMG SRC="(archive/\d+.+?|http://www.calamitiesofnature.com/archive/\d+.+?)"')
stripUrl = latestUrl + 'archive/?c=%s'
imageSearch = compile(tagre("img", "src", r'(archive/\d+[^"]+|http://www\.calamitiesofnature\.com/archive/\d+[^"]+)'))
prevSearch = compile(r'<a id="previous" href="(http://www.calamitiesofnature.com/archive/\?c\=\d+)">')
help = 'Index format: nnn'
@ -451,8 +458,8 @@ class Champ2010(_BasicScraper):
class Chucklebrain(_BasicScraper):
latestUrl = 'http://www.chucklebrain.com/main.php'
stripUrl = 'http://www.chucklebrain.com/main.php?img=%s'
imageSearch = compile(r'<img src="(/images/strip.+?)"')
stripUrl = latestUrl + '?img=%s'
imageSearch = compile(tagre("img", "src", r'(/images/strip[^"]+)'))
prevSearch = compile(r'<a href=\'(/main.php\?img\=\d+)\'><img src=\'/images/previous.jpg\'')
help = 'Index format: nnn'
@ -460,8 +467,8 @@ class Chucklebrain(_BasicScraper):
class CompanyY(_BasicScraper):
latestUrl = 'http://company-y.com/'
stripUrl = 'http://company-y.com/%s/'
imageSearch = compile(r'<img src="(http://company-y.com/comics/.+?)"')
stripUrl = latestUrl + '%s/'
imageSearch = compile(tagre("img", "src", r'(http://company-y\.com/comics/[^"]+)'))
prevSearch = compile(r'<div class="nav-previous"><a href="(http://company-y.com/.+?)"')
help = 'Index format: yyyy/mm/dd/strip-name'
@ -483,7 +490,7 @@ class CorydonCafe(_BasicScraper):
class CraftedFables(_BasicScraper):
latestUrl = 'http://www.craftedfables.com/'
stripUrl = 'http://www.caf-fiends.net/craftedfables/?p=%s'
imageSearch = compile(r'<img src="(http://www.caf-fiends.net/craftedfables/comics/.+?)"')
imageSearch = compile(tagre("img", "src", r'(http://www\.caf-fiends\.net/craftedfables/comics/[^"]+)'))
prevSearch = compile(r'<a href="(http://www.caf-fiends.net/craftedfables/.+?)"><span class="prev">')
help = 'Index format: nnn'
@ -491,7 +498,7 @@ class CraftedFables(_BasicScraper):
class Currhue(_BasicScraper):
latestUrl = 'http://www.currhue.com/'
stripUrl = 'http://www.currhue.com/?p=%s'
imageSearch = compile(r'<img src="(http://www.currhue.com/comics/.+?)"')
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.currhue\.com/comics/[^"]+)'))
prevSearch = compile(r'<div class="nav-previous"><a href="(http://www.currhue.com/.+?)"')
help = 'Index format: nnn'

View file

@ -19,15 +19,15 @@ class DMFA(_BasicScraper):
class DandyAndCompany(_BasicScraper):
latestUrl = 'http://www.dandyandcompany.com/'
stripUrl = 'http://www.dandyandcompany.com/%s'
imageSearch = compile(r'<img src="(.*?/strips/.+?)"')
stripUrl = latestUrl + '%s'
imageSearch = compile(tagre("img", "src", r'([^"]*/strips/[^"]+)'))
prevSearch = compile(r'<a href="(.*)" class="prev"')
help = 'Index format: yyyy/mm/dd'
class DarkWings(_BasicScraper):
latestUrl = 'http://www.flowerlarkstudios.com/dark-wings/'
stripUrl = 'http://www.flowerlarkstudios.com/dark-wings/archive.php?day=%s'
stripUrl = latestUrl + 'archive.php?day=%s'
imageSearch = compile(r'(comics/.+?)" W')
prevSearch = compile(r"first_day.+?/(archive.+?)'.+?previous_day")
help = 'Index format: yyyymmdd'
@ -35,7 +35,7 @@ class DarkWings(_BasicScraper):
class DeathToTheExtremist(_BasicScraper):
latestUrl = 'http://www.dtecomic.com/'
stripUrl = 'http://www.dtecomic.com/?n=%s'
stripUrl = latestUrl + '?n=%s'
imageSearch = compile(r'"(comics/.*?)"')
prevSearch = compile(r'</a> <a href="(\?n=.*?)"><.+?/aprev.gif"')
help = 'Index format: nnn'
@ -43,7 +43,7 @@ class DeathToTheExtremist(_BasicScraper):
class DeepFried(_BasicScraper):
latestUrl = 'http://www.whatisdeepfried.com/'
stripUrl = 'http://www.whatisdeepfried.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'(http://www.whatisdeepfried.com/comics/.+?)"')
prevSearch = compile(r'"(http://www.whatisdeepfried.com/.+?)"><span class="prev">')
help = 'Index format: non'
@ -52,7 +52,7 @@ class DeepFried(_BasicScraper):
class DoemainOfOurOwn(_BasicScraper):
latestUrl = 'http://www.doemain.com/'
stripUrl = 'http://www.doemain.com/index.cgi/%s'
stripUrl = latestUrl + 'index.cgi/%s'
imageSearch = compile(r"<img border='0' width='\d+' height='\d+' src='(/strips/\d{4}/\d{6}-[^\']+)'")
prevSearch = compile(r'<a href="(/index\.cgi/\d{4}-\d{2}-\d{2})"><img width="\d+" height="\d+" border="\d+" alt="Previous Strip"')
help = 'Index format: yyyy-mm-dd'
@ -70,8 +70,8 @@ class DrFun(_BasicScraper):
class Dracula(_BasicScraper):
latestUrl = 'http://draculacomic.net/'
stripUrl = 'http://draculacomic.net/comic.php?comicID=%s'
imageSearch = compile(r'<img src="(comics/.+?)"')
stripUrl = latestUrl + 'comic.php?comicID=%s'
imageSearch = compile(tagre("img", "src", r'(comics/[^"]+)'))
prevSearch = compile(r'&nbsp;<a class="archivelink" href="(.+?)">&laquo; Prev</a>')
help = 'Index format: nnn'
@ -79,7 +79,7 @@ class Dracula(_BasicScraper):
class DragonTails(_BasicScraper):
latestUrl = 'http://www.dragon-tails.com/'
stripUrl = 'http://www.dragon-tails.com/archive.php?date=%s'
stripUrl = latestUrl + 'archive.php?date=%s'
imageSearch = compile(r'"(newcomic/.+?)"')
prevSearch = compile(r'"(archive.+?)">.+n_2')
help = 'Index format: yyyy-mm-dd'
@ -87,7 +87,7 @@ class DragonTails(_BasicScraper):
class DreamKeepersPrelude(_BasicScraper):
latestUrl = 'http://www.dreamkeeperscomic.com/Prelude.php'
stripUrl = 'http://www.dreamkeeperscomic.com/Prelude.php?pg=%s'
stripUrl = latestUrl + '?pg=%s'
imageSearch = compile(r'(images/PreludeNew/.+?)"')
prevSearch = compile(r'(Prelude.php\?pg=.+?)"')
help = 'Index format: n'
@ -95,7 +95,7 @@ class DreamKeepersPrelude(_BasicScraper):
class Drowtales(_BasicScraper):
latestUrl = 'http://www.drowtales.com/mainarchive.php'
stripUrl = 'http://www.drowtales.com/mainarchive.php?location=%s'
stripUrl = latestUrl + '?location=%s'
imageSearch = compile(r'src=".(/tmpmanga/.+?)"')
prevSearch = compile(r'<a href="mainarchive.php(\?location=\d+)"><img src="[^"]*previousday\.gif"')
help = 'Index format: yyyymmdd'
@ -112,7 +112,7 @@ class DungeonCrawlInc(_BasicScraper):
class DieselSweeties(_BasicScraper):
latestUrl = 'http://www.dieselsweeties.com/'
stripUrl = 'http://www.dieselsweeties.com/archive/%s'
stripUrl = latestUrl + 'archive/%s'
imageSearch = compile(r'src="(/hstrips/.+?)"')
prevSearch = compile(r'href="(/archive/.+?)">(<img src="http://www.dieselsweeties.com/ximages/blackbackarrow160.png|previous webcomic)')
help = 'Index format: n (unpadded)'
@ -126,7 +126,7 @@ class DieselSweeties(_BasicScraper):
class DominicDeegan(_BasicScraper):
latestUrl = 'http://www.dominic-deegan.com/'
stripUrl = 'http://www.dominic-deegan.com/view.php?date=%s'
stripUrl = latestUrl + 'view.php?date=%s'
imageSearch = compile(r'<img src="(.+?save-as=.+?)" alt')
prevSearch = compile(r'"(view.php\?date=.+?)".+?prev21')
help = 'Index format: yyyy-mm-dd'
@ -157,7 +157,7 @@ class DresdenCodak(_BasicScraper):
class DonkBirds(_BasicScraper):
latestUrl = 'http://www.donkbirds.com/'
stripUrl = 'http://www.donkbirds.com/index.php?date=%s'
stripUrl = latestUrl + 'index.php?date=%s'
imageSearch = compile(r'<img src="(strips/.+?)"')
prevSearch = compile(r'<a href="(.+?)">Previous</a>')
help = 'Index format: yyyy-mm-dd'

View file

@ -4,11 +4,12 @@ from re import compile, IGNORECASE
from ..helpers import indirectStarter
from ..scraper import _BasicScraper
from ..util import tagre
class EerieCuties(_BasicScraper):
latestUrl = 'http://www.eeriecuties.com/'
stripUrl = 'http://www.eeriecuties.com/d/%s.html'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(r'(/comics/.+?)"')
prevSearch = compile(r'(/d/.+?.html).+?/previous_day.gif')
help = 'Index format: yyyymmdd'
@ -17,7 +18,7 @@ class EerieCuties(_BasicScraper):
class EdgeTheDevilhunter(_BasicScraper):
name = 'KeenSpot/EdgeTheDevilhunter'
latestUrl = 'http://www.edgethedevilhunter.com/'
stripUrl = 'http://www.edgethedevilhunter.com/comics/%s'
stripUrl = latestUrl + 'comics/%s'
imageSearch = compile(r'(http://www.edgethedevilhunter.com/comics/.+?)" alt')
prevSearch = compile(r'(http://www.edgethedevilhunter.com/comics/.+?)"><span class="prev')
help = 'Index format: mmddyyyy or name'
@ -40,7 +41,7 @@ class Eriadan(_BasicScraper):
class ElGoonishShive(_BasicScraper):
name = 'KeenSpot/ElGoonishShive'
latestUrl = 'http://www.egscomics.com/'
stripUrl = 'http://www.egscomics.com/?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r"'(comics/.+?)'")
prevSearch = compile(r"<a href='(/\?date=.+?)'.+?arrow_prev.gif")
help = 'Index format: yyyy-mm-dd'
@ -50,7 +51,7 @@ class ElGoonishShive(_BasicScraper):
class ElGoonishShiveNP(_BasicScraper):
name = 'KeenSpot/ElGoonishShiveNP'
latestUrl = 'http://www.egscomics.com/egsnp/'
stripUrl = 'http://www.egscomics.com/egsnp/?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r'<div class=\'comic2\'><img src=\'(comics/\d{4}/\d{2}.+?)\'')
prevSearch = compile(r'<a href=\'(.+?)\'[^>]+?onmouseover=\'\$\("navimg(6|2)"\)')
help = 'Index format: yyyy-mm-dd'
@ -68,16 +69,17 @@ class ElsieHooper(_BasicScraper):
class EmergencyExit(_BasicScraper):
latestUrl = 'http://www.eecomics.net/'
stripUrl = ''
stripUrl = None
imageSearch = compile(r'"(comics/.+?)"')
prevSearch = compile(r'START.+?"(.+?)"')
# XXX ?
help = 'God help us now!'
class ErrantStory(_BasicScraper):
latestUrl = 'http://www.errantstory.com/'
stripUrl = 'http://www.errantstory.com/archive.php?date=%s'
stripUrl = latestUrl + 'archive.php?date=%s'
imageSearch = compile(r'<img[^>]+?src="([^"]*?comics/.+?)"')
prevSearch = compile(r'><a href="(.+?)">&lt;Previous</a>')
help = 'Index format: yyyy-mm-dd'
@ -95,7 +97,7 @@ class EternalVenture(_BasicScraper):
class Evercrest(_BasicScraper):
latestUrl = 'http://www.evercrest.com/archives/20030308'
stripUrl = 'http://www.evercrest.com/archives/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'<img.+?src="([^"]*/(images/oldstrips|archives/i)/[^"]*)"')
prevSearch = compile(r'<a.+?href="(http://www.evercrest.com/archives/\d+)">&lt; Previous')
help = 'Index format: yyyymmdd'
@ -103,7 +105,7 @@ class Evercrest(_BasicScraper):
class EverybodyLovesEricRaymond(_BasicScraper):
latestUrl = 'http://geekz.co.uk/lovesraymond/'
stripUrl = 'http://geekz.co.uk/lovesraymond/archive/%s'
stripUrl = latestUrl + 'archive/%s'
imageSearch = compile(r'<img src="((?:http://geekz.co.uk)?/lovesraymond/wp-content(?:/images)/ep\d+\w?\.jpg)"', IGNORECASE)
prevSearch = compile(r'&laquo; <a href="(http://geekz.co.uk/lovesraymond/archive/[^/"]*)">')
help = 'Index format: name-of-old-comic'
@ -111,16 +113,22 @@ class EverybodyLovesEricRaymond(_BasicScraper):
class EvilDiva(_BasicScraper):
latestUrl = 'http://www.evildivacomics.com/'
stripUrl = 'http://www.evildivacomics.com/%s.html'
stripUrl = latestUrl + '%s.html'
imageSearch = compile(r'(/comics/.+?)"')
prevSearch = compile(r'http.+?com/(.+?)".+?"prev')
help = 'Index format: cpn (unpadded)'
class EvilInc(_BasicScraper):
latestUrl = 'http://www.evil-comic.com/'
stripUrl = latestUrl + 'd/%s.html'
imageSearch=compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch=compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help='Index format: yyyymmdd'
class Exiern(_BasicScraper):
latestUrl = 'http://www.exiern.com/'
stripUrl = 'http://www.exiern.com/comic/%s'
stripUrl = latestUrl + 'comic/%s'
imageSearch = compile(r'<img src="(http://www.exiern.com/comics/.+?)"')
prevSearch = compile(r'<a href="(http://www.exiern.com/.+?)" class="navi navi-prev"')
help = 'Index format: ChapterName-StripName'
@ -129,7 +137,7 @@ class Exiern(_BasicScraper):
class ExiernDarkReflections(_BasicScraper):
latestUrl = 'http://darkreflections.exiern.com/'
stripUrl = 'http://darkreflections.exiern.com/index.php?strip_id=%s'
stripUrl = latestUrl + 'index.php?strip_id=%s'
imageSearch = compile(r'"(istrip.+?)"')
prevSearch = compile(r'First.+?(/index.+?)".+?prev')
help = 'Index format: n'
@ -138,7 +146,7 @@ class ExiernDarkReflections(_BasicScraper):
class ExtraLife(_BasicScraper):
latestUrl = 'http://www.myextralife.com/'
stripUrl = 'http://www.myextralife.com/comic/%s/'
stripUrl = latestUrl + 'comic/%s/'
imageSearch = compile(r'<img src="(http://www.myextralife.com/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(http://www.myextralife.com/comic/.+?)"')
help = 'Index format: mmddyyyy'
@ -147,7 +155,7 @@ class ExtraLife(_BasicScraper):
class EyeOfRamalach(_BasicScraper):
latestUrl = 'http://theeye.katbox.net/'
stripUrl = 'http://theeye.katbox.net/index.php?strip_id=%s'
stripUrl = latestUrl + 'index.php?strip_id=%s'
imageSearch = compile(r'="(.+?strips/.+?)"')
prevSearch = compile(r'(index.php\?strip_id=.+?)".+?navigation_back')
help = 'Index format: n (unpadded)'
@ -170,7 +178,7 @@ class EarthsongSaga(_BasicScraper):
class ExploitationNow(_BasicScraper):
latestUrl = 'http://exploitationnow.com/'
stripUrl = 'http://exploitationnow.com/comic.php?date=%s'
stripUrl = latestUrl + 'comic.php?date=%s'
imageSearch = compile(r'src="(comics/.+?)"')
prevSearch = compile(r' <a href="(.+?)" title="\[Back\]">')
help = 'Index format: yyyy-mm-dd'
@ -179,7 +187,7 @@ class ExploitationNow(_BasicScraper):
class Ellerbisms(_BasicScraper):
latestUrl = 'http://www.ellerbisms.com/'
stripUrl = 'http://www.ellerbisms.com/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'<img src="(http://www.ellerbisms.com/comics/.+?)"')
prevSearch = compile(r'<a href="(http://www.ellerbisms.com/.+?)"><span class="prev">')
help = 'Index format: nnn'

View file

@ -9,7 +9,7 @@ from ..helpers import indirectStarter
class FalconTwin(_BasicScraper):
latestUrl = 'http://www.falcontwin.com/'
stripUrl = 'http://www.falcontwin.com/index.html?strip=%s'
stripUrl = latestUrl + 'index.html?strip=%s'
imageSearch = compile(r'"(strips/.+?)"')
prevSearch = compile(r'"prev"><a href="(index.+?)"')
help = 'Index format: nnn'
@ -17,7 +17,7 @@ class FalconTwin(_BasicScraper):
class FauxPas(_BasicScraper):
latestUrl = 'http://www.ozfoxes.net/cgi/pl-fp1.cgi'
stripUrl = 'http://www.ozfoxes.net/cgi/pl-fp1.cgi?%s'
stripUrl = latestUrl + '?%s'
imageSearch = compile(r'<img .*src="(.*fp/fp.*(png|jpg|gif))"')
prevSearch = compile(r'<a href="(pl-fp1\.cgi\?\d+)">Previous Strip')
help = 'Index format: nnn'
@ -35,8 +35,8 @@ class FeyWinds(_BasicScraper):
class FightCastOrEvade(_BasicScraper):
latestUrl = 'http://www.fightcastorevade.net/'
stripUrl = 'http://www.fightcastorevade.net/d/%s'
imageSearch = compile(r'<img src="(http://www.fightcastorevade.net/comics/.+?)"')
stripUrl = latestUrl + 'd/%s'
imageSearch = compile(tagre("img", "src", r'"(http://www\.fightcastorevade\.net/comics/[^"]+)'))
prevSearch = compile(r'"(.+?/d/.+?)".+?previous')
help = 'Index format: yyyymmdd.html'
@ -44,8 +44,8 @@ class FightCastOrEvade(_BasicScraper):
class FilibusterCartoons(_BasicScraper):
latestUrl = 'http://www.filibustercartoons.com/'
stripUrl = 'http://www.filibustercartoons.com/index.php/%s'
imageSearch = compile(r'<img src="(http://www.filibustercartoons.com/comics/.+?)"')
stripUrl = latestUrl + 'index.php/%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.filibustercartoons\.com/comics/[^"]+)'))
prevSearch = compile(r'<a href="(.+?)"><img src=\'(.+?/arrow-left.gif)\'')
help = 'Index format: yyyy/mm/dd/name'
@ -61,7 +61,7 @@ class FlakyPastry(_BasicScraper):
class Flipside(_BasicScraper):
latestUrl = 'http://www.flipsidecomics.com/comic.php'
stripUrl = 'http://www.flipsidecomics.com/comic.php?i=%s'
stripUrl = latestUrl + '?i=%s'
imageSearch = compile(r'<IMG SRC="(comic/.+?)"')
prevSearch = compile(r'<A HREF="(comic.php\?i=\d+?)">&lt')
help = 'Index format: nnnn'
@ -79,7 +79,7 @@ class Footloose(_BasicScraper):
class FragileGravity(_BasicScraper):
latestUrl = 'http://www.fragilegravity.com/'
stripUrl = 'http://www.fragilegravity.com/core.php?archive=%s'
stripUrl = latestUrl + 'core.php?archive=%s'
imageSearch = compile(r'<IMG SRC="(strips/.+?)"')
prevSearch = compile(r'<A HREF="(.+?)"\nonMouseover="window.status=\'Previous Strip', MULTILINE | IGNORECASE)
help = 'Index format: yyyymmdd'
@ -114,7 +114,7 @@ class FullFrontalNerdity(_BasicScraper):
class FunInJammies(_BasicScraper):
latestUrl = 'http://www.funinjammies.com/'
stripUrl = 'http://www.funinjammies.com/comic.php?issue=%s'
stripUrl = latestUrl + 'comic.php?issue=%s'
imageSearch = compile(r'(/comics/.+?)"')
prevSearch = compile(r'(/comic.php.+?)" id.+?prev')
help = 'Index format: n (unpadded)'

View file

@ -4,11 +4,12 @@ from re import compile
from ..scraper import _BasicScraper
from ..helpers import indirectStarter
from ..util import tagre
class Galaxion(_BasicScraper):
latestUrl = 'http://galaxioncomics.com/'
stripUrl = 'http://galaxioncomics.com/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'(wordpress/comics/.+?)"')
prevSearch = compile(r'\| <a href="http://galaxioncomics.com/(\?p=.+?)".+?vious.gif')
help = 'Index format: non'
@ -16,7 +17,7 @@ class Galaxion(_BasicScraper):
class Garanos(_BasicScraper):
latestUrl = 'http://www.garanos.com/'
stripUrl = 'http://www.garanos.com/pages/page-%s'
stripUrl = latestUrl + 'pages/page-%s'
imageSearch = compile(r'<img src=.+?(/pages/.+?)"')
prevSearch = compile(r'<a href="(http://www.garanos.com/pages/page-.../)">&#9668; Previous<')
help = 'Index format: n (unpadded)'
@ -24,7 +25,7 @@ class Garanos(_BasicScraper):
class GUComics(_BasicScraper):
latestUrl = 'http://www.gucomics.com/comic/'
stripUrl = 'http://www.gucomics.com/comic/?cdate=%s'
stripUrl = latestUrl + '?cdate=%s'
imageSearch = compile(r'<IMG src="(/comics/\d{4}/gu_.*?)"')
prevSearch = compile(r'<A href="(/comic/\?cdate=\d+)"><IMG src="/images/cnav_prev')
help = 'Index format: yyyymmdd'
@ -33,7 +34,7 @@ class GUComics(_BasicScraper):
class GenrezvousPoint(_BasicScraper):
latestUrl = 'http://genrezvouspoint.com/'
stripUrl = 'http://genrezvouspoint.com/index.php?comicID=%s'
stripUrl = latestUrl + 'index.php?comicID=%s'
imageSearch = compile(r'<img src=\'(comics/.+?)\'')
prevSearch = compile(r' <a[^>]+?href="(.+?)">PREVIOUS</a>')
help = 'Index format: nnn'
@ -57,19 +58,25 @@ class GirlsWithSlingshots(_BasicScraper):
help = 'Index format: nnn'
class Girly(_BasicScraper):
latestUrl = 'http://girlyyy.com/'
stripUrl = 'http://girlyyy.com/go/%s'
stripUrl = latestUrl + 'go/%s'
imageSearch = compile(r'<img src="(http://girlyyy.com/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)"> &nbsp;&lt;&nbsp;prev')
help = 'Index format: nnn'
class GleefulNihilism(_BasicScraper):
latestUrl = 'http://gleefulnihilism.com/'
stripUrl = latestUrl + 'comics/%s/'
imageSearch = compile(tagre("img", "src", r'(http://gleefulnihilism\.com/comics/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'(http://gleefulnihilism\.com/comics/[^"]+)') + 'Previous')
help = 'Index format: yyyy/mm/dd/stripname'
class Goats(_BasicScraper):
latestUrl = 'http://www.goats.com/'
stripUrl = 'http://www.goats.com/archive/%s.html'
stripUrl = latestUrl + 'archive/%s.html'
imageSearch = compile(r'<img.+?src="(/comix/.+?)"')
prevSearch = compile(r'<a href="(/archive/\d{6}.html)" class="button" title="go back">')
help = 'Index format: yymmdd'
@ -101,7 +108,7 @@ class GunnerkrigCourt(_BasicScraper):
class Gunshow(_BasicScraper):
latestUrl = 'http://gunshowcomic.com/'
stripUrl = 'http://gunshowcomic.com/d/%s.html'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(r'src="(/comics/.+?)"')
prevSearch = compile(r'(/d/\d+\.html)"><img[^>]+?src="/images/previous_day')
help = 'Index format: yyyy/mm/dd'
@ -110,7 +117,7 @@ class Gunshow(_BasicScraper):
class GleefulNihilism(_BasicScraper):
latestUrl = 'http://gleefulnihilism.com/'
stripUrl = 'http://gleefulnihilism.com/comics/2009/12/01/just-one-of-the-perks/%s'
stripUrl = latestUrl + 'comics/2009/12/01/just-one-of-the-perks/%s'
imageSearch = compile(r'<img src="(http://gleefulnihilism.com/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)"[^>]+?>Previous</a>')
help = 'Index format: yyyy/mm/dd/strip-name'
@ -119,7 +126,7 @@ class GleefulNihilism(_BasicScraper):
class GastroPhobia(_BasicScraper):
latestUrl = 'http://www.gastrophobia.com/'
stripUrl = 'http://www.gastrophobia.com/index.php?date=%s'
stripUrl = latestUrl + 'index.php?date=%s'
imageSearch = compile(r'<img src="(http://gastrophobia.com/comix/[^"]+)"[^>]*>(?!<br>)')
prevSearch = compile(r'<a href="(.+?)"><img src="pix/prev.gif" ')
help = 'Index format: yyyy-mm-dd'
@ -128,7 +135,7 @@ class GastroPhobia(_BasicScraper):
class Geeks(_BasicScraper):
latestUrl = 'http://sevenfloorsdown.com/geeks/'
stripUrl = 'http://sevenfloorsdown.com/geeks/archives/%s'
stripUrl = latestUrl + 'archives/%s'
imageSearch = compile(r'<img src=\'(http://sevenfloorsdown.com/geeks/comics/.+?)\'')
prevSearch = compile(r'<a href="(.+?)">&laquo; Previous')
help = 'Index format: nnn'
@ -137,7 +144,16 @@ class Geeks(_BasicScraper):
class GlassHalfEmpty(_BasicScraper):
latestUrl = 'http://www.defectivity.com/ghe/index.php'
stripUrl = 'http://www.defectivity.com/ghe/index.php?strip_id=%s'
stripUrl = latestUrl + '?strip_id=%s'
imageSearch = compile(r'src="(comics/.+?)"')
prevSearch = compile(r'</a><a href="(.+?)"><img src="\.\./images/onback\.jpg"')
help = 'Index format: nnn'
class GreystoneInn(_BasicScraper):
latestUrl = 'http://www.greystoneinn.net/'
stripUrl = latestUrl + 'd/%s.html'
imageSearch=compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch=compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help='Index format: yyyymmdd'

View file

@ -7,7 +7,7 @@ from ..scraper import _BasicScraper
class HappyMedium(_BasicScraper):
latestUrl = 'http://happymedium.fast-bee.com/'
stripUrl = 'http://happymedium.fast-bee.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'(/comics/.+?)"')
prevSearch = compile(r'com(/.+?)".+?"prev">&#9668')
help = 'Index format: yyyy/mm/chapter-n-page-n'
@ -16,7 +16,7 @@ class HappyMedium(_BasicScraper):
class Heliothaumic(_BasicScraper):
latestUrl = 'http://thaumic.net/'
stripUrl = 'http://thaumic.net/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'<img src="(http://thaumic.net/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(http://thaumic.net/.+?)">')
help = 'Index format: yyyy/mm/dd/n(unpadded)-comicname'
@ -34,7 +34,7 @@ class Housd(_BasicScraper):
class HateSong(_BasicScraper):
latestUrl = 'http://hatesong.com/'
stripUrl = 'http://hatesong.com/%s/'
stripUrl = latestUrl + '%s/'
imageSearch = compile(r'src="(http://www.hatesong.com/strips/.+?)"')
prevSearch = compile(r'<div class="headernav"><a href="(http://hatesong.com/\d{4}/\d{2}/\d{2})')
help = 'Index format: yyyy/mm/dd'
@ -52,7 +52,7 @@ class HorribleVille(_BasicScraper):
class HelpDesk(_BasicScraper):
latestUrl = 'http://www.ubersoft.net/'
stripUrl = 'http://www.ubersoft.net/comic/hd/%s/%s/%s'
stripUrl = latestUrl + 'comic/hd/%s/%s/%s'
imageSearch = compile(r'src="(http://www.ubersoft.net/files/comics/hd/hd\d{8}.png)')
prevSearch = compile(r'<a href="(/comic/.+?)">(.+?)previous</a>')
help = 'Index format: yyyy/mm/name'
@ -61,7 +61,7 @@ class HelpDesk(_BasicScraper):
class HardGraft(_BasicScraper):
latestUrl = 'http://hard-graft.net/'
stripUrl = 'http://hard-graft.net/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'<img src="(http://hard-graft.net/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(.+?)"')
help = 'Index format: nnn'

View file

@ -3,11 +3,12 @@
from re import compile, IGNORECASE
from ..scraper import _BasicScraper
from ..util import tagre
class IDreamOfAJeanieBottle(_BasicScraper):
latestUrl = 'http://jeaniebottle.com/'
stripUrl = 'http://jeaniebottle.com/review.php?comicID='
stripUrl = latestUrl + 'review.php?comicID='
imageSearch = compile(r'(/comics/.+?)"')
prevSearch = compile(r'First".+?(review.php.+?)".+?prev_a.gif')
help = 'Index format: n (unpadded)'
@ -15,7 +16,7 @@ class IDreamOfAJeanieBottle(_BasicScraper):
class IrregularWebcomic(_BasicScraper):
latestUrl = 'http://www.irregularwebcomic.net/'
stripUrl = 'http://www.irregularwebcomic.net/cgi-bin/comic.pl?comic=%s'
stripUrl = latestUrl + 'cgi-bin/comic.pl?comic=%s'
imageSearch = compile(r'<img .*src="(.*comics/.*(png|jpg|gif))".*>')
prevSearch = compile(r'<a href="(/\d+\.html|/cgi-bin/comic\.pl\?comic=\d+)">Previous ')
help = 'Index format: nnn'
@ -23,7 +24,7 @@ class IrregularWebcomic(_BasicScraper):
class InsideOut(_BasicScraper):
latestUrl = 'http://www.insideoutcomic.com/'
stripUrl = 'http://www.insideoutcomic.com/html/%s.html'
stripUrl = latestUrl + 'html/%s.html'
imageSearch = compile(r'Picture12LYR.+?C="(.+?/assets/images/.+?)"')
prevSearch = compile(r'Picture7LYR.+?F="(.+?/html/.+?)"')
help = 'Index format: n_comic_name'
@ -75,3 +76,12 @@ class ICantDrawFeet(_BasicScraper):
imageSearch = compile(r'src="(http://icantdrawfeet.com/comics/.+?)"')
prevSearch = compile(r'<a href="(http://icantdrawfeet.com/.+?)"><img src="http://icantdrawfeet.com/pageimages/prev.png"')
help = 'Index format: yyyy/mm/dd/stripname'
class ItsWalky(_BasicScraper):
latestUrl = 'http://www.itswalky.com/'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch = compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help = 'Index format: yyyymmdd'

View file

@ -7,7 +7,7 @@ from ..scraper import _BasicScraper
class Jack(_BasicScraper):
latestUrl = 'http://www.pholph.com/'
stripUrl = 'http://www.pholph.com/strip.php?id=5&sid=%s'
stripUrl = latestUrl + 'strip.php?id=5&sid=%s'
imageSearch = compile(r'<img src="(./artwork/.+?/Jack.+?)"')
prevSearch = compile(r'\|<a href="(.+?)">Previous Strip</a>')
help = 'Index format: n (unpadded)'
@ -16,7 +16,7 @@ class Jack(_BasicScraper):
class JerkCity(_BasicScraper):
latestUrl = 'http://www.jerkcity.com/'
stripUrl = 'http://www.jerkcity.com/jerkcity%s'
stripUrl = latestUrl + 'jerkcity%s'
imageSearch = compile(r'"jerkcity.+?">.+?"(/jerkcity.+?)"')
prevSearch = compile(r'"(jerkcity.+?)">.+?"/jerkcity.+?"')
help = 'Index format: unknown'
@ -25,7 +25,7 @@ class JerkCity(_BasicScraper):
class JoeAndMonkey(_BasicScraper):
latestUrl = 'http://www.joeandmonkey.com/'
stripUrl = 'http://www.joeandmonkey.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'"(/comic/[^"]+)"')
prevSearch = compile(r"<a href='(/\d+)'>Previous")
help = 'Index format: nnn'

View file

@ -3,11 +3,12 @@
from re import compile, IGNORECASE
from ..scraper import _BasicScraper
from ..util import tagre
class KernelPanic(_BasicScraper):
latestUrl = 'http://www.ubersoft.net/kpanic/'
stripUrl = 'http://www.ubersoft.net/kpanic/d/%s'
stripUrl = latestUrl + 'd/%s'
imageSearch = compile(r'src="(.+?/kp/kp.+?)" ')
prevSearch = compile(r'<li class="previous"><a href="(.+?)">')
help = 'Index format: yyyymmdd.html'
@ -29,7 +30,7 @@ class Key(_BasicScraper):
class Krakow(_BasicScraper):
latestUrl = 'http://www.krakowstudios.com/'
stripUrl = 'http://www.krakowstudios.com/archive.php?date=%s'
stripUrl = latestUrl + 'archive.php?date=%s'
imageSearch = compile(r'<img src="(comics/.+?)"')
prevSearch = compile(r'<a href="(archive\.php\?date=.+?)"><img border=0 name=previous_day')
help = 'Index format: yyyymmdd'
@ -45,7 +46,7 @@ class Kukuburi(_BasicScraper):
class KevinAndKell(_BasicScraper):
latestUrl = 'http://www.kevinandkell.com/'
stripUrl = 'http://www.kevinandkell.com/%s/kk%s%s.html'
stripUrl = latestUrl + '%s/kk%s%s.html'
imageSearch = compile(r'<img.+?src="(/?(\d+/)?strips/kk\d+.gif)"', IGNORECASE)
prevSearch = compile(r'<a.+?href="(/?(\.\./)?\d+/kk\d+\.html)"[^>]*><span>Previous Strip', IGNORECASE)
help = 'Index format: yyyy-mm-dd'
@ -54,10 +55,18 @@ class KevinAndKell(_BasicScraper):
self.currentUrl = self.stripUrl % tuple(map(int, index.split('-')))
class KillerKomics(_BasicScraper):
latestUrl = 'http://www.killerkomics.com/web-comics/index_ang.cfm'
stripUrl = 'http://www.killerkomics.com/web-comics/%s.cfm'
imageSearch = compile(r'<img src="(http://www.killerkomics.com/FichiersUpload/Comics/.+?)"')
prevSearch = compile(r'<div id="precedent"><a href="(.+?)"')
help = 'Index format: strip-name'
class KrazyLarry(_BasicScraper):
latestUrl = 'http://www.krazylarry.com/'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch = compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help = 'Index format: yyyymmdd'

View file

@ -8,7 +8,7 @@ from ..helpers import indirectStarter
class LasLindas(_BasicScraper):
latestUrl = 'http://www.katbox.net/laslindas/'
stripUrl = 'http://www.katbox.net/laslindas/index.php?strip_id=%s'
stripUrl = latestUrl + 'index.php?strip_id=%s'
imageSearch = compile(r'"(istrip_files/strips/.+?)"')
prevSearch = compile(r'<a href="(.+?)"><[^>]+?alt="Back"')
help = 'Index format: n (unpadded)'
@ -17,7 +17,7 @@ class LasLindas(_BasicScraper):
class LastBlood(_BasicScraper):
latestUrl = 'http://www.lastblood.net/main/'
stripUrl = 'http://www.lastblood.net/main/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'(/comicfolder/.+?)" alt')
prevSearch = compile(r'Previous Comic:</small><br />&laquo; <a href="(.+?)">')
help = 'Index format: yyyy/mm/dd/(page number and name)'
@ -26,7 +26,7 @@ class LastBlood(_BasicScraper):
class LesbianPiratesFromOuterSpace(_BasicScraper):
latestUrl = 'http://rosalarian.com/lesbianpirates/'
stripUrl = 'http://rosalarian.com/lesbianpirates/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'(/lesbianpirates/comics/.+?)"')
prevSearch = compile(r'/(\?p=.+?)">&laquo')
help = 'Index format: n'
@ -35,7 +35,7 @@ class LesbianPiratesFromOuterSpace(_BasicScraper):
class Lint(_BasicScraper):
latestUrl = 'http://www.purnicellin.com/lint/'
stripUrl = 'http://www.purnicellin.com/lint/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'<img src="(http://www.purnicellin.com/lint/comics/.+?)"')
prevSearch = compile(r'\| <a href="([^"]+)" rel="prev">')
help = 'Index format: yyyy/mm/dd/num-name'
@ -58,7 +58,7 @@ class LookingForGroup(_BasicScraper):
class Loserz(_BasicScraper):
latestUrl = 'http://bukucomics.com/loserz/'
stripUrl = 'http://bukucomics.com/loserz/go/%s'
stripUrl = latestUrl + 'go/%s'
imageSearch = compile(r'<img src="(http://bukucomics.com/loserz/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)"> &nbsp;&lt;&nbsp;')
help = 'Index format: n (unpadded)'
@ -67,7 +67,7 @@ class Loserz(_BasicScraper):
class LittleGamers(_BasicScraper):
latestUrl = 'http://www.little-gamers.com/'
stripUrl = 'http://www.little-gamers.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'<img src="(http://www.little-gamers.com/comics/[^"]+)"')
prevSearch = compile(r'href="(.+?)"><img id="comic-nav-prev"')
help = 'Index format: yyyy/mm/dd/name'
@ -76,7 +76,7 @@ class LittleGamers(_BasicScraper):
class LegoRobot(_BasicScraper):
latestUrl = 'http://www.legorobotcomics.com/'
stripUrl = 'http://www.legorobotcomics.com/?id=%s'
stripUrl = latestUrl + '?id=%s'
imageSearch = compile(r'id="the_comic" src="(comics/.+?)"')
prevSearch = compile(r'(\?id=\d+)"><img src="images/back.png"')
help = 'Index format: nnnn'
@ -85,7 +85,7 @@ class LegoRobot(_BasicScraper):
class LeastICouldDo(_BasicScraper):
latestUrl = 'http://www.leasticoulddo.com/'
stripUrl = 'http://www.leasticoulddo.com/comic/%s'
stripUrl = latestUrl + 'comic/%s'
imageSearch = compile(r'<img src="(http://cdn.leasticoulddo.com/comics/\d{8}.\w{1,4})" />')
prevSearch = compile(r'<a href="(/comic/\d{8})">Previous</a>')
help = 'Index format: yyyymmdd'

View file

@ -4,19 +4,19 @@ from re import compile, IGNORECASE
from ..scraper import _BasicScraper
from ..helpers import queryNamer
from ..util import tagre
class MadamAndEve(_BasicScraper):
latestUrl = 'http://www.madamandeve.co.za/week_of_cartns.php'
stripUrl = 'http://www.madamandeve.co.za/week_of_cartns.php'
stripUrl = None
imageSearch = compile(r'<IMG BORDER="0" SRC="(cartoons/me\d{6}\.(gif|jpg))">')
prevSearch = compile(r'<a href="(weekend_cartoon.php)"')
help = 'Index format: (none)'
class MagicHigh(_BasicScraper):
latestUrl = 'http://www.doomnstuff.com/magichigh/index.php'
stripUrl = 'http://www.doomnstuff.com/magichigh/index.php?strip_id=%s'
stripUrl = latestUrl + '?strip_id=%s'
imageSearch = compile(r'(istrip_files/strips/.+?)"')
prevSearch = compile(r'First .+?"(/magichigh.+?)".+?top_back')
help = 'Index format: n'
@ -25,7 +25,7 @@ class MagicHigh(_BasicScraper):
class Marilith(_BasicScraper):
latestUrl = 'http://www.marilith.com/'
stripUrl = 'http://www.marilith.com/archive.php?date=%s'
stripUrl = latestUrl + 'archive.php?date=%s'
imageSearch = compile(r'<img src="(comics/.+?)" border')
prevSearch = compile(r'<a href="(archive\.php\?date=.+?)"><img border=0 name=previous_day')
help = 'Index format: yyyymmdd'
@ -34,7 +34,7 @@ class Marilith(_BasicScraper):
class MarryMe(_BasicScraper):
latestUrl = 'http://marrymemovie.com/main/'
stripUrl = 'http://marrymemovie.com/main/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'(/comicfolder/.+?)"')
prevSearch = compile(r'Previous Comic:</small><br />&#171; <a href="(.+?)">')
help = 'Index format: good luck !'
@ -42,7 +42,7 @@ class MarryMe(_BasicScraper):
class Meek(_BasicScraper):
latestUrl = 'http://www.meekcomic.com/'
stripUrl = 'http://www.meekcomic.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'meekcomic.com(/comics/.+?)"')
prevSearch = compile(r'\s.+?(http://www.meekcomic.com/.+?)".+?Previous<')
help = 'Index format: yyyy/mm/dd/ch-p/'
@ -50,7 +50,7 @@ class Meek(_BasicScraper):
class MegaTokyo(_BasicScraper):
latestUrl = 'http://www.megatokyo.com/'
stripUrl = 'http://www.megatokyo.com/strip/%s'
stripUrl = latestUrl + 'strip/%s'
imageSearch = compile(r'"(strips/.+?)"', IGNORECASE)
prevSearch = compile(r'"(./strip/\d+?)">Prev')
help = 'Index format: nnnn'
@ -58,7 +58,7 @@ class MegaTokyo(_BasicScraper):
class MyPrivateLittleHell(_BasicScraper):
latestUrl = 'http://mutt.purrsia.com/mplh/'
stripUrl = 'http://mutt.purrsia.com/mplh/?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r'<img.+?src="(comics/.+?)"')
prevSearch = compile(r'<a.+?href="(\?date=\d+/\d+/\d+)">Prev</a>')
help = 'Index format: mm/dd/yyyy'
@ -67,16 +67,23 @@ class MyPrivateLittleHell(_BasicScraper):
class MacHall(_BasicScraper):
latestUrl = 'http://www.machall.com/'
stripUrl = 'http://www.machall.com/view.php?date=%s'
stripUrl = latestUrl + 'view.php?date=%s'
imageSearch = compile(r'<img src="(comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)"><img[^>]+?src=\'drop_shadow/previous.gif\'>')
help = 'Index format: yyyy-mm-dd'
class Melonpool(_BasicScraper):
latestUrl = 'http://www.melonpool.com/'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch = compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help = 'Index format: yyyymmdd'
class Misfile(_BasicScraper):
latestUrl = 'http://www.misfile.com/'
stripUrl = 'http://www.misfile.com/?page=%s'
stripUrl = latestUrl + '?page=%s'
imageSearch = compile(r'<img src="(overlay\.php\?pageCalled=\d+)">')
prevSearch = compile(r'<a href="(\?page=\d+)"><img src="/images/back\.gif"')
help = 'Index format: n (unpadded)'
@ -86,7 +93,7 @@ class Misfile(_BasicScraper):
class MysteriesOfTheArcana(_BasicScraper):
latestUrl = 'http://mysteriesofthearcana.com/'
stripUrl = 'http://mysteriesofthearcana.com/index.php?action=comics&cid='
stripUrl = latestUrl + 'index.php?action=comics&cid='
imageSearch = compile(r'(image.php\?type=com&i=.+?)"')
prevSearch = compile(r'(index.php\?action=comics&cid=.+?)".+?show_prev1')
help = 'Index format: n (unpadded)'
@ -95,7 +102,7 @@ class MysteriesOfTheArcana(_BasicScraper):
class MysticRevolution(_BasicScraper):
latestUrl = 'http://www.mysticrev.com/index.php'
stripUrl = 'http://www.mysticrev.com/index.php?cid=%s'
stripUrl = latestUrl + '?cid=%s'
imageSearch = compile(r'(comics/.+?)"')
prevSearch = compile(r'(\?cid=.+?)".+?prev.gif')
help = 'Index format: n (unpadded)'
@ -104,7 +111,7 @@ class MysticRevolution(_BasicScraper):
class MontyAndWooly(_BasicScraper):
latestUrl = 'http://www.montyandwoolley.co.uk/'
stripUrl = 'http://montyandwoolley.co.uk/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'<img src="(http://montyandwoolley.co.uk/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(.+?)">')
help = 'Index format: yyyy/mm/dd/strip-name'

View file

@ -9,16 +9,16 @@ from ..helpers import indirectStarter, _PHPScraper
class NamirDeiter(_BasicScraper):
latestUrl = 'http://www.namirdeiter.com/'
stripUrl = 'http://www.namirdeiter.com/comics/index.php?date=%s'
stripUrl = latestUrl + 'comics/index.php?date=%s'
imageSearch = compile(r'<img.+?(/comics/\d{8}.+?)[\'|\"]')
prevSearch = compile(r'(/comics/index.php\?date=.+?|http://www.namirdeiter.com/comics/index.php\?date=.+?)[\'|\"].+?previous')
prevSearch = compile(r'(/comics/index.php\?date=.+?|http://www\.namirdeiter\.com/comics/index.php\?date=.+?)[\'|\"].+?previous')
help = 'Index format: yyyymmdd'
class NeoEarth(_BasicScraper):
latestUrl = 'http://www.neo-earth.com/NE/'
stripUrl = 'http://www.neo-earth.com/NE/index.php?date=%s'
stripUrl = latestUrl + 'index.php?date=%s'
imageSearch = compile(r'<img src="(strips/.+?)"')
prevSearch = compile(r'<a href="(.+?)">Previous</a>')
help = 'Index format: yyyy-mm-dd'
@ -27,7 +27,7 @@ class NeoEarth(_BasicScraper):
class Nervillsaga(_BasicScraper):
latestUrl = 'http://www.nervillsaga.com/'
stripUrl = 'http://www.nervillsaga.com/index.php?s=%s'
stripUrl = latestUrl + 'index.php?s=%s'
imageSearch = compile(r'"(pic/.+?)"')
prevSearch = compile(r'"(.+?)">Previous')
help = 'Index format: nnn'
@ -35,8 +35,8 @@ class Nervillsaga(_BasicScraper):
class NewAdventuresOfBobbin(_BasicScraper):
latestUrl = 'http://bobbin-comic.com/'
stripUrl = 'http://www.bobbin-comic.com/wordpress/?p=%s'
latestUrl = 'http://www.bobbin-comic.com/'
stripUrl = latestUrl + 'wordpress/?p=%s'
imageSearch = compile(r'<img src="(http://www.bobbin-comic.com/wordpress/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)"><span class="prev">')
help = 'Index format: n'
@ -45,7 +45,7 @@ class NewAdventuresOfBobbin(_BasicScraper):
class NewWorld(_BasicScraper):
latestUrl = 'http://www.tfsnewworld.com/'
stripUrl = 'http://www.tfsnewworld.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'<img src="(http://www.tfsnewworld.com/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="([^"]+)" rel="prev">')
help = 'Index format: yyyy/mm/dd/stripn'
@ -54,7 +54,7 @@ class NewWorld(_BasicScraper):
class Nicky510(_BasicScraper):
latestUrl = 'http://www.nicky510.com/'
stripUrl = 'http://www.nicky510.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'(http://www.nicky510.com/comics/.+?)"')
prevSearch = compile(r'<a href="(http://www.nicky510.com/.+?)" class="navi navi-prev"')
help = 'Index format: yyyy/mm/dd/stripname/'
@ -72,7 +72,7 @@ class NoNeedForBushido(_BasicScraper):
class Nukees(_BasicScraper):
latestUrl = 'http://www.nukees.com/'
stripUrl = 'http://www.nukees.com/d/%s'
stripUrl = latestUrl + 'd/%s'
imageSearch = compile(r'"comic".+?"(/comics/.+?)"')
prevSearch = compile(r'"(/d/.+?)".+?previous')
help = 'Index format: yyyymmdd.html'
@ -159,7 +159,7 @@ class Nodwick(_BasicScraper):
class NekkoAndJoruba(_BasicScraper):
latestUrl = 'http://www.nekkoandjoruba.com/'
stripUrl = 'http://www.nekkoandjoruba.com/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'<img src="(http://www.nekkoandjoruba.com/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)">&lsaquo;</a>')
help = 'Index format: nnn'
@ -168,7 +168,7 @@ class NekkoAndJoruba(_BasicScraper):
class NobodyScores(_BasicScraper):
latestUrl = 'http://nobodyscores.loosenutstudio.com/'
stripUrl = 'http://nobodyscores.loosenutstudio.com/index.php?id=%s'
imageSearch = compile(r'><img src="(http://nobodyscores.loosenutstudio.com/comix/.+?)"')
prevSearch = compile(r'<a href="(http://nobodyscores.loosenutstudio.com/index.php.+?)">the one before </a>')
stripUrl = latestUrl + 'index.php?id=%s'
imageSearch = compile(r'><img src="(http://nobodyscores\.loosenutstudio\.com/comix/.+?)"')
prevSearch = compile(r'<a href="(http://nobodyscores\.loosenutstudio\.com/index.php.+?)">the one before </a>')
help = 'Index format: nnn'

View file

@ -9,7 +9,6 @@ from ..scraper import _BasicScraper
class NineteenNinetySeven(_BasicScraper):
name = '1997'
latestUrl = 'http://www.1977thecomic.com/'
stripUrl = 'http://www.1977thecomic.com/%s'
imageSearch = compile(tagre("img", "src", r'(http://www\.1977thecomic\.com/comics-1977/[^"]+)'))
prevSearch = compile(tagre("a", "href", r'([^"]+)')+"Previous")
help = 'Index format: yyyy/mm/dd/strip-name'

View file

@ -18,7 +18,7 @@ class OctopusPie(_BasicScraper):
class OddFish(_BasicScraper):
latestUrl = 'http://www.odd-fish.net/'
stripUrl = 'http://www.odd-fish.net/viewing.php?&comic_id=%s'
stripUrl = latestUrl + 'viewing.php?&comic_id=%s'
imageSearch = compile(r'<img src="(images/\d{1,4}.\w{3,4})" ')
prevSearch = compile(r'<a href="(.+?)"><img src="http://www.odd-fishing.net/i/older.gif" ')
help = 'Index format: n (unpadded)'
@ -27,7 +27,7 @@ class OddFish(_BasicScraper):
class OhMyGods(_BasicScraper):
latestUrl = 'http://ohmygods.co.uk/'
stripUrl = 'http://ohmygods.co.uk/strips/%s'
stripUrl = latestUrl + 'strips/%s'
imageSearch = compile(r'<p class="omgs-strip"><img src="(/system/files/.+?)"')
prevSearch = compile(r'<li class="custom_pager_prev"><a href="(/strips/.+?)"')
help = 'Index format: yyyy-mm-dd'
@ -45,7 +45,7 @@ class OnTheEdge(_BasicScraper):
class OneQuestion(_BasicScraper):
latestUrl = 'http://onequestioncomic.com/'
stripUrl = 'http://onequestioncomic.com/comics/%s/'
stripUrl = latestUrl + 'comics/%s/'
imageSearch = compile(r'(istrip_files.+?)"')
prevSearch = compile(r'First.+?"(comic.php.+?)".+?previous.png')
help = 'Index format: n (unpadded)'
@ -54,7 +54,7 @@ class OneQuestion(_BasicScraper):
class OurHomePlanet(_BasicScraper):
latestUrl = 'http://gdk.gd-kun.net/'
stripUrl = 'http://gdk.gd-kun.net/%s.html'
stripUrl = latestUrl + '%s.html'
imageSearch = compile(r'<img src="(pages/comic.+?)"')
prevSearch = compile(r'coords="50,18,95,65".+?href="(.+?\.html)".+?alt=')
help = 'Index format: n (unpadded)'
@ -81,7 +81,7 @@ class Oglaf(_BasicScraper):
class OverCompensating(_BasicScraper):
latestUrl = 'http://www.overcompensating.com/'
stripUrl = 'http://www.overcompensating.com/posts/%s.html'
stripUrl = latestUrl + 'posts/%s.html'
imageSearch = compile(r'<img src="(/comics/.+?)"')
prevSearch = compile(r'"><a href="(.+?)"[^>]+?>&nbsp;\<\- &nbsp;</a>')
help = 'Index format: yyyymmdd'

View file

@ -8,7 +8,7 @@ from ..helpers import bounceStarter, queryNamer
class PartiallyClips(_BasicScraper):
latestUrl = 'http://www.partiallyclips.com/'
stripUrl = 'http://www.partiallyclips.com/index.php?id=%s'
stripUrl = latestUrl + 'index.php?id=%s'
imageSearch = compile(r'"(http://www.partiallyclips.com/storage/.+?)"')
prevSearch = compile(r'"(index.php\?id=.+?)".+?prev')
help = 'Index format: nnnn'
@ -26,7 +26,7 @@ class PastelDefender(_BasicScraper):
class PebbleVersion(_BasicScraper):
latestUrl = 'http://www.pebbleversion.com/'
stripUrl = 'http://www.pebbleversion.com/Archives/Strip%s.html'
stripUrl = latestUrl + 'Archives/Strip%s.html'
imageSearch = compile(r'<img src="(ComicStrips/.+?|../ComicStrips/.+?)"')
prevSearch = compile(r'<a href="((?!.+?">First Comic)Archives/Strip.+?|(?=.+?">Previous Comic)(?!.+?">First Comic)Strip.+?)"')
help = 'Index format: n (unpadded)'
@ -34,7 +34,7 @@ class PebbleVersion(_BasicScraper):
class PennyAndAggie(_BasicScraper):
latestUrl = 'http://www.pennyandaggie.com/index.php'
stripUrl = 'http://www.pennyandaggie.com/index.php\?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'src=".+?(/comics/.+?)"')
prevSearch = compile(r"</a><a href='(index.php\?p=.+?)'.+?prev")
help = 'Index format: n (unpadded)'
@ -58,7 +58,7 @@ class PennyArcade(_BasicScraper):
class PeppermintSaga(_BasicScraper):
latestUrl = 'http://www.pepsaga.com/'
stripUrl = 'http://www.pepsaga.com/comics/%s/'
stripUrl = latestUrl + 'comics/%s/'
imageSearch = compile(r'src=.+?(http.+?/comics/.+?)"')
prevSearch = compile(r'First</a><a href="(http://www.pepsaga.com/comics/.+?/)"')
help = 'Index format: non'
@ -66,7 +66,7 @@ class PeppermintSaga(_BasicScraper):
class PerkiGoth(_BasicScraper):
latestUrl = 'http://mutt.purrsia.com/main.php'
stripUrl = 'http://mutt.purrsia.com/main.php?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r'<img.+?src="(comics/.+?)"')
prevSearch = compile(r'<a.+?href="(\?date=\d+/\d+/\d+)">Prev</a>')
help = 'Index format: mm/dd/yyyy'
@ -74,7 +74,7 @@ class PerkiGoth(_BasicScraper):
class Pixel(_BasicScraper):
latestUrl = 'http://www.chrisdlugosz.net/pixel/'
stripUrl = 'http://www.chrisdlugosz.net/pixel/%s.shtml'
stripUrl = latestUrl + '%s.shtml'
imageSearch = compile(r'<IMG SRC="(\d+\.png)" ALT=""><BR><BR>')
prevSearch = compile(r'<A HREF="(\d+\.shtml)"><IMG SRC="_prev.png" BORDER=0 ALT=""></A>')
help = 'Index format: nnn'
@ -92,7 +92,7 @@ class PiledHigherAndDeeper(_BasicScraper):
class Precocious(_BasicScraper):
latestUrl = 'http://www.precociouscomic.com/'
stripUrl = 'http://www.precociouscomic.com/comic.php?page=%s'
stripUrl = latestUrl + 'comic.php?page=%s'
imageSearch = compile(r'(archive/strips/.+?)"')
prevSearch = compile(r'First.+?(comic.php\?page=.+?)">Previous<')
help = 'Index format: n (unpadded)'
@ -138,7 +138,7 @@ earthbound = pensAndTales('Earthbound', 'http://earthbound.pensandtales.com/')
class ProperBarn(_BasicScraper):
latestUrl = 'http://www.nitrocosm.com/go/gag/'
stripUrl = 'http://www.nitrocosm.com/go/gag/%s/'
stripUrl = latestUrl + '%s/'
imageSearch = compile(r'<img class="gallery_display" src="([^"]+)"')
prevSearch = compile(r'<a href="([^"]+)"[^>]*><button type="submit" class="nav_btn_previous">')
help = 'Index format: nnn'
@ -147,7 +147,7 @@ class ProperBarn(_BasicScraper):
class PunksAndNerds(_BasicScraper):
latestUrl = 'http://www.punksandnerds.com/'
stripUrl = 'http://www.punksandnerds.com/?id=%s/'
stripUrl = latestUrl + '?id=%s/'
imageSearch = compile(r'<img src="(http://www.punksandnerds.com/img/comic/.+?)"')
prevSearch = compile(r'<td><a href="(.+?)"[^>]+?><img src="backcomic.gif"')
help = 'Index format: nnn'
@ -156,7 +156,7 @@ class PunksAndNerds(_BasicScraper):
class PunksAndNerdsOld(_BasicScraper):
latestUrl = 'http://original.punksandnerds.com/'
stripUrl = 'http://original.punksandnerds.com/d/%s.html'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(r' src="(/comics/.+?)"')
prevSearch = compile(r'><strong><a href="(.+?)"[^>]+?><img[^>]+?src="/previouscomic.gif">')
help = 'Index format: yyyymmdd'
@ -165,7 +165,7 @@ class PunksAndNerdsOld(_BasicScraper):
class PlanescapeSurvival(_BasicScraper):
latestUrl = 'http://planescapecomic.com/'
stripUrl = 'http://planescapecomic.com/%s.html'
stripUrl = latestUrl + '%s.html'
imageSearch = compile(r'src="(comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)"><img alt="Previous" ')
help = 'Index format: nnn'

View file

@ -7,7 +7,7 @@ from ..scraper import _BasicScraper
class QuestionableContent(_BasicScraper):
latestUrl = 'http://www.questionablecontent.net/'
stripUrl = 'http://www.questionablecontent.net/view.php?comic=%s'
stripUrl = latestUrl + 'view.php?comic=%s'
imageSearch = compile(r'/(comics/\d+\.png)"')
prevSearch = compile(r'<a href="(view.php\?comic=\d+)">Previous')
help = 'Index format: n (unpadded)'
@ -16,7 +16,7 @@ class QuestionableContent(_BasicScraper):
class Qwantz(_BasicScraper):
latestUrl = 'http://www.qwantz.com/index.php'
stripUrl = 'http://www.qwantz.com/index.php?comic=%s'
stripUrl = latestUrl + '?comic=%s'
imageSearch = compile(r'<img src="(http://www.qwantz.com/comics/.+?)" class="comic"')
prevSearch = compile(r'"><a href="(.+?)">&larr; previous</a>')
help = 'Index format: n'

View file

@ -8,7 +8,7 @@ from ..helpers import bounceStarter
class RadioactivePanda(_BasicScraper):
latestUrl = 'http://www.radioactivepanda.com/'
stripUrl = 'http://www.radioactivepanda.com/comic/%s'
stripUrl = latestUrl + 'comic/%s'
imageSearch = compile(r'<img src="(/Assets/.*?)".+?"comicimg"')
prevSearch = compile(r'<a href="(/comic/.*?)".+?previous_btn')
help = 'Index format: n (no padding)'
@ -24,7 +24,7 @@ class Rascals(_BasicScraper):
class RealLife(_BasicScraper):
latestUrl = 'http://www.reallifecomics.com/'
stripUrl = 'http://www.reallifecomics.com/achive/%s.html'
stripUrl = latestUrl + 'achive/%s.html'
imageSearch = compile(r'"(/comics/.+?)"')
prevSearch = compile(r'"(/archive/.+?)".+?nav_previous')
help = 'Index format: yymmdd)'
@ -33,7 +33,7 @@ class RealLife(_BasicScraper):
class RedString(_BasicScraper):
latestUrl = 'http://www.redstring.strawberrycomics.com/'
stripUrl = 'http://www.redstring.strawberrycomics.com/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'<img src="(http://www.redstring.strawberrycomics.com/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)">Previous Comic</a>')
help = 'Index format: nnn'
@ -42,7 +42,7 @@ class RedString(_BasicScraper):
class Roza(_BasicScraper):
latestUrl = 'http://www.junglestudio.com/roza/index.php'
stripUrl = 'http://www.junglestudio.com/roza/index.php\?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r'<img src="(pages/.+?)"')
prevSearch = compile(r'<a href="(index.php\?date=.+?)">[^>].+?navtable_01.gif')
help = 'Index format: yyyy-mm-dd'
@ -61,7 +61,7 @@ class RedMeat(_BasicScraper):
class RunningWild(_BasicScraper):
latestUrl = 'http://runningwild.katbox.net/'
stripUrl = 'http://runningwild.katbox.net/index.php?strip_id=%s'
stripUrl = latestUrl + 'index.php?strip_id=%s'
imageSearch = compile(r'="(.+?strips/.+?)"')
prevSearch = compile(r'(index.php\?strip_id=.+?)".+?navigation_back')
help = 'Index format: n (unpadded)'

View file

@ -5,11 +5,12 @@ from os.path import splitext
from ..scraper import _BasicScraper
from ..helpers import bounceStarter, indirectStarter
from ..util import tagre
class SailorsunOrg(_BasicScraper):
latestUrl = 'http://www.sailorsun.org/'
stripUrl = 'http://www.sailorsun.org/browse.php?comicID=%s'
stripUrl = latestUrl + 'browse.php?comicID=%s'
imageSearch = compile(r'(comics/.+?)"')
prevSearch = compile(r'/(browse.php.+?)".+?/prev.gif')
help = 'Index format: n (unpadded)'
@ -27,7 +28,7 @@ class SamAndFuzzy(_BasicScraper):
class SarahZero(_BasicScraper):
latestUrl = 'http://www.sarahzero.com/'
stripUrl = 'http://www.sarahzero.com/sz_%s.html'
stripUrl = latestUrl + 'sz_%s.html'
imageSearch = compile(r'<img src="(z_(?:(?:spreads)|(?:temp)).+?)" alt=""')
prevSearch = compile(r'onmouseout="changeImages\(\'sz_05_nav\',\'z_site/sz_05_nav.gif\'\);return true" href="(sz_.+?)">')
help = 'Index format: nnnn'
@ -36,25 +37,48 @@ class SarahZero(_BasicScraper):
class ScaryGoRound(_BasicScraper):
latestUrl = 'http://www.scarygoround.com/'
stripUrl = 'http://www.scarygoround.com/?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r'<img src="(strips/\d{8}\..{3})"')
prevSearch = compile(r'f><a href="(.+?)"><img src="site-images/previous.png"')
help = 'Index format: n (unpadded)'
class SchlockMercenary(_BasicScraper):
latestUrl = 'http://www.schlockmercenary.com/'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch = compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help = 'Index format: yyyymmdd'
class SchoolBites(_BasicScraper):
latestUrl = 'http://www.schoolbites.net/'
stripUrl = 'http://www.schoolbites.net/d/%s.html'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(r'(/comics/.+?)"')
prevSearch = compile(r'first_day.+?(/d/.+?.html).+?/previous_day.gif')
help = 'Index format: yyyymmdd'
class Sheldon(_BasicScraper):
latestUrl = 'http://www.sheldoncomics.com/'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch = compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help = 'Index format: yyyymmdd'
class Shortpacked(_BasicScraper):
latestUrl = 'http://www.shortpacked.com/'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch = compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help = 'Index format: yyyymmdd'
class SinFest(_BasicScraper):
name = 'KeenSpot/SinFest'
latestUrl = 'http://www.sinfest.net/'
stripUrl = 'http://www.sinfest.net/archive_page.php?comicID=%s'
stripUrl = latestUrl + 'archive_page.php?comicID=%s'
imageSearch = compile(r'<img src=".+?(/comikaze/comics/.+?)"')
prevSearch = compile(r'(/archive_page.php\?comicID=.+?)".+?prev_a')
help = 'Index format: n (unpadded)'
@ -71,7 +95,7 @@ class SlightlyDamned(_BasicScraper):
class SluggyFreelance(_BasicScraper):
latestUrl = 'http://www.sluggy.com/'
stripUrl = 'http://www.sluggy.com/comics/archives/daily/%s'
stripUrl = latestUrl + 'comics/archives/daily/%s'
imageSearch = compile(r'<img src="(/images/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)"[^>]+?><span class="ui-icon ui-icon-seek-prev">')
help = 'Index format: yymmdd'
@ -90,12 +114,19 @@ class SodiumEyes(_BasicScraper):
class SpareParts(_BasicScraper):
latestUrl = 'http://www.sparepartscomics.com/'
stripUrl = 'http://www.sparepartscomics.com/comics/\\?date=s%'
stripUrl = latestUrl + 'comics/\\?date=s%'
imageSearch = compile(r'(/comics/2.+?)[" ]')
prevSearch = compile(r'(/comics/.+?|index.php\?.+?)".+?Prev')
help = 'Index format: yyyymmdd'
class StarslipCrisis(_BasicScraper):
latestUrl = 'http://www.starslipcrisis.com/'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch = compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help = 'Index format: yyyymmdd'
class Stubble(_BasicScraper):
latestUrl = 'http://www.stubblecomics.com/d/20051230.html'
@ -108,7 +139,7 @@ class Stubble(_BasicScraper):
class StrawberryDeathCake(_BasicScraper):
latestUrl = 'http://rainchildstudios.com/strawberry/'
stripUrl = 'http://rainchildstudios.com/strawberry/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'/(comics/.+?)"')
prevSearch = compile(r'strawberry/(\?p=.+?)".+?span class="prev"')
help = 'Index format: n (good luck)'
@ -117,7 +148,7 @@ class StrawberryDeathCake(_BasicScraper):
class SuburbanTribe(_BasicScraper):
latestUrl = 'http://www.pixelwhip.com/'
stripUrl = 'http://www.pixelwhip.com/?p%s'
stripUrl = latestUrl + '?p%s'
imageSearch = compile(r'<img src="(http://www.pixelwhip.com/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="([^"]+)" rel="prev">')
help = 'Index format: nnnn'
@ -135,7 +166,7 @@ class SuccubusJustice(_BasicScraper):
class Supafine(_BasicScraper):
latestUrl = 'http://www.supafine.com/comics/classic.php'
stripUrl = 'http://www.supafine.com/comics/classic.php?comicID=%s'
stripUrl = latestUrl + '?comicID=%s'
imageSearch = compile(r'<img src="(http://www.supafine.com/comics/.+?)"')
prevSearch = compile(r'<a href="(http://www.supafine.com/comics/classic.php\?.+?)"><img src="http://supafine.com/comikaze/images/previous.gif" ')
help = 'Index format: nnn'
@ -144,7 +175,7 @@ class Supafine(_BasicScraper):
class SomethingPositive(_BasicScraper):
latestUrl = 'http://www.somethingpositive.net/'
stripUrl = 'http://www.somethingpositive.net/sp%s.shtml'
stripUrl = latestUrl + 'sp%s.shtml'
imageSearch = compile(r'<img src="(/arch/sp\d+.\w{3,4}|/sp\d+.\w{3,4})"')
prevSearch = compile(r'<a \n?href="(sp\d{8}\.shtml)">(<font size=1\nface=".+?"\nSTYLE=".+?">Previous|<img src="images2/previous|<img src="images/previous.gif")', MULTILINE | IGNORECASE)
help = 'Index format: mmddyyyy'
@ -296,7 +327,7 @@ globals().update(snafuComics())
class SosiaalisestiRajoittuneet(_BasicScraper):
latestUrl = 'http://sosiaalisestirajoittuneet.fi/index_nocomment.php'
stripUrl = 'http://sosiaalisestirajoittuneet.fi/index_nocomment.php?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r'<img src="(strips/web/\d+.jpg)" alt=".*?" />')
prevSearch = compile(r'<a href="(index_nocomment\.php\?date=\d+)"><img\s+src="images/active_edellinen\.gif"', MULTILINE)
@ -304,7 +335,7 @@ class SosiaalisestiRajoittuneet(_BasicScraper):
class StrangeCandy(_BasicScraper):
latestUrl = 'http://www.strangecandy.net/'
stripUrl = 'http://www.strangecandy.net/d/%s.html'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(r'src="(http://www.strangecandy.net/comics/\d{8}.\w{1,4})"')
prevSearch = compile(r'<a href="(http://www.strangecandy.net/d/\d{8}.html)"><img[^>]+?src="http://www.strangecandy.net/images/previous_day.gif"')
help = 'Index format: yyyyddmm'
@ -313,7 +344,7 @@ class StrangeCandy(_BasicScraper):
class SMBC(_BasicScraper):
latestUrl = 'http://www.smbc-comics.com/'
stripUrl = 'http://www.smbc-comics.com/index.php?db=comics&id=%s'
stripUrl = latestUrl + 'index.php?db=comics&id=%s'
imageSearch = compile(r'<img src=\'(.+?\d{8}.\w{1,4})\'>')
prevSearch = compile(r'131,13,216,84"\n\s+href="(.+?)#comic"\n>', MULTILINE)
help = 'Index format: nnnn'
@ -322,7 +353,7 @@ class SMBC(_BasicScraper):
class SomethingLikeLife(_BasicScraper):
latestUrl = 'http://www.pulledpunches.com/'
stripUrl = 'http://www.pulledpunches.com/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'<img src="(http://www.pulledpunches.com/comics/[^"]*)"')
prevSearch = compile(r'</a> <a href="(http://www.pulledpunches.com/\?p=[^"]*)"><img src="back1.gif"')
help = 'Index format: nn'
@ -331,7 +362,7 @@ class SomethingLikeLife(_BasicScraper):
class StickEmUpComics(_BasicScraper):
latestUrl = 'http://stickemupcomics.com/'
stripUrl = 'http://stickemupcomics.com/%s'
stripUrl = latestUrl + '%s'
imageSearch = compile(r'<img src="(http://stickemupcomics.com/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)"><span class="prev">')
help = 'Index format: yyyy/mm/dd/stripname'
@ -340,7 +371,7 @@ class StickEmUpComics(_BasicScraper):
class SexDemonBag(_BasicScraper):
latestUrl = 'http://www.sexdemonbag.com/'
stripUrl = 'http://www.sexdemonbag.com/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'<img src="(http://www.sexdemonbag.com/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(.+?)">')
help = 'Index format: nnn'

View file

@ -8,7 +8,7 @@ from ..helpers import indirectStarter
class TalesOfPylea(_BasicScraper):
latestUrl = 'http://talesofpylea.com/'
stripUrl = 'http://talesofpylea.com/%s/'
stripUrl = latestUrl + '%s/'
imageSearch = compile(r'<img src="(istrip_files/strips/.+?)"')
prevSearch = compile(r' <a href="(.+?)">Back</a>')
help = 'Index format: nnn'
@ -17,7 +17,7 @@ class TalesOfPylea(_BasicScraper):
class TheNoob(_BasicScraper):
latestUrl = 'http://www.thenoobcomic.com/index.php'
stripUrl = 'http://www.thenoobcomic.com/index.php?pos=%'
stripUrl = latestUrl + '?pos=%'
imageSearch = compile(r'<img src="(/headquarters/comics/.+?)"')
prevSearch = compile(r'<a class="comic_nav_previous_button" href="(.+?)"></a>')
help = 'Index format: nnnn'
@ -26,7 +26,7 @@ class TheNoob(_BasicScraper):
class TheOrderOfTheStick(_BasicScraper):
latestUrl = 'http://www.giantitp.com/'
stripUrl = 'http://www.giantitp.com/comics/images/%s'
stripUrl = latestUrl + 'comics/images/%s'
imageSearch = compile(r'<IMG src="(/comics/images/.+?)">')
prevSearch = compile(r'<A href="(/comics/oots\d{4}\.html)"><IMG src="/Images/redesign/ComicNav_Back.gif"')
help = 'Index format: n (unpadded)'
@ -45,7 +45,7 @@ class TheParkingLotIsFull(_BasicScraper):
class TheWotch(_BasicScraper):
latestUrl = 'http://www.thewotch.com/'
stripUrl = 'http://www.thewotch.com/?epDate=%s'
stripUrl = latestUrl + '?epDate=%s'
imageSearch = compile(r"<img.+?src='(comics/.+?)'")
prevSearch = compile(r"<link rel='Previous' href='(\?epDate=\d+-\d+-\d+)'")
help = 'Index format: yyyy-mm-dd'
@ -62,7 +62,7 @@ class Thorn(_BasicScraper):
class TwoTwoOneFour(_BasicScraper):
latestUrl = 'http://www.nitrocosm.com/go/2214_classic/'
stripUrl = 'http://www.nitrocosm.com/go/2214_classic/%s/'
stripUrl = latestUrl + '%s/'
imageSearch = compile(r'<img class="gallery_display" src="([^"]+)"')
prevSearch = compile(r'<a href="([^"]+)"[^>]*><button type="submit" class="nav_btn_previous">')
help = 'Index format: n (unpadded)'
@ -71,7 +71,7 @@ class TwoTwoOneFour(_BasicScraper):
class TheWhiteboard(_BasicScraper):
latestUrl = 'http://www.the-whiteboard.com/'
stripUrl = 'http://www.the-whiteboard.com/auto%s.html'
stripUrl = latestUrl + 'auto%s.html'
imageSearch = compile(r'<img SRC="(autotwb\d{1,4}.+?|autowb\d{1,4}.+?)">', IGNORECASE)
prevSearch = compile(r'&nbsp<a href="(.+?)">previous</a>', IGNORECASE)
help = 'Index format: twb or wb + n wg. twb1000'
@ -119,7 +119,7 @@ class MalloryChan(_TheFallenAngel):
class HMHigh(_BasicScraper):
name = 'TheFallenAngel/HMHigh'
latestUrl = 'http://www.thefallenangel.co.uk/hmhigh/'
stripUrl = 'http://www.thefallenangel.co.uk/hmhigh/?id=%s'
stripUrl = latestUrl + '?id=%s'
imageSearch = compile(r'<img src="(http://www.thefallenangel.co.uk/hmhigh/img/comic/.+?)"')
prevSearch = compile(r' <a href="(http://www.thefallenangel.co.uk/.+?)" title=".+?">Prev</a>')
help = 'Index format: nnn'
@ -128,7 +128,7 @@ class HMHigh(_BasicScraper):
class TheOuterQuarter(_BasicScraper):
latestUrl = 'http://theouterquarter.com/'
stripUrl = 'http://theouterquarter.com/comic/%s'
stripUrl = latestUrl + 'comic/%s'
imageSearch = compile(r'<img src="(http://theouterquarter.com/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="([^"]+)" rel="prev">')
help = 'Index format: nnn'
@ -137,7 +137,7 @@ class TheOuterQuarter(_BasicScraper):
class TheHorrificAdventuresOfFranky(_BasicScraper):
latestUrl = 'http://www.boneyardfranky.com/'
stripUrl = 'http://www.boneyardfranky.com/?p=%s'
stripUrl = latestUrl + '?p=%s'
imageSearch = compile(r'<img src="(http://www.boneyardfranky.com/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(.+?)">')
help = 'Index format: nnn'

View file

@ -4,12 +4,20 @@ from re import compile, IGNORECASE
from ..scraper import _BasicScraper
from ..helpers import bounceStarter, indirectStarter
from ..util import getQueryParams
from ..util import getQueryParams, tagre
class UglyHill(_BasicScraper):
latestUrl = 'http://www.uglyhill.com/'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(tagre("img", "src", r'(/comic[s|/][^"]+)'))
prevSearch = compile(tagre("a", "href", r'[^"]*(/d/\d+\.s?html)')+r"[^>]+/images/(?:nav_02|previous_day)\.gif")
help = 'Index format: yyyymmdd'
class UnderPower(_BasicScraper):
latestUrl = 'http://underpower.non-essential.com/'
stripUrl = 'http://underpower.non-essential.com/index.php?comic=%s'
stripUrl = latestUrl + 'index.php?comic=%s'
imageSearch = compile(r'<img src="(comics/\d{8}\..+?)"')
prevSearch = compile(r'<a href="(/index.php\?comic=\d{8})"><img src="images/previous-comic\.gif"')
help = 'Index format: yyyymmdd'
@ -46,7 +54,7 @@ class UserFriendly(_BasicScraper):
class UndeadFriend(_BasicScraper):
latestUrl = 'http://www.undeadfriend.com/'
stripUrl = 'http://www.undeadfriend.com/d/%s.html'
stripUrl = latestUrl + 'd/%s.html'
imageSearch = compile(r'src="(http://www\.undeadfriend\.com/comics/.+?)"', IGNORECASE)
prevSearch = compile(r'<a.+?href="(http://www\.undeadfriend\.com/d/\d+?\.html)"><img border="0" name="previous_day" alt="Previous comic" src="http://www\.undeadfriend\.com/images/previous_day\.jpg', IGNORECASE)
help = 'Index format: yyyymmdd'

View file

@ -1,17 +1,17 @@
# -*- coding: iso-8859-1 -*-
# Copyright (C) 2004-2005 Tristan Seligmann and Jonathan Jacobs
from re import compile, IGNORECASE, sub
from re import compile, sub
from ..scraper import _BasicScraper
from ..util import fetchUrl
from ..util import fetchUrl, tagre
class _UClickScraper(_BasicScraper):
homepage = 'http://content.uclick.com/a2z.html'
baseUrl = 'http://www.uclick.com/client/zzz/%s/'
stripUrl = property(lambda self: self.latestUrl + '%s/')
imageSearch = compile(r'<img[^>]+src="(http://synd.imgsrv.uclick.com/comics/\w+/\d{4}/[^"]+\.gif)"', IGNORECASE)
prevSearch = compile(r'<a href="(/client/zzz/\w+/\d{4}/\d{2}/\d{2}/)">Previous date', IGNORECASE)
imageSearch = compile(tagre("img", "src", r'(http://synd\.imgsrv\.uclick\.com/comics/\w+/\d{4}/[^"]+\.gif)'))
prevSearch = compile(tagre("a", "href", r'(/client/zzz/\w+/\d{4}/\d{2}/\d{2}/)') + 'Previous date')
help = 'Index format: yyyy/mm/dd'
@classmethod
@ -20,13 +20,11 @@ class _UClickScraper(_BasicScraper):
@classmethod
def fetchSubmodules(cls):
exclusions = (
'index',
)
exclusions = ('index',)
# XXX refactor this mess
submoduleSearch = compile(r'(<A HREF="http://content.uclick.com/content/\w+.html">[^>]+?</a>)', IGNORECASE)
partsMatch = compile(r'<A HREF="http://content.uclick.com/content/(\w+?).html">([^>]+?)</a>', IGNORECASE)
submoduleSearch = compile(tagre("a", "href", r'(http://content\.uclick\.com/content/\w+\.html)'))
partsMatch = compile(tagre("a", "href", r'http://content\.uclick\.com/content/(\w+?)\.html'))
matches = fetchManyMatches(cls.homepage, (submoduleSearch,))[0]
possibles = [partsMatch.match(match).groups() for match in matches]
@ -37,7 +35,8 @@ class _UClickScraper(_BasicScraper):
def fetchSubmodule(module):
try:
return fetchUrl(cls.baseUrl % module, cls.imageSearch)
except:
except Exception:
# XXX log error
return False
return [normalizeName(name) for part, name in possibles if part not in exclusions and fetchSubmodule(part)]

View file

@ -16,13 +16,11 @@ class _VGCats(_BasicScraper):
return self.latestUrl + '?strip_id=%s'
class Super(_VGCats):
name = 'VGCats/Super'
latestUrl = 'http://www.vgcats.com/super/'
class Adventure(_VGCats):
name = 'VGCats/Adventure'
latestUrl = 'http://www.vgcats.com/ffxi/'
@ -31,7 +29,7 @@ class Adventure(_VGCats):
class ViiviJaWagner(_BasicScraper):
latestUrl = 'http://www.hs.fi/viivijawagner/'
stripUrl = 'http://www.hs.fi/viivijawagner/%s'
imageSearch = compile(r'<img id="strip\d+"\s+src="([^"]+)"', IGNORECASE)
prevSearch = compile(r'<a href="(.+?)"[^>]+?>\nEdellinen&nbsp;\n<img src="http://www.hs.fi/static/hs/img/viivitaakse.gif"', MULTILINE | IGNORECASE)
# XXX ?
help = 'Index format: shrugs!'

View file

@ -8,7 +8,7 @@ from ..helpers import queryNamer, bounceStarter
class WayfarersMoon(_BasicScraper):
latestUrl = 'http://www.wayfarersmoon.com/'
stripUrl = 'http://www.wayfarersmoon.com/index.php\?page=%s'
stripUrl = latestUrl + 'index.php\?page=%s'
imageSearch = compile(r'<img src="(/admin.+?)"')
prevSearch = compile(r'<a href="(.+?)".+?btn_back.gif')
help = 'Index format: nn'
@ -42,7 +42,7 @@ class WhyTheLongFace(_BasicScraper):
class Wigu(_BasicScraper):
latestUrl = 'http://www.wigu.com/wigu/'
stripUrl = 'http://www.wigu.com/wigu/?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r'<img src="(strips/\d{8}\..+?)" alt=""')
prevSearch = compile(r'<a href="(.+?)"[^>]+?>< PREV COMIC</a> ')
help = 'Index format: yyyymmdd'
@ -51,7 +51,7 @@ class Wigu(_BasicScraper):
class WiguTV(_BasicScraper):
latestUrl = 'http://jjrowland.com/'
stripUrl = 'http://jjrowland.com/archive/%s.html'
stripUrl = latestUrl + 'archive/%s.html'
imageSearch = compile(r'"(/comics/.+?)"')
prevSearch = compile(r'<a href="(/archive/.+?)"[^>]+?>&nbsp;')
help = 'Index format: yyyymmdd'
@ -60,7 +60,7 @@ class WiguTV(_BasicScraper):
class WotNow(_BasicScraper):
latestUrl = 'http://shadowburn.binmode.com/wotnow/'
stripUrl = 'http://shadowburn.binmode.com/wotnow/comic.php?comic_id=%s'
stripUrl = latestUrl + 'comic.php?comic_id=%s'
imageSearch = compile(r'<IMG SRC="(comics/.+?)"')
prevSearch = compile(r'<A HREF="(.+?)"><IMG SRC="images/b_prev.gif" ')
help = 'Index format: n (unpadded)'
@ -69,15 +69,14 @@ class WotNow(_BasicScraper):
class WorldOfWarcraftEh(_BasicScraper):
latestUrl = 'http://woweh.com/'
stripUrl = 'http://woweh.com/?p='
stripUrl = None
imageSearch = compile(r'http://woweh.com/(comics/.+?)"')
prevSearch = compile(r'woweh.com/(\?p=.+:?)".+:?="prev')
help = 'Index format: non'
class Wulffmorgenthaler(_BasicScraper):
latestUrl = 'http://www.wulffmorgenthaler.com/'
stripUrl = 'http://www.wulffmorgenthaler.com/Default.aspx?id=%s'
stripUrl = latestUrl + 'Default.aspx?id=%s'
imageSearch = compile(r'img id="ctl00_content_Strip1_imgStrip".+?class="strip" src="(striphandler\.ashx\?stripid=[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})"')
prevSearch = compile(r'<a href="(/default\.aspx\?id=[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})" id="ctl00_content_Strip1_aPrev">')
help = 'Index format: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx (GUID)'
@ -129,7 +128,7 @@ class WhiteNoise(_BasicScraper):
class WapsiSquare(_BasicScraper):
latestUrl = 'http://wapsisquare.com/'
stripUrl = 'http://wapsisquare.com/comic/%s'
stripUrl = latestUrl + 'comic/%s'
imageSearch = compile(r'<img src="(http://wapsisquare.com/comics/.+?)"')
prevSearch = compile(r'<a href="(.+?)"[^>]+?>Previous</a>')
help = 'Index format: strip-name'
@ -138,7 +137,7 @@ class WapsiSquare(_BasicScraper):
class WrongWay(_BasicScraper):
latestUrl = 'http://www.wrongwaycomics.com/'
stripUrl = 'http://www.wrongwaycomics.com/%s.html'
stripUrl = latestUrl + '%s.html'
imageSearch = compile(r'<img src="(comics/.+?)"')
prevSearch = compile(r' <a class="comicNav" href="(.+?)" onmouseover="previousLinkIn\(\)"')
help = 'Index format: nnn'
@ -147,7 +146,6 @@ class WrongWay(_BasicScraper):
class WeCanSleepTomorrow(_BasicScraper):
latestUrl = 'http://wecansleeptomorrow.com/'
stripUrl = 'http://wecansleeptomorrow.com/2009/12/07/smothered/'
imageSearch = compile(r'<img src="(http://wecansleeptomorrow.com/comics/.+?)"')
prevSearch = compile(r'<div class="nav-previous"><a href="(.+?)">')
help = 'Index format: yyyy/mm/dd/stripname'
@ -208,8 +206,8 @@ class Stellar(_WLP):
class Wondermark(_BasicScraper):
latestUrl = 'http://wondermark.com'
stripUrl = 'http://wondermark.com/%s/'
latestUrl = 'http://wondermark.com/'
stripUrl = latestUrl + '%s/'
imageSearch = compile(r'<img src="(http://wondermark.com/c/.+?)"')
prevSearch = compile(r'<a href="(.+?)" rel="prev">')
help = 'Index format: nnn'

View file

@ -23,7 +23,7 @@ class xkcd(_BasicScraper):
class xkcdSpanish(_BasicScraper):
latestUrl = 'http://es.xkcd.com/xkcd-es/'
stripUrl = 'http://es.xkcd.com/xkcd-es/strips/%s/'
stripUrl = latestUrl + 'strips/%s/'
imageSearch = compile(r'src="(/site_media/strips/.+?)"')
prevSearch = compile(r'<a rel="prev" href="(http://es.xkcd.com/xkcd-es/strips/.+?)">Anterior</a>')
help = 'Index format: stripname'

View file

@ -7,7 +7,7 @@ from ..scraper import _BasicScraper
class YAFGC(_BasicScraper):
latestUrl = 'http://yafgc.shipsinker.com/'
stripUrl = 'http://yafgc.shipsinker.com/index.php?strip_id=%s'
stripUrl = latestUrl + 'index.php?strip_id=%s'
imageSearch = compile(r'(istrip_.+?)"')
prevSearch = compile(r'(/.+?)">\r\n.+?prev.gif', MULTILINE)
help = 'Index format: n'
@ -23,7 +23,7 @@ class YouSayItFirst(_BasicScraper):
class Yirmumah(_BasicScraper):
latestUrl = 'http://yirmumah.net/archives.php'
stripUrl = 'http://yirmumah.net/archives.php?date=%s'
stripUrl = latestUrl + '?date=%s'
imageSearch = compile(r'<img src="(strips/\d{8}\..*?)"')
prevSearch = compile(r'<a href="(\?date=\d{8})">.*Previous')
help = 'Index format: yyyymmdd'

View file

@ -14,7 +14,7 @@ class Zapiro(_BasicScraper):
class ZombieHunters(_BasicScraper):
latestUrl = 'http://www.thezombiehunters.com/'
stripUrl = 'http://www.thezombiehunters.com/index.php?strip_id=%s'
stripUrl = latestUrl + 'index.php?strip_id=%s'
imageSearch = compile(r'"(.+?strips/.+?)"')
prevSearch = compile(r'</a><a href="(.+?)"><img id="prevcomic" ')
help = 'Index format: n(unpadded)'

View file

@ -1,25 +1,11 @@
# -*- coding: iso-8859-1 -*-
# Copyright (C) 2004-2005 Tristan Seligmann and Jonathan Jacobs
# Copyright (C) 2012 Bastian Kleineidam
import os
from . import loader
from .util import fetchUrls
from .comic import ComicStrip
from .output import out
disabled = []
def init_disabled():
filename = os.path.expanduser('~/.dosage/disabled')
if os.path.isfile(filename):
with open(filename) as f:
for line in f:
if line and not line.startswith('#'):
disabled.append(line.rstrip())
init_disabled()
class DisabledComicError(ValueError):
pass
class _BasicScraper(object):
'''Base class with scrape functions for comics.

View file

@ -1,7 +1,7 @@
# -*- coding: iso-8859-1 -*-
# Copyright (C) 2004-2005 Tristan Seligmann and Jonathan Jacobs
# Copyright (C) 2012 Bastian Kleineidam
from __future__ import division
from __future__ import division, print_function
import urllib2, urlparse
import sys
@ -185,8 +185,8 @@ def urlopen(url, referrer=None, retries=3, retry_wait_seconds=5):
while True:
try:
return urllib2.urlopen(req)
except IOError, msg:
out.write('URL retrieval failed: %s' % msg)
except IOError as msg:
out.write('URL retrieval of %s failed: %s' % (url, msg))
out.write('waiting %d seconds and retrying (%d)' % (retry_wait_seconds, tries), 2)
time.sleep(retry_wait_seconds)
tries += 1
@ -251,8 +251,8 @@ def getQueryParams(url):
def internal_error(out=sys.stderr, etype=None, evalue=None, tb=None):
"""Print internal error message (output defaults to stderr)."""
print >> out, os.linesep
print >> out, """********** Oops, I did it again. *************
print(os.linesep, file=out)
print("""********** Oops, I did it again. *************
You have found an internal error in %(app)s. Please write a bug report
at %(url)s and include the following information:
@ -262,7 +262,7 @@ at %(url)s and include the following information:
Not disclosing some of the information above due to privacy reasons is ok.
I will try to help you nonetheless, but you have to give me something
I can work with ;) .
""" % dict(app=AppName, url=SupportUrl)
""" % dict(app=AppName, url=SupportUrl), file=out)
if etype is None:
etype = sys.exc_info()[0]
if evalue is None:
@ -274,15 +274,15 @@ I can work with ;) .
print_app_info(out=out)
print_proxy_info(out=out)
print_locale_info(out=out)
print >> out, os.linesep, \
"******** %s internal error, over and out ********" % AppName
print(os.linesep,
"******** %s internal error, over and out ********" % AppName, file=out)
def print_env_info(key, out=sys.stderr):
"""If given environment key is defined, print it out."""
value = os.getenv(key)
if value is not None:
print >> out, key, "=", repr(value)
print(key, "=", repr(value), file=out)
def print_proxy_info(out=sys.stderr):
@ -298,12 +298,12 @@ def print_locale_info(out=sys.stderr):
def print_app_info(out=sys.stderr):
"""Print system and application info (output defaults to stderr)."""
print >> out, "System info:"
print >> out, App
print >> out, "Python %(version)s on %(platform)s" % \
{"version": sys.version, "platform": sys.platform}
print("System info:", file=out)
print(App, file=out)
print("Python %(version)s on %(platform)s" %
{"version": sys.version, "platform": sys.platform}, file=out)
stime = strtime(time.time())
print >> out, "Local time:", stime
print("Local time:", stime, file=out)
def strtime(t):

View file

@ -35,20 +35,20 @@ class _ComicTester(TestCase):
urlmatch = "^%s$" % urlmatch
ro = re.compile(urlmatch)
mo = ro.search(strip.stripUrl)
self.check(mo is not None, 'strip URL %r does not match %s' % (strip.stripUrl, urlmatch))
self.check(mo is not None, 'strip URL %r does not match stripUrl pattern %s' % (strip.stripUrl, urlmatch))
else:
empty += 1
num += 1
self.check(num >= 4, 'traversal failed after %d strips.' % num)
self.check(empty <= 1, 'failed to find images on %d pages.' % empty)
self.check(num >= 4, 'traversal failed after %d strips, check the prevSearch pattern.' % num)
self.check(empty <= 1, 'failed to find images on %d pages, check the imageSearch pattern.' % empty)
def save(self, image):
# create a temporary directory
tmpdir = tempfile.mkdtemp()
try:
image.save(tmpdir)
except Exception, msg:
self.check(False, 'could not save to %s: %s' % (tmpdir, msg))
except Exception as msg:
self.check(False, 'could not save %s to %s: %s' % (image.url, tmpdir, msg))
finally:
shutil.rmtree(tmpdir)