A modified version of https://github.com/webcomics/dosage
Find a file
2013-04-05 19:27:30 +02:00
doc Better web page title. 2013-04-05 18:47:38 +02:00
dosagelib Write encoded data in binary format. 2013-04-05 19:27:30 +02:00
scripts Remove broken scripted plugins. 2013-04-04 18:30:02 +02:00
tests Use HTMLParser.unescape instead of rolling our own function. 2013-04-05 18:53:19 +02:00
.gitattributes Add web page source files. 2013-03-29 11:01:04 +01:00
.gitignore Un-ignore but do not package .json files. 2013-02-13 17:50:16 +01:00
.travis.yml Test Python 3.3 2013-04-04 18:30:04 +02:00
COPYING Updated copyright. 2013-03-07 18:19:50 +01:00
dosage Use integers for screen positions, not floats. 2013-04-05 18:48:33 +02:00
dosage.freecode Set release date. 2013-04-01 10:48:13 +02:00
Makefile Add web page source files. 2013-03-29 11:01:04 +01:00
MANIFEST.in Add web page source files. 2013-03-29 11:01:04 +01:00
README.md Initial commit to Github. 2012-06-20 21:58:13 +02:00
requirements.txt Fix pip requirements. 2013-04-03 20:27:55 +02:00
setup.py Ignore unknown distribution option. 2013-04-04 18:30:02 +02:00

Dosage

Dosage is a comic strip downloader and archiver.

Introduction

Dosage is designed to keep a local copy of specific webcomics and other picture-based content such as Picture of the Day sites. With the dosage commandline script you can get the latest strip of a webcomic, or catch-up to the last strip downloaded, or download a strip for a particular date/index (if the webcomic's site layout makes this possible).

Notice

This software is in no way intended to publically "broadcast" comic strips, it is purely for personal use. Please be aware that by making downloaded strips publically available (without the explicit permission of the author) you may be infringing upon various copyrights.

Additionally, Dosage respects the robots.txt exclusion protocol. This makes sure no content is accessed in an automatic way without consent by the publishers.

If you are a publisher of comics and want Dosage to access your files, add the following entry to your robotst.txt file:

User-agent: dosage
Allow: *

Usage

List available comics (ca. 3000 at the moment):

$ dosage --list

Get the latest comic of for example CalvinAndHobbes and save it in the "Comics" directory:

$ dosage CalvinAndHobbes

If you already have downloaded several comics and want to get the latest strips of all of them:

$ dosage --continue @

On Unix, xargs can download several comic strips in parallel, for example using up to 4 processes:

$ cd Comics && find . -type d | xargs -n1 -P4 dosage --basedir . --verbose

For advanced options and features execute dosage --help or look at the dosage manual page.

Adult content

Some comics contain adult content and require age confirmation. These comics can only be downloaded by using the --adult option, which confirms that you are old enough to view them.

Installation

The most convenient method is to use pip:

pip install requests dosage

If you install Dosage from source, the dosage script can be run directly with ./dosage. Alternatively, you can install Dosage using python distutils by invoking setup.py in the root of the distribution. For example:

python setup.py install

or if you do not have root permissions:

python setup.py install --home=$HOME

Dependencies

Python version 2.7 or higher, which can be downloaded from http://www.python.org/

Also the python-requests module is used, which can be downloaded from http://docs.python-requests.org/en/latest/

Technical Description

Dosage is written in Python and relies on regular expressions to do most of the grunt work.

For each comic Dosage has a plugin module, found in the "plugins" subdirectory of the dosagelib directory. Each module is a subclass of the _BasicScraper class and specifies where to download its comic images. Some comic syndicates (GoComics for example) have a standard layout for all comics. For such cases a generator function creates all _BasicScraper class instances from a given list of comic strips.

Extending Dosage

In order to add a new comic, a new module class has to be created in one of the *.py files in the dosagelib/plugins subdirectory. Look at the existing module classes for examples.

Reporting Bugs

You can report bugs, patches or requests at the Github issue tracker at https://github.com/wummel/dosage/issues

Dosage currently supports a large number of comics and that number grows on a regular basis. If you feel that there are comics that Dosage does not currently support but should support, please feel free to request them.

Test suite status

Dosage has extensive unit tests to ensure the code quality. Travis CI is used for continuous build and test integration.

Build Status