From bc0740c9909bbb8be881bb2fd9c8a9bacf04894b Mon Sep 17 00:00:00 2001 From: Bastian Kleineidam Date: Wed, 28 Aug 2013 20:49:53 +0200 Subject: [PATCH] Add new comic doc [ci skip] --- doc/adding_new_comics.txt | 90 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 90 insertions(+) create mode 100644 doc/adding_new_comics.txt diff --git a/doc/adding_new_comics.txt b/doc/adding_new_comics.txt new file mode 100644 index 000000000..886fee736 --- /dev/null +++ b/doc/adding_new_comics.txt @@ -0,0 +1,90 @@ +How to add a new comic to Dosage +================================= + +To add a new comic, add a new class in one of the *.py files +in the dosagelib/plugins module. + +The files in dosagelib/plugin and the classes inside those files are +sorted alphabetically. Add your comic to the appropriate filename. +For example if the comic name is "Super duper comic", the new class +should be added to dosagelib/plugins/s.py. + +Here is a complete example which is explained in detail below. + +``` +class SuperDuperComic(_BasicScraper): + description = u'A super duper comic by Mr Smith!' + url = 'http://superdupercomic.com/' + rurl = escape(url) + stripUrl = url + 'comic/%s' + firstStripUrl = stripUrl % '1' + imageSearch = compile(tagre("img", "src", r'(%scomicimg/[^"]+)' % rurl)) + prevSearch = compile(tagre("a", "href", r'(%scomic/\d+)' % rurl, after="prev")) + help = 'Index format: n (unpadded)' +``` + +Let's look at each line in detail. + +```class SuperDuperComic(_BasicScraper):``` + +All comic plugin classes inherit from ``_BasicScraper``. +The classname (``SuperDuperComic`` in our example) must be unique, +regardless of upper/lower characters. +The user finds comics with this classname, so be sure to select +something descriptive and easy to remember. + +```description = u'A super duper comic by Mr Smith!'``` + +Next, a description should be added to the class in Unicode notation +(u'...'). It is displayed when a user requests help for one comic with +``dosage -m superdupercomic``. + +```url = 'http://superdupercomic.com/'``` + +The URL must display the latest comic picture. This is where the +comic image search will start. See below for some special cases. + +```rurl = escape(url)``` + +This defines a variable ``rurl`` which is used in the search patterns +below. It properly escapes all regular expression special characters +like dots or question marks. + +```stripUrl = url + 'comic/%s'``` + +This defines how a comic strip URL looks like. In our example, all +comic strip URLs look like ``http://superdupercomic.com/comics/NNN`` +where NNN is the increasing comic number. + +```firstStripUrl = stripUrl % '1'``` + +This tells Dosage what the earliest comic strip URL looks like. Dosage +stops searching for more comics when it is encounted. In our example +comic numbering starts with ``1``, so the first comic URL is +``http://superdupercomic.com/comics/1`` + +```imageSearch = compile(tagre("img", "src", r'(%simg/[^"]+)' % rurl))``` + +Each comic page URL has one or more comic strip images. The imageSearch +pattern must match those images in the HTML content of the page URL. +To make it easy to match HTML tags, the ``tagre()`` function is +helpful. The first parameter is the tag name, the second the attribute +name and the third the attribute value. So in our example the given +pattern whould match a tag like +````` . + +```prevSearch = compile(tagre("a", "href", r'(%scomic/\d+)' % rurl, after="prev"))``` + +To search for more comics, Dosage has to look for the previous comic URL. +The ``after=`` value in ``tagre()`` matches anything between the +attribute value and the end of the tag. +So this pattern assumes each comic page URL has a link to the previous +comic, for example ``http://superdupercomic.com/comic/100`` has a +link ``