2017-05-18 22:31:12 +00:00
|
|
|
# Adding a comic to Dosage
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
To add a new comic to a local dosage installation, drop a python file into
|
|
|
|
Dosage's "user plugin directory" - If you don't know where that is, run `dosage
|
|
|
|
--help`, the directory will be shown at the end.
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
Here is a complete example which is explained in detail below. Dosage provides
|
|
|
|
different base classes for parsing comic pages, but this tutorial only covers
|
|
|
|
the modern `ParserScraper` base class, which uses an HTML parser (LXML/libxml)
|
|
|
|
to find on each pages's DOM.
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
```python
|
|
|
|
from ..scraper import ParserScraper
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
class SuperDuperComic(ParserScraper):
|
|
|
|
url = 'https://superdupercomic.com/'
|
2022-10-01 15:51:03 +00:00
|
|
|
stripUrl = url + 'comics/%s'
|
2013-08-28 18:49:53 +00:00
|
|
|
firstStripUrl = stripUrl % '1'
|
2023-06-01 21:03:59 +00:00
|
|
|
imageSearch = '//div[d:class("comicpane")]//img'
|
|
|
|
prevSearch = '//a[@rel="prev"]'
|
2013-08-28 18:49:53 +00:00
|
|
|
help = 'Index format: n (unpadded)'
|
|
|
|
```
|
|
|
|
|
|
|
|
Let's look at each line in detail.
|
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
```python
|
|
|
|
class SuperDuperComic(ParserScraper):
|
|
|
|
```
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
All comic plugin classes inherit from `ParserScraper`. The class name
|
|
|
|
(`SuperDuperComic` in our example) must be unique, regardless of upper/lower
|
|
|
|
characters. The user finds comics with this class name, so be sure to select
|
2013-08-28 18:49:53 +00:00
|
|
|
something descriptive and easy to remember.
|
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
```python
|
|
|
|
url = 'https://superdupercomic.com/'
|
|
|
|
```
|
|
|
|
|
|
|
|
The URL must display the latest comic picture. This is where the comic image
|
|
|
|
search will start. See below for some special cases.
|
|
|
|
|
|
|
|
```python
|
|
|
|
stripUrl = url + 'comics/%s'
|
|
|
|
```
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
This defines how a comic strip URL looks like. In our example, all comic strip
|
|
|
|
URLs look like `https://superdupercomic.com/comics/NNN` where NNN is the
|
|
|
|
increasing comic number.
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
```python
|
|
|
|
firstStripUrl = stripUrl % '1'
|
|
|
|
```
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
This tells Dosage what the earliest comic strip URL looks like. Dosage stops
|
|
|
|
searching for more comics when it is encounterd. In our example comic numbering
|
|
|
|
starts with `1`, so the oldest comic URL is
|
|
|
|
`https://superdupercomic.com/comics/1`
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
```python
|
|
|
|
imageSearch = '//div[d:class("comicpane")]//img'
|
|
|
|
```
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
Each comic page URL has one or more comic strip images. The `imageSearch`
|
|
|
|
defines an [XPath](https://quickref.me/xpath) expression to find the comic
|
|
|
|
strip image inside each page. Most of the time you can use your browser's
|
|
|
|
console (Open with `F12`) to experiment on the real page. Dosage adds a custom
|
|
|
|
XPath function (`d:class`) to make it easier to match HTML classes.
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
```python
|
|
|
|
prevSearch = '//a[@rel="prev"]'
|
|
|
|
```
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
To search for more comics, Dosage has to look for the previous comic URL. This
|
|
|
|
property defines an XPath expression to find a link to the previous comic page.
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
```python
|
|
|
|
help = 'Index format: n (unpadded)'
|
|
|
|
```
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
Since the user can search comics from a given start point, the help can
|
|
|
|
describe how the comic is numbered. Running `dosage superdupercomic:100` would
|
|
|
|
start getting comics from number 100 and earlier.
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
## Contribute a module to dosage
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
If you don't know how to use git and/or setup a Python development environment,
|
|
|
|
that's fine! You can [create an
|
|
|
|
issue](https://github.com/webcomics/dosage/issues/new) on GitHub and paste the
|
|
|
|
source of your new module into it and a Dosage developer will take care of
|
|
|
|
integrating the module into Dosage.
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
Otherwise, integrate your new comic module into in one of the `*.py` files in
|
|
|
|
the dosagelib/plugins module.
|
2013-08-28 18:49:53 +00:00
|
|
|
|
2023-06-01 21:03:59 +00:00
|
|
|
The files in dosagelib/plugins and the classes inside those files are sorted
|
|
|
|
alphabetically. Add your comic to the appropriate filename. For example if the
|
|
|
|
comic name is "Super duper comic", the new class should be added to
|
|
|
|
dosagelib/plugins/s.py.
|