Tobias Gruetzmacher
c4a184d173
Remove some vanished modules.
2017-01-12 02:01:10 +01:00
Tobias Gruetzmacher
36ac459bed
Add removed GoComics modules to old list.
2017-01-12 01:22:13 +01:00
Tobias Gruetzmacher
a183e812ae
Update GoComics module for new site layout.
...
(fixes #77 )
2017-01-11 02:21:05 +01:00
Tobias Gruetzmacher
061efaac6e
New module for ComicSherpa (removed from GoComics)
2017-01-11 01:34:52 +01:00
John Safrit
969e633877
Fix pattern for The Devils Panties
2017-01-08 17:39:59 -05:00
Tobias Gruetzmacher
3f9feec041
Allow modules to ignore some HTTP error codes.
...
This is neccessary since it seems some webservers out there are
misconfigured to deliver actual content with an HTTP error code...
2016-11-01 18:25:02 +01:00
Tobias Gruetzmacher
46b7a374f6
Small GoComics update.
2016-11-01 02:51:00 +01:00
Tobias Gruetzmacher
f7f4e130bf
Small fix to the WLP module.
2016-11-01 02:27:29 +01:00
Tobias Gruetzmacher
bc755d09a3
Apply link modifier to all links.
...
This was previously only the "previous link modifier", now it can also
modify "next" and "latest" links. Additionally, the modifier is given
the current URL, so those cases can be distinguished.
2016-11-01 01:50:44 +01:00
Tobias Gruetzmacher
7fc05f75f5
Remove broken PetiteSymphony comics.
2016-10-31 07:16:10 +01:00
Tobias Gruetzmacher
69e6318f87
Remove ScurryAndCover, too much JavaScript.
2016-10-31 07:04:00 +01:00
Tobias Gruetzmacher
47e2502ec7
Fix a bunch of comic modules.
2016-10-31 06:57:47 +01:00
Tobias Gruetzmacher
446b81fc45
Fix Wumo and friends.
2016-10-30 15:28:54 +01:00
Tobias Gruetzmacher
51ed898f5d
Fix some SmackJeeves comics.
2016-10-30 14:30:45 +01:00
Tobias Gruetzmacher
b6d99945f6
Merge pull request #73 from acaranta/master
...
Added several SmackJeeves Comics
2016-10-30 11:55:17 +01:00
Tobias Gruetzmacher
3b9f30affd
Update ComicFury modules.
2016-10-30 11:04:45 +01:00
Tobias Gruetzmacher
a02660a7d3
Replace custom @memoized with stdlib @lru_cache.
2016-10-29 00:46:49 +02:00
Tobias Gruetzmacher
9a6a310b76
Fixup copyright years.
2016-10-29 00:21:41 +02:00
acaranta
83880a3cbd
corrected RainbowMansion
2016-10-27 09:53:34 +02:00
acaranta
0ed823175c
Added even more Smackjeeves comics
2016-10-27 06:58:57 +02:00
acaranta
a5c9a3c35c
Added several SmackJeeves Comics
2016-10-26 05:25:13 +02:00
Peter Brunner
19445a83ae
Fix smbc
2016-10-18 21:28:42 -04:00
Tobias Gruetzmacher
f94caa8a16
Use terminal size calculation from standard library.
2016-10-14 23:55:10 +02:00
Tobias Gruetzmacher
06be2a026b
Move some ex-KeenSpot comics to shorter names.
2016-10-14 14:23:33 +02:00
Tobias Gruetzmacher
b17d6e5f22
Rework/fix KeenSpot modules.
2016-10-14 00:14:53 +02:00
Tobias Gruetzmacher
064e7976ec
Add namer for Extra Fabulous Comics.
2016-10-06 00:42:50 +02:00
mostlyuseful
fce7dfff19
Add "Extra Fabulous Comics" comic
2016-10-04 17:06:50 +02:00
Tobias Gruetzmacher
f342a93aa1
Update GoComics module.
2016-10-01 03:39:36 +02:00
Tobias Gruetzmacher
c0d945a563
Update ComicFury modules.
2016-10-01 02:52:33 +02:00
Tobias Gruetzmacher
98c98ddfab
Fix some more comic modules (c-f).
2016-09-30 00:15:45 +02:00
Tobias Gruetzmacher
b1d2650615
Fix some modules (a&b).
2016-09-29 01:29:01 +02:00
Damjan Košir
c04c62e92b
xkcd now hone with xpaths
2016-08-18 21:28:25 +12:00
Damjan Košir
9ba184eb43
fixing LoadingArtist
2016-08-16 21:20:35 +12:00
Hubert Figuière
afcd19bf5b
Added Prince of Sartar Comic
2016-08-08 09:18:33 -04:00
Hubert Figuière
81821dc450
Added Space Junk Arlia comic
2016-08-08 09:18:33 -04:00
Tobias Gruetzmacher
fb37f946e0
Speed up comic module tests.
...
This fakes an If-Modified-Since header, so most web servers don't need
to send comic images at all. This should also reduce the amount of data
that needs to be fetched for comic module tests.
2016-08-01 00:44:34 +02:00
Tobias Gruetzmacher
4f80016bf0
Change robotparser import to make PyInstaller happy.
2016-06-06 22:42:01 +02:00
Tobias Gruetzmacher
64c8e502ca
Ignore case for comic download directories.
...
Since we already match comics case-insensitive on the command line, this
was a logical step, even if this means changing quite a bit of code that
all tries to resolve the "comic directory" in a slightly different
way...
2016-06-06 00:08:29 +02:00
Tobias Gruetzmacher
215d597573
Remove DrunkDuck for now.
...
- It's been disabled for ages
- Needs a major rework
- I don't want to add that many comics anyways...
- This also gets rid of make_scraper :)
2016-06-05 22:22:17 +02:00
Tobias Gruetzmacher
67d0d38100
Migrate SnafuComics to single-class module.
2016-06-05 22:12:16 +02:00
Tobias Gruetzmacher
125c96e9dc
Remove command to download ALL comics...
2016-06-05 21:57:56 +02:00
Tobias Gruetzmacher
df2048cb34
Keep track of removed and moved comics ( fixes #41 ).
...
I plan on keeping this list for at least ~ 2 releases and then purging
older entries...
2016-06-05 21:47:58 +02:00
Tobias Gruetzmacher
9b755a7e6c
Restore BobWhite.
2016-06-05 18:32:27 +02:00
Tobias Gruetzmacher
603fd62a1e
Fix workaround for PyInstaller...
2016-06-05 16:01:35 +02:00
Tobias Gruetzmacher
295b53a2d3
Fix name overrides (broken by 51008a).
2016-06-05 10:03:29 +02:00
Tobias Gruetzmacher
844bec09ba
Remove another dead comic from ComicFury.
2016-06-05 01:06:04 +02:00
Tobias Gruetzmacher
12123961a4
Fix error in PyInstaller packaged application.
2016-06-05 00:34:16 +02:00
André-Patrick Bubel
2b8e948868
Add String Theory comic
2016-06-01 11:19:17 +00:00
André-Patrick Bubel
192751073c
Add KillSixBillionDemons comic
2016-05-31 07:28:32 +00:00
Tobias Gruetzmacher
807bee6342
Migrate GoComics to single-class module.
2016-05-23 00:01:10 +02:00
Tobias Gruetzmacher
2c8e57bdea
Migrate Creators to single-class module.
2016-05-22 23:56:59 +02:00
Tobias Gruetzmacher
f5dff27b0a
Migrate SmackJeeves to single-class module.
2016-05-22 23:54:21 +02:00
Tobias Gruetzmacher
1ea20e1743
Migrate WebcomicFactory to single-class module.
2016-05-22 23:40:58 +02:00
Tobias Gruetzmacher
c62a7283a2
Migrate ComicFury to single-class module.
2016-05-22 23:31:53 +02:00
Tobias Gruetzmacher
1834bf179f
Migrate Arcamax to single-class module.
2016-05-22 23:17:24 +02:00
Tobias Gruetzmacher
f29472c143
Make auto-update script more flexible.
2016-05-22 23:06:05 +02:00
Tobias Gruetzmacher
e4650d5941
Remove make_scraper from Nitrocosm.
2016-05-21 14:35:53 +02:00
Tobias Gruetzmacher
b6eb8ab8ef
Remove make_scraper from SandraAndWoo
2016-05-21 14:12:11 +02:00
Tobias Gruetzmacher
4630ea047c
Implement Oglaf's strange navigation ( fixes #33 )
...
(also should fix wummel#91)
2016-05-21 02:38:07 +02:00
Tobias Gruetzmacher
51008a975b
Refactor: Introduce generator methods for scrapers
...
This allows one comic module class to generate multiple scrapers. This
change is to support a more dynamic module system as described in #42 .
2016-05-21 01:29:36 +02:00
Tobias Gruetzmacher
89cfd9d310
Add comics from catomix.com.
2016-05-16 23:55:41 +02:00
Tobias Gruetzmacher
a6cf4e7040
Fix some more comic modules.
2016-05-16 23:16:29 +02:00
Tobias Gruetzmacher
be1a63da0c
Update GoComics comic list.
2016-05-16 18:26:45 +02:00
Tobias Gruetzmacher
6d3f74142c
Move command line tool into package.
...
This way we can use the default Python console_scripts install process.
2016-05-16 14:57:47 +02:00
Tobias Gruetzmacher
b9d9564085
Fix Dilbert ( fixes #44 ).
2016-05-16 01:21:23 +02:00
Tobias Gruetzmacher
e9b3c487c0
Remove some dead comics.
2016-05-16 01:10:20 +02:00
Tobias Gruetzmacher
bd60155d9f
Some more ComicFury comics gone...
2016-05-16 00:53:22 +02:00
Tobias Gruetzmacher
849e60e795
Remove make_scraper magic from webcomiceu.
2016-05-07 03:20:01 +02:00
Tobias Gruetzmacher
975d2376bf
Another round of comic module fixes.
2016-05-07 01:50:10 +02:00
Tobias Gruetzmacher
efe1308db2
Replace home-grown Python2/3 compat. with six.
2016-05-05 23:33:48 +02:00
Tobias Gruetzmacher
77ed0218e0
Fix some comic modules.
2016-05-05 20:55:14 +02:00
Tobias Gruetzmacher
bb2ac39639
Fix some URLs.
2016-05-05 10:12:03 +02:00
Tobias Gruetzmacher
d05316e3ac
Seems ComicFury is deleting comics regularly...
...
Well, there's nothing we can do: Remove them.
2016-05-04 08:26:53 +02:00
Tobias Gruetzmacher
0c1aa9e8bd
Move libxml < 2.9.3 workaround to base class.
2016-05-02 23:22:06 +02:00
Tobias Gruetzmacher
b93a8fde65
Move PensAndTales comics and fix them.
2016-05-02 22:32:14 +02:00
Tobias Gruetzmacher
4006ced43d
Move all HijinksEnsue comics into alphabetic files.
2016-05-02 01:25:34 +02:00
Tobias Gruetzmacher
d5f91ecfd2
Fix some modules in m.py.
2016-04-30 01:59:28 +02:00
Tobias Gruetzmacher
1d52d33311
Remove missing SmackJeeves comics.
2016-04-30 00:56:20 +02:00
Tobias Gruetzmacher
d796f3476c
Fix some modules in d.py.
2016-04-30 00:44:18 +02:00
Tobias Gruetzmacher
cc16fea880
Fix some modules in c.py
2016-04-29 00:35:02 +02:00
Tobias Gruetzmacher
1d94439715
Fix some more comic modules.
2016-04-27 00:31:27 +02:00
Tobias Gruetzmacher
8b1ac4eb35
Fix "tagsoup" on SmackJeeves
...
Unfortunatly, browsers render < outside of HTML tags differently then
libXML until recently (libXML 2.9.3), so we need to preprocess pages
before parsing them...
(This was fixed in libXML commit 140c25)
2016-04-26 08:05:38 +02:00
Tobias Gruetzmacher
035d6e94e4
Allow output level for warnings and errors.
2016-04-26 07:53:53 +02:00
Tobias Gruetzmacher
8ddf553eb4
Fix some more SmackJeeves modules.
2016-04-22 01:04:47 +02:00
Tobias Gruetzmacher
fd85c8583a
Unify similar code in fetchUrl and fetchText
2016-04-22 00:42:46 +02:00
Tobias Gruetzmacher
6574997e01
Refactor: All the other class methods.
...
Turns out, it would have been better if all methods had been instance
methods and not class methods. This finished a big chunk of the rework
needed for #42 .
2016-04-21 23:52:31 +02:00
Tobias Gruetzmacher
0d436b8ca9
Refactor: url modifiers to normal methods.
...
As before, to implement #42 these might want to access information from
the instance, so they should be normal methods.
2016-04-21 21:39:25 +02:00
Tobias Gruetzmacher
c3f32dfef7
Refactor: Make namer a method.
...
When #42 is realized, the naming of files might differ between comic
modules, so the namer's logical location is the instance, not the class.
2016-04-21 08:20:49 +02:00
Tobias Gruetzmacher
5bd2a49f48
Add debug output on matched XPath/CSS expression.
2016-04-20 23:51:54 +02:00
Tobias Gruetzmacher
fe51a449df
Update SmackJeeves
...
- Now uses _ParserScraper, which makes the pattern quite a bit more
generic and IMHO more readable
- remove make_scraper magic
- No new comics, only fixed existing ones and removed some dead ones.
2016-04-20 23:36:45 +02:00
Tobias Gruetzmacher
190cd3b063
Convert language & getDisabledReasons to methods.
...
Both are more properties of a webcomic (this is part of the design
changes for #42 )
2016-04-19 23:53:46 +02:00
Tobias Gruetzmacher
df46907f39
Register EXSLT extensions by default.
...
This allows comic module authors to use the full power of regular
expressions in XPath expression, see http://exslt.org/regexp/regexp.html
for usage. Please be aware that these use the prefix re: instead of
regexp: here.
2016-04-19 23:48:14 +02:00
Tobias Gruetzmacher
4204f5f1e4
Send "If-Modified-Since" header for images.
2016-04-19 00:36:50 +02:00
Tobias Gruetzmacher
13a3409854
Remove some comics that are gone or block us.
2016-04-17 19:42:43 +02:00
Tobias Gruetzmacher
1fbc844077
Update GoComics.
2016-04-17 18:40:09 +02:00
Tobias Gruetzmacher
73e958670d
Update ComicFury (again).
2016-04-17 16:19:44 +02:00
Tobias Gruetzmacher
b0481a01f7
Update languages.
2016-04-16 13:14:12 +02:00
Tobias Gruetzmacher
3329027e4b
Update ComicFury.
2016-04-16 13:13:47 +02:00
Tobias Gruetzmacher
ee99c087d7
Remove prevUrlMatchesStripUrl.
...
It was only used for one test.
2016-04-16 01:14:26 +02:00
Tobias Gruetzmacher
92a688457a
Remove useless indirection.
2016-04-15 23:42:24 +02:00
Tobias Gruetzmacher
52515b5fc5
Update GoComics.
2016-04-15 00:26:14 +02:00
Tobias Gruetzmacher
031a523846
Fix SnafuComics.
2016-04-14 23:52:35 +02:00
Tobias Gruetzmacher
7626b1e100
Webcomics Nation is gone.
2016-04-14 22:46:52 +02:00
Tobias Gruetzmacher
497653c448
Remove make_scraper magic from Arcamax.
2016-04-14 00:17:59 +02:00
Tobias Gruetzmacher
db87ed95e7
Use new features to make modules simpler.
2016-04-13 23:28:43 +02:00
Tobias Gruetzmacher
b266e28ae1
Remove debugging prints 😭
2016-04-13 22:59:06 +02:00
Tobias Gruetzmacher
ff3b824311
Fix variable shadowing...
2016-04-13 22:43:34 +02:00
Tobias Gruetzmacher
060281e5ff
Use concrete scraper objects everywhere.
...
This is a first step for #42 . Since most access to the scraper classes
is through instances, modules can now dynamically override url and name
(name is now a property).
2016-04-13 22:17:30 +02:00
Tobias Gruetzmacher
0468f2f31a
Refactor: Convert starter to simple method.
2016-04-13 20:01:51 +02:00
Tobias Gruetzmacher
16004e43e4
Use default bounceStarter for site modules.
2016-04-13 01:24:13 +02:00
Tobias Gruetzmacher
9028724a74
Clean up update helper scripts.
2016-04-13 00:52:16 +02:00
Tobias Gruetzmacher
42e43fa4e6
Read starter parameters from class.
...
This allows to specify starters in a more declarative and dynamic way.
2016-04-12 23:11:39 +02:00
Tobias Gruetzmacher
b865a171f9
Remove some broken comics.
2016-04-12 08:21:06 +02:00
Tobias Gruetzmacher
4e2e4ac529
Prevent scraper from moving to a different comic.
2016-04-12 08:10:47 +02:00
Tobias Gruetzmacher
443ab119e9
Refresh GoComics list from online directory.
2016-04-12 00:36:33 +02:00
Tobias Gruetzmacher
0e385a3697
Update GoComics (no change in supported comics)
...
- remove make_scraper magic
- switch to _ParserScraper
2016-04-11 22:42:01 +02:00
Tobias Gruetzmacher
ad7a297964
Fix WLP comics.
2016-04-11 01:07:21 +02:00
Damjan Košir
af2e57d850
Added comic ScurryAndCover...
...
- Yay, funky JavaScript parsing!
- Start page isn't latest comic...
Updated-by: Tobias Gruetzmacher <tobias-git@23.gs>
2016-04-11 00:09:53 +02:00
Tobias Gruetzmacher
fa98f6ddbf
Move more comics to common WordPressScraper.
2016-04-10 23:04:34 +02:00
Tobias Gruetzmacher
f6e605e146
Fix unicode error in text search.
2016-04-10 13:16:30 +02:00
Tobias Gruetzmacher
bc10bd9a4d
Streamline color output.
...
- Depend on external colorama instead of embedding an old copy.
- Move most output code into output module.
- Convert pager to context manager.
2016-04-10 03:45:00 +02:00
Tobias Gruetzmacher
bb5b6ffcec
Fix comics in module a.py.
2016-04-07 23:21:31 +02:00
Tobias Gruetzmacher
0033a8046b
Fix creators module.
2016-04-07 00:20:03 +02:00
Tobias Gruetzmacher
8768ff07b6
Fix AhoiPolloi, be a bit smarter about encoding.
...
HTML character encoding in the context of HTTP is quite tricky to get
right and honestly, I'm not sure if I did get it right this time. But I
think, the current behaviour matches best what web browsers try to do:
1. Let Requests figure out the content from the HTTP header. This
overrides everything else. We need to "trick" LXML to accept our
decision if the document contains an XML declaration which might
disagree with the HTTP header.
2. If the HTTP headers don't specify any encoding, let LXML guess the
encoding and be done with it.
2016-04-06 22:22:22 +02:00
Tobias Gruetzmacher
183d18e7bc
Skip non-image on xkcd.
2016-04-06 00:50:01 +02:00
Tobias Gruetzmacher
9feaf245f2
Fixed & removed some comics in s.py.
2016-04-06 00:40:13 +02:00
Tobias Gruetzmacher
6bbdcfb341
BloomingFaeries: Don't download every page twice.
...
(Also, simplify namer, switch to _ParserScraper)
2016-04-05 23:58:43 +02:00
Tobias Gruetzmacher
8db6f8e8b7
Fix ZapComics, remove ZebraGirl.
...
- ZebraGirl is now ComicFury/ZebraGirl...
2016-04-04 00:27:11 +02:00
Tobias Gruetzmacher
0bcfb8a82e
Move ComicControl into common module.
...
- Move all comics using ComicControl into alphabetical files.
- Add BalderDash & Picklewhistle
2016-04-04 00:12:53 +02:00
Tobias Gruetzmacher
0d453a6858
Move Flowerlark Studios into alphabetical files.
2016-04-03 22:58:01 +02:00
Tobias Gruetzmacher
a9f0dfdce4
Merge pull request #39 from peterjanes/peterjanes/sherman-fix
...
Fix Sherman's Lagoon
2016-04-03 22:20:04 +02:00
Tobias Gruetzmacher
926439cd14
Every comic need an url.
2016-04-03 22:03:16 +02:00
Tobias Gruetzmacher
2c6decb7f5
Move WebcomicFactory in its own module.
...
Also, add an updater script for it.
2016-04-03 21:31:56 +02:00
Peter Janes
759bd0c360
Fix Sherman's Lagoon
2016-04-03 14:54:41 -04:00
Tobias Gruetzmacher
bb1f20d867
Remove make_scraper for most WordPress comics.
...
- Dropped KatzenfutterGeleespritzer, because robots.txt.
- Move all WordPress/ComicPress scrapers into alphabetical files.
- Move _WordPressScraper & _ComicPress scraper into common.py.
- Some smaller PEP8 fixes.
2016-04-02 00:19:53 +02:00
Tobias Gruetzmacher
7f1e136d8b
Sort comics alphabetically & PEP8 style fixes.
2016-03-31 23:13:54 +02:00
Tobias Gruetzmacher
d6db1d0b81
Fix a conflict with IPython.
2016-03-20 23:57:07 +01:00
Tobias Gruetzmacher
90dfceaeb1
Remove dead modules (& format).
2016-03-20 20:48:42 +01:00
Tobias Gruetzmacher
f243096d49
Fix GastroPhobia, remove GeneralProtectionFault.
...
(& formatting)
2016-03-20 20:11:21 +01:00
Tobias Gruetzmacher
cfcfcc2468
Switch plugin loading to pkgutil.
...
This should work with all PEP-302 loaders that implement iter_modules.
Unfortunatly, PyInstaller (which I plan to use for Windows releases)
does not support it, so we don't get around a special case. Anyways,
this should help for #22 .
2016-03-20 15:13:24 +01:00
Tobias Gruetzmacher
1af022895e
Fix NuklearPower ( fixes #38 ).
...
Also remove make_scraper magic.
2016-03-17 23:19:52 +01:00
Tobias Gruetzmacher
552f29e5fc
Update ComicFury comics. (+871, -245)
...
- Remove make_scraper magic
- Switch to HTML parser
- Update parsing of comic listing.
2016-03-17 00:44:06 +01:00
Tobias Gruetzmacher
6727e9b559
Use vendored urllib3.
...
As long as requests ships with urllib3, we can't fall back to the
"system" urllib3, since that breaks class-identity checks.
2016-03-16 23:18:19 +01:00
Damjan Košir
615f094ef3
fixing EdmundFinney
2016-03-14 20:32:18 +13:00
Tobias Gruetzmacher
c4fcd985dd
Let urllib3 handle all retries.
2016-03-13 21:30:36 +01:00
Tobias Gruetzmacher
78e13962f9
Sort scraper modules (mostly for test stability).
2016-03-13 20:24:21 +01:00
Tobias Gruetzmacher
017d35cb3c
Fallback version if pkg_resources not available.
...
This helps for Windows packaging.
2016-03-03 01:05:36 +01:00
Johannes Schöpp
351fa7154e
Modified maximum page size
...
Fixes #36
2016-03-01 22:19:44 +01:00
Damjan Košir
b0dc510b08
adding LastNerdsOnEarth
2016-01-03 14:16:58 +13:00
Damjan Košir
a1e79cbbf2
fixing Fragile
2016-01-03 14:08:49 +13:00