mirror of
https://github.com/knightcrawler-stremio/knightcrawler.git
synced 2024-12-20 03:29:51 +00:00
Added back original scrapers, integrated with PGSQL
This commit is contained in:
37
scraper/README.md
Normal file
37
scraper/README.md
Normal file
@@ -0,0 +1,37 @@
|
||||
# Torrentio Scraper
|
||||
|
||||
## Initial dumps
|
||||
|
||||
### The Pirate Bay
|
||||
|
||||
https://mega.nz/#F!tktzySBS!ndSEaK3Z-Uc3zvycQYxhJA
|
||||
|
||||
https://thepiratebay.org/static/dump/csv/
|
||||
|
||||
### Kickass
|
||||
|
||||
https://mega.nz/#F!tktzySBS!ndSEaK3Z-Uc3zvycQYxhJA
|
||||
|
||||
https://web.archive.org/web/20150416071329/http://kickass.to/api
|
||||
|
||||
### RARBG
|
||||
|
||||
Scrape movie and tv catalog using [www.webscraper.io](https://www.webscraper.io/) for available `imdbIds` and use those via the api to search for torrents.
|
||||
|
||||
Movies sitemap
|
||||
```json
|
||||
{"_id":"rarbg-movies","startUrl":["https://rarbgmirror.org/catalog/movies/[1-4235]"],"selectors":[{"id":"rarbg-movie-imdb-id","type":"SelectorHTML","parentSelectors":["_root"],"selector":".lista-rounded table td[width='110']","multiple":true,"regex":"tt[0-9]+","delay":0}]}
|
||||
```
|
||||
|
||||
TV sitemap
|
||||
```json
|
||||
{"_id":"rarbg-tv","startUrl":["https://rarbgmirror.org/catalog/tv/[1-609]"],"selectors":[{"id":"rarbg-tv-imdb-id","type":"SelectorHTML","parentSelectors":["_root"],"selector":".lista-rounded table td[width='110']","multiple":true,"regex":"tt[0-9]+","delay":0}]}
|
||||
```
|
||||
|
||||
### Migrating Database
|
||||
|
||||
When migrating database to a new one it is important to alter the `files_id_seq` sequence to the maximum file id value plus 1.
|
||||
|
||||
```sql
|
||||
ALTER SEQUENCE files_id_seq RESTART WITH <last_file_id + 1>;
|
||||
```
|
||||
Reference in New Issue
Block a user