Files

purple_emily 64149c55a8 WIP: External access

2024-03-10 13:15:05 +00:00

5.9 KiB

Raw Permalink Blame History

Getting started

Knight Crawler is provided as an all-in-one solution. This means we include all the necessary software you need to get started out of the box.

Before you start

Make sure that you have:

A place to host Knight Crawler
Docker and Compose installed
A GitHub account (optional)

Download the files

Installing Knight Crawler is as simple as downloading a copy of the deployment directory.

A basic installation requires only two files:

deployment/docker/.env.example
deployment/docker/docker-compose.yaml.

For this guide I will be placing them in a directory on my home drive ~/knightcrawler.

Rename the .env.example file to be .env

~/
└── knightcrawler/
    ├── .env
    └── docker-compose.yaml

Initial configuration

Below are a few recommended configuration changes.

Open the .env file in your favourite editor.

If you are using an external database, configure it in the .env file. Don't forget to disable the ones included in the docker-compose.yaml.

Database credentials

It is strongly recommended that you change the credentials for the databases included with Knight Crawler. This is best done before running Knight Crawler for the first time. It is much harder to change the passwords once the services have been started for the first time.

POSTGRES_PASSWORD=postgres
...
MONGODB_PASSWORD=mongo
...
RABBITMQ_PASSWORD=guest

Here's a few options on generating a secure password:

# Linux
tr -cd '[:alnum:]' < /dev/urandom | fold -w 64 | head -n 1
# Or you could use openssl
openssl rand -hex 32

# Python
import secrets

print(secrets.token_hex(32))

Your time zone

TZ=London/Europe

A list of time zones can be found on Wikipedia

Consumers

JOB_CONCURRENCY=5
...
MAX_CONNECTIONS_PER_TORRENT=10
...
CONSUMER_REPLICAS=3

These are totally subjective to your machine and network capacity. The above default is pretty minimal and will work on most machines.

JOB_CONCURRENCY is how many films and tv shows the consumers should process at once. As this affects every consumer this will likely cause exponential strain on your system. It's probably best to leave this at 5, but you can try experimenting with it if you wish.

MAX_CONNECTIONS_PER_TORRENT is how many peers the consumer will attempt to connect to when it is trying to collect metadata. Increasing this value can speed up processing, but you will eventually reach a point where more connections are being made than your router can handle. This will then cause a cascading fail where your internet stops working. If you are going to increase this value then try increasing it by 10 at a time.

Increasing this value increases the max connections for every parallel job, for every consumer. For example with the default values above this means that Knight Crawler will be on average making (5 x 3) x 10 = 150 connections at any one time.

{style="warning"}

CONSUMER_REPLICAS is how many consumers should be initially started. You can increase or decrease the number of consumers whilst the service is running by running the command docker compose up -d --scale consumer=<number>.

GitHub personal access token

This step is optional but strongly recommended. Debrid Media Manager is a media library manager for Debrid services. When a user of this service chooses to export/share their library publicly it is saved to a public GitHub repository. This is, essentially, a repository containing a vast amount of ready to go films and tv shows. Knight Crawler comes with the ability to read these exported lists, but it requires a GitHub account to make it work.

Knight Crawler needs a personal access token with read-only access to public repositories. This means we can not access any private repositories you have.

Navigate to GitHub settings (GitHub token settings):
- Navigate to GitHub settings.
- Click on Developer Settings.
- Select Personal access tokens.
- Choose Fine-grained tokens.
Press Generate new token.

Fill out the form with the following information:

Token name:
    KnightCrawler
Expiration:
    90 days
Description:
    <blank>
Repository access:
    (checked) Public Repositories (read-only)

Click Generate token.
Take the new token and add it to the bottom of the .env file:
```
# Producer
GITHUB_PAT=<YOUR TOKEN HERE>
```

Start Knight Crawler

To start Knight Crawler use the following command:

docker compose up -d

Then we can follow the logs to watch it start:

docker compose logs -f --since 1m

Knight Crawler will only be accessible on the machine you run it on, to make it accessible from other machines navigate to External access.

{style="note"}

To stop following the logs press Ctrl+C at any time.

The Knight Crawler configuration page should now be accessible in your web browser at http://localhost:7000

Start more consumers

If you wish to speed up the processing of the films and tv shows that Knight Crawler finds, then you'll likely want to increase the number of consumers.

The below command can be used to both increase or decrease the number of running consumers. Gradually increase the number until you encounter any issues and then decrease until stable.

docker compose up -d --scale consumer=<number>

Stop Knight Crawler

Knight Crawler can be stopped with the following command:

docker compose down

5.9 KiB Raw Permalink Blame History