WIP: Run Knight Crawler

2024-12-20 03:29:51 +00:00 · 2024-03-10 11:40:58 +00:00
parent 108a4a9066
commit f000ae6c12
4 changed files with 33 additions and 18 deletions
--- a/docs/Writerside/topics/Run-Knight-Crawler.md
+++ b/docs/Writerside/topics/Run-Knight-Crawler.md
@@ -1,6 +1,7 @@
 # Run Knight Crawler

-To run Knight Crawler you need two files, both can be found in the [deployment/docker](https://github.com/Gabisonfire/knightcrawler/tree/master/deployment/docker)
+To run Knight Crawler you need two files, both can be found in
+the [deployment/docker](https://github.com/Gabisonfire/knightcrawler/tree/master/deployment/docker)
 directory on GitHub:

 - <path>deployment/docker/.env.example</path> 
@@ -25,7 +26,7 @@ Before we start the services, we need to change a few things in the <path>.env</
 > If you are using an external database, configure it in the <path>.env</path> file. Don't forget to disable the ones
 > included in the <path>docker-compose.yaml</path>.

-### Your time zone.
+### Your time zone

 ```Bash
 TZ=London/Europe
@@ -43,21 +44,28 @@ MAX_CONNECTIONS_PER_TORRENT=10
 CONSUMER_REPLICAS=3
 ```

-These are totally subjective to your machine and network capacity. The above default is pretty minimal and will work on most machines.
+These are totally subjective to your machine and network capacity. The above default is pretty minimal and will work on
+most machines.

-`JOB_CONCURRENCY` is how many films and tv shows the consumers should process at once. As this affects every consumer this will likely cause exponential
- strain on your system. It's probably best to leave this at 5, but you can try experimenting with it if you wish.
+`JOB_CONCURRENCY` is how many films and tv shows the consumers should process at once. As this affects every consumer
+this will likely cause exponential
+strain on your system. It's probably best to leave this at 5, but you can try experimenting with it if you wish.

-`MAX_CONNECTIONS_PER_TORRENT` is how many peers the consumer will attempt to connect to when it is trying to collect metadata.
-Increasing this value can speed up processing, but you will eventually reach a point where more connections are being made than
-your router can handle. This will then cause a cascading fail where your internet stops working. If you are going to increase this value
+`MAX_CONNECTIONS_PER_TORRENT` is how many peers the consumer will attempt to connect to when it is trying to collect
+metadata.
+Increasing this value can speed up processing, but you will eventually reach a point where more connections are being
+made than
+your router can handle. This will then cause a cascading fail where your internet stops working. If you are going to
+increase this value
 then try increasing it by 10 at a time.

-> Increasing this value increases the max connections for every parallel job for every consumer. For example
-> with the default values above this means that Knight Crawler will be on average making `(5 x 3) x 10 = 150` connections at any one time.
+> Increasing this value increases the max connections for every parallel job, for every consumer. For example
+> with the default values above this means that Knight Crawler will be on average making `(5 x 3) x 10 = 150`
+> connections at any one time.
 >
 {style="warning"}

-`CONSUMER_REPLICAS` is how many consumers should be started. This is the ultimate decider in how fast you will be able to
-add films and tv shows to your database. However, this is also going to be the most intensive service you will run.
-The default of 3 is a reasonable starting amount. It will work on almost every system.
+`CONSUMER_REPLICAS` is how many consumers should be initially started. This is best kept below 10 as GitHub rate limit
+how fast we can access a list of torrent trackers. You can increase or decrease the number of consumers whilst the service is running by running the command `docker compose up --scale consumer=<number>`. This value is best increased by 5 at a time. Repeat this process until you have reached the desired level of consumers.
+
+### GitHub personal access token