Producer / Consumer / Collector rewrite (#160)

* Converted metadata service to redis

* move to postgres instead

* fix global usings

* [skip ci] optimize wolverine by prebuilding static types

* [skip ci] Stop indexing mac folder indexes

* [skip ci] producer, metadata and migrations

removed mongodb
added redis cache
imdb meta in postgres
Enable pgtrm
Create trigrams index
Add search meta postgres function

* [skip ci] get rid of node folder, replace mongo with redis in consumer

also wire up postgres metadata searches

* [skip ci] change mongo to redis in the addon

* [skip ci] jackettio to redis

* Rest of mongo removed...

* Cleaner rerunning of metadata - without conflicts

* Add akas import as well as basic metadata

* Include episodes file too

* cascade truncate pre-import

* reverse order to avoid cascadeing

* separate out clean to separate handler

* Switch producer to use metadata matching pre-preocessing dmm

* More work

* Still porting PTN

* PTN port, adding tests

* [skip ci] Codec tests

* [skip ci] Complete Collection handler tests

* [skip ci] container tests

* [skip ci] Convert handlers tests

* [skip ci] DateHandler tests

* [skip ci] Dual Audio matching tests

* [skip ci] episode code tests

* [skip ci] Extended handler tests

* [skip ci] group handler tests

* [skip ci] some broken stuff right now

* [skip ci] more ptn

* [skip ci] PTN now in a separate nuget package, rebased this on the redis changes - i need them.

* [skip ci] Wire up PTN port. Tired - will test tomorrow

* [skip ci] Needs a lot of work - too many titles being missed now

* cleaner. done?

* Handle the date in the imdb search

- add integer function to confirm its a valid integer
- use the input date as a range of -+1 year

* [skip ci] Start of collector service for RD

[skip ci] WIP

Implemented metadata saga, along with channels to process up to a maximum of 100 infohashes each time
The saga will rety for each infohas by requeuing up to three times, before just marking as complete for that infoHash - meaning no data will be updated in the db for that torrent.

[skip ci] Ready to test with queue publishing

Will provision a fanout exchange if it doesn't exist, and create and bind a queue to it. Listens to the queue with 50 prefetch count.
Still needs PTN rewrite bringing in to parse the filename response from real debrid, and extract season and episode numbers if the file is a tvshow

[skip ci] Add Debrid Collector Build Job

Debrid Collector ready for testing

New consumer, new collector, producer has meta lookup and anti porn measures

[skip ci] WIP - moving from wolverine to MassTransit.

 not happy that wolverine cannot effectively control saga concurrency. we need to really.

[skip ci] Producer and new Consumer moved to MassTransit

Just the debrid collector to go now, then to write the optional qbit collector.

Collector now switched to mass transit too

hide porn titles in logs, clean up cache name in redis for imdb titles

[skip ci] Allow control of queues

[skip ci] Update deployment

Remove old consumer, fix deployment files, fix dockerfiles for shared project import

fix base deployment

* Add collector missing env var

* edits to kick off builds

* Add optional qbit deployment which qbit collector will use

* Qbit collector done

* reorder compose, and bring both qbit and qbitcollector into the compose, with 0 replicas as default

* Clean up compose file

* Ensure debrid collector errors if no debrid api key
This commit is contained in:
iPromKnight
2024-03-25 23:32:28 +00:00
committed by GitHub
parent 9c6c1ac249
commit 9a831e92d0
443 changed files with 4154 additions and 476262 deletions

View File

@@ -8,48 +8,29 @@ POSTGRES_USER=postgres
POSTGRES_PASSWORD=postgres
POSTGRES_DB=knightcrawler
# MongoDB
MONGODB_HOST=mongodb
MONGODB_PORT=27017
MONGODB_DB=knightcrawler
MONGODB_USER=mongo
MONGODB_PASSWORD=mongo
# Redis
REDIS_CONNECTION_STRING=redis:6379
# RabbitMQ
RABBITMQ_HOST=rabbitmq
RABBITMQ_USER=guest
RABBITMQ_PASSWORD=guest
RABBITMQ_QUEUE_NAME=ingested
RABBITMQ_CONSUMER_QUEUE_NAME=ingested
RABBITMQ_DURABLE=true
RABBITMQ_MAX_QUEUE_SIZE=0
RABBITMQ_MAX_PUBLISH_BATCH_SIZE=500
RABBITMQ_PUBLISH_INTERVAL_IN_SECONDS=10
# Metadata
## Only used if DATA_ONCE is set to false. If true, the schedule is ignored
METADATA_DOWNLOAD_IMDB_DATA_SCHEDULE="0 0 1 * *"
## If true, the metadata will be downloaded once and then the schedule will be ignored
METADATA_DOWNLOAD_IMDB_DATA_ONCE=true
## Controls the amount of records processed in memory at any given time during import, higher values will consume more memory
METADATA_INSERT_BATCH_SIZE=25000
METADATA_INSERT_BATCH_SIZE=50000
# Collectors
COLLECTOR_QBIT_ENABLED=false
COLLECTOR_DEBRID_ENABLED=true
COLLECTOR_REAL_DEBRID_API_KEY=
# Addon
DEBUG_MODE=false
# Consumer
JOB_CONCURRENCY=5
JOBS_ENABLED=true
## can be debug for extra verbosity (a lot more verbosity - useful for development)
LOG_LEVEL=info
MAX_CONNECTIONS_PER_TORRENT=10
MAX_CONNECTIONS_OVERALL=100
TORRENT_TIMEOUT=30000
UDP_TRACKERS_ENABLED=true
CONSUMER_REPLICAS=3
## Fix for #66 - toggle on for development
AUTO_CREATE_AND_APPLY_MIGRATIONS=false
## Allows control of the threshold for matching titles to the IMDB dataset. The closer to 0, the more strict the matching.
TITLE_MATCH_THRESHOLD=0.25
# Producer
GITHUB_PAT=