-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Currently the bot uses PicklePersistence provided by the python-telegram-bot library to store bot data on disk. The data stored this way includes information about the subscribed users, nodes, and farmerbot related minting violations. If the data is lost, users stop getting the alerts they subscribed for and must subscribe again, some extra alerts regarding violations will likely fire, and overall it's a bad situation.
So far the deployment of the bot has included syncing this bot data file to remote storage as a backup. At least one recovery from the remotely synced file has been successful, but overall this approach is not so robust. Here's a summary of issues:
- Python's pickle is a binary file format, and any corruption will likely result in the file being completely unreadable. Such corruption might occur if the machine running the bot crashes or goes offline while data is being synced
- We are pickling some custom classes and this is also fragile. Moving these classes to a different module, for example, requires a sort of migration for the data to remain readable (this could be solved by using generic data structures like lists and dicts only)
- Examining bot data is not very straightforward because it relies on creating an appropriate environment where the unpickling can take place
There are some alternatives, which are broadly:
- DictPersistence and store the data as JSON text on disk
- Is better than binary data, in that at least some manual recovery should be possible in case of corruption
- Database
- Database is best for on disk consistency
- Get access to replication mechanisms that can also provide consistency guarantees in case of crashes
- We have to deal with schema updates/migrations, updating data structure becomes a pain (maybe this can be eased by just storing serialized JSON objects in the database rather than mapping
- More complicated implementation, see below
When it comes to databases, there are further sub options:
- A third party persistence class that writes to a database, like python-telegram-bot-django-persistence
- These projects tend to be small and not terribly active, decent chance they get abandoned one day
- Could be very convenient, just drop in replacement
- Write a subclass of BasePersistence
- Bot code needs little to no modification
- Need to study/integrate with the existing persistence framework
- Handle data persistence separately (stop using built in persistence feature)
- Will need to update code wherever data is stored
- Best flexibility
In the case of a custom implementation, there's also the question of how to handle interfacing with the database, as well as what database to use:
- SQLite
- Good fit for this use case
- Live replication options available (Litestream)
- peewee
- Simple Python ORM
- Etc...