You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
./bwb-import-bot.py setup_db ./bwb.csv : Parses and cleans all the data, and then inserts the data into a database called bwb-import-state.db. All these entries will have a status of TO_BE_IMPORTED and null in a column called comment in the DB. (This should therotically not take more than a few minutes even for files > 1GB, but need to test). This is something which will be run every time OL recieves data (which is most probably once a month).
./bwb-import-bot.py process : Reads a batch (currently 10000) of records whose status is TO_BE_IMPORTED from the DB and tries to import them into OL. If the request succeeds, then the status will change to SUCCESS for that row in the DB else it will change to ERROR with the error in the comment. This is a process which will keep running in the background. It stops running when there are no rows in the DB whose status is TO_BE_IMPORTED.
Some more thoughts here, we can technically make use of all the cores on the system (using pandarell) and parallelize the process step. And thereby start making a lot of import calls in parallel. @mekarpeles had two very good points to not do this for now
Import process is a very intensive process and queuing many imports may impact the server's stability
OL ratelimits to one request per second, hence we cannot make multiple calls in parallel
This optimization is something we might explore later on, but right now everything will be sync.
Some ideas we discussed:
cc: @BharatKalluri
The text was updated successfully, but these errors were encountered: