Mlos parallelization: storage schema initialization fails when attempted by many processes concurrently #793

DelphianCalamity · 2024-07-12T18:20:23Z

  File "python3.11/site-packages/mysql/connector/connection_cext.py", line 713, in cmd_query
    raise get_mysql_exception(
sqlalchemy.exc.DatabaseError: (mysql.connector.errors.DatabaseError) 1684 (HY000): Table 'mlos'.'config' was skipped since its definition is being modified by concurrent DDL statement

[SQL: DESCRIBE `mlos`.`config`]
(Background on this error at:
https://sqlalche.me/e/20/4xp6)

Running many mlos processes concurrently and I am getting this error when they are all trying to call self._meta.create_all(self._engine) at the same time.

possible solutions: shared mutex or move schema initialization functionality to external script?

The text was updated successfully, but these errors were encountered:

bpkroth · 2024-07-12T19:00:05Z

Copying some conversations here:

What's nice about the current system is you really only need to do pip install mlos_bench; mlos_bench ... and it will mostly just work locally by creating the sqlite db and everything you need for you. No extra setup steps required.

The mutex is really only necessary if you run multiple instances at the same time targetting the same backend storage. Which shows up in the case of this ray system. In which case, I'd argue you need the mutex in ray and not mlos_bench.

We've talked before (and there is an open issue - #732) on creating mlos_bench_service, that does some of these things for you, and we could even build it with ray if we wanted (though, extra dependencies I'm not sure we need). Anyways, I'd argue we should add such mutexes types of things there.

Another option is that we have a flag to enable that functionality and let people turn it on/off at will (e.g., --create-update-schema), which is probably easier to implement quickly and could be used by either mlos_bench_service or ray or custom scripts later on as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mlos parallelization: storage schema initialization fails when attempted by many processes concurrently #793

Mlos parallelization: storage schema initialization fails when attempted by many processes concurrently #793

DelphianCalamity commented Jul 12, 2024

bpkroth commented Jul 12, 2024

Mlos parallelization: storage schema initialization fails when attempted by many processes concurrently #793

Mlos parallelization: storage schema initialization fails when attempted by many processes concurrently #793

Comments

DelphianCalamity commented Jul 12, 2024

bpkroth commented Jul 12, 2024