Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mlos parallelization: storage schema initialization fails when attempted by many processes concurrently #793

Open
DelphianCalamity opened this issue Jul 12, 2024 · 1 comment

Comments

@DelphianCalamity
Copy link
Contributor

  File "python3.11/site-packages/mysql/connector/connection_cext.py", line 713, in cmd_query
    raise get_mysql_exception(
sqlalchemy.exc.DatabaseError: (mysql.connector.errors.DatabaseError) 1684 (HY000): Table 'mlos'.'config' was skipped since its definition is being modified by concurrent DDL statement

[SQL: DESCRIBE `mlos`.`config`]
(Background on this error at:
https://sqlalche.me/e/20/4xp6)

Running many mlos processes concurrently and I am getting this error when they are all trying to call self._meta.create_all(self._engine) at the same time.

possible solutions: shared mutex or move schema initialization functionality to external script?

@bpkroth
Copy link
Contributor

bpkroth commented Jul 12, 2024

Copying some conversations here:

What's nice about the current system is you really only need to do pip install mlos_bench; mlos_bench ... and it will mostly just work locally by creating the sqlite db and everything you need for you. No extra setup steps required.

The mutex is really only necessary if you run multiple instances at the same time targetting the same backend storage. Which shows up in the case of this ray system. In which case, I'd argue you need the mutex in ray and not mlos_bench.

We've talked before (and there is an open issue - #732) on creating mlos_bench_service, that does some of these things for you, and we could even build it with ray if we wanted (though, extra dependencies I'm not sure we need). Anyways, I'd argue we should add such mutexes types of things there.

Another option is that we have a flag to enable that functionality and let people turn it on/off at will (e.g., --create-update-schema), which is probably easier to implement quickly and could be used by either mlos_bench_service or ray or custom scripts later on as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants