Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple statements in Cassandra cql migration files #76

Merged
merged 3 commits into from
Jul 24, 2018

Conversation

aggrhm
Copy link
Contributor

@aggrhm aggrhm commented Jul 17, 2018

Currently you cannot include multiple statements in the CQL migration files due to Cassandra only allowing 1 statement per query. This makes creating multiple tables or indices in a single migration difficult. This change splits the CQL migration script by a semi-colon and newline pattern, and iterates them 1 by 1, allowing multiple statements per migration.

@coveralls
Copy link

coveralls commented Jul 17, 2018

Pull Request Test Coverage Report for Build 134

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 49.522%

Totals Coverage Status
Change from base Build 125: 0.0%
Covered Lines: 828
Relevant Lines: 1672

💛 - Coveralls

Copy link
Member

@dhui dhui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you read this?

migrate has avoid parsing the migration files b/c it's more complicate than it seems.

Upfront, I see a few issues:

  1. You could break up a valid query that contains a semi-colon. e.g. embedded in a string
  2. Partial failures. How do you handle a query that fails in the middle of a single migration? Is the migration wrapped in a transaction?
  3. Backwards compatibility. This shouldn't break existing migrations and with the first issue I mentioned, it's likely to. Add an option to the db driver to use batches instead of a single query.

Also, I'm not familiar enough w/ Cassandra to know if multiple statements are disallowed in a single query.

for _, q := range(queries) {
tq := strings.TrimSpace(q)
if (tq == "") { continue }
if err := c.session.Query(tq).Exec(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be better off using Batch since atomicity is guaranteed. But I'm still not sure about parsing the migration data.

@aggrhm
Copy link
Contributor Author

aggrhm commented Jul 18, 2018

I didn't see that link, that's probably worth adding to the CONTRIBUTING.md file. Responses to your other issues are below:

  1. Yes I agree this will break a migration with a string with a semi-colon and newline in it, but this is a pretty uncommon migration. In any event, maybe we could add a flag in the connection url to allow these parsed multi-statement migrations (or some sort of header comment so it only applies to particular migration files?).

  2. Yes, this could result in partial changes to the database, but it will just leave the schema in a dirty state, where the user needs to clean it themselves and force the version number. You cannot wrap CREATE TABLE statements in a transaction.

  3. A param or header comment for allowing parsed multi-statements will allow old migrations to continue to work as normal. BATCH statements only support INSERT, UPDATE, and DELETE (https://docs.datastax.com/en/cql/3.3/cql/cql_reference/cqlBatch.html).

I agree with your basic concern, but the issue arises from the gocql library (and somewhat Cassandra itself). Apparently, Cassandra can only support a single statement per "session.Query" call. This makes this library very tedious for Cassandra, in that I need to make a separate migration file per individual schema change.

How do you feel about adding a param to the connection url? Or adding a header comment that denotes this migration should be parsed into multiple statements?

(BTW, this issue was brought up before it was forked here (mattes/migrate#338) and here (apache/cassandra-gocql-driver#1063)

@dhui
Copy link
Member

dhui commented Jul 18, 2018

Thanks for the context!

I think an option like x-multi-statement (disabled by default) for the Cassandra driver, appropriately documented, would work. e.g. warn about simple split on ; not working w/ embedded semi-colons and atomicity.

Although having headers in the migration file offers more control and granularity, it introduces additional parsing of the migration files which I'd like to avoid to maintain simplicity.

Copy link
Member

@dhui dhui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests

return database.Error{OrigErr: err, Err: "migration failed", Query: migr}

// split query by semi-colon
queries := strings.Split(query, ";\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't split on \n so we don't need to deal with different line endings

@aggrhm
Copy link
Contributor Author

aggrhm commented Jul 18, 2018

Sounds good. I have some other work to get done but I'll try to make these changes over the next couple of days.

@aggrhm
Copy link
Contributor Author

aggrhm commented Jul 20, 2018

Ok I just pushed support for the x-multi-statement flag and updated the README. Also split on ";" instead of ";\n". Ran this on my migrations for my project and it works great.

Copy link
Member

@dhui dhui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the feedback! There are only a few more minor issues remaining.

@@ -3,6 +3,7 @@
* Drop command will not work on Cassandra 2.X because it rely on
system_schema table which comes with 3.X
* Other commands should work properly but are **not tested**
* The Cassandra driver (gocql) does not natively support executing multipe statements in a single query. To allow for multiple statements in a single migration, you can use the `x-multi-statement` param. Note that this simply splits the migration into separately-executed statements by a semi-colon ';'. The queries are also not executed in any sort of transaction/batch, meaning you are responsible for fixing partial migrations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use `;` instead of ';' and provide an explicit warning about semi-colons in strings causing issues.

return database.Error{OrigErr: err, Err: "migration failed", Query: migr}
}
}
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return in the if block above and remove the else to save a level of indentation
https://github.com/golang/go/wiki/CodeReviewComments#indent-error-flow

@aggrhm
Copy link
Contributor Author

aggrhm commented Jul 24, 2018

Additional requested changes fixed.

@dhui
Copy link
Member

dhui commented Jul 24, 2018

Thanks for your PR!

@dhui dhui merged commit fad64ed into golang-migrate:master Jul 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants