Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BATCH-2478: Improve JdbcPagingItemReader ResultSet access #450

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Kuroshii
Copy link

Changes (within the context of JdbcPagingItemReader):

  • Add logic for determining how to get an column from a result set based on the column name used in the query
  • Add tests that the above changes allow for sortKeys with table-qualified names.

The problem I was made aware of, which is stated somewhat ambiguously in BATCH-2478, is that there is a difference between the name that JdbcPagingItemReader needs to refer to columns that it uses as sort keys and the name that is needed to retrieve that data from a result set.

This means you could have a perfectly formatted query constructed by the queryProvider that then fails because the JdbcPagingItemReader could not retrieve the values it needs to do paging from the ResultSet returned by the query.

I noticed a getSortKeysWithoutAliases method in AbstractSqlPagingQueryProvider, but I think that that approach isn't a very good solution to the problem.

Fundamentally, the problem is that sometimes what is used to refer to a column in the query may not match how it is referred to in the result set. The obvious case is when you have table qualified column names, but in some SQL engines (Postgres, for example) it is perfectly OK to have a '.' appear in an alias as long as the alias is quoted. This isn't a complete solution, because even still, some databases use the original column name in the ResultSet, even when you specify an alias in the query (HSQL, for example).

Some ideas for dealing with this more thoroughly:

  • Parse the select clause for aliases and names of to determine which position in the result set to fetch the sort keys from. (Turn the name problem into a parsing problem).
  • Have the client of JdbcPagingItemReader get sort key values from the ResultSet
  • Have the client of JdbcPagingItemReader specify what names or positions in the original query correspond which sort keys.

@pivotal-issuemaster
Copy link

@Kuroshii Please sign the Contributor License Agreement!

Click here to manually synchronize the status of this Pull Request.

See the FAQ for frequently asked questions.

@pivotal-issuemaster
Copy link

@Kuroshii Thank you for signing the Contributor License Agreement!

if (separatorIndex > 0) {
try {
return rs.getObject(columnName.substring(separatorIndex));
} catch (SQLException ignored) { }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of testing this every time like this. Generating stack traces is pretty performance intensive. A better option would be to allow a Map<String, String> to be configurable to map any fields to the result set field. This allows it to be more flexible as well I think.

@Kuroshii Kuroshii force-pushed the batch-2478-improve-jdbc-paging-resultset-access branch from bc4accb to ef99d47 Compare February 9, 2017 15:50
@Kuroshii
Copy link
Author

Kuroshii commented Feb 9, 2017

@mminella Pull request updated; I think that should be a better solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants