Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bypass join to TokenIndexEntity table for SearchParameter _id #2721

Open
LZRS opened this issue Nov 14, 2024 · 8 comments · May be fixed by #2743
Open

Bypass join to TokenIndexEntity table for SearchParameter _id #2721

LZRS opened this issue Nov 14, 2024 · 8 comments · May be fixed by #2743
Assignees

Comments

@LZRS
Copy link
Collaborator

LZRS commented Nov 14, 2024

Is your feature request related to a problem? Please describe.
Searching for resources given multiple ids with filter _id, does a join to the TokenIndexEntity, that may seem to incur an extra performance cost

Describe the solution you'd like
When search is done using filter in the form

fhirEngine.search<Patient>{
        filter(Patient.RES_ID, { value = of("1234") }, { value = of("23456") })
}

the query generated is of the form of

SELECT a.resourceUuid, a.serializedResource
 FROM ResourceEntity a
WHERE a.resourceType = 'Patient'
         AND a.resourceUuid IN (
SELECT resourceUuid FROM TokenIndexEntity
WHERE resourceType = 'Patient' AND index_name = '_id' AND (index_value = '1234' OR index_value = '23456')
 )

The suggestion we had was if we could bypass joining to the TokenIndexEntity table for the _id , and instead filter directly from the ResourceEntity table.
The resulting query would therefore look similar to

SELECT a.resourceUuid, a.serializedResource
 FROM ResourceEntity a
WHERE a.resourceType = 'Patient' 
         AND a.resourceId  IN ('1234', '23456')
 )

Additional context
Link to RES_ID constant

Describe alternatives you've considered
Enhancing the fhirEngine.get interface to support multiple resourceId to fetch multiple resources, although this wouldn't work with the rest of the Search DSL

Would you like to work on the issue?
Please state if this issue should be assigned to you or who you think could help to solve this issue.

@LZRS LZRS changed the title Bypass join to TokenIndexEntity table for SearchParameter [_id](https://build.fhir.org/search.html#_id) Bypass join to TokenIndexEntity table for SearchParameter _id Nov 14, 2024
@MJ1998
Copy link
Collaborator

MJ1998 commented Nov 15, 2024

Thanks for raising this @LZRS. This is a valid request. However the SQL generation logic in engine library is generic to deal with all kinds of keys.
So implementing the "bypass TokenIndexEntity" above would require to write custom logic for the RES_ID key.

Let us evaluate if this is possible. @jingtang10 @aditya-07

@aditya-07
Copy link
Collaborator

Thanks for raising this @LZRS. This is a valid request. However the SQL generation logic in engine library is generic to deal with all kinds of keys.

@LZRS Would an overloaded suspend fun get(type: ResourceType, vararg ids: String): Resource be better fit for this usecase?

@LZRS
Copy link
Collaborator Author

LZRS commented Nov 18, 2024

@LZRS Would an overloaded suspend fun get(type: ResourceType, vararg ids: String): Resource be better fit for this usecase?

Yeah, that would work although we have some usecases that would require Search DSL, for example getting Patient's Observations and Conditions using revInclude

Example

fhirEngine.search<Patient>{
        filter(Patient.RES_ID, { value = of("1234") }, { value = of("23456") })
        revInclude<Observation>(Observation.SUBJECT)
        revInclude<Condition>(Condition.SUBJECT)
}

@jingtang10
Copy link
Collaborator

@aditya-07 @dubdabasoduba @google/labs-ai-sdk-contribs we discussed this, the change should happen here:

add a shortcut so that filtering by id goes through a different codepath.

@jingtang10
Copy link
Collaborator

@LZRS Would an overloaded suspend fun get(type: ResourceType, vararg ids: String): Resource be better fit for this usecase?

Yeah, that would work although we have some usecases that would require Search DSL, for example getting Patient's Observations and Conditions using revInclude

Example

fhirEngine.search<Patient>{
        filter(Patient.RES_ID, { value = of("1234") }, { value = of("23456") })
        revInclude<Observation>(Observation.SUBJECT)
        revInclude<Condition>(Condition.SUBJECT)
}

have you tried not using revinclude and just search observations and conditions with a reference to the known patient?

i guess in the applicaiton you want one query to load all the observations and conditions? is that right?

@LZRS
Copy link
Collaborator Author

LZRS commented Nov 21, 2024

@LZRS Would an overloaded suspend fun get(type: ResourceType, vararg ids: String): Resource be better fit for this usecase?

Yeah, that would work although we have some usecases that would require Search DSL, for example getting Patient's Observations and Conditions using revInclude
Example

fhirEngine.search<Patient>{
        filter(Patient.RES_ID, { value = of("1234") }, { value = of("23456") })
        revInclude<Observation>(Observation.SUBJECT)
        revInclude<Condition>(Condition.SUBJECT)
}

have you tried not using revinclude and just search observations and conditions with a reference to the known patient?

i guess in the applicaiton you want one query to load all the observations and conditions? is that right?

No, we hadn't tried that. But I believe that would then require multiple queries, although they wouldn't probably be more than 5....yeah, I think we could try that

In that case, implementing this would just be okay I think
suspend fun get(type: ResourceType, vararg ids: String): Resource

@dubdabasoduba
Copy link
Collaborator

@jingtang10 @aditya-07 Please re-assign this to @qiarie

@jingtang10
Copy link
Collaborator

just to confirm, what we're suggesting as a fix is:

val patients = fhirEngine.get<Patient>("1", "2", ...) // this uses Simon's new API
...
val observations = fhirEngine.search<Observations> {
  filter( ... )
}

there shouldn't be a performance penalty because the search api implementation run these queries separately anyway (the joins would have been massive)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New
Development

Successfully merging a pull request may close this issue.

6 participants