feat(api): improve caching and place name matching #451
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pr updates the caching layers, introduces a new cache, and adds levenshtein to place name matching.
The bigquery hosted service now runs on startup and every sunday (the scheduled query to load data from sheets runs on saturday) and hydrates an in memory cache. There was no reason to have more cache levels as it's static data. I also couldn't figure out firebase remote config :(
I introduced fusion cache since microsofts hybrid cache does not allow for named caches and we needed different settings for each cache. Fusion cache allows for a l1 in memory and l2 redis.
The firestore fusion cache will help reduce the amount of calls to read api keys. We sometime incur a read cost since there was a 1:1 between requests and firestore reads.
The place name fusion cache will allow us to correct minor place name mistakes. First I check the bigquery memory cache for our standard names and on miss check the fusion cache. On another miss, I get all our standard names and levenshtein to get the closest match. If one is found, I get the grids from the bq memory cache and update the fusion cache with the misspelling. Otherwise I set the misspelling to empty so it's not levenshteined again.
I chose not to add this feature to the zip codes but rather expect an exact match since zip codes are very similar and levenshtein could produce strange results. Zips use the bq memory cache.