-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dynamic error handling within eia_data()
from metadata layer
#17
Conversation
…tests based on new dynamic error handling messaging
…r input validation following a first error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminded me we should have some unit tests that make use of all possible frequency values, so we'll probably have to test against at least a couple different data sources for which different frequencies can be requested. But my biggest question is the one about the algo for requesting metadata within a data request.
R/data.R
Outdated
if(!is.null(end)) .end_check(end, freq, md$Frequency, md_end, md_start) | ||
if(!is.null(sort)) .sort_check(sort) | ||
if(!is.null(length)) .lng_check(length) | ||
if(!is.null(offset)) .ofs_check(offset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment about lines 83-105: What do you think about migrating all the instances of this if(!is.null(x))
stuff into the functions that check or handle the rest of the respective operations? Seems like since we have all these other internal helper functions handling cases and returning something (or nothing), they could also handle the NULL case themselves as well. Then outer functions like .eia_data_url()
and .eia_data_check()
can be further simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I will make this happen. Good catch.
R/data.R
Outdated
@@ -58,6 +58,8 @@ eia_data <- function(dir, data = NULL, facets = NULL, | |||
} | |||
|
|||
.eia_data <- function(dir, data, facets, freq, start, end, sort, length, offset, tidy, key){ | |||
md <- eia_metadata(dir, TRUE, TRUE, key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that every API request becomes a minimum of two API requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, yes. I'm not sure how else having truly dynamic validation of input values could be accomplished - unless we precompiled within the package itself (maybe within data-raw/
...?) all the available options for every API data endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intention was to only check as part of error handling itself, like catching an exception and handling it as needed.
Meaning, data request is made. Then, if data is returned in some expected manner, do nothing but return the data. This would be the 99% common use case and only ever makes one request. Otherwise, handle the exception when it occurs (whether that is a try catch around an actual error, or checking for some message or lack of data in the response).
So a second API request, for metadata, would only be made at the end of the parent data request function if needed based on the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, completely dynamic- typical exception handling around the data call, rather than an automatic preemptive request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at it, I'm thinking I could just re-order the metadata call to come after the data call...
Let me first commit and push the is.null()
checks to the helper functions, and then I'll play around with error handling on error/warning, rather than preemptively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other thing to realize, is that it will only make two API calls on the first function call to that endpoint. After that initial call, that endpoint and its metadata will be cached, so only one API call is made from then on within the context of that API endpoint (dir
entry).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. That's an excellent benefit of the memoization.
Another option is to make available a check_metadata = FALSE
(default) parameter to eia_data()
, ideally after tidy
and before key
, as perhaps the second least likely parameter to be passed an argument by the user.
Then, if they want to run a call with a more robust check that (potentially) makes another API request, they can set check_metadata = TRUE
. Think of it like a sort of debug mode, though I wouldn't use that name. Then within the code, the metadata request and the subsequent checks could all be wrapped in an if(check_metadata)
.
It's just another option; if it's not worth the trouble to have the metadata request and checks be fully dynamic/only run as needed, then having the extra checking be optional but off by default is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no - I like this. Good suggestion.
…defense in handling errors.
…rom metadata check; `sort` made more robust where `order` may now handle a list of varying length respective of `cols` length; messaging made more consistent with API error messaging
…he inputs being tested
All this unit testing work has made me really re-think the user experience and the in-package (non-API) error handling I've written. I've modified this fairly heavily; e.g. removing One thing I'm still concerned with, and would appreciate your input on, is with the Further, I am still planning to find a API data endpoint offering hourly granularity so I can add testing around that - just haven't gotten to it yet. |
That's a good point about the start and freq formats. There is nothing inherently wrong with imposing more formality, requiring matching format, in this area simply because the API itself is more loose in its requirement around this. But of course it's also fine to only reproduce the API's behavior. I could go either way. You can always indicate in the documentation something in the details about how if both arguments are given and the one is a higher temporal resolution than the other, it's resolution is reduced to match that of |
No description provided.