Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic error handling within eia_data() from metadata layer #17

Merged
merged 20 commits into from
Nov 16, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,4 @@
^doc$
^Meta$
^\.github$
^CRAN-SUBMISSION$
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ README.html
doc
Meta
docs
^CRAN-SUBMISSION$
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# eia 0.4.1

* Added dynamic error handling to `eia_data()` via metadata layer.

# eia 0.4.0

* Updated functions and functionality based on the new v2 API, effective March 2023;
Expand Down
153 changes: 110 additions & 43 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ eia_data <- function(dir, data = NULL, facets = NULL,
}

.eia_data <- function(dir, data, facets, freq, start, end, sort, length, offset, tidy, key){
md <- eia_metadata(dir, TRUE, TRUE, key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that every API request becomes a minimum of two API requests?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, yes. I'm not sure how else having truly dynamic validation of input values could be accomplished - unless we precompiled within the package itself (maybe within data-raw/...?) all the available options for every API data endpoint.

Copy link
Member

@leonawicz leonawicz Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was to only check as part of error handling itself, like catching an exception and handling it as needed.

Meaning, data request is made. Then, if data is returned in some expected manner, do nothing but return the data. This would be the 99% common use case and only ever makes one request. Otherwise, handle the exception when it occurs (whether that is a try catch around an actual error, or checking for some message or lack of data in the response).

So a second API request, for metadata, would only be made at the end of the parent data request function if needed based on the result.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, completely dynamic- typical exception handling around the data call, rather than an automatic preemptive request.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at it, I'm thinking I could just re-order the metadata call to come after the data call...

Let me first commit and push the is.null() checks to the helper functions, and then I'll play around with error handling on error/warning, rather than preemptively.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other thing to realize, is that it will only make two API calls on the first function call to that endpoint. After that initial call, that endpoint and its metadata will be cached, so only one API call is made from then on within the context of that API endpoint (dir entry).

Copy link
Member

@leonawicz leonawicz Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. That's an excellent benefit of the memoization.

Another option is to make available a check_metadata = FALSE (default) parameter to eia_data(), ideally after tidy and before key, as perhaps the second least likely parameter to be passed an argument by the user.

Then, if they want to run a call with a more robust check that (potentially) makes another API request, they can set check_metadata = TRUE. Think of it like a sort of debug mode, though I wouldn't use that name. Then within the code, the metadata request and the subsequent checks could all be wrapped in an if(check_metadata).

It's just another option; if it's not worth the trouble to have the metadata request and checks be fully dynamic/only run as needed, then having the extra checking be optional but off by default is good.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no - I like this. Good suggestion.

.eia_data_check(md, dir, data, facets, freq, start, end, sort, length, offset)
r <- .eia_get(.eia_data_url(dir, data, facets, freq, start, end, sort, length, offset, key))
if(is.na(tidy)) return(r)
r <- jsonlite::fromJSON(r)
Expand All @@ -78,83 +80,148 @@ eia_data <- function(dir, data = NULL, facets = NULL,
.eia_data_memoized <- memoise::memoise(.eia_data)

.eia_data_url <- function(dir, data, facets, freq, start, end, sort, length, offset, key){
dir <- .eia_url(path = paste0(dir, "/data/?api_key=", key))
dat_spcs <- if(!is.null(data)) .data_specs(data)
fct_spcs <- if(!is.null(facets)) .facet_specs(facets)
frq_spcs <- if(!is.null(freq)) .freq_specs(freq)
str_spcs <- if(!is.null(start)) .start_specs(start)
end_spcs <- if(!is.null(end)) .end_specs(end)
srt_spcs <- if(!is.null(sort)) .sort_specs(sort)
lng_spcs <- if(!is.null(length)) .lng_specs(length)
ofs_spcs <- if(!is.null(offset)) .ofs_specs(offset)
paste0(dir, dat_spcs, fct_spcs, frq_spcs, str_spcs, end_spcs, srt_spcs, lng_spcs, ofs_spcs)
paste0(
.eia_url(path = paste0(dir, "/data/?api_key=", key)),
if(!is.null(data)) .data_specs(data),
if(!is.null(facets)) .facet_specs(facets),
if(!is.null(freq)) .freq_specs(freq),
if(!is.null(start)) .start_specs(start),
if(!is.null(end)) .end_specs(end),
if(!is.null(sort)) .sort_specs(sort),
if(!is.null(length)) .lng_specs(length),
if(!is.null(offset)) .ofs_specs(offset)
)
}

.eia_data_check <- function(md, dir, data, facets, freq, start, end, sort, length, offset){
if(!is.null(data)) .data_check(data, md$Data$id)
if(!is.null(facets)) .facet_check(facets, md$Facets$id)
if(!is.null(freq)) .freq_check(freq, md$Frequency$id)
md_start <- md$Period$start; md_end <- md$Period$end
if(!is.null(start)) .start_check(start, freq, md$Frequency, md_start, md_end)
if(!is.null(end)) .end_check(end, freq, md$Frequency, md_end, md_start)
if(!is.null(sort)) .sort_check(sort)
if(!is.null(length)) .lng_check(length)
if(!is.null(offset)) .ofs_check(offset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment about lines 83-105: What do you think about migrating all the instances of this if(!is.null(x)) stuff into the functions that check or handle the rest of the respective operations? Seems like since we have all these other internal helper functions handling cases and returning something (or nothing), they could also handle the NULL case themselves as well. Then outer functions like .eia_data_url() and .eia_data_check() can be further simplified.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I will make this happen. Good catch.

}

.data_specs <- function(data){
paste0("&data[]=", data, collapse = "")
}

.data_check <- function(data, ids){
if (!all(data %in% ids))
stop("'data' must be some combination of: ", paste(ids, collapse = ", "),
call. = FALSE)
}

.facet_specs <- function(facets){
paste0(unlist(lapply(
1:length(facets),
paste0(unlist(lapply(1:length(facets),
function(x){
paste0("&facets[", names(facets[x]), "][]=", unlist(facets[x]), collapse = "")
})), collapse = "")
})), collapse = "")
}

.facet_check <- function(facets, ids){
nms <- names(facets)
if (!all(nms %in% ids))
stop("names of the 'facets' list input must be some combination of: ",
paste(ids, collapse = ", "),
call. = FALSE)
}

.freq_specs <- function(freq){
if (!is.character(freq) | length(freq) > 1)
stop("'freq' must be a character value of length 1.")
freqs <- c("annual", "yearly", "quarterly", "monthly", "daily", "hourly")
if (!freq %in% freqs)
stop("'freq' must be one of: 'annual', 'yearly', 'monthly', 'daily', or 'hourly'.")
paste0("&frequency=", freq)
}

.start_specs <- function(start, freq){
if (!is.character(start))
stop("'start' must be a character matching the required frequency format.")
.freq_check <- function(freq, ids){
if (!is.character(freq) | length(freq) > 1)
stop("'freq' must be a character value of length 1.",
"\n'freq' options are: ", paste(ids, collapse = ", "),
call. = FALSE)
if (!(freq %in% ids))
stop("'freq' must be one of: ", paste(ids, collapse = ", "),
call. = FALSE)
}

.start_specs <- function(start){
paste0("&start=", start)
}

.end_specs <- function(end, freq){
if (!is.character(end))
stop("'end' must be a character matching the required frequency format.")
.start_check <- function(start, freq, md_frq_tbl, mds, mde){
fmt <- md_frq_tbl[md_frq_tbl$id == freq, ]$format
if (!is.character(start) | nchar(start) != nchar(fmt))
stop("'start' must be a character string of format: ", fmt,
call. = FALSE)
if (start > mde)
stop("'start' is beyond the end of available data.",
call. = FALSE)
if (start < mds)
warning("'start' is beyond available history. Earliest available: ", mds,
call. = FALSE)
}

.end_specs <- function(end){
paste0("&end=", end)
}

.end_check <- function(end, freq, md_frq_tbl, mde, mds){
fmt <- md_frq_tbl[md_frq_tbl$id == freq, ]$format
if (!is.character(end) | nchar(end) != nchar(fmt))
stop("'end' must be a character string of format: ", fmt,
call. = FALSE)
if (end < mds)
stop("'end' is before the start of available data.",
call. = FALSE)
if (end > mde)
warning("'end' is beyond available history. Latest available: ", mde,
call. = FALSE)
}

.sort_specs <- function(sort){
if (length(sort) != 2 || !all(names(sort) %in% c("cols", "order")))
stop(
"'sort' must be a named list of length 2 containing the following:\n",
"'cols' and 'order' of arbitrary length and of length 1, respectively."
)
cols <- sort$cols
sort_cols <- lapply(1:length(cols),
function(x){paste0("&sort[", x, "][column]=", unlist(cols[x]), collapse = "")}
)
order <- sort$order
sort_cols <- lapply(
1:length(cols),
function(x){
paste0("&sort[", x, "][column]=", unlist(cols[x]), collapse = "")
})
if (length(order) > 1)
stop("must provide a single value for 'order': 'asc' or 'desc'.")
if (!order %in% c("asc", "desc"))
stop("'order' must be one of 'asc' or 'desc'.")
sort_order <- lapply(
1:length(cols),
function(x) { paste0("&sort[", x, "][direction]=", order) }
function(x) {paste0("&sort[", x, "][direction]=", order)}
)
paste0(unlist(sort_cols), sort_order, collapse = "")
}

.sort_check <- function(sort){
if (length(sort) != 2 || !all(names(sort) %in% c("cols", "order")))
stop("'sort' must be a named list of length 2 containing the following:\n",
"'cols' and 'order' of arbitrary length and of length 1, respectively.",
call.=FALSE)
cols <- sort$cols
order <- sort$order
if (length(order) > 1)
stop("must provide a single value for 'order': 'asc' or 'desc'.",
call. = FALSE)
if (!order %in% c("asc", "desc"))
stop("'order' must be one of 'asc' or 'desc'.",
call. = FALSE)
}

.lng_specs <- function(length){
if (!is.numeric(length) | length > 5000)
stop("'length' must be a single numeric value between 0 and 5000.")
paste0("&length=", length)
}

.lng_check <- function(length){
if (!is.numeric(length) | length > 5000)
stop("'length' must be a single numeric value between 0 and 5000.",
call. = FALSE)
}

.ofs_specs <- function(offset){
if (!is.numeric(offset) | offset < 0)
stop("'offset' must be a single numeric value greater than 0.")
paste0("&offset=", offset)
}

.ofs_check <- function(offset){
if (!is.numeric(offset) | offset < 0)
stop("'offset' must be a single numeric value greater than 0.",
call. = FALSE)
}
51 changes: 41 additions & 10 deletions tests/testthat/test-data.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ test_that("data queries return as expected", {
# Test JSON list returned given `tidy = FALSE`
x <- eia_data("electricity/retail-sales", tidy = FALSE)
expect_type(x, "list")
expect_equal(length(x), 3)
expect_equal(names(x), c("response", "request", "apiVersion"))
expect_equal(length(x), 4)
expect_equal(names(x), c("response", "request", "apiVersion", "ExcelAddInVersion"))

options(eia_antidos = 3)

Expand Down Expand Up @@ -53,19 +53,49 @@ test_that("data queries return as expected", {
# Test no data
expect_error(suppressMessages(eia_data("electricity/zzz")), "Page not found")

# Test "data" input value
err <- "'data' must be some combination of: revenue, sales, price, customers"
expect_error(eia_data("electricity/retail-sales", data = "prce"), err)

# Test "facet" input value
err <- "names of the 'facets' list input must be some combination of: stateid, sectorid"
expect_error(eia_data("electricity/retail-sales", facets = list(statid = "OH")), err)

# Test "freq" input value
err <- "'freq' must be one of: 'annual', 'yearly', 'monthly', 'daily', or 'hourly'."
err <- "'freq' must be one of: monthly, quarterly, annual"
expect_error(eia_data("electricity/retail-sales", freq = "anual"), err)
err <- "'freq' must be a character value of length 1."
expect_error(eia_data("electricity/retail-sales", freq = c("annual", "monthly")), err)

# Test non-character "start" input value
err <- "'start' must be a character matching the required frequency format."
expect_error(eia_data("electricity/retail-sales", freq = "annual", start = 2010), err)
# Test "start" input value
err <- "'start' must be a character string of format: YYYY"
expect_error(eia_data("electricity/retail-sales", freq="annual", start=2010), err)
err <- "'start' is beyond the end of available data."
expect_error(eia_data("electricity/retail-sales", freq="annual", start="2099"), err)
err <- "'start' must be a character string of format: YYYY"
expect_error(eia_data("electricity/retail-sales", freq="annual", start="2020-06"), err)
err <- "'start' must be a character string of format: YYYY-MM"
expect_error(eia_data("electricity/retail-sales", freq="monthly", start="2020"), err)
wrn <- "'start' is beyond available history. Earliest available: 2001-01"
expect_warning(
eia_data("electricity/retail-sales", facets=list(stateid="OH"), freq="annual", start="1980", end="2020"),
wrn
)

# Test non-character "end" input value
err <- "'end' must be a character matching the required frequency format."
expect_error(eia_data("electricity/retail-sales", freq = "annual", end = 2010), err)
# Test "end" input value
err <- "'end' must be a character string of format: YYYY"
expect_error(eia_data("electricity/retail-sales", freq="annual", end=2010), err)
err <- "'end' is before the start of available data."
expect_error(eia_data("electricity/retail-sales", freq="annual", end="1980"), err)
err <- "'end' must be a character string of format: YYYY"
expect_error(eia_data("electricity/retail-sales", freq="annual", end="2020-06"), err)
err <- "'end' must be a character string of format: YYYY-MM"
expect_error(eia_data("electricity/retail-sales", freq="monthly", end="2020"), err)
wrn <- "'end' is beyond available history. Latest available: 2023-08"
expect_warning(
eia_data("electricity/retail-sales", facets=list(stateid="OH"), freq="annual", start="2020", end="2099"),
wrn
)

# Test "length" input value
expect_warning(
Expand Down Expand Up @@ -135,7 +165,8 @@ test_that("metadata helper returns as expected", {

x <- eia_metadata("electricity/retail-sales", tidy = FALSE, cache = FALSE)
expect_type(x, "list")
expect_length(x, 3)
expect_length(x, 4)
expect_equal(names(x), c("response", "request", "apiVersion", "ExcelAddInVersion"))

x <- eia_metadata("electricity/retail-sales", tidy = NA, cache = FALSE)
expect_type(x, "character")
Expand Down
3 changes: 2 additions & 1 deletion tests/testthat/test-dirs.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@ test_that("directory functions returns as expected", {

x <- eia_dir(tidy = FALSE, key = key)
expect_type(x, "list")
expect_equal(names(x), c("response", "request", "apiVersion"))
expect_equal(length(x), 4)
expect_equal(names(x), c("response", "request", "apiVersion", "ExcelAddInVersion"))

x2 <- eia_dir(tidy = FALSE)
expect_identical(x, x2)
Expand Down
4 changes: 2 additions & 2 deletions tests/testthat/test-facets.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ test_that("facets working as expected", {
# Test JSON list returned given `tidy = FALSE`
x <- eia_facets("electricity/retail-sales", facet = "sectorid", tidy = FALSE)
expect_type(x, "list")
expect_equal(length(x), 3)
expect_equal(names(x), c("response", "request", "apiVersion"))
expect_equal(length(x), 4)
expect_equal(names(x), c("response", "request", "apiVersion", "ExcelAddInVersion"))

# Test character object returned given `tidy = NA`
x <- eia_facets("electricity/retail-sales", facet = "sectorid", tidy = NA)
Expand Down