Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting on dates with NULL values #227

Open
dhicks opened this issue Dec 30, 2022 · 0 comments
Open

Sorting on dates with NULL values #227

dhicks opened this issue Dec 30, 2022 · 0 comments

Comments

@dhicks
Copy link

dhicks commented Dec 30, 2022

Brief description of the problem: When trying to sort on a csl_dates column with missing (NULL) values, the ordering is wrong. Replacing the NULL entry/ies with NA resolves the problem, but is tricky.

Diagnosis: This came up when I was trying to use bibliography_entries() with an Zotero export of a group that included some submitted but unpublished papers (so, no publication date). bibliography_entries() calls jsonlite::fromJSON(), which has a longstanding issue with assuming java null should be translated to R NULL: jeroen/jsonlite#70.

Reprex:

library(vitae)
#> 
#> Attaching package: 'vitae'
#> The following object is masked from 'package:stats':
#> 
#>     filter

dates = structure(list(
    structure(list(`date-parts` = list(list(2020L))), class = "csl_date"),
    NULL, 
    structure(list(`date-parts` = list(list(2019L, 3L, 14L))), class = "csl_date"), 
    structure(list(`date-parts` = list(list(2016L, 12L, 22L))), class = "csl_date"), 
    structure(list(`date-parts` = list(list(2020L, 1L))), class = "csl_date")
    ), 
    class = c("csl_dates", "vctrs_vctr", "list"))

dates
#> <csl_dates[5]>
#> [1] 2020       NULL       2019-3-14  2016-12-22 2020-1
## The order is all wrong and the last entry has disappeared
dates[order(dates)]
#> <csl_dates[4]>
#> [1] 2019-3-14  NULL       2020       2016-12-22

## From <https://stackoverflow.com/questions/22870198/is-there-a-more-efficient-way-to-replace-null-with-na-in-a-list/49539022#49539022>
replace_x <- function(x, replacement = NA_character_) {
    if (length(x) == 0 || length(x[[1]]) == 0) {
        replacement
    } else {
        x
    }
}

## Presumably you could use an lapply here, but I can't be bothered to figure that out right now
fixed_dates = purrr::modify_depth(dates, 1, replace_x)
## Sorted correctly, with the missing value at the end
fixed_dates[order(fixed_dates)]
#> <csl_dates[5]>
#> [1] 2016-12-22 2019-3-14  2020       2020-1     NA

Created on 2022-12-30 with reprex v2.0.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant