Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to extract contents from a fst file when R crashes reading it #264

Open
gabowi opened this issue Nov 24, 2021 · 2 comments
Open

How to extract contents from a fst file when R crashes reading it #264

gabowi opened this issue Nov 24, 2021 · 2 comments

Comments

@gabowi
Copy link

gabowi commented Nov 24, 2021

Hey everyone,

first of all thank you for this package, which is quite helpful in our work. For the first time after writing and reading a lot files already, I now experience a problem.

Trying to read a 12 GB fst file (using: read_fst(path_fstfile)), R crashes. The error message is: "R Session Aborted. R encountered a fatal error. The session was terminated."

This can be reproduces on different computers and from different sources (network, local drive). It is independent from whether data.table is loaded as well or not. It is furthermore independent from whether the script is called through RStudio or through the command line using Rscript.exe. There is sufficient memory available (more than 100 GB RAM). Other fst files can be read successfully.

metadata_fst() works well on this file (see output below).

Is there any method to retrieve the contents of this file?

Thank you in advance for your help.
Gabriel

> metadata_fst(path_fstfile)
<fst file>
120534568 rows, 43 columns (demandsimulationResult.fst)

* 'tripId'                   : integer
* 'legId'                    : integer
* 'personnumber'             : integer
* 'householdOid'             : integer
* 'personOid'                : integer
* ....

Note: other columns are of type character, double and logical.

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 17763)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] fst_0.9.4

loaded via a namespace (and not attached):
[1] compiler_4.1.0 parallel_4.1.0 tools_4.1.0    Rcpp_1.0.7 

Note: On another computer with R version 4.1.2 the error occurs as well.

@fox34
Copy link

fox34 commented Dec 21, 2021

Have you tried incrementally reading parts of the file? E.g.

read_fst(path_fstfile, from=1, to=100)
read_fst(path_fstfile, from=100, to=1000)
read_fst(path_fstfile, from=120534468)

@MarcusKlik
Copy link
Collaborator

Hi @gabowi, did you check your memory consumption while the fst file is loading from disk? This sounds like your system doesn't have enough memory to read this file but that shouldn't crash R. Were the partial reads suggested by @fox34 successful?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants