Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generation of the data set report fails #3326

Open
AndyDaniel1 opened this issue Mar 28, 2024 · 9 comments
Open

Generation of the data set report fails #3326

AndyDaniel1 opened this issue Mar 28, 2024 · 9 comments

Comments

@AndyDaniel1
Copy link
Member

The generation of the data set report fails.

Error message:

grafik

@SaCodematix @tilovillwock @moellerth

@tilovillwock
Copy link
Collaborator

@AndyDaniel1 @thorsteneuler as far as I can tell the report task error message indicates that there's a rogue unicode character in the dataset as described here. This character would be rendered as in a modern text editor (e.g. Notepad++ or Visual Studio Code).

I don't really have a good explanation how this particular sequence would end up in a dataset other than that maybe the dataset went through a lossy conversion from e.g. a legacy Windows encoding like ISO-8859-1 to UTF-8. I'm unfortunately not that familiar with SPSS.

@tilovillwock tilovillwock self-assigned this Mar 28, 2024
@thorsteneuler
Copy link

@tilovillwock Thank you for the information. I've checked for the rogue unicode character (found 19 of them) and removed them. I'll start the generation of the data set report again with the updated information to check if it works now.

I do have two guesses how this particular character came into the dataset (none due to SPSS). It seems to originated from word in quotations within sentences in quotations.

@thorsteneuler
Copy link

@tilovillwock Generation failed again. Are there other rogue characters which I can check for?

@tilovillwock
Copy link
Collaborator

@thorsteneuler this time the report generation seems to be failing because we've hit some kind of memory limit. I'm still investigating but this might take some time. I'll report back as soon as possible. Sorry for the inconvenience.

tilovillwock added a commit to dzhw/report-task that referenced this issue Apr 29, 2024
tilovillwock added a commit to dzhw/report-task that referenced this issue Apr 29, 2024
* Upgrade GitHub workflow actions (#45)
* Added a memory configuration to allow for processing of larger
  reports (dzhw/metadatamanagement#3326)
tilovillwock added a commit to dzhw/report-task that referenced this issue Apr 29, 2024
@tilovillwock
Copy link
Collaborator

I finally found a configuration that seems to be able to process the workload. We've deployed a fix to production.

@thorsteneuler your report should be listed now since it went through successfully when I tried it.

@anneweber please try creating a report for dat-nac2018-ds1 again.

@AndyDaniel1 we should discuss the underlying issues in detail during our next Jour Fixe.

@AndyDaniel1
Copy link
Member Author

@tilovillwock thank you!!

@AndyDaniel1
Copy link
Member Author

The generation failed again for nac2018

grafik

@tilovillwock
Copy link
Collaborator

@anneweber I made another adjustment. Seems like it went through. Your PDF report is listed now.

@anneweber
Copy link
Contributor

@tilovillwock Yes, it seems to work now. :) Thank you! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants