Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be robust to invalid utf-8 characters in task db #115

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

bergercookie
Copy link
Contributor

Sometimes, there may exist non-printable characters in the taskwarrior pending.data file, for example due to an emoji added to the task description but is not yet properly parsed by python. In these cases, we'd want tasklib not to crash but rather ignore it and keep parsing the results of the command.

Sometimes, there may exist non-printable characters in the taskwarrior `pending.data` file, for example due to an emoji added to the task description but is not yet properly parsed by python. In these cases, we'd want `tasklib` not to crash but rather ignore it and keep parsing the results of the command.
@smemsh
Copy link

smemsh commented Mar 13, 2024

Not entirely sure that ignoring the error is the way to go, because it means the records obtained from taskwarrior will be inconsistent with the actual stored data. In fact it shouldn't be getting invalid text at all that cannot be decoded probably... it doesn't support arbitrary binary data.

Here is a small reproducer which tries to store the character '🦀':

# add the rustlang crab emoji as task annotation from shell prompt:
task 1 annotate -- $'\U0001f980\'

After this, tasklib will crash when reading the task database. I am not sure if this is rather a bug in taskwarrior, as it doesn't seem to get encoded correctly; after the above annotation, task 1 edit shows the annotation as two 16-bit characters 0xd83e and 0xdd80 rather than the single character 0x1f980. However, if this character is manually inserted with task 1 edit and saved, then both taskwarrior and tasklib seem to handle it just fine.

@bergercookie did you get the character into the database by specifically using task annotate ?

@smemsh
Copy link

smemsh commented Mar 14, 2024

Note that the latest development version of taskwarrior seems to store this fine, see GothenburgBitFactory/taskwarrior#3286

@bergercookie
Copy link
Contributor Author

@bergercookie did you get the character into the database by specifically using task annotate ?

This PR is like 2 years back so I honestly have no idea how I had reproduced this 😅

@smemsh
Copy link

smemsh commented Jun 6, 2024

I think the issue should be closed, because the problem doesn't exist anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants