Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird EOT character #48

Open
leokim-l opened this issue Sep 18, 2024 · 1 comment
Open

Weird EOT character #48

leokim-l opened this issue Sep 18, 2024 · 1 comment

Comments

@leokim-l
Copy link
Member

yaml output by ontogpt contained a \x0004 character, which seems to be an end of transmission in unicode, see
https://en.wikipedia.org/wiki/End-of-Transmission_character

The following fixes it:

return list(yaml.safe_load_all(raw_result.read().replace(u'\x04',''))) # Load and convert to list

It could have been produced by the following:

try:
with open(yaml_file, 'r') as file2concat:
with open(old_yaml_file, 'a') as original_file:
shutil.copyfileobj(file2concat, original_file)

This issue is not pressing and more of a reminder to me to eventually look into it, but @caufieldjh let me know if you are interseted.

@caufieldjh
Copy link
Member

caufieldjh commented Sep 18, 2024

If that character is in the raw output (that is, it's coming directly from the LLM) then ontogpt should catch it (but it doesn't right now AFAIK).
Opened monarch-initiative/ontogpt#454

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants