Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Wrong display of special characters such as á,ó when streaming response #13157

Open
r0llingclouds opened this issue Apr 29, 2024 · 2 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@r0llingclouds
Copy link

Bug Description

When enabling streaming in a query engine, the displayed response shows wrong characters if it is not standard english characters.

For example: "í" is wrongly displayed as í, and "ó" is displayed "ó".

This only happens when streaming=True in the index.as_query_engine() method.

Version

0.10.33

Steps to Reproduce

Create an index and set it as query engine via index.as_query_engine(). Set streaming=True

Then execute a query via response = query_engine.query()

When running response.print_response_stream() special characters will not be displayed properly.

Relevant Logs/Tracbacks

No response

@r0llingclouds r0llingclouds added bug Something isn't working triage Issue needs to be triaged/prioritized labels Apr 29, 2024
Copy link

dosubot bot commented Apr 29, 2024

The issue with special characters like "í" displaying incorrectly as "í" when using index.as_query_engine(streaming=True) is likely due to a character encoding mismatch. This often happens when the output is expected to be in UTF-8 encoding but is interpreted in another encoding.

To address this, ensure that the environment where you're displaying or processing the text is set to use UTF-8 encoding. For example, in Python, you can set the standard output to handle UTF-8 characters like this:

import sys
import io

# Ensure stdout can handle UTF-8 characters
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')

This code snippet explicitly sets the standard output (stdout) to use UTF-8 encoding, which should help in correctly displaying special characters. If you're writing to a file or another stream, make sure it's opened or configured to use UTF-8 encoding as well.

Keep in mind that the response.print_response_stream() method used for printing the streaming response does not handle character encoding explicitly. It relies on the default behavior of the Python print function, which uses the encoding set for sys.stdout. Therefore, adjusting the encoding of sys.stdout as shown above is a crucial step in ensuring special characters are displayed correctly.

If you continue to face issues or have any further questions, feel free to ask for more assistance.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@r0llingclouds
Copy link
Author

r0llingclouds commented Apr 29, 2024

@dosu That did not work. It actually raised an error:
AttributeError: 'OutStream' object has no attribute 'buffer'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

1 participant