Skip to content

Commit

Permalink
Fix multi-byte character handling in read_file (Significant-Gravita…
Browse files Browse the repository at this point in the history
…s#3173)

Co-authored-by: Reinier van der Leer <[email protected]>
  • Loading branch information
sidewaysthought and Pwuts authored May 1, 2023
1 parent 7fc6f2a commit a5f8563
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
8 changes: 5 additions & 3 deletions autogpt/commands/file_operations.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import os.path
from typing import Dict, Generator, Literal, Tuple

import charset_normalizer
import requests
from colorama import Back, Fore
from requests.adapters import HTTPAdapter, Retry
Expand Down Expand Up @@ -153,9 +154,10 @@ def read_file(filename: str) -> str:
str: The contents of the file
"""
try:
with open(filename, "r", encoding="utf-8") as f:
content = f.read()
return content
charset_match = charset_normalizer.from_path(filename).best()
encoding = charset_match.encoding
logger.debug(f"Read file '{filename}' with encoding '{encoding}'")
return str(charset_match)
except Exception as err:
return f"Error: {err}"

Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ webdriver-manager
jsonschema
tweepy
click
charset-normalizer>=3.1.0
spacy>=3.0.0,<4.0.0
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.5.0/en_core_web_sm-3.5.0-py3-none-any.whl

Expand Down

0 comments on commit a5f8563

Please sign in to comment.