Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python 3.7 버전에서 지역 변수 사용 시 오류가 납니다. #101

Open
robin-kkk opened this issue Jun 1, 2020 · 2 comments
Open

Comments

@robin-kkk
Copy link

Python 3.7.4 (default, May 29 2020, 10:08:52)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from soynlp.utils import DoublespaceLineCorpus
>>> from soynlp.noun import LRNounExtractor_v2
>>> corpus_path ='test_news'
>>> sents = DoublespaceLineCorpus(corpus_path, iter_sent=True)
>>> noun_extractor = LRNounExtractor_v2(verbose=True)
[Noun Extractor] use default predictors
[Noun Extractor] num features: pos=3929, neg=2321, common=107
>>> nouns = noun_extractor.train_extract(sents)
[Noun Extractor] counting eojeols
'utf-8' codec can't decode byte 0xbc in position 0: invalid start byte
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/user/.pyenv/versions/venv/lib/python3.7/site-packages/soynlp/noun/_noun_ver2.py", line 143, in train_extract
    self.train(inputs, min_eojeol_frequency)
  File "/Users/user/.pyenv/versions/venv/lib/python3.7/site-packages/soynlp/noun/_noun_ver2.py", line 153, in train
    self._train_with_sentences(inputs, min_eojeol_frequency)
  File "/Users/user/.pyenv/versions/venv/lib/python3.7/site-packages/soynlp/noun/_noun_ver2.py", line 170, in _train_with_sentences
    preprocess = preprocess
  File "/Users/user/.pyenv/versions/venv/lib/python3.7/site-packages/soynlp/utils/utils.py", line 183, in __init__
    self._counter = self._counting_from_sents(sents)
  File "/Users/user/.pyenv/versions/venv/lib/python3.7/site-packages/soynlp/utils/utils.py", line 223, in _counting_from_sents
    len(_counter), i_sent + 1, '%.3f'%get_process_memory(), ' '*20), flush=True)
UnboundLocalError: local variable 'i_sent' referenced before assignment

for 문에서 사용한 i_sent 변수를 외부에서 사용해서 나온 것 같습니다.

@lovit
Copy link
Owner

lovit commented Jun 1, 2020

@ke2ek 이슈 감사합니다. 해당 부분 수정해 두겠습니다. 더하여, 올려주신 메시지에서 아래의 구문이 보입니다. corpus_path 에 입력하는 텍스트파일의 인코딩을 utf-8 로 가정하고 있습니다. decode 에러가 나서 for loop 이 아예 돌지 않아 i_sent 가 assigned 되지 않아 보입니다. 아마 텍스트 파일의 인코딩을 맞춰두시면 해당 부분은 곧바로 작동할 것으로 예상됩니다.

'utf-8' codec can't decode byte 0xbc in position 0: invalid start byte

그럼에도 @ke2ek 님께서 언급하신 것처럼 local variable assignment 로 에러가 표현되는 부분은 수정해 두도록 하겠습니다.

@soheekim911
Copy link

혹시 soynlp pip 패키지에서도 local variable assignment 에러가 수정되었나요? 같은 에러가 발생합니다.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants