Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ocamllex with sedlex for unicode support #254

Open
ufl-taeber opened this issue Apr 23, 2020 · 3 comments
Open

Replace ocamllex with sedlex for unicode support #254

ufl-taeber opened this issue Apr 23, 2020 · 3 comments
Labels
enhancement Improvements and feature requests

Comments

@ufl-taeber
Copy link

Pyre fails to parse files with θ (and most likely other non-ASCII characters supported by CPython).

import math
from typing import Tuple


def cartesian(r: int, θ: int) -> Tuple[float, float]:
    rad = math.radians(θ)
    return r*math.cos(rad), r*math.sin(rad)


r = 12
θ = 195

x,y = cartesian(r, θ)
print(f"({r}, {θ}°) = ({x:.2f}, {y:.2f})")
$ python3 src/example.py
(12, 195°) = (-11.59, -3.11)

$ pyre --source-directory src check
Setting up a `.pyre_configuration` with `pyre init` may reduce overhead.
 ƛ Could not parse file at example.py:5:22-5:22
 ƛ Could not parse 1 file due to syntax errors!
 ƛ No type errors found

I'm using Python 3.7.7 installed via homebrew on macOS Catalina 10.15.4.

$ pyre --version
No watchman binary found. 
To enable pyre incremental, you can install watchman: https://facebook.github.io/watchman/docs/install.html
Defaulting to non-incremental check.
Binary version: 8a73fbdd6f7e74fa832ecbf5a9e2ebe20be46cd5
Client version: internal-dev
@ufl-taeber ufl-taeber changed the title Bug - "Could not prase file" with theta Bug - "Could not parse file" with theta Apr 23, 2020
@mrkmndz mrkmndz added the enhancement Improvements and feature requests label Apr 23, 2020
@mrkmndz
Copy link
Contributor

mrkmndz commented Apr 23, 2020

The background here is that our lexer library is not unicode compatible :/. Therefore supporting this would require reimplementing our lexing into a new library, which would be a significant amount of work.

We do plan on doing this at some point, but it will likely be a while.

@mrkmndz mrkmndz changed the title Bug - "Could not parse file" with theta Support Unicode Apr 23, 2020
@mrkmndz mrkmndz pinned this issue Apr 23, 2020
@mrkmndz mrkmndz changed the title Support Unicode Replace ocamllex with sedlex for unicode support Apr 23, 2020
@EricLiclair
Copy link

Does this mean changing line 1 here ?

Or will it require much more than just this?

@aryx
Copy link

aryx commented Jan 17, 2022

Actually you can hack ocamllex to recognize some unicode characters.
See https://github.com/returntocorp/pfff/pull/502/files for examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvements and feature requests
Projects
None yet
Development

No branches or pull requests

4 participants