Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature to handle individual digits in parse_number #42

Open
arnavkapoor opened this issue Aug 26, 2020 · 2 comments
Open

Feature to handle individual digits in parse_number #42

arnavkapoor opened this issue Aug 26, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@arnavkapoor
Copy link
Collaborator

So one of the use cases I think could be parsing phone numbers or zip codes. These might be written in the form of two three zero two five eight etc with each of the digit spelled out. Using parse would return space separated string 2 3 0 2 5 8 while parse_number would give None. Neither gives the wanted output 230258 (number). (Of-course the user can do some additional processing on parse output which will work but having a feature it in the library itself might be better)

We can have a parameter in parse_number say relaxed which when set to true will build this number up as one large number.

@noviluni
Copy link
Contributor

It sounds interesting.

We could also use a parameter like join_delimiter or something like that to choose how to join the followed numbers (that are followed omitting spaces, commas, etc).

Examples:

>>> parse('I have three numbers: one, two, three', join_delimiter='-')
'I have 3 numbers: 1-2-3'

>>> parse('I have three numbers: one, two, three', join_delimiter='')
'I have 3 numbers: 123'

>>> parse('I have three numbers: one, two, three', join_delimiter='/')
'I have 3 numbers: 1/2/3'


>>> parse('two three zero two five eight', join_delimiter='.')
2.3.0.2.5.8

@NEERAJAP2001
Copy link

@noviluni sir, this is what I am thinking what do you suggest?

From this code

        myvalue = _build_number(tokens_taken, lang_data)
        for each_number in myvalue:
            current_sentence.append(each_number)
            current_sentence.append(" ") 

To this code

 if tokens_taken:
        myvalue = _build_number(tokens_taken, lang_data)
        for each_number in myvalue:
            if delimeter:
                current_sentence.append(each_number)
                current_sentence.append(delimeter)
            else:
                current_sentence.append(each_number)
                current_sentence.append(" ") 

Here, I have added a new parameter - delimiter into the parse function

Links for the code

if tokens_taken:

if compare_token in SENTENCE_SEPARATORS:

@noviluni noviluni added the enhancement New feature or request label Dec 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants