Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treating the marker of incomplete words as a word-internal character #13

Open
aarppe opened this issue Sep 16, 2022 · 2 comments
Open

Comments

@aarppe
Copy link
Contributor

aarppe commented Sep 16, 2022

... the |-sign [which is added to the end of incomplete words as an indication of such] is treated as a word-separator, rather than as a part of the word (though the error-model deletes all |-signs before the end of a string, so something should be done about that ....
Originally posted by @aarppe in #10 (comment)

Generally, I've thought that the marker of incomplete words could alternatively be the tilde (~), as it is similar to a hyphen and not generally expected to occur in words. Or then one could make use of the hyphen (-) which already doubles as a separator of preverbs/prenouns and reduplicative elements, though that could be confusing to the end-users. In the case of the hyphen, one could either always add it at the end of incomplete words (e.g. ni- or nikî--) or append it to an incomplete word only when the correct version of the string would not end in some aforementioned prefix (e.g. ni- or nikî-).

Whichever character we will use, that should be treated as a word-internal character.

@aarppe
Copy link
Contributor Author

aarppe commented Sep 16, 2022

Furthermore, selecting a suggestion that ends with a character marking an incomplete word should not be appended with a space character (as the word yet needs to be completed) (again, warranting a separate issue).

@aarppe
Copy link
Contributor Author

aarppe commented Sep 17, 2022

Swapped marker for incomplete words to tilde ~, as that is more similar to but not the same as the hyphen, as a possible word-internal character. Nevertheless, we would need to specify whichever incomplete word marker as a word-internal character.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant