Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add complex Chinese to simplified Chinese for voice in group chat #1083

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Zhaoyi-Yan
Copy link
Contributor

No description provided.

@Zhaoyi-Yan
Copy link
Contributor Author

fix #1083

其实我也弄了一下模糊匹配。目前只是用jieba去分词,将第一个次转成拼音,并且进行类似搜狗输入法的模糊音匹配。不清楚这样是否合适

@lanvent
Copy link
Collaborator

lanvent commented May 12, 2023

感谢PR,后面会在这个的基础上施工。 分词不太需要,想到的方式是:按空格将句子分割后,可能取第一部分的前x个词,比如20,用pypinyin转换成拼音声调(更模糊的匹配可以不匹配声调),来进行匹配

@Zhaoyi-Yan
Copy link
Contributor Author

其实我觉得,声调不需要匹配。毕竟你群里发语音,声调类似的情况并不多。比如,你触发词就几个,那么对应的类似声调的中文也很少。直接转成拼音匹配就行了。问题是有这种情况,有些人发语音时,不小心有语气词,就是‘’呃,机器人,xxx“,或者是
“呃呃嗯机器人,xxx”。不管怎样,加入保留句子分割后,第一第二部分的词,若有触发词,就进行匹配。

这是因为,在群里发语音的应该情况比较少,并且触发词也是少数,即使误触发也没啥问题。所以,这样的方式似乎好点。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants