📙 中华新华字典数据库。包括歇后语,成语,词语,汉字。
-
Updated
Dec 26, 2023 - Python
📙 中华新华字典数据库。包括歇后语,成语,词语,汉字。
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Language Technology Platform
Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation Extraction with biGRU+2ATT 中文实体识别与关系提取
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
百度NLP:分词,词性标注,命名实体识别,词重要性
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
AirLLM 70B inference with single 4GB GPU
Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系,成为中文AIGC和认知智能的基础设施。
Datasets, SOTA results of every fields of Chinese NLP
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywords extraction, key sentence extraction, summary extraction implemented based on TEXTRANK algorithm. Jcseg had a build-in http server and search modules for lucene,solr,elasticsearch,opensearch
🍀 Another Chinese chatbot implemented in PyTorch, which is the sub-module of intelligent work order processing robot. 👩🔧
微信公众号语料库
Add a description, image, and links to the chinese-nlp topic page so that developers can more easily learn about it.
To associate your repository with the chinese-nlp topic, visit your repo's landing page and select "manage topics."