Skip to content

yoogun143/Chinese-Handwritten-Recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chinese Handwritten Recognition

Flask web app recognizes 3,755 Chinese handwritten characters.

Description

Train convolutional neural network using pretrained ResNet50 on Imagenet 1000 dataset using Pytorch:

  • Fixed feature extractor: The weights for all of the network will be freezed except that of the final fully connected layer. This step will be trained for 2 -3 epochs to avoid overfitting

  • Finetuning: The entire weights of the network will be trained with discriminative learning rates. The layers closer to the input layer will be set low learning rate because they may learn more general features, such as lines and edges . On the other hand, we increase the learning rate for later layers as they learn the detail features.

Top 1 Top 5
Accuracy 95.41% 99.27%

Screenshots

demo

Prerequisites

  • Python <= 3.7.9
  • Anaconda (optional, is used to install environment, you can use python venv instead)
  • HSK 3 at least (Just kidding >.<)

Installation

  1. Clone repository:
$ git clone https://github.com/yoogun143/Chinese-Handwritten-Recognition.git
$ cd app
  1. Install dependencies using Anaconda and pip
$ conda create -n chinese-handwritten-app python=3.7.9  #Create new environment
$ conda activate chinese-handwritten-app #Activate environment
$ conda install pip #install pip inside the environment
$ pip install -r requirements.txt #Install required dependencies
  1. Download resnet50-transfer-4-bestmodel.pth file weights from here and place in train_model folder
train_model
├─code_word.pkl
└─resnet50-transfer-4-bestmodel.pth
  1. Run the app
$ python views_pytorch.py
 * Running on http://127.0.0.1:5000/

Voila! the app is now run locally. Now head over to http://127.0.0.1:5000/, and you should see your app roaring.

Training instruction

You need to download train and test HWDB1.1 dataset below

http://www.nlpr.ia.ac.cn/databases/download/feature_data/HWDB1.1trn_gnt.zip

http://www.nlpr.ia.ac.cn/databases/download/feature_data/HWDB1.1tst_gnt.zip

Put 2 zip files into data folder like below and unzip. Data folders tree:

data
├─test
│   └─HWDB1.1tst_gnt.zip
└─train
    └─HWDB1.1trn_gnt.zip

Convert gnt to png

$ python gnt2png.py

Start training

$ python train.py

Roadmap

  • Reduce web app latency, loading function
  • Handwritten keyboard
  • Train with more words

Credits

cnn_handwritten_chinese_recognition

tf28: 手写汉字识别

drawingboard.js

License

MIT License

Copyright (c) [2021] [Thanh Hoang]