Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you #15

Open
DYF-AI opened this issue Sep 7, 2020 · 8 comments

Comments

@DYF-AI
Copy link

DYF-AI commented Sep 7, 2020

If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you

@DYF-AI
Copy link
Author

DYF-AI commented Sep 7, 2020

Can you provide a simple demo?thank you

@kadirbeytorun
Copy link

+1

1 similar comment
@Key-Lab-of-Intelligent-Robot-WIT

+1

@lc1314555
Copy link

Thanks for your work!
Can you provide a demo for recognition of single image?

@Agiroy4712
Copy link

Agiroy4712 commented Nov 3, 2021

@DYF-AI @kadirbeytorun @lc1314555
1、the demo.py can be like this:`from future import absolute_import
import sys
sys.path.append('./')

import argparse
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

import os.path as osp
import numpy as np
import math
import time

#new
from PIL import Image, ImageFile

import torch
from torch import nn, optim
from torch.backends import cudnn
from torch.utils.data import DataLoader, SubsetRandomSampler
#new
from torchvision import transforms

from config import get_args
from lib import datasets, evaluation_metrics, models
from lib.models.model_builder import ModelBuilder
from lib.datasets.dataset import LmdbDataset, AlignCollate, CustomDataset
from lib.datasets.concatdataset import ConcatDataset
from lib.loss import SequenceCrossEntropyLoss
from lib.trainers import Trainer
from lib.evaluators import Evaluator
from lib.utils.logging import Logger, TFLogger
from lib.utils.serialization import load_checkpoint, save_checkpoint
from lib.utils.osutils import make_symlink_if_not_exists

#new
from lib.evaluation_metrics.metrics import get_str_list
from lib.utils.labelmaps import get_vocabulary, labels2strs

global_args = get_args(sys.argv[1:])

def image_process(image_path, imgH=32, imgW=100, keep_ratio=False, min_ratio=1):
img = Image.open(image_path).convert('RGB')

if keep_ratio:
w, h = img.size
ratio = w / float(h)
imgW = int(np.floor(ratio * imgH))
imgW = max(imgH * min_ratio, imgW)

img = img.resize((imgW, imgH), Image.BILINEAR)
img = transforms.ToTensor()(img)
img.sub_(0.5).div_(0.5)

return img

#new
class DataInfo(object):
"""
Save the info about the dataset.
This a code snippet from dataset.py
"""
def init(self, voc_type):
super(DataInfo, self).init()
self.voc_type = voc_type

assert voc_type in ['LOWERCASE', 'ALLCASES', 'ALLCASES_SYMBOLS']
self.EOS = 'EOS'
self.PADDING = 'PADDING'
self.UNKNOWN = 'UNKNOWN'
self.voc = get_vocabulary(voc_type, EOS=self.EOS, PADDING=self.PADDING, UNKNOWN=self.UNKNOWN)
self.char2id = dict(zip(self.voc, range(len(self.voc))))
self.id2char = dict(zip(range(len(self.voc)), self.voc))

self.rec_num_classes = len(self.voc)

def main(args):
np.random.seed(args.seed)
torch.manual_seed(args.seed)
torch.cuda.manual_seed(args.seed)
torch.cuda.manual_seed_all(args.seed)
cudnn.benchmark = True
torch.backends.cudnn.deterministic = True

args.cuda = args.cuda and torch.cuda.is_available()
if args.cuda:
print('using cuda.')
torch.set_default_tensor_type('torch.cuda.FloatTensor')
else:
torch.set_default_tensor_type('torch.FloatTensor')

Create data loaders

if args.height is None or args.width is None:
args.height, args.width = (32, 100)

dataset_info = DataInfo(args.voc_type)

Create model

model = ModelBuilder(arch=args.arch, rec_num_classes=dataset_info.rec_num_classes,
sDim=args.decoder_sdim, attDim=args.attDim, max_len_labels=args.max_len,
eos=dataset_info.char2id[dataset_info.EOS], STN_ON=args.STN_ON)

Load from checkpoint

if args.resume:
checkpoint = load_checkpoint(args.resume)
model.load_state_dict(checkpoint['state_dict'])

if args.cuda:
device = torch.device("cuda")
model = model.to(device)
model = nn.DataParallel(model)

Evaluator

model.eval()
img = image_process(args.image_path)
with torch.no_grad():
img = img.to(device)
input_dict = {}
input_dict['images'] = img.unsqueeze(0)

TODO: testing should be more clean.

to be compatible with the lmdb-based testing, need to construct some meaningless variables.

rec_targets = torch.IntTensor(1, args.max_len).fill_(1)
rec_targets[:,args.max_len-1] = dataset_info.char2id[dataset_info.EOS]
input_dict['rec_targets'] = rec_targets
input_dict['rec_lengths'] = [args.max_len]
output_dict = model(input_dict)
pred_rec = output_dict['output']['pred_rec']
pred_str, _ = get_str_list(pred_rec, input_dict['rec_targets'], dataset=dataset_info)
print('Recognition result: {0}'.format(pred_str[0]))

if name == 'main':

parse the config

os.environ['CUDA_VISIBLE_DEVICES'] = '8'

torch.backends.cudnn.enabled = False

args = get_args(sys.argv[1:])
main(args)`

2、it should debug the "loss_embed"part from the models/model_builder.py

@lc1314555
Copy link

lc1314555 commented Nov 4, 2021 via email

@Agiroy4712
Copy link

/SEED/lib/models/model_builder.py", line 73, in forward
input_dict['rec_embeds']
KeyError: 'rec_embeds'
You just need to annotate the code which relate the 'rec_embeds'. such the no.100 102 107 109 from the model_builder.py.
When I annotate these lines, my code runs away,it will be fine.

@lc1314555
Copy link

lc1314555 commented Nov 4, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants