-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you #15
Comments
Can you provide a simple demo?thank you |
+1 |
1 similar comment
+1 |
Thanks for your work! |
@DYF-AI @kadirbeytorun @lc1314555 import argparse import os.path as osp #new import torch from config import get_args #new global_args = get_args(sys.argv[1:]) def image_process(image_path, imgH=32, imgW=100, keep_ratio=False, min_ratio=1): if keep_ratio: img = img.resize((imgW, imgH), Image.BILINEAR) return img #new
def main(args): args.cuda = args.cuda and torch.cuda.is_available() Create data loadersif args.height is None or args.width is None: dataset_info = DataInfo(args.voc_type) Create modelmodel = ModelBuilder(arch=args.arch, rec_num_classes=dataset_info.rec_num_classes, Load from checkpointif args.resume: if args.cuda: Evaluatormodel.eval() TODO: testing should be more clean.to be compatible with the lmdb-based testing, need to construct some meaningless variables.rec_targets = torch.IntTensor(1, args.max_len).fill_(1) if name == 'main': parse the configos.environ['CUDA_VISIBLE_DEVICES'] = '8' torch.backends.cudnn.enabled = Falseargs = get_args(sys.argv[1:]) 2、it should debug the "loss_embed"part from the models/model_builder.py |
Thanks for your reply!I have found this demo from ASTER,but it doesn't work.Can you tell me how to debug the "loss_embed"part from the models/model_builder.py.
…------------------ 原始邮件 ------------------
发件人: "Pay20Y/SEED" ***@***.***>;
发送时间: 2021年11月3日(星期三) 下午4:15
***@***.***>;
***@***.******@***.***>;
主题: Re: [Pay20Y/SEED] If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you (#15)
@DYF-AI @kadirbeytorun @lc1314555
1、the demo.py can be like this:`from future import absolute_import
import sys
sys.path.append('./')
import argparse
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"
import os.path as osp
import numpy as np
import math
import time
#new
from PIL import Image, ImageFile
import torch
from torch import nn, optim
from torch.backends import cudnn
from torch.utils.data import DataLoader, SubsetRandomSampler
#new
from torchvision import transforms
from config import get_args
from lib import datasets, evaluation_metrics, models
from lib.models.model_builder import ModelBuilder
from lib.datasets.dataset import LmdbDataset, AlignCollate, CustomDataset
from lib.datasets.concatdataset import ConcatDataset
from lib.loss import SequenceCrossEntropyLoss
from lib.trainers import Trainer
from lib.evaluators import Evaluator
from lib.utils.logging import Logger, TFLogger
from lib.utils.serialization import load_checkpoint, save_checkpoint
from lib.utils.osutils import make_symlink_if_not_exists
#new
from lib.evaluation_metrics.metrics import get_str_list
from lib.utils.labelmaps import get_vocabulary, labels2strs
global_args = get_args(sys.argv[1:])
def image_process(image_path, imgH=32, imgW=100, keep_ratio=False, min_ratio=1):
img = Image.open(image_path).convert('RGB')
if keep_ratio:
w, h = img.size
ratio = w / float(h)
imgW = int(np.floor(ratio * imgH))
imgW = max(imgH * min_ratio, imgW)
img = img.resize((imgW, imgH), Image.BILINEAR)
img = transforms.ToTensor()(img)
img.sub_(0.5).div_(0.5)
return img
#new
class DataInfo(object):
"""
Save the info about the dataset.
This a code snippet from dataset.py
"""
def init(self, voc_type):
super(DataInfo, self).init()
self.voc_type = voc_type
assert voc_type in ['LOWERCASE', 'ALLCASES', 'ALLCASES_SYMBOLS'] self.EOS = 'EOS' self.PADDING = 'PADDING' self.UNKNOWN = 'UNKNOWN' self.voc = get_vocabulary(voc_type, EOS=self.EOS, PADDING=self.PADDING, UNKNOWN=self.UNKNOWN) self.char2id = dict(zip(self.voc, range(len(self.voc)))) self.id2char = dict(zip(range(len(self.voc)), self.voc)) self.rec_num_classes = len(self.voc)
def main(args):
np.random.seed(args.seed)
torch.manual_seed(args.seed)
torch.cuda.manual_seed(args.seed)
torch.cuda.manual_seed_all(args.seed)
cudnn.benchmark = True
torch.backends.cudnn.deterministic = True
args.cuda = args.cuda and torch.cuda.is_available()
if args.cuda:
print('using cuda.')
torch.set_default_tensor_type('torch.cuda.FloatTensor')
else:
torch.set_default_tensor_type('torch.FloatTensor')
Create data loaders
if args.height is None or args.width is None:
args.height, args.width = (32, 100)
dataset_info = DataInfo(args.voc_type)
Create model
model = ModelBuilder(arch=args.arch, rec_num_classes=dataset_info.rec_num_classes,
sDim=args.decoder_sdim, attDim=args.attDim, max_len_labels=args.max_len,
eos=dataset_info.char2id[dataset_info.EOS], STN_ON=args.STN_ON)
Load from checkpoint
if args.resume:
checkpoint = load_checkpoint(args.resume)
model.load_state_dict(checkpoint['state_dict'])
if args.cuda:
device = torch.device("cuda")
model = model.to(device)
model = nn.DataParallel(model)
Evaluator
model.eval()
img = image_process(args.image_path)
with torch.no_grad():
img = img.to(device)
input_dict = {}
input_dict['images'] = img.unsqueeze(0)
TODO: testing should be more clean.
to be compatible with the lmdb-based testing, need to construct some meaningless variables.
rec_targets = torch.IntTensor(1, args.max_len).fill_(1)
rec_targets[:,args.max_len-1] = dataset_info.char2id[dataset_info.EOS]
input_dict['rec_targets'] = rec_targets
input_dict['rec_lengths'] = [args.max_len]
output_dict = model(input_dict)
pred_rec = output_dict['output']['pred_rec']
pred_str, _ = get_str_list(pred_rec, input_dict['rec_targets'], dataset=dataset_info)
print('Recognition result: {0}'.format(pred_str[0]))
if name == 'main':
parse the config
os.environ['CUDA_VISIBLE_DEVICES'] = '8'
torch.backends.cudnn.enabled = False
args = get_args(sys.argv[1:])
main(args)`
2、it should debug the "loss_embed"part from the models/model_builder.py
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
/SEED/lib/models/model_builder.py", line 73, in forward |
Thanks for your reply!I have solved my problem after hearing your suggestions.You are so kind to be patient to my question,thank you!
…------------------ 原始邮件 ------------------
发件人: "Pay20Y/SEED" ***@***.***>;
发送时间: 2021年11月4日(星期四) 下午2:51
***@***.***>;
***@***.******@***.***>;
主题: Re: [Pay20Y/SEED] If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you (#15)
/SEED/lib/models/model_builder.py", line 73, in forward
input_dict['rec_embeds']
KeyError: 'rec_embeds'
You just need to annotate the code which relate the 'rec_embeds'. such the no.100 102 107 109 from the model_builder.py.
When I annotate these lines, my code runs away,it will be fine.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
Triage notifications on the go with GitHub Mobile for iOS or Android.
|
If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you
The text was updated successfully, but these errors were encountered: