Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add chinese flickr30k and flickr8k from https://github.com/li-xirong/cross-lingual-cap @yangapku #101

Merged
merged 5 commits into from
Jul 10, 2023

Conversation

mehdidc
Copy link
Collaborator

@mehdidc mehdidc commented Jul 9, 2023

Issue #33

@mehdidc mehdidc changed the title add chinese flickr30k and flickr8k from https://github.com/li-xirong/cross-lingual-capp @yangapku add chinese flickr30k and flickr8k from https://github.com/li-xirong/cross-lingual-cap @yangapku Jul 10, 2023
@mehdidc
Copy link
Collaborator Author

mehdidc commented Jul 10, 2023

xlm-roberta-large-ViT-H-14 results on the test set:

{"dataset": "flickr30k", "model": "xlm-roberta-large-ViT-H-14", 
"pretrained": "frozen_laion5b_s13b_b90k", 
"task": "zeroshot_retrieval",
 "metrics": {
"image_retrieval_recall@5": 0.738547682762146, 
"text_retrieval_recall@5": 0.9359999895095825
},
 "language": "zh"
}

Chinese CLIP (https://github.com/OFA-Sys/Chinese-CLIP/tree/master) reports 0.914 for R@5 image retrieval and 0.97.5 for text retrieval.

@mehdidc mehdidc merged commit d0e83ab into main Jul 10, 2023
@czczup
Copy link

czczup commented Sep 4, 2023

I think this result of xlm-roberta-large-ViT-H-14 maybe wrong. Because the annotation file of Flickr-CN used in the code contains the annotation of the parts of speech of Chinese words (such as verb v, noun n), and this is not handled in dataloader.

In my experiments, the results are:
image

@czczup
Copy link

czczup commented Sep 4, 2023

I use this annotation file.

flickr30k_cn_test.txt

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants