D-STAR: Demonstrative Self-Training for Source-free Domain Adaptation of Entity Linking with Foundation Models

Overview

This repository contains the code of D-STAR and Fandomwiki dataset to evaluate source-free domain adaptation. In this work, we present D-STAR, a framework for solving unsupervised entity linking problems using Demonstrative Self-Training and source-free domain adaptation.

*** UPDATE ***

We have uploaded comparison of running examples to illustrate our method.

We have uploaded D-STAR query generation scripts with GPT3.5 as the foundation model.

We have uploaded the D-STAR query generation scripts with LLaMA we quantized as the foundation model

Methods

Our approach utilizes few-shot examples to prompt a foundation model to generate factoid context-relted questions for mention-entity pairs. The order of these examples is determined by a sampled path from a graph encoded by the retriever. We then directly adapt the retrieval model to the generated query and labels retrieved entity documents with its previous knowledge, aided by a pseudo label denoising strategy. Our group contrastive learning strategy shares negative samples within subgraphs. The updated model recomputes distances within the unvisited graph and optimizes the demonstration priority queue for the next self-training cycle. Our demonstrative self-training strategy updates question generation and question answering simultaneously without accessing source domain data.

Requirements

Our evaluation code is tested on Ubuntu 20.04 with RTX-3090. To install the required packages:

pip install -r requirements.txt

Data Preparation

download and unzip the datasets

├── data
├── documents
│   ├── american_football.json
│   ├── coronation_street.json
│   ├── doctor_who.json
│   ├── elder_scrolls.json
│   ├── fallout.json
│   ├── final_fantasy.json
│   ├── forgotten_realms.json
│   ├── ice_hockey.json
│   ├── lego.json
│   ├── military.json
│   ├── muppets.json
│   ├── pro_wrestling.json
│   ├── star_trek.json
│   ├── starwars.json
│   ├── world_of_warcraft.json
│   └── yugioh.json
├── entity2mention.json
├── mention2entity.json
├── Fandomwiki
│   ├── mentions
│   │   ├── test.json
│   │   ├── train.json
│   │   └── valid.json
│   └── tfidf_candidates
│       ├── test_tfidfs.json
│       ├── train_tfidfs.json
│       └── valid_tfidfs.json
└── Zeshel
    ├── mentions
    │   ├── all.json
    │   ├── test.json
    │   ├── train.json
    │   └── valid.json
    └── tfidf_candidates
        ├── test_tfidfs.json
        ├── train_tfidfs.json
        └── valid_tfidfs.json

download checkpoints

Name	Size	Download Link
bi_encoder (D-STAR)	831 MB	Result/Checkpoint
bi_encoder_cand1_group_contrastive_learning	831 MB	Result/Checkpoint
cross_encoder	831 MB	Result/Checkpoint
ColBERT-v2	406MB	Checkpoint

Checkpoint structure

├── bi_encoder_cand1_group_contrastive_learning
│   ├── cross_domain_test_metric.json
│   ├── Fandomwiki_test_metric.json
│   └── model_best.ckpt
├── bi_encoder
│   ├── cross_domain_test_metric.json
│   ├── Fandomwiki_test_metric.json
│   └── model_best.ckpt
├── cross_encoder
│   ├── cross_domain_test_metric.json
│   ├── Fandomwiki_test_metric.json
│   └── model_best.ckpt

Evaluate on Fandomwiki

bash scripts/eval_fandomwiki.sh

Evaluate on Zeshel

bash scripts/eval_zeshel.sh

D-STAR query generation

D-STAR query generation using GPT3.5

cd colbert
bash scripts/query_generation_chatgpt.sh

D-STAR query generation using LLaMA

cd colbert
bash scripts/query_generation_llama.sh

Self-training (Pseudo Labelling)

cd colbert
bash scripts/self_training.sh

Group contrastive learning

Run the group contrastive learning with 4~8 GPUs to achieve the similar retrieval performance on Fandomwiki and Zeshel.

bash train.sh

Run the PEFT version of group contrastive learning on a single GPU with (BitFit \ LoRA \ Adatper \ PromptTuning)!

bash train_peft.sh

Examples of generated questions

We compare questions generated by random demonstrations and demonstrations from subgraphs to demonstrate diversity. Although the perplexities can pertain to large model scales as shown in Figure 7, the diversity of questions generated by D-STAR is still in a satisfactory range as evidenced by the following examples. Compared with randomly sampled demonstrations from other topics or domains, the foundation model better at understanding and generating questions when provided with demonstrations from the same domain. Comparing examples from the same domain, or even further from a subgraph neighborhood, can help to extrapolate question generation for low-overlap mention-entity pairs, which require a higher level of knowledge. Table 4 Comparison of question generation extrapolation with random demonstrations (grey) and D-STAR on FandomWiki

Domain	Mention	Matchinig	Question
Doctor Who	Project FXX Q84	Low	What was the stolen project that Sheldukher used to search for Sakkrat called?
Doctor Who	Project FXX Q84	Low	_________________?
Doctor Who	Quinn	Medium	Which planet did Quinn end up on?
Doctor Who	Quinn	Medium	Which is not true?
Doctor Who	Joan Redfern	High	Who did the Doctor give his jacket to after wearing it through several adventures?
Doctor Who	Joan Redfern	High	Who was Redfern?
Star wars	colleague	Low	Which sephi aide served navi during the clone wars?
Star wars	colleague	Low	What?
Star wars	attacked	Medium	Where was alderaanian senator bail prestor organa attacked by pirate forces?
Star wars	attacked	Medium	Question: Question:?
Star wars	Nightbrother	Other	Who was the Nightbrother?
Star wars	Nightbrother	Other	Who was maul's brother?
Military	Fort Knox	Low	What is the name of the fort that is used as a location in the game?
Military	Fort Knox	Low	american forts during 18th and 19th centuries french british and american nations built and
Military	Bloody Creek	Medium	What is the significance of the battle of bloody creek?
Military	Bloody Creek	Medium	What is the name of the french fort that was built in 1632 to protect acadia from attacks by the wabanaki the french?
Military	Springfield rifled musket	Other	What was the name of the primary weapon used by the Union Army during the Civil War?
Military	Springfield rifled musket	Other	What was armory's primary weapon of union infantry during war?
American Football	UGA	Low	How did the season end when Georgia had won the acc coastal division but lost its last 3 games to its rival uga?
American Football	UGA	Low	_________________?
American Football	J . T . Thomas	Medium	Which position did J T Thomas play for the Buffalo Bills?
American Football	J . T . Thomas	Medium	Question:
American Football	Jimmy Graham	High	Who is Jimmy Graham?
American Football	Jimmy Graham	High	What is the name of Jimmy Graham?
Muppets	Birds - of - a - Feather club	Low	What badge is Maxwell trying to earn with his fellow club members at the Birds-of-a-Feather club?
Muppets	Birds - of - a - Feather club	Low	What is scouting?
Muppets	Fraggle Rock	Medium	Which character has the most similar view about life?
Muppets	Fraggle Rock	Medium	What is Fraggle Rock?
Muppets	Orn	High	How long did Orn serve under Crais?
Muppets	Orn	High	Who played Orn on farscape episode?

Text matching means the degree of textual overlaps between the mentions and the entities measured by BM25 scores.

TODO Checklist

Evaluation scripts uploaded.
Datasets and checkpoints uploaded.
D-STAR query generation with GPT3.5.
D-STAR query generation with LLaMA.
Self-training scripts.
Contrastive training scripts.
General Self-training pipeline.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
colbert		colbert
images		images
modeling		modeling
preprocessing		preprocessing
scripts		scripts
trainer		trainer
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py
main.py		main.py
metrics.py		metrics.py
requirements.txt		requirements.txt
test_config.py		test_config.py
train_config.py		train_config.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

D-STAR: Demonstrative Self-Training for Source-free Domain Adaptation of Entity Linking with Foundation Models

Overview

Methods

Requirements

Data Preparation

Evaluate on Fandomwiki

Evaluate on Zeshel

D-STAR query generation

Self-training (Pseudo Labelling)

Group contrastive learning

Examples of generated questions

TODO Checklist

About

Releases

Packages

Languages

pengbohua/D-STAR

Folders and files

Latest commit

History

Repository files navigation

D-STAR: Demonstrative Self-Training for Source-free Domain Adaptation of Entity Linking with Foundation Models

Overview

Methods

Requirements

Data Preparation

Evaluate on Fandomwiki

Evaluate on Zeshel

D-STAR query generation

Self-training (Pseudo Labelling)

Group contrastive learning

Examples of generated questions

TODO Checklist

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages