When is the random_seed being used in training. the multi_label classification gives me different results on each running #280

YanLiang1102 · 2021-01-15T06:21:01Z

I saw the seed is being set in the notebook, but did not find its reference in the source code.


args = Box({
    "run_text": "multilabel toxic comments with freezable layers",
    "train_size": -1,
    "val_size": -1,
    "log_path": LOG_PATH,
    "full_data_dir": DATA_PATH,
    "data_dir": DATA_PATH,
    "task_name": "toxic_classification_lib",
    "no_cuda": False,
    "bert_model": BERT_PRETRAINED_PATH,
    "output_dir": OUTPUT_PATH,
    "max_seq_length": 512,
    "do_train": True,
    "do_eval": True,
    "do_lower_case": True,
    "train_batch_size": 8,
    "eval_batch_size": 16,
    "learning_rate": 5e-5,
    "num_train_epochs": 6,
    "warmup_proportion": 0.0,
    "no_cuda": False,
    "local_rank": -1,
    "seed": 42,
    "gradient_accumulation_steps": 1,
    "optimize_on_cpu": False,
    "fp16": True,
    "fp16_opt_level": "O1",
    "weight_decay": 0.0,
    "adam_epsilon": 1e-8,
    "max_grad_norm": 1.0,
    "max_steps": -1,
    "warmup_steps": 500,
    "logging_steps": 50,
    "eval_all_checkpoints": True,
    "overwrite_output_dir": True,
    "overwrite_cache": False,
    "seed": 42,
    "loss_scale": 128,
    "task_name": 'intent',
    "model_name": 'xlnet-base-cased',
    "model_type": 'xlnet'
})

The text was updated successfully, but these errors were encountered:

lingdoc · 2022-10-20T03:38:46Z

If you're using cuda, you probably need to define/run a function to set the seed manually for each library at the beginning of your code. Something like:

def fix_seeds(seed):
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

fix_seeds(42)

This is because each library has its own seed for randomization that it uses for initialization.

YanLiang1102 changed the title ~~When is the random_seed being used in training. the multi_label classification gives me different result on each runnining~~ When is the random_seed being used in training. the multi_label classification gives me different results on each running Jan 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When is the random_seed being used in training. the multi_label classification gives me different results on each running #280

When is the random_seed being used in training. the multi_label classification gives me different results on each running #280

YanLiang1102 commented Jan 15, 2021

lingdoc commented Oct 20, 2022 •

edited

When is the random_seed being used in training. the multi_label classification gives me different results on each running #280

When is the random_seed being used in training. the multi_label classification gives me different results on each running #280

Comments

YanLiang1102 commented Jan 15, 2021

lingdoc commented Oct 20, 2022 • edited

lingdoc commented Oct 20, 2022 •

edited