This repo contains link to the materials for the Learning to Learn workshop on Machine Learning Prague 2023.
The primary playground env for the exercises below is Google Colab.
The linked Colab notebooks contain the resolution of dependences, but if you'd like to run the exercises elsewhere, simply install the attached requirements.txt
into any environment:
git clone https://github.com/gaussalgo/L2L_MLPrague23.git
pip install -r L2L_MLPrague23/requirements.txt
-
Architectures
- Difference to other arch's (attention layer)
- Tasks (=objectives)
-
Pre-training & Fine-tuning
-
Inputs and outputs
- Single token prediction
-
Generation
- Iterative prediction
- Other generation strategies
- [Hands-on] Constraining generated output (forcing & disabling)
- Problem definition (usage)
- Contrast with Supervised ML
- Zero-shot vs few-shot
- Examples
- [Hands-on] comparison of zero-shot vs. few-shot performance (of some chosen ICL)
- Demonstrations heterogeneity
- Prompt engineering
- Promptsource - database of prompts?
- [Hands-on] prompt engineering (inspired by the training data?)
- Training strategies + existing models
- training in explicit fewshot format (QA)
- Instruction tuning
- Multitask learning
- Chain-of-Thought
- Pre-training on a code
- Fine-tuning with human feedback
- Data properties fostering ICL
- Experiments
- Explanations of the existing models?
- [Hands-on] Customizing Few-shot ICL to specialized data
- Practical training pipeline
- Overview of the training pipeline
- Adaptor example
If you trained your own great few-shot ICL model, it would be a pity not to test it on some unseen reasoning tasks.
See the competition readme for how to evaluate the model and if it beats the baseline, how to spread the word!