Timer (Large Time Series Model)

This repo provides official code, datasets and checkpoints for Timer: Generative Pre-trained Transformers Are Large Time Series Models. [Poster], [Slides].

Updates

🚩 News (2024.6) Pre-training dataset (UTSD) is available in HuggingFace. Dataloader is also contained.

🚩 News (2024.5) Accepted by ICML 2024, a camera-ready version of 31 pages.

🚩 News (2024.4) The pre-training scale has been extended, enabling zero-shot forecasting.

🚩 News (2024.2) Releasing model checkpoints and code for adaptation.

Introduction

Time Series Transformer (Timer) is a Generative Pre-trained Transformer for general time series analysis. You can visit our Homepage for a more detailed introduction.

Datasets

We curate Unified Time Series Datasets (UTSD) comprised of 1B time points and 4 volumes to facilitate the research on large time series models and pre-training.

Our dataset is released in HuggingFace to facilitate the research of large models and pre-training in the field of time series.

Usage

You can access and load UTSD in the style of TSLib based on the following:

# huggingface-cli login
# export HF_ENDPOINT=https://hf-mirror.com 

python ./scripts/UTSD/download_dataset.py

# dataloader
python ./scripts/UTSD/utsdataset.py

Tasks

Forecasting: We provide all scripts as well as datasets for few-shot forecasting in this repo.

Imputation: We propose segment-level imputation, which is more challenging than point-level imputation.

Anomaly Detection: We provide new benchmarks of predictive anomaly detection on UCR Anomaly Archive.

We provide detailed README files illustrating each task under the folder ./scripts/.

Code for Fine-tuning

Install Pytorch and necessary dependencies.

pip install -r requirements.txt

Put downstream datasets from Google Drive and Tsinghua Cloud under the folder ./dataset/.
Put the checkpoint from Google Drive and Tsinghua Cloud under the folder ./checkpoints/.
Train and evaluate the model. We provide the above tasks under the folder ./scripts/.

# forecasting
bash ./scripts/forecast/ECL.sh

# segement-level imputation
bash ./scripts/imputation/ECL.sh

# anomaly detection
bash ./scripts/anomaly_detection/UCR.sh

Train on Custom Dataset

To fine-tune on your time series dataset, you can try out the following steps:

The essense is to reload the customized dataloader and load the pre-trained checkpoint (See ./scripts/ folder).
CIDatasetBenchmark/CIAutoRegressionDatasetBenchmark in the data_provider folder can train and evaluate models in direct / iterative multi-step mode.

Approach

Pre-training and Adaptation

To pre-train on heterogeneous time series, we propose single-series sequence (S3), reserving series variations with the unified context length. Further, we convert forecasting, imputation, and anomaly detection into a unified generative task.

Model Architecture

Given the limited exploration of the backbone for large time series models, we extensively evaluate candidate backbones and adopt the decoder-only Transformer with autoregressive generation towards LTSMs.

Performance

Timer achieves state-of-the-art performance in each task and we present the pre-training benefit on few-shot scenarios.

Scalability

By increasing the parameters and pre-training scale, Timer achieves notable performance improvement: 0.231 $\to$ 0.138 (−40.3%), surpassing the previous state-of-the-art deep forecasters.

Flexible Sequence Length

The decoder-only architecture provides the flexibility to accommodate time series of different lookback and forecast lengths.

Benchmark

Given the significant value to researchers and practitioners, we provide a summary of concurrent LTSMs:

We also establish the first zero-shot forecasting benchmark in our paper (See Section 4.6 for the details).

Future Work

We are preparing to provide the online service for zero-shot forecasting. Please stay tuned for the update!

Citation

If you find this repo helpful, please cite our paper.

@inproceedings{liutimer,
  title={Timer: Generative Pre-trained Transformers Are Large Time Series Models},
  author={Liu, Yong and Zhang, Haoran and Li, Chenyu and Huang, Xiangdong and Wang, Jianmin and Long, Mingsheng},
  booktitle={Forty-first International Conference on Machine Learning}
}

Acknowledgement

We appreciate the following GitHub repos a lot for their valuable code and efforts.

Time-Series-Library (https://github.com/thuml/Time-Series-Library)

Contact

If you have any questions or want to use the code, feel free to contact:

Yong Liu ([email protected])
Haoran Zhang ([email protected])
Chenyu Li ([email protected])

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data_provider		data_provider
exp		exp
figures		figures
layers		layers
models		models
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Timer (Large Time Series Model)

Updates

Introduction

Datasets

Usage

Tasks

Code for Fine-tuning

Train on Custom Dataset

Approach

Pre-training and Adaptation

Model Architecture

Performance

Scalability

Flexible Sequence Length

Benchmark

Future Work

Citation

Acknowledgement

Contact

About

Releases

Packages

Contributors 2

Languages

License

thuml/Large-Time-Series-Model

Folders and files

Latest commit

History

Repository files navigation

Timer (Large Time Series Model)

Updates

Introduction

Datasets

Usage

Tasks

Code for Fine-tuning

Train on Custom Dataset

Approach

Pre-training and Adaptation

Model Architecture

Performance

Scalability

Flexible Sequence Length

Benchmark

Future Work

Citation

Acknowledgement

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages