A Go-based simulator for analyzing and optimizing serverless function cold starts and resource management policies, based on the paper "Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider" (USENIX ATC '20).
This simulator implements different keep-alive policies for serverless function management and compares their effectiveness in:
- Reducing cold starts
- Optimizing memory usage
- Managing resource allocation
- Predicting function invocation patterns
The project includes implementations of:
- Fixed keep-alive policy (baseline)
- Hybrid histogram-based policy
- ARIMA-based prediction for out-of-bounds cases
- Go 1.19 or later
- Python 3.8 or later (for visualization)
- Azure Functions dataset
- Clone the repository:
git clone https://github.com/yourusername/serverless-simulator.git
cd serverless-simulator
- Install Go dependencies:
go mod download
- Install Python dependencies for visualization:
pip install -r scripts/requirements.txt
- Download the Azure Functions dataset:
mkdir data
cd data
wget https://azurepublicdatasettraces.blob.core.windows.net/azurepublicdatasetv2/azurefunctions_dataset2019/azurefunctions-dataset2019.tar.xz
tar xf azurefunctions-dataset2019.tar.xz
- Run simulation with fixed keep-alive policy:
go run cmd/simulator/main.go \
--policy fixed \
--keepalive 20 \
--data ./data \
- Run simulation with hybrid policy:
go run cmd/simulator/main.go \
--policy hybrid \
--range 240 \
--prewarm 5.0 \
--keepalivepct 99.0 \
--cv 2.0 \
--data ./data \
Generate visualization plots:
python scripts/visualize.py
serverless-simulator/
├── cmd/ # Command-line entry points
│ └── simulator/ # Main simulator executable
├── internal/ # Internal packages
│ ├── config/ # Configuration handling
│ ├── model/ # Core data structures
│ ├── metrics/ # Metrics for exporting data
│ ├── policy/ # Policy implementations
│ ├── simulator/ # Simulator core
│ └── dataloader/ # Dataset loading
├── pkg/ # Public packages
│ ├── histogram/ # Histogram implementation
│ └── timeseries/ # Time series prediction
├── scripts/ # Analysis and visualization
└── data/ # Dataset directory
--policy
: Policy type (fixed or hybrid)--keepalive
: Keep-alive period in minutes--range
: Histogram range in minutes--prewarm
: Pre-warm percentile--keepalivepct
: Keep-alive percentile--cv
: Coefficient of variation threshold--data
: Path to data directory--output
: Output file path
-
Fixed Policy:
- Keep-alive period (default: 20 minutes)
-
Hybrid Policy:
- Histogram range (default: 240 minutes)
- Pre-warm percentile (default: 5.0)
- Keep-alive percentile (default: 99.0)
- CV threshold (default: 2.0)
The simulator generates:
-
Detailed metrics for:
- Cold start ratios
- Memory waste time
-
Visualization plots:
- Cold start rate
- Wasted memory time comparison
- Policy comparisons
- Based on research from the USENIX ATC '20 paper
- Uses the Azure Functions dataset