Disclaimer: This project is a work in progress and is not yet ready for production use. Please check back later for updates.
Spark
streamlines the process of job execution within a Kubernetes cluster by providing a set of endpoints to schedule, track, and manage jobs programmatically. With integrated job tracking and logging capabilities, making it ideal for environments that require dynamic job scheduling and comprehensive monitoring.
- Dynamic Job Scheduling: Automate the deployment of Kubernetes jobs, leveraging a structured approach to define and manage job specifications such as image, commands, and necessary configurations.
- Concurrency Management: Control and limit the number of jobs that can run concurrently, allowing for effective resource utilization and system stability.
- Task Queuing System: Utilize an internal queuing system to manage job tasks, ensuring that job submissions are handled efficiently and executed in order.
- Comprehensive Monitoring: Continuously monitor the status of each job, capturing and reacting to job completions, failures, and timeouts in real-time.
- Log Retrieval and Storage: Automatically fetch and store logs from job executions, providing immediate access to job outputs for debugging and verification purposes.
- Rate Limiting and Timeouts: Implement client-side rate limiting and configurable timeouts to manage the load on the Kubernetes API and ensure jobs complete within expected time frames.
- Local Persistence: Using BuntDB for fast, in-memory data storage to keep track of job statuses and logs, ensuring data persistence across job operations.
- Data processing applications: Managing batch jobs for data transformation, analysis, or MLM training.
- General automation: Running maintenance scripts, backups, and other periodic tasks within a Kubernetes cluster.
- CI/CD pipelines: Automating deployment tasks, testing, and other operations that can be encapsulated as Kubernetes jobs.
The -Runner- k8sJobs
package is a Go library for managing Kubernetes jobs. It provides functionalities for creating, monitoring, and deleting jobs, managing concurrency, and maintaining a record of job statuses and logs in a local BuntDB database.
- Dynamic Job Management: Create and monitor Kubernetes jobs dynamically within your application.
- Concurrency Control: Manage multiple jobs concurrently with a configurable limit.
- Task Queuing: Queue tasks with a channel-based mechanism.
- Local Persistence: Utilize BuntDB to store job statuses and logs.
- Timeout Handling: Automatically handle job execution with configurable timeouts.
- Error Handling: Robust error handling throughout the job lifecycle.
Orchestrates job tasks and interacts with the Kubernetes API.
- Fields:
cs
: Kubernetes ClientSet to interact with Kubernetes API.maxConcurrentJobs
: Maximum number of jobs that can run concurrently.taskChan
: Channel for queuing tasks.quit
: Channel to signal the shutdown of task processing.namespace
: Namespace in Kubernetes where jobs are deployed.db
: BuntDB instance for local data storage.
Defines the structure of a job task.
- Fields:
ID
: Unique identifier of the task.Command
: Docker container command.Image
: Docker image used for the job.Timeout
: Execution timeout in seconds.Status
: Current status of the task.Logs
: Logs generated by the task.CreatedAt
: Timestamp of task creation.StartedAt
: Timestamp of task start.CompletedAt
: Timestamp of task completion.
// You can replace "default" with the desired path
// or pass an empty string for the current namespace if in-cluster.
runner, err := k8sJobs.New(context.Background(),
"default", // kubernetes config location
5, // max concurrent jobs
100, // max job queue size
60, // default job timeout
)
if err != nil {
log.Fatalf("Failed to start runner: %v", err)
}
task := k8sJobs.Task{
ID: "unique-job-id",
Command: []string{"echo", "Hello World"},
Image: "busybox",
Timeout: 30,
}
if err := runner.AddTask(task); err != nil {
log.Errorf("Failed to add task: %v", err)
}
// Perform additional operations...
runner.Shutdown()
The dashboard provides a minimalist & quick read-only access to the job statuses and logs.
The CLI provides a set of commands to interact with the Spark
system. To get started, run the following command:
go run .\cmd\cli\main.go --mode rest --image "busybox" --cmd "echo Hello World" --timeout 60
The command above will create a new job with the specified image, command, and timeout. You can also use the following flags to customize the job:
--mode
: The mode of operation (rest
orgrpc
).--image
: The Docker image to use for the job.--cmd
: The command to run inside the container.--timeout
: The maximum duration for the job to complete.