Let an LLM handle the labeling!
This GitHub Action uses AI to label your issues and PRs, keeping your repo organized so you can focus on what matters.
- Smart Analysis: Understands context from titles, descriptions, and code changes
- Context-Aware: Uses repository files (CODEOWNERS, templates, etc.) to make informed decisions
- Incremental: Works alongside other label management tools and manual labeling
- Zero Config: Works out of the box with your existing GitHub labels
- Customizable: Fine-tune behavior with optional configuration
- Reliable: Supports OpenAI and Anthropic's latest models, orchestrated with ControlFlow
To get started with the default model (OpenAI's gpt-4o-mini
), follow these steps:
- Add the following workflow definition to your repo at
.github/workflows/ai-labeler.yml
.
name: AI Labeler
on:
issues:
types: [opened]
pull_request:
types: [opened]
jobs:
ai-labeler:
runs-on: ubuntu-latest
permissions:
contents: read
issues: write
pull-requests: write
steps:
- uses: actions/checkout@v4
- uses: jlowin/[email protected]
with:
include-repo-labels: true # Set to false if you're providing a config file with labels
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
-
Add an OpenAI API key to your repository's secrets as
OPENAI_API_KEY
. -
Optionally, add a fine-tuning configuration file to
.github/ai-labeler.yml
. This example will add the contents ofREADME.md
andCONTRIBUTING.md
to the LLM's context when labeling issues and PRs. (See below for a more comprehensive example.)
context-files:
- README.md
- CONTRIBUTING.md
That's it!
Whenever an issue or PR is opened, the action will read your repository's existing labels and their descriptions to make smart labeling decisions. To improve its accuracy, update your label descriptions in GitHub's UI or provide a fine-tuning configuration file as described below.
The following settings can be provided as part of the workflow definition file:
By default, the AI labeler will use all labels found in your repository, as well as those specifically defined in your config file. This can lead to unexpected results if your repository contains labels that don't make sense for the AI to apply, or contains very many labels.
To disable this behavior, set include-repo-labels
to false
. In this case, the AI will only use labels defined in your config file. See the fine-tuning section below for more details.
- uses: jlowin/[email protected]
with:
include-repo-labels: false # Only use labels defined in config
You must specify an LLM provider and provide an API key. You can use either OpenAI or Anthropic models:
By default, the AI labeler uses OpenAI's gpt-4o-mini
model. This is an excellent and affordable choice for most users. However, 4o-mini can get confused by complex per-label instructions. You can specify a different model if you'd like:
- uses: jlowin/[email protected]
with:
controlflow-llm-model: openai/gpt-4o-mini
Supported formats:
- OpenAI:
openai/<model-name>
(e.g., "openai/gpt-4o-mini") - Anthropic:
anthropic/<model-name>
(e.g., "anthropic/claude-3-5-sonnet-20241022")
See the ControlFlow LLM documentation for more information on supported models.
Note that you must provide an appropriate API key for your selected LLM provider.
- uses: jlowin/[email protected]
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
# Optionally specify a different model
controlflow-llm-model: openai/gpt-4o
Set your OpenAI API key as a repository secret named OPENAI_API_KEY
. Since the default model is openai/gpt-4o-mini
, you don't need to specify a model unless you want to change it.
- uses: jlowin/[email protected]
with:
anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
# You must specify a model
controlflow-llm-model: anthropic/claude-3-5-sonnet-20241022
Set your Anthropic API key as a repository secret named ANTHROPIC_API_KEY
. To use Anthropic, you must specify a model.
By default, the action looks for additional configuration in .github/ai-labeler.yml
. You can specify a different location:
- uses: jlowin/[email protected]
with:
config-path: .github/my-custom-config.yml
This file controls the labeling behavior - see the Fine-Tuning section below for details.
In addition to choosing a model, you can create a config file to fine-tune the labeling behavior. By default, the action looks for a file at .github/ai-labeler.yml
. If no file is found, it will use the default behavior.
This file should have the following format; each section is optional and described in full below.
instructions: |
"..."
labels:
- ...
context-files:
- ...
This example provides a focused set of default labels with clear instructions for minimal, accurate labeling. For best results, turn off the include-repo-labels
option. This will require the AI to use only the labels defined in your config file.
instructions: |
Apply the minimal set of labels that accurately characterize the issue/PR:
- Use at most 1-2 labels unless there's a compelling reason for more
- Prefer specific labels (bug, feature) over generic ones (question, help wanted)
- For PRs that fix bugs, use 'bug' not 'enhancement'
- Never combine: bug + enhancement, feature + enhancement. For these labels, only choose the most relevant one.
- Reserve 'question' and 'help wanted' for when they're the primary characteristic
labels:
- bug:
description: "Something isn't working as expected"
instructions: |
Apply when describing or fixing unexpected behavior:
- Issues: Clear error messages or unexpected outcomes
- PRs: Fixes for broken functionality
Don't apply enhancement/feature for bug fixes unless they add significant new functionality
beyond fixing the bug
- documentation:
description: "Improvements or additions to documentation"
instructions: |
Apply only when documentation is the primary focus:
- README updates
- Code comments and docstrings
- API documentation
- Usage examples
Don't apply for minor doc updates alongside code changes
- enhancement:
description: "Improvements to existing features"
instructions: |
Apply only for improvements to existing functionality:
- Performance improvements
- UI/UX improvements
- Expanded capabilities of existing features
Don't apply to:
- Bug fixes
- New features
- Minor tweaks
- feature:
description: "New functionality"
instructions: |
Apply only for net-new functionality:
- New API endpoints
- New commands or tools
- New user-facing capabilities
Don't apply to:
- Improvements to existing features (use enhancement)
- Bug fixes
- good first issue:
description: "Good for newcomers"
instructions: |
Apply very selectively to issues that are:
- Small in scope
- Well-documented
- Require minimal context
- Have clear success criteria
Don't apply if the task requires significant background knowledge
- help wanted:
description: "Extra attention is needed"
instructions: |
Apply only when it's the primary characteristic:
- Issue needs external expertise
- Current maintainers can't address it
- Additional contributors would be valuable
Don't apply just because an issue is open or needs work
- question:
description: "Further information is requested"
instructions: |
Apply only when the primary purpose is seeking information:
- Clarification needed before work can begin
- Architectural discussions
- Implementation strategy questions
Don't apply to:
- Bug reports that need more details
- Feature requests that need refinement
# These files will be included in the context if they exist
context-files:
- README.md
- CONTRIBUTING.md
- CODE_OF_CONDUCT.md
- .github/ISSUE_TEMPLATE/bug_report.md
- .github/ISSUE_TEMPLATE/feature_request.md
Global guidance for the AI labeler:
instructions: |
You're our labeling expert! Please help keep our repository organized by:
- Using 'bug' only for confirmed issues, not feature requests
- Applying 'help wanted' to good first-time contributor opportunities
- Being generous with 'good first issue' to encourage new contributors
In the labels section, you can define or enhance specific labels that you want the AI to use. Any labels defined here that do not already exist in your repository will be created automatically. The behavior of whether these are used alongside existing repository labels is controlled by the include-repo-labels
action input.
labels:
# Simple form: just the name
- question
# Expanded form with description and instructions
- documentation:
description: "Documentation changes"
instructions: |
Apply when changes are primarily documentation-focused:
- Changes to README, guides, or other .md files
- Updates to docstrings or inline documentation
By default, the LLM context includes a variety of information about the issue or PR in question, as well as information about available labels. You can specify additional files the AI should consider when making decisions:
context-files:
- .github/CODEOWNERS
- CONTRIBUTING.md
- .github/ISSUE_TEMPLATE/bug_report.md
Here are some examples of interesting labeling behaviors you can configure:
This example shows a basic configuration for a bug
label, including instructions for the AI to follow when labeling issues.
labels:
- bug:
description: "Something isn't working"
instructions: |
Apply when the issue describes unexpected behavior with:
- Clear error messages
- Steps to reproduce
- Expected vs actual behavior
By adding CODEOWNERS
to the context files, the AI can use that information to label issues that need review from specific teams.
labels:
- frontend-review:
description: "Needs review from frontend team"
instructions: |
Apply when changes touch frontend code:
- Check if files are in frontend/ or ui/ directories
- Check CODEOWNERS for @frontend-team ownership
- Look for changes to CSS, JavaScript, or React components
- backend-review:
description: "Needs review from backend team"
instructions: |
Apply for changes to backend systems:
- Check if files are in backend/ or api/ directories
- Check CODEOWNERS for @backend-team ownership
- Look for database or API changes
context-files:
- .github/CODEOWNERS
GitHub can automatically generate release notes for each release, using labels to categorize changes. You can use the AI labeler to determine whether a change should be excluded from release notes, or appears to introduce a breaking change, both of which can be reflected in the generated release notes.
labels:
- skip-release-notes:
description: "Exclude from release notes"
instructions: |
Apply to changes that don't need release notes:
- Simple typo fixes
- Internal documentation updates
- CI/CD tweaks
- Version bumps in test files
- Changes to development tools
- breaking-change:
description: "Introduces breaking changes"
instructions: |
Apply when changes require user action:
- API signature changes
- Configuration format updates
- Dependency requirement changes
- Removed features or endpoints
Labels are often used for triaging issues, and the AI labeler can use the provided content to assist with a first pass.
labels:
- needs-reproduction:
description: "Issue needs steps to reproduce"
instructions: |
Apply to bug reports that need more info:
- Check issue templates for missing required info
- Look for clear reproduction steps
- Check for environment details
- good-first-issue:
description: "Good for newcomers"
instructions: |
Apply to encourage new contributors:
- Small, well-defined scope
- Clear success criteria
- Minimal prerequisite knowledge
- Good documentation exists
context-files:
- .github/ISSUE_TEMPLATE/bug_report.md
- CONTRIBUTING.md
Global instructions for labeling PRs based on the number of lines changed:
instructions: |
When labeling pull requests, apply size labels based on these criteria:
- 'size/XS': 0-9 lines changed
- 'size/S': 10-29 lines changed
- 'size/M': 30-99 lines changed
- 'size/L': 100-499 lines changed
- 'size/XL': 500+ lines changed
Don't count changes to:
- Auto-generated files
- Package-lock.json or similar
- Simple formatting changes
Based on the contents of the PR, the AI can apply a security-review
label and mark the issue as high priority.
instructions: |
Apply 'security-review' label if the changes involve:
- Authentication/authorization code
- Cryptographic operations
- File system access
- Network requests
- Environment variables
- Dependencies with known vulnerabilities
Also apply 'high-priority' if the changes are in:
- auth/*
- security/*
- crypto/*
context-files:
- .github/SECURITY.md
- .github/CODEOWNERS
For dependency-related changes, the AI can apply a dependencies
label and add additional labels based on the change type.
instructions: |
For dependency-related changes:
1. Apply 'dependencies' label to all dependency updates
2. Additionally:
- Apply 'security' if it's a security update
- Apply 'breaking-change' if it's a major version bump
- Apply 'ci-only' if it only affects dev/test dependencies
For package.json changes:
- Apply 'frontend-deps' if touching frontend dependencies
- Apply 'backend-deps' if touching backend dependencies
The AI can help maintain PR quality by applying labels based on the PR contents.
labels:
- needs-improvement:
description: "PR needs substantial improvements to meet quality standards"
instructions: |
Apply this label to PRs that show signs of being low-effort or opportunistic:
Documentation:
- Unnecessary formatting changes
- Broken or circular links
- Machine-translated content
Code:
- Changes that introduce complexity without justification
- Copy-pasted code without attribution
- Changes that bypass tests or reduce coverage
- Trivial variable renaming
Patterns:
- PRs that ignore project conventions
- Auto-generated or templated content
- PRs that copy issues without adding value
However, be careful not to discourage genuine first-time contributors who may be unfamiliar with best practices.
If the PR can be improved with guidance, also apply the 'help-wanted' label.
- invalid:
description: "PR does not meet contribution guidelines or appears to be spam"
instructions: |
Apply this label when a PR appears to be:
- Automated spam content
- Deliberately gaming contribution counts
- Excessive self-promotion
- Pure promotional content without value
When this label is applied, include a comment explaining why and link to contributing guidelines.
context-files:
- .github/pull_request_template.md
- CONTRIBUTING.md
- CODE_OF_CONDUCT.md
The cost of using the AI labeler depends on the LLM provider and the model you choose, as well as the size of the issue or PR. Since the LLM output is minimal (a few labels), the cost will be primarily driven by input tokens.
As a rough estimate, this README is approximately 3000 tokens, which is longer than the typical issue or PR. Processing it would cost about $0.00045 with OpenAI's gpt-4o-mini
.
That means you could process 10,000 PRs of similar length for under $5.
Issues and PRs welcome! And don't worry about labels β we've got that covered! π