Requests a code review from a large language model.
Software developers want reviews so that they can improve their code. However, developers are busy. Peers may be unavailable or lack the context to provide comments. Backseat Driver prompts a large language model to give code a letter grade for readability, expressiveness, and organization. It also instructs the model to explain its reasoning. Developers gain the benefits of code review without needing another engineer.
pip install backseat-driver
Users can run Backseat Driver from the command line or as a GitHub Action.
Users need to create an OpenAI account
and API key.
Backseat Driver needs the API key to be able to request code review from a
language model.
Set the OPENAI_API_KEY
environment variable to your API key.
Note that OpenAI will charge your account for Backseat Driver's requests. Each Backseat Driver invocation will make a request of at most 4096 tokens on a language model. ChatGPT's current pricing model is $0.002 per 1K tokens. At this price, users can expect each call to Backseat Driver to cost $0.002 * 4.096 = $0.008192, or just under 1 cent.
Run Backseat Driver on the command line with the following.
backseat-driver my_script.py
Provide multiple scripts for simultaneous code review of all given files.
backseat-driver script1.py script2.py script3.py
Wildcard operators also work as expected.
backseat-driver *.py
Here is an example that gets all Python files under the current directory on a Unix system.
backseat-driver $(find . -name "*.py")
Set the fail_under
flag to cause Backseat Driver to exit with an error if the
model gives the code a lower grade than what you have specified.
backseat-driver --fail_under B *.py
Get a code review on every push by adding Backseat Driver to your CI.
Include the following code in your GitHub Workflow .yml
file.
Adjust the parameters as necessary to fit your use case.
See the GitHub Action page
for more details.
- uses: actions/checkout@v3
- name: Run Backseat Driver on this repository
uses: kostaleonard/backseat-driver-action@v2
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
filenames: '**/*.py'
fail-under: B
foo@bar:~$ backseat-driver -h
usage: backseat-driver [-h] [--fail_under {A,B,C,D}] filenames [filenames ...]
Requests a code review from a large language model (LLM). The model will grade the code based on readability, expressiveness, and organization. The output will include a letter grade in ['A', 'B', 'C', 'D', 'F'], as well as the model's reasoning.
positional arguments:
filenames The files to pass to the LLM for code review.
options:
-h, --help show this help message and exit
--fail_under {A,B,C,D}
If specified, exit with non-zero status if the LLM's grade falls below the given value. This value is not inclusive: if this value is "B" and the LLM gives a final grade of "B," then the program will exit with a zero status. If not specified, then the program will exit with a zero status no matter the LLM's grade.
You can try Backseat Driver on the input program below to see how it works.
Copy the code into test.py
.
"""A file for testing Backseat Driver."""
def fib(n):
"""Returns the nth fibonacci number."""
if n <= 2:
return 1
return fib(n - 1) + fib(n - 2)
def fact(n):
"""Returns n factorial."""
if n <= 1:
return 1
return n * fact(n - 1)
def hailstone(n):
"""Returns the hailstone sequence starting with positive integer n."""
if n <= 0:
raise ValueError(f"Cannot compute hailstone of negative number {n}")
if n == 1:
return [n]
if n % 2 == 0:
return [n] + hailstone(n // 2)
return [n] + hailstone(3 * n + 1)
Run Backseat Driver with the following.
backseat-driver test.py
What Backseat Driver says:
Grade: A
This code is well-written, easy to read, and well-organized. The function names are clear and descriptive, and the docstrings provide useful information about the functions' purpose and behavior. The code also follows the recommended Python style guidelines (PEP 8), including appropriate indentation, whitespace, and naming conventions. Overall, there are no major issues with the code, and it is highly readable and maintainable.
One possible improvement could be to add some error handling to the
fib()
andfact()
functions, for cases where the input is not a positive integer. Another potential improvement could be to add some more comments to explain the logic behind thehailstone()
function. However, these are minor suggestions and are not necessary for the code to function correctly.