Skip to content

Commit

Permalink
docs(readme): typescript implementation of examples in readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Jack Hopkins committed Jan 23, 2024
1 parent 057a261 commit ee64724
Show file tree
Hide file tree
Showing 5 changed files with 181 additions and 97 deletions.
208 changes: 125 additions & 83 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,61 +61,95 @@ Tanuki.align(async (it) => {
- **Easy and seamless integration** - Add LLM augmented functions to any workflow within seconds. Create a function with inline `patch` syntax, with types and docstrings to guide the execution. That’s it.
- **Type aware** - Ensure that the outputs of the LLM adhere to the type constraints of the function (arbitrary Types, Literals, Generics etc) to guard against bugs or unexpected side-effects of using LLMs.
- **RAG support** - Get embedding outputs for downstream RAG (Retrieval Augmented Generation) implementations. Output embeddings can then be easily stored and used for relevant document retrieval to reduce cost & latency and improve performance on long-form content.
- **Aligned outputs** - LLMs are unreliable, which makes them difficult to use in place of classically programmed functions. Using simple assert statements in a function decorated with `@tanuki.align`, you can align the behaviour of your patched function to what you expect.
- **Aligned outputs** - LLMs are unreliable, which makes them difficult to use in place of classically programmed functions. Using simple jest-like `expect` syntax in a `Tanuki.align` block, you can align the behaviour of your patched function to what you need.
- **Lower cost and latency** - Achieve up to 90% lower cost and 80% lower latency with increased usage. The package will take care of model training, MLOps and DataOps efforts to improve LLM capabilities through distillation.
- **Batteries included** - No remote dependencies other than OpenAI.
- **Batteries included** - No remote dependencies other than your model provider (OpenAI / AWS Bedrock).

<!-- TOC --><a name="installation-and-getting-started"></a>
## Installation and Getting Started
<!-- TOC --><a name="installation"></a>
### Installation
```
pip install tanuki.py
```bash
npm install tanuki.ts
```

or with Poetry
Set your OpenAI / AWS key in your `.env` file or export your key as an environment variable:

```
poetry add tanuki.py
```

Set your OpenAI key using:
// for OpenAI
export OPENAI_API_KEY=sk-...
// for AWS Bedrock
export AWS_SECRET_ACCESS_KEY=...
export AWS_ACCESS_KEY_ID=...
```
export OPENAI_API_KEY=sk-...

Next, you need to add the Tanuki transformer to your `tsconfig.json` file:

```json
{
"compilerOptions": {
"plugins": [
{
"transform": "tanuki.ts/tanukiTransformer"
}
]
}
}
```
This is required for Tanuki to be aware of your patched functions and types at runtime, as these types are usually erased by the Typescript compiler when transpiling into Javascript.



<!-- TOC --><a name="getting-started"></a>
### Getting Started

To get started:
1. Create a python function stub decorated with `@tanuki.patch` including type hints and a docstring.
2. (Optional) Create another function decorated with `@tanuki.align` containing normal `assert` statements declaring the expected behaviour of your patched function with different inputs.
1. Create a `patch` function stub, including your input and output types, and an instruction.
2. (Optional) Create jest-like equivalent assertions in a `Tanuki.align` block, declaring the expected behaviour of your patched function with different inputs.

The patched function can now be called as normal in the rest of your code.
Once you have built your code (to make Tanuki aware of your types), the `patch` function will be registered and can be invoked as normal.

To add functional alignment, the functions annotated with `align` must also be called if:
Your `align` block must also be called if:
- It is the first time calling the patched function (including any updates to the function signature, i.e docstring, input arguments, input type hints, naming or the output type hint)
- You have made changes to your assert statements.
- You have made changes to your desired behaviour.

Here is what it could look like for a simple classification function:

```python
@tanuki.patch
def classify_sentiment(msg: str) -> Optional[Literal['Good', 'Bad']]:
"""Classifies a message from the user into Good, Bad or None."""
```typescript
// Assuming TypedOutput is a union type of 'Good', 'Bad', or null
type Sentiment = 'Good' | 'Bad' | null;

@tanuki.align
def align_classify_sentiment():
assert classify_sentiment("I love you") == 'Good'
assert classify_sentiment("I hate you") == 'Bad'
assert not classify_sentiment("People from Phoenix are called Phoenicians")

if __name__ == "__main__":
align_classify_sentiment()
print(classify_sentiment("I like you")) # Good
print(classify_sentiment("Apples might be red")) # None
// TypedInput is assumed to be a string
type Message = string;

/**
* Declare the function that you want Tanuki to provide.
*/
class Functions {
classifySentiment = patch<Sentiment, Message>()`Classifies message from the user based on sentiment`;
}

/**
* Align your function to the expected behavior using Jest-like assertions.
*/
Tanuki.align(async (it) => {
it("alignClassifySentiment", async (expect) => {
const functions = new Functions();
expect(await functions.classifySentiment("I love you")).toEqual('Good');
expect(await functions.classifySentiment("I hate you")).toEqual('Bad');
expect(await functions.classifySentiment("People from Phoenix are called Phoenicians")).toBeNull();
});
});

// Example usage of the patched function somewhere else in your code
const runExamples = async () => {
const functions = new Functions();
console.log(await functions.classifySentiment("I like you")); // Expect 'Good' or null
console.log(await functions.classifySentiment("Apples might be red")); // Expect null
};

runExamples();
```

<!-- TOC --><a name="how-it-works"></a>
Expand All @@ -124,7 +158,7 @@ See [here](https://github.com/monkeypatch/tanuki.py/blob/update_docs/docs/functi

## How It Works

When you call a tanuki-patched function during development, an LLM in a n-shot configuration is invoked to generate the typed response.
When you call a Tanuki-patched function during development, an LLM in a n-shot configuration is invoked to generate the typed response.

The number of examples used is dependent on the number of align statements supplied in functions annotated with the align decorator.

Expand All @@ -146,55 +180,72 @@ LLM API outputs are typically in natural language. In many instances, it’s pre

A core concept of Tanuki is the support for typed parameters and outputs. Supporting typed outputs of patched functions allows you to declare *rules about what kind of data the patched function is allowed to pass back* for use in the rest of your program. This will guard against the verbose or inconsistent outputs of the LLMs that are trained to be as “helpful as possible”.

You can use Literals or create custom types in Pydantic to express very complex rules about what the patched function can return. These act as guard-rails for the model preventing a patched function breaking the code or downstream workflows, and means you can avoid having to write custom validation logic in your application.
The types you provide the patched functions act as guard-rails for the model preventing a patched function breaking the code or downstream workflows, and means you can avoid having to write custom validation logic in your application.

```python
@dataclass
class ActionItem:
goal: str = Field(description="What task must be completed")
deadline: datetime = Field(description="The date the goal needs to be achieved")

@tanuki.patch
def action_items(input: str) -> List[ActionItem]:
"""Generate a list of Action Items"""
```typescript
// Define the ActionItem class
class ActionItem {
goal: string;
deadline: Date;

constructor(goal: string, deadline: Date) {
this.goal = goal;
this.deadline = deadline;
}
}

@tanuki.align
def align_action_items():
goal = "Can you please get the presentation to me by Tuesday?"
next_tuesday = (datetime.now() + timedelta((1 - datetime.now().weekday() + 7) % 7)).replace(hour=0, minute=0, second=0, microsecond=0)
// Assuming we have a similar setup for the patch and align methods
class Functions {
actionItems = patch<ActionItem[], string>()`Generate a list of Action Items`;
}

assert action_items(goal) == ActionItem(goal="Prepare the presentation", deadline=next_tuesday)
// Define the alignment for the actionItems method
Tanuki.align(async (it) => {
it("alignActionItems", async (expect) => {
const goal = "Can you please get the presentation to me by Tuesday?";
const nextTuesday = new Date();
nextTuesday.setDate(nextTuesday.getDate() + ((1 - nextTuesday.getDay() + 7) % 7));
nextTuesday.setHours(0, 0, 0, 0);

const expectedActionItem = new ActionItem("Prepare the presentation", nextTuesday);
const result = await new Functions().actionItems(goal);

// Assuming the result is an array of ActionItems
expect(result[0]).toEqual(expectedActionItem);
});
});
```

By constraining the types of data that can pass through your patched function, you are declaring the potential outputs that the model can return and specifying the world where the program exists in.

You can add integer constraints to the outputs for Pydantic field values, and generics if you wish.
You can add integer constraints using union types.

```python
@tanuki.patch
def score_sentiment(input: str) -> Optional[Annotated[int, Field(gt=0, lt=10)]]:
"""Scores the input between 0-10"""
```typescript
type ZeroToNine = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9;

@tanuki.align
def align_score_sentiment():
"""Register several examples to align your function"""
assert score_sentiment("I love you") == 10
assert score_sentiment("I hate you") == 0
assert score_sentiment("You're okay I guess") == 5

# This is a normal test that can be invoked with pytest or unittest
def test_score_sentiment():
"""We can test the function as normal using Pytest or Unittest"""
score = score_sentiment("I like you")
assert score >= 7

if __name__ == "__main__":
align_score_sentiment()
print(score_sentiment("I like you")) # 7
print(score_sentiment("Apples might be red")) # None
class Functions {
scoreSentiment = patch<ZeroToNine, string>()`Scores the input between 0-9`;
}

// Define the alignment for the scoreSentiment method
Tanuki.align(async (it) => {
it("alignScoreSentiment", async (expect) => {
expect(await new Functions().scoreSentiment("I love you")).toBe(9);
expect(await new Functions().scoreSentiment("I hate you")).toBe(0);
expect(await new Functions().scoreSentiment("You're okay I guess")).toBe(5);
});
});

// Example test using Jest
describe('testScoreSentiment', () => {
it('should return a score >= 7 for positive sentiment', async () => {
const score = await new Functions().scoreSentiment("I like you");
expect(score).toBeGreaterThanOrEqual(7);
});
});
```

To see more examples using Tanuki for different use cases (including how to integrate with FastAPI), have a look at [examples](https://github.com/monkeypatch/tanuki.py/tree/master/examples).
<!--To see more examples using Tanuki for different use cases (including how to integrate with FastAPI), have a look at [examples](https://github.com/monkeypatch/tanuki.py/tree/master/examples).-->

For embedding outputs for RAG support, see [here](https://github.com/monkeypatch/tanuki.py/blob/update_docs/docs/embeddings_support.md)

Expand Down Expand Up @@ -226,16 +277,7 @@ By writing a test that encapsulates the expected behaviour of the tanuki-patched

Unlike traditional TDD, where the objective is to write code that passes the test, TDA flips the script: **tests do not fail**. Their existence and the form they take are sufficient for LLMs to align themselves with the expected behavior.

TDA offers a lean yet robust methodology for grafting machine learning onto existing or new Python codebases. It combines the preventive virtues of TDD while addressing the specific challenges posed by the dynamism of LLMs.

---
(Aligning function chains is work in progress)
```python
def test_score_sentiment():
"""We can test the function as normal using Pytest or Unittest"""
assert multiply_by_two(score_sentiment("I like you")) == 14
assert 2*score_sentiment("I like you") == 14
```
TDA offers a lean yet robust methodology for grafting machine learning onto existing or new Typescript codebases.

<!-- TOC --><a name="scaling-and-finetuning"></a>
## Scaling and Finetuning
Expand Down Expand Up @@ -301,15 +343,15 @@ Yes

<!-- TOC --><a name="does-it-only-work-with-openai"></a>
#### Does it only work with OpenAI?
Currently yes but there are plans to support Anthropic and popular open-source models. If you have a specific request, either join [our Discord server](https://discord.gg/kEGS5sQU), or create a Github issue.
Bedrock is also supported to give access Anthropic and popular open-source models like Llama2. If you have a specific request, either join [our Discord server](https://discord.gg/kEGS5sQU), or create a Github issue.

<!-- TOC --><a name="how-it-works-1"></a>
### How It Works
<!-- TOC --><a name="how-does-the-llm-get-cheaper-and-faster-over-time-and-by-how-much"></a>
#### How does the LLM get cheaper and faster over time? And by how much?
In short, we use distillation of LLM models.
In short, we distill LLM models.

Expanded, using the outputs of the larger (teacher) model, a smaller (student) model will be trained to emulate the teacher model behaviour while being faster and cheaper to run due to smaller size. In some cases it is possible to achieve up to 90% lower cost and 80% lower latency with a small number of executions of your patched functions.
Using the outputs of the larger (teacher) model, a smaller (student) model will be trained to emulate the teacher model behaviour while being faster and cheaper to run due to smaller size. In some cases it is possible to achieve up to 90% lower cost and 80% lower latency with a small number of executions of your patched functions.
<!-- TOC --><a name="how-many-calls-does-it-require-to-get-the-improvement"></a>
#### How many calls does it require to get the improvement?
The default minimum is 200 calls, although this can be changed by adding flags to the patch decorator.
Expand All @@ -325,13 +367,13 @@ Not necessarily. Currently the only way to improve the LLM performance is to hav
### Accuracy & Reliability
<!-- TOC --><a name="how-do-you-guarantee-consistency-in-the-output-of-patched-functions"></a>
#### How do you guarantee consistency in the output of patched functions?
Each output of the LLM will be programmatically instantiated into the output class ensuring the output will be of the correct type, just like your Python functions. If the output is incorrect and instantiating the correct output object fails, an automatic feedback repair loop kicks in to correct the mistake.
Each output of the LLM will be programmatically instantiated into the output class or type. If the output is incorrect and instantiating the correct output object fails, an automatic feedback repair loop kicks in to correct the mistake.
<!-- TOC --><a name="how-reliable-are-the-typed-outputs"></a>
#### How reliable are the typed outputs?
For simpler-medium complexity classes GPT4 with align statements has been shown to be very reliable in outputting the correct type. Additionally we have implemented a repair loop with error feedback to “fix” incorrect outputs and add the correct output to the training dataset.
<!-- TOC --><a name="how-do-you-deal-with-hallucinations"></a>
#### How do you deal with hallucinations?
Hallucinations can’t be 100% removed from LLMs at the moment, if ever. However, by creating test functions decorated with `@tanuki.align`, you can use normal `assert` statements to align the model to behave in the way that you expect. Additionally, you can create types with Pydantic, which act as guardrails to prevent any nasty surprises and provide correct error handling.
Hallucinations can’t be 100% removed from LLMs at the moment, if ever. However, by creating `expect` declarations (like in Jest) inside `Tanuki.align` blocks, you can align the model to behave in the way that you expect to minimize hallucations.
<!-- TOC --><a name="how-do-you-deal-with-bias"></a>
#### How do you deal with bias?
By adding more align statements that cover a wider range of inputs, you can ensure that the model is less biased.
Expand Down
10 changes: 5 additions & 5 deletions package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "typescript-npm-package-template",
"version": "0.0.0-development",
"description": "A template for creating npm packages using TypeScript and VSCode",
"name": "tanuki.ts",
"version": "0.0.1-development",
"description": "TypeScript client for building LLM-powered applications",
"main": "./lib/index.js",
"type": "module",
"files": [
Expand All @@ -21,12 +21,12 @@
},
"repository": {
"type": "git",
"url": "git+https://github.com/ryansonshine/typescript-npm-package-template.git"
"url": "git+https://github.com/Tanuki/tanuki.ts.git"
},
"license": "MIT",
"author": {
"name": "Jack Hopkins",
"email": "ryansonshine@users.noreply.github.com",
"email": "jackhopkins@users.noreply.github.com",
"url": "https://github.com/ryansonshine"
},
"engines": {
Expand Down
51 changes: 51 additions & 0 deletions tests/testPatch/testClassInstantiation.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
// Define the ActionItem class
import { patch, Tanuki } from "../../src/tanuki";

class ActionItem {
goal: string;
deadline: Date;

constructor(goal: string, deadline: Date) {
this.goal = goal;
this.deadline = deadline;
}
}

// Assuming we have a similar setup for the patch and align methods
class Functions {
actionItems = patch<ActionItem[], string>()`Generate a list of Action Items`;
}

describe('Instantiate Class Tests', () => {

// Assuming tanuki.align functionality is handled within the test itself
it('align_action_items', async () => {
Tanuki.align(async (it) => {
it("alignActionItems", async (expect) => {
const goal = "Can you please get the presentation to me by Tuesday?";
const nextTuesday = new Date();
nextTuesday.setDate(nextTuesday.getDate() + ((1 - nextTuesday.getDay() + 7) % 7));
nextTuesday.setHours(0, 0, 0, 0);

const expectedActionItem = new ActionItem("Prepare the presentation", nextTuesday);
const result = await new Functions().actionItems(goal);

// Assuming the result is an array of ActionItems
expect(result[0]).toEqual(expectedActionItem);
});
});
})
// Assuming tanuki.align functionality is handled within the test itself
it('create_action_items', async () => {
const goal = "Can you please get the presentation to me by Wednesday?";
const nextWednesday = new Date();
nextWednesday.setDate(nextWednesday.getDate() + ((1 - nextWednesday.getDay() + 7) % 7));
nextWednesday.setHours(0, 0, 0, 0);

const expectedActionItem = new ActionItem("Prepare the presentation", nextWednesday);
const result = await new Functions().actionItems(goal);

// Assuming the result is an array of ActionItems
expect(result[0]).toEqual(expectedActionItem);
})
})
2 changes: 0 additions & 2 deletions tests/testPatch/testClassification.test.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
import { patch, Tanuki } from "../../src/tanuki";
import { LiteralType } from "typescript";
import { Sentiment } from "../testTypes/testTypes.test";

class Classifier {
static classifySentiment2 = patch<"Good" | "Bad", [string, string]>()`The sentiment of the input objects`;
Expand Down
7 changes: 0 additions & 7 deletions tsconfig.json
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,6 @@
"plugins": [
{
"transform": "./src/tanukiTransformer.ts",
//"after": true,
/*"resolvePathAliases": true,
/*"resolvePathAliases": true,
"tsConfig": "./transformer.tsconfig.json"
/*
"after": true,
"transformProgram": true*/
}
],
/*"traceResolution": true,*/
Expand Down

0 comments on commit ee64724

Please sign in to comment.