Skip to content

Commit

Permalink
doc: add ai-token-ratelimit example (#245)
Browse files Browse the repository at this point in the history
  • Loading branch information
cr7258 committed Jul 2, 2024
1 parent 46cbc5a commit 6beabbd
Show file tree
Hide file tree
Showing 2 changed files with 209 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -182,3 +182,208 @@ rejected_msg: '{"code":-1,"msg":"Too many requests"}'
redis:
service_name: redis.static
```

## 完整示例

AI Token 限流插件依赖 Redis 记录剩余可用的 token 数,因此首先需要部署 Redis 服务。

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
labels:
app: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis
ports:
- containerPort: 6379
---
apiVersion: v1
kind: Service
metadata:
name: redis
labels:
app: redis
spec:
ports:
- port: 6379
targetPort: 6379
selector:
app: redis
---
```

在本例中,使用通义千问作为 AI 服务提供商。另外还需要设置 AI 统计插件,因为 AI Token 限流插件依赖 AI 统计插件计算每次请求消耗的 token 数,以下配置限制每分钟的 input 和 output token 总数为 200 个。

```yaml
apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
name: ai-proxy
namespace: higress-system
spec:
matchRules:
- config:
provider:
type: qwen
apiTokens:
- "<YOUR_API_TOKEN>"
modelMapping:
'gpt-3': "qwen-turbo"
'gpt-35-turbo': "qwen-plus"
'gpt-4-turbo': "qwen-max"
'*': "qwen-turbo"
ingress:
- qwen
url: oci://higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/wasm-go-ai-proxy:v1.0.0
phase: UNSPECIFIED_PHASE
priority: 100
---
apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
name: ai-statistics
namespace: higress-system
spec:
defaultConfig:
enable: true
url: oci://higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/wasm-go-ai-token-statistics:v1.0.0
phase: UNSPECIFIED_PHASE
priority: 200
---
apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
name: ai-token-ratelimit
namespace: higress-system
spec:
defaultConfig:
rule_name: default_limit_by_param_apikey
rule_items:
- limit_by_param: apikey
limit_keys:
- key: 123456
token_per_minute: 200
redis:
# 默认情况下,为了减轻数据面的压力,Higress 的 global.onlyPushRouteCluster 配置参数被设置为 true,意味着不会自动发现 Kubernetes Service
# 如果需要使用 Kubernetes Service 作为服务发现,可以将 global.onlyPushRouteCluster 参数设置为 false,
# 这样就可以直接将 service_name 设置为 Kubernetes Service, 而无须为 Redis 创建 McpBridge 以及 Ingress 路由
# service_name: redis.default.svc.cluster.local
service_name: redis.dns
service_port: 6379
url: oci://higress-registry.cn-hangzhou.cr.aliyuncs.com/plugins/wasm-go-ai-token-ratelimit:v1.0.0
phase: UNSPECIFIED_PHASE
priority: 600
```

注意,AI Token 限流插件中的 Redis 配置项 `service_name` 来自 McpBridge 中配置的服务来源,另外我们还需要在 McpBridge 中配置通义千问服务的访问地址。

```yaml
apiVersion: networking.higress.io/v1
kind: McpBridge
metadata:
name: default
namespace: higress-system
spec:
registries:
- domain: dashscope.aliyuncs.com
name: qwen
port: 443
type: dns
- domain: redis.default.svc.cluster.local # Kubernetes Service
name: redis
type: dns
port: 6379
```

分别创建两条路由规则。

```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
higress.io/backend-protocol: HTTPS
higress.io/destination: qwen.dns
higress.io/proxy-ssl-name: dashscope.aliyuncs.com
higress.io/proxy-ssl-server-name: "on"
labels:
higress.io/resource-definer: higress
name: qwen
namespace: higress-system
spec:
ingressClassName: higress
rules:
- host: qwen-test.com
http:
paths:
- backend:
resource:
apiGroup: networking.higress.io
kind: McpBridge
name: default
path: /
pathType: Prefix
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
higress.io/destination: redis.dns
higress.io/ignore-path-case: "false"
labels:
higress.io/resource-definer: higress
name: redis
spec:
ingressClassName: higress
rules:
- http:
paths:
- backend:
resource:
apiGroup: networking.higress.io
kind: McpBridge
name: default
path: /
pathType: Prefix
```

触发限流效果如下:

```bash
curl "http://qwen-test.com:18000/v1/chat/completions?apikey=123456" -H "Content-Type: application/json" -d '{
"model": "gpt-3",
"messages": [
{
"role": "user",
"content": "你好,你是谁?"
}
],
"stream": false
}'
{"id":"88cfa80f-545d-93b4-8ff3-3f5245ca33ba","choices":[{"index":0,"message":{"role":"assistant","content":"我是通义千问,由阿里云开发的AI助手。我可以回答各种问题、提供信息和与用户进行对话。有什么我可以帮助你的吗?"},"finish_reason":"stop"}],"created":1719909825,"model":"qwen-turbo","object":"chat.completion","usage":{"prompt_tokens":13,"completion_tokens":33,"total_tokens":46}}

curl "http://qwen-test.com:18000/v1/chat/completions?apikey=123456" -H "Content-Type: application/json" -d '{
"model": "gpt-3",
"messages": [
{
"role": "user",
"content": "你好,你是谁?"
}
],
"stream": false
}'
Too many requests # 限流成功
```
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,10 @@ custom_edit_url: https://github.com/higress-group/higress-group.github.io/blob/m
# Mcp Bridge 配置说明

## McpBridge 字段说明
| 字段 | 类型 | 说明 | 示例值 | 是否必填 |
| --- | --- | --- | --- | --- |
| registries | RegistryConfig 数组 | 支持配置多个不同注册中心的服务来源 | [] ||
| 字段 | 类型 | 说明 | 示例值 | 是否必填 |
|------------| --- |-------------------|---------|-------------------|
| registries | RegistryConfig 数组 | 支持配置多个不同注册中心的服务来源 | [] ||
| name | 字符串 | McpBridge 名称 | default | 是,当前该值只能是 default |

## RegistryConfig 字段说明
| 字段 | 类型 | 说明 | 示例值 | 是否必填 |
Expand Down

0 comments on commit 6beabbd

Please sign in to comment.