Skip to content

Commit

Permalink
Merge 0.6
Browse files Browse the repository at this point in the history
  • Loading branch information
perklet committed Dec 31, 2023
2 parents dd4635b + 1748982 commit 9d83c94
Show file tree
Hide file tree
Showing 23 changed files with 531 additions and 44 deletions.
1 change: 0 additions & 1 deletion .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,4 +50,3 @@ jobs:
- uses: pypa/[email protected]
with:
password: ${{ secrets.PYPI_TOKEN }}

1 change: 0 additions & 1 deletion .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ on:
push:
branches:
- main
- master
- bugfix/*
- feature/*
jobs:
Expand Down
8 changes: 4 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.ONESHELL:
SHELL := bash
VERSION := 0.5.4
CURL_VERSION := curl-7.84.0
VERSION := 0.6.0b6
CURL_VERSION := curl-8.1.1

.preprocessed: curl_cffi/include/curl/curl.h .so_downloaded
touch .preprocessed
Expand All @@ -15,7 +15,7 @@ $(CURL_VERSION):
tar -xf $(CURL_VERSION).tar.xz

curl-impersonate-$(VERSION)/chrome/patches: $(CURL_VERSION)
curl -L "https://github.com/lwthiker/curl-impersonate/archive/refs/tags/v$(VERSION).tar.gz" \
curl -L "https://github.com/yifeikong/curl-impersonate/archive/refs/tags/v$(VERSION).tar.gz" \
-o "curl-impersonate-$(VERSION).tar.gz"
tar -xf curl-impersonate-$(VERSION).tar.gz

Expand All @@ -28,7 +28,7 @@ curl_cffi/include/curl/curl.h: curl-impersonate-$(VERSION)/chrome/patches
cp -R include/curl/* ../curl_cffi/include/curl/

.so_downloaded:
python preprocess/download_so.py
python preprocess/download_so.py $(VERSION)
touch .so_downloaded

preprocess: .preprocessed
Expand Down
46 changes: 44 additions & 2 deletions README-zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,16 @@ TLS 或者 JA3 指纹。如果你莫名其妙地被某个网站封锁了,可
- 预编译,不需要再自己机器上再弄一遍。
- 支持 `asyncio`,并且每个请求都可以换代理。
- 支持 http 2.0,requests 不支持。
- 支持 websocket。

||requests|aiohttp|httpx|pycurl|curl_cffi|
|---|---|---|---|---|---|
|http2||||||
|sync||||||
|async||||||
|websocket||||||
|指纹||||||
|速度|🐇|🐇🐇|🐇|🐇🐇|🐇🐇|

## 安装

Expand All @@ -23,8 +33,14 @@ TLS 或者 JA3 指纹。如果你莫名其妙地被某个网站封锁了,可
在其他小众平台,你可能需要先编译并安装 `curl-impersonate` 并且设置 `LD_LIBRARY_PATH` 这些
环境变量。

安装测试版:

pip install curl_cffi --pre

## 使用

尽量模仿比较新的浏览器,不要直接从下边的例子里复制 `chrome110` 去用。

### 类 requests

```python
Expand Down Expand Up @@ -59,19 +75,25 @@ print(r.json())
# {'cookies': {'foo': 'bar'}}
```

支持模拟的浏览器版本,和 [curl-impersonate](https://github.com/lwthiker/curl-impersonate) 一致:
支持模拟的浏览器版本,和我 [fork](https://github.com/yifeikong/curl-impersonate)[curl-impersonate](https://github.com/lwthiker/curl-impersonate) 一致:

不过只支持类似 Chrome 的浏览器。Firefox 的支持进展可以查看 #55

- chrome99
- chrome100
- chrome101
- chrome104
- chrome107
- chrome110
- chrome116
- chrome119
- chrome120
- chrome99_android
- edge99
- edge101
- safari15_3
- safari15_5
- safari17_2_ios

### asyncio

Expand Down Expand Up @@ -102,6 +124,22 @@ async with AsyncSession() as s:
results = await asyncio.gather(*tasks)
```

### WebSockets

```python
from curl_cffi.requests import Session, WebSocket

def on_message(ws: WebSocket, message):
print(message)

with Session() as s:
ws = s.ws_connect(
"wss://api.gemini.com/v1/marketdata/BTCUSD",
on_message=on_message,
)
ws.run_forever()
```

### 类 curl

另外,你还可以使用类似 curl 的底层 API:
Expand All @@ -125,7 +163,10 @@ print(body.decode())

更多细节请查看 [英文文档](https://curl-cffi.readthedocs.io)

如果你用 scrapy 的话,可以参考这个中间件:[tieyongjie/scrapy-fingerprint](https://github.com/tieyongjie/scrapy-fingerprint)
如果你用 scrapy 的话,可以参考这些中间件:

- [tieyongjie/scrapy-fingerprint](https://github.com/tieyongjie/scrapy-fingerprint)
- [jxlil/scrapy-impersonate](https://github.com/jxlil/scrapy-impersonate)

有问题和建议请优先提 issue,中英文均可,也可以加 [TG 群](https://t.me/+lL9n33eZp480MGM1) 或微信群讨论:

Expand All @@ -136,6 +177,7 @@ print(body.decode())
- 该项目 fork 自:[multippt/python_curl_cffi](https://github.com/multippt/python_curl_cffi), MIT 协议发布。
- Headers/Cookies 代码来自 [httpx](https://github.com/encode/httpx/blob/master/httpx/_models.py), BSD 协议发布。
- Asyncio 支持是受 Tornado 的 curl http client 启发而做。
- WebSocket API 的设计来自 [websocket_client](https://github.com/websocket-client/websocket-client)

## 赞助

Expand Down
34 changes: 32 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,14 @@ website for no obvious reason, you can give this package a try.
- Pre-compiled, so you don't have to compile on your machine.
- Supports `asyncio` with proxy rotation on each request.
- Supports http 2.0, which requests does not.
- Supports websocket.

|library|requests|aiohttp|httpx|pycurl|curl_cffi|
|---|---|---|---|---|---|
|http2||||||
|sync||||||
|async||||||
|websocket||||||
|fingerprints||||||
|speed|🐇|🐇🐇|🐇|🐇🐇|🐇🐇|

Expand All @@ -40,6 +42,8 @@ To install beta releases:

## Usage

Use the latest impersonate versions, do NOT copy `chrome110` here without changing.

### requests-like

```python
Expand Down Expand Up @@ -74,19 +78,25 @@ print(r.json())
# {'cookies': {'foo': 'bar'}}
```

Supported impersonate versions, as supported by [curl-impersonate](https://github.com/lwthiker/curl-impersonate):
Supported impersonate versions, as supported by my [fork](https://github.com/yifeikong/curl-impersonate) of [curl-impersonate](https://github.com/lwthiker/curl-impersonate):

However, only Chrome-like browsers are supported. Firefox support is tracked in #55

- chrome99
- chrome100
- chrome101
- chrome104
- chrome107
- chrome110
- chrome116
- chrome119
- chrome120
- chrome99_android
- edge99
- edge101
- safari15_3
- safari15_5
- safari17_2_ios

### asyncio

Expand Down Expand Up @@ -117,6 +127,22 @@ async with AsyncSession() as s:
results = await asyncio.gather(*tasks)
```

### WebSockets

```python
from curl_cffi.requests import Session, WebSocket

def on_message(ws: WebSocket, message):
print(message)

with Session() as s:
ws = s.ws_connect(
"wss://api.gemini.com/v1/marketdata/BTCUSD",
on_message=on_message,
)
ws.run_forever()
```

### curl-like

Alternatively, you can use the low-level curl-like API:
Expand All @@ -140,13 +166,17 @@ print(body.decode())

See the [docs](https://curl-cffi.readthedocs.io) for more details.

If you are using scrapy, check out this middleware: [tieyongjie/scrapy-fingerprint](https://github.com/tieyongjie/scrapy-fingerprint)
If you are using scrapy, check out these middlewares:

- [tieyongjie/scrapy-fingerprint](https://github.com/tieyongjie/scrapy-fingerprint)
- [jxlil/scrapy-impersonate](https://github.com/jxlil/scrapy-impersonate)

## Acknowledgement

- Originally forked from [multippt/python_curl_cffi](https://github.com/multippt/python_curl_cffi), which is under the MIT license.
- Headers/Cookies files are copied from [httpx](https://github.com/encode/httpx/blob/master/httpx/_models.py), which is under the BSD license.
- Asyncio support is inspired by Tornado's curl http client.
- The WebSocket API is inspired by [websocket_client](https://github.com/websocket-client/websocket-client)

## Sponsor

Expand Down
12 changes: 12 additions & 0 deletions bump_version.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/bin/bash

VERSION=$1

# Makefile
gsed "s/^VERSION := .*/VERSION := ${VERSION}/g" -i Makefile

# curl_cffi/__version__.py
gsed "s/^__version__ = .*/__version__ = \"${VERSION}\"/g" -i curl_cffi/__version__.py

# pyproject.toml
gsed "s/^version = .*/version = \"${VERSION}\"/g" -i pyproject.toml
2 changes: 1 addition & 1 deletion curl_cffi/__version__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,5 @@
# __description__ = metadata.metadata("curl_cffi")["Summary"]
# __version__ = metadata.version("curl_cffi")
__description__ = "libcurl ffi bindings for Python, with impersonation support"
__version__ = "0.5.10"
__version__ = "0.6.0b6"
__curl_version__ = Curl().version().decode()
46 changes: 27 additions & 19 deletions curl_cffi/const.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ class CurlOpt(IntEnum):
SSL_CTX_DATA = 10000 + 109
FTP_CREATE_MISSING_DIRS = 0 + 110
PROXYAUTH = 0 + 111
FTP_RESPONSE_TIMEOUT = 0 + 112
SERVER_RESPONSE_TIMEOUT = 0 + 112
IPRESOLVE = 0 + 113
MAXFILESIZE = 0 + 114
INFILESIZE_LARGE = 30000 + 115
Expand Down Expand Up @@ -303,14 +303,21 @@ class CurlOpt(IntEnum):
MIME_OPTIONS = 0 + 315
SSH_HOSTKEYFUNCTION = 20000 + 316
SSH_HOSTKEYDATA = 10000 + 317
HTTPBASEHEADER = 10000 + 318
SSL_SIG_HASH_ALGS = 10000 + 319
SSL_ENABLE_ALPS = 0 + 320
SSL_CERT_COMPRESSION = 10000 + 321
SSL_ENABLE_TICKET = 0 + 322
HTTP2_PSEUDO_HEADERS_ORDER = 10000 + 323
HTTP2_NO_SERVER_PUSH = 0 + 324
SSL_PERMUTE_EXTENSIONS = 0 + 325
PROTOCOLS_STR = 10000 + 318
REDIR_PROTOCOLS_STR = 10000 + 319
WS_OPTIONS = 0 + 320
CA_CACHE_TIMEOUT = 0 + 321
QUICK_EXIT = 0 + 322
HTTPBASEHEADER = 10000 + 323
SSL_SIG_HASH_ALGS = 10000 + 324
SSL_ENABLE_ALPS = 0 + 325
SSL_CERT_COMPRESSION = 10000 + 326
SSL_ENABLE_TICKET = 0 + 327
HTTP2_PSEUDO_HEADERS_ORDER = 10000 + 328
HTTP2_SETTINGS = 10000 + 329
SSL_PERMUTE_EXTENSIONS = 0 + 330
HTTP2_WINDOW_UPDATE = 0 + 331
ECH = 10000 + 332

if locals().get("WRITEDATA"):
FILE = locals().get("WRITEDATA")
Expand All @@ -328,22 +335,16 @@ class CurlInfo(IntEnum):
NAMELOOKUP_TIME = 0x300000 + 4
CONNECT_TIME = 0x300000 + 5
PRETRANSFER_TIME = 0x300000 + 6
SIZE_UPLOAD = 0x300000 + 7
SIZE_UPLOAD_T = 0x600000 + 7
SIZE_DOWNLOAD = 0x300000 + 8
SIZE_DOWNLOAD_T = 0x600000 + 8
SPEED_DOWNLOAD = 0x300000 + 9
SPEED_DOWNLOAD_T = 0x600000 + 9
SPEED_UPLOAD = 0x300000 + 10
SPEED_UPLOAD_T = 0x600000 + 10
HEADER_SIZE = 0x200000 + 11
REQUEST_SIZE = 0x200000 + 12
SSL_VERIFYRESULT = 0x200000 + 13
FILETIME = 0x200000 + 14
FILETIME_T = 0x600000 + 14
CONTENT_LENGTH_DOWNLOAD = 0x300000 + 15
CONTENT_LENGTH_DOWNLOAD_T = 0x600000 + 15
CONTENT_LENGTH_UPLOAD = 0x300000 + 16
CONTENT_LENGTH_UPLOAD_T = 0x600000 + 16
STARTTRANSFER_TIME = 0x300000 + 17
CONTENT_TYPE = 0x100000 + 18
Expand All @@ -357,7 +358,6 @@ class CurlInfo(IntEnum):
NUM_CONNECTS = 0x200000 + 26
SSL_ENGINES = 0x400000 + 27
COOKIELIST = 0x400000 + 28
LASTSOCKET = 0x200000 + 29
FTP_ENTRY_PATH = 0x100000 + 30
REDIRECT_URL = 0x100000 + 31
PRIMARY_IP = 0x100000 + 32
Expand All @@ -371,12 +371,10 @@ class CurlInfo(IntEnum):
PRIMARY_PORT = 0x200000 + 40
LOCAL_IP = 0x100000 + 41
LOCAL_PORT = 0x200000 + 42
TLS_SESSION = 0x400000 + 43
ACTIVESOCKET = 0x500000 + 44
TLS_SSL_PTR = 0x400000 + 45
HTTP_VERSION = 0x200000 + 46
PROXY_SSL_VERIFYRESULT = 0x200000 + 47
PROTOCOL = 0x200000 + 48
SCHEME = 0x100000 + 49
TOTAL_TIME_T = 0x600000 + 50
NAMELOOKUP_TIME_T = 0x600000 + 51
Expand Down Expand Up @@ -492,7 +490,7 @@ class CurlECode(IntEnum):
TFTP_UNKNOWNID = 72
REMOTE_FILE_EXISTS = 73
TFTP_NOSUCHUSER = 74
CONV_FAILED = 75
OBSOLETE75 = 75
OBSOLETE76 = 76
SSL_CACERT_BADFILE = 77
REMOTE_FILE_NOT_FOUND = 78
Expand All @@ -517,6 +515,7 @@ class CurlECode(IntEnum):
PROXY = 97
SSL_CLIENTCERT = 98
UNRECOVERABLE_POLL = 99
ECH_REQUIRED = 100


class CurlHttpVersion(IntEnum):
Expand All @@ -527,3 +526,12 @@ class CurlHttpVersion(IntEnum):
V2TLS = 4 # use version 2 for HTTPS, version 1.1 for HTTP */
V2_PRIOR_KNOWLEDGE = 5 # please use HTTP 2 without HTTP/1.1 Upgrade */
V3 = 30 # Makes use of explicit HTTP/3 without fallback.


class CurlWsFlag(IntEnum):
TEXT = (1<<0)
BINARY = (1<<1)
CONT = (1<<2)
CLOSE = (1<<3)
PING = (1<<4)
OFFSET = (1<<5)
Loading

0 comments on commit 9d83c94

Please sign in to comment.