Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Firefox curl-impersonate #186

Closed
wants to merge 2 commits into from
Closed

Conversation

bjia56
Copy link
Contributor

@bjia56 bjia56 commented Dec 26, 2023

Adds the Firefox build of curl-impersonate as a separate cffi module, built into and distributed with the curl_cffi package. This change parameterizes all references to ffi and lib to use ones loaded from either chrome or ff. No external modules needed.

Firefox's libnss requires installing the libnss3 package on the host to get the system certificates, which are bundled into a shared object. An alternate is to use libnsspem to support PEM certificates like Chrome, but this library is not currently built into the main nss distribution. Perhaps curl-impersonate could be extended someday to link against it by default, or curl_cffi could vendor it using auditwheel.

I will keep it as a draft for now to gather feedback.

Resolves #59
Resolves #177

@perklet
Copy link
Collaborator

perklet commented Dec 26, 2023

Can certifi help this situation? If it helps, we can just switch to that.

@bjia56
Copy link
Contributor Author

bjia56 commented Dec 26, 2023

certifi probably won't help much directly, it provides certs in PEM format. libnss encodes the certs within one of its shared objects libnssckbi.so, which can be installed with something like apt install libnss3.

Comment on lines +25 to +26
from .curl import Curl, CurlChrome, CurlFirefox, CurlError
from .aio import AsyncCurl, AsyncCurlChrome, AsyncCurlFirefox
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it need to deprecate Curl and AsyncCurl in favor of CurlChrome and AsyncCurlChrome?

Comment on lines +101 to +102
self._ffi = _ffi
self._lib = _lib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Curl is exposed api, could use auto-detect based on impersonate parameter (like requests.Session):

class Curl:
    """
    Wrapper for `curl_easy_*` functions of libcurl.
    """

    def __init__(self, cacert: str = DEFAULT_CACERT, debug: bool = False, handle = None, impersonate = None):
        """
        Parameters:
            cacert: CA cert path to use, by default, curl_cffi uses its own bundled cert.
            debug: whether to show curl debug messages.
        """
        self._ffi, self._lib = get_backend_lib(impersonate)  # return chrome by default
        self._curl = self._lib.curl_easy_init() if not handle else handle
        self._headers = self._ffi.NULL
        self._resolve = self._ffi.NULL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure there is an easy way to achieve this auto-detect - in requests.Session, the impersonate parameter could be set in the constructor, or it could be set (and changed) for each individual request. For example, if a user omits the constructor impersonate, then Session would default to the Chrome version, but what if the user tries to use Firefox in the request itself? I think it's could lead to confusion on how to use or not use this class.

The addition of extra classes make it explicitly clear what the underlying implementation is, and keeps the existing API intact. Firefox is then opt-in by customizing the curl instance and the impersonate parameter.

@T-256
Copy link
Contributor

T-256 commented Dec 26, 2023

pointing to discussed before: #163 (comment)

As of firefox, it's really challenging to pack an addtional .so file in a python wheel. There are two options to bypass this:

  1. release another package, i.e. curl_cffi_ff, as suggested by one of our users
  2. Try to use boringssl(chrome) to emulate nss(firefox)

At least one of them should work, just haven't had time to try them out. Maybe I can experiment them during the Chinese New Year.

If in near future, we could someway patch boringssl to impersonate FireFox through boringssl (you can track at lexiforest/curl-impersonate#6), then I think it would hard to remove these exposed public APIs in this PR.

As I said in #163 (comment), IMO it's more important for us to supporting FireFox by patch boringssl (its maintaining costs is lower).

@bjia56
Copy link
Contributor Author

bjia56 commented Dec 27, 2023

In my opinion, emulation is prone to errors and getting the real software to work is more reliable. However, if @yifeikong thinks that a Boringssl implementation is the right way to go, then I am happy to close this PR, though I'm personally not able to take on an attempt at emulation and someone else will need to do it.

@bjia56
Copy link
Contributor Author

bjia56 commented Dec 27, 2023

If we decide to keep this approach of linking in curl-impersonate-ff, I'm happy to also investigate incorporating libnsspem.so so certifi will work.

@perklet
Copy link
Collaborator

perklet commented Dec 27, 2023

As of 0.6, I don't intend to investigate boringssl anymore, it's such a headache to me, which is why the milestone for lexiforest/curl-impersonate#6 is set to 0.7. And I'm not sure it will ever work.

Please keep going on this approch, it will be the way at least for the whole lifecycle of 0.6

@bjia56
Copy link
Contributor Author

bjia56 commented Dec 27, 2023

@yifeikong I see that curl-impersonate-win only has a chrome build, would you like to keep that repo as the source of windows builds or are you planning to merge into your other curl-impersonate fork? Just wondering where I should start getting a firefox build for windows

@perklet
Copy link
Collaborator

perklet commented Dec 27, 2023

I would like to merge them as stated in lexiforest/curl-impersonate#4

@bjia56
Copy link
Contributor Author

bjia56 commented Dec 27, 2023

Something that will need addressing is the glibc target of the firefox library. For example, the upstream curl-impersonate chrome requires glibc 2.17 (hence the ability to produce manylinux2014 images), but curl-impersonate firefox requires a more recent glibc. (See below) This prevents auditwheel from moving the shared objects into the wheel.

I suggest that in the curl-impersonate repo, we consider building libcurl within the manylinux2014 docker images (the ones used by cibuildwheel), so we can guarantee glibc 2.17 compatibility. The docker images have Red Hat's devtoolset, which should give us newer C/C++ features (if needed by libcurl dependencies) while maintaining glibc 2.17.

I can take a look at submitting a PR for switching to this kind of build.

$ ldd -v libcurl-impersonate-chrome.so
        linux-vdso.so.1 (0x00007fff52bd1000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007ff777336000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ff777331000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff777109000)
        /lib64/ld-linux-x86-64.so.2 (0x00007ff777642000)

        Version information:
        ./libcurl-impersonate-chrome.so:
                libpthread.so.0 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libpthread.so.0
                libc.so.6 (GLIBC_2.16) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.3) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.7) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.14) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.15) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.4) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.17) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.3.4) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libc.so.6
        /lib/x86_64-linux-gnu/libz.so.1:
                libc.so.6 (GLIBC_2.14) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.4) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.3.4) => /lib/x86_64-linux-gnu/libc.so.6
        /lib/x86_64-linux-gnu/libpthread.so.0:
                libc.so.6 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libc.so.6
        /lib/x86_64-linux-gnu/libc.so.6:
                ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
                ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
                ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2
$ ldd -v libcurl-impersonate-ff.so
        linux-vdso.so.1 (0x00007ffcbfb14000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f195c17a000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f195c15e000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f195c159000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f195bf31000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f195c571000)

        Version information:
        ./libcurl-impersonate-ff.so:
                libdl.so.2 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libdl.so.2
                libpthread.so.0 (GLIBC_2.3.2) => /lib/x86_64-linux-gnu/libpthread.so.0
                libpthread.so.0 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libpthread.so.0
                libc.so.6 (GLIBC_2.15) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.14) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.4) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.30) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.28) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.25) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.3) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.7) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.17) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.3.4) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libc.so.6
        /lib/x86_64-linux-gnu/libdl.so.2:
                libc.so.6 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libc.so.6
        /lib/x86_64-linux-gnu/libz.so.1:
                libc.so.6 (GLIBC_2.14) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.4) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libc.so.6
                libc.so.6 (GLIBC_2.3.4) => /lib/x86_64-linux-gnu/libc.so.6
        /lib/x86_64-linux-gnu/libpthread.so.0:
                libc.so.6 (GLIBC_2.2.5) => /lib/x86_64-linux-gnu/libc.so.6
        /lib/x86_64-linux-gnu/libc.so.6:
                ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
                ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
                ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2

@perklet
Copy link
Collaborator

perklet commented Dec 28, 2023 via email

@bjia56
Copy link
Contributor Author

bjia56 commented Jan 9, 2024

Unfortunately, it's probably no longer worth pursuing this option due to curl removing support for NSS, as discussed in lexiforest/curl-impersonate#20

I'll close this for now

@bjia56 bjia56 closed this Jan 9, 2024
@perklet
Copy link
Collaborator

perklet commented Jan 10, 2024

@bjia56 Thanks a lot for your exploration! We would never know where the path leads to until we have gone through it.

@bjia56 bjia56 deleted the firefox branch August 9, 2024 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for Firefox Can't use firefox impersolate
3 participants