Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YouTube] Unable to extract yt initial data (live) #32973

Open
fluca1978 opened this issue Nov 13, 2024 · 7 comments
Open

[YouTube] Unable to extract yt initial data (live) #32973

fluca1978 opened this issue Nov 13, 2024 · 7 comments

Comments

@fluca1978
Copy link

Using head commit c509896:

% ./bin/youtube-dl 'https://www.youtube.com/live/QJ42yFBHgZk' --verbose
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['https://www.youtube.com/live/QJ42yFBHgZk', '--verbose']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.12.3 (CPython) - Linux-6.8.0-48-generic-x86_64-with-glibc2.39
[debug] exe versions: ffmpeg 6.1.1, ffprobe 6.1.1
[debug] Proxy map: {}
[youtube:tab] live: Downloading webpage
ERROR: Unable to extract yt initial data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
                ^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/common.py", line 534, in extract
    ie_result = self._real_extract(url)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/youtube.py", line 2841, in _real_extract
    data = self._extract_yt_initial_data(item_id, webpage)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/youtube.py", line 299, in _extract_yt_initial_data
    self._search_regex(
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/common.py", line 1012, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract yt initial data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

@dirkf
Copy link
Contributor

dirkf commented Nov 14, 2024

Yet:

$ ./youtube-dl-20240807 -vF 'https://www.youtube.com/live/QJ42yFBHgZk'
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-vF', u'https://www.youtube.com/live/QJ42yFBHgZk']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2024.08.07 [c5098961b] (single file build)
[debug] ** This version was built from the latest master code at https://github.com/ytdl-org/youtube-dl.
[debug] ** For support, visit the main site.
[debug] Python 2.7.15 (CPython i686 32bit) - Linux-6.1.0-26-686-pae-i686-with-debian-12.8 - OpenSSL 1.1.1a  20 Nov 2018 - glibc 2.1.3
[debug] exe versions: ffmpeg 5.1.6-0, ffprobe 5.1.6-0
[debug] Proxy map: {}
[youtube:tab] live: Downloading webpage
[youtube] QJ42yFBHgZk: Downloading webpage
[debug] [youtube] Decrypted nsig tqHWgWT1ggyyPbVj => -_FrJhdsXbosmQ
[debug] [youtube] Decrypted nsig Drg_nvve4GYgn5kM => Zv0ej5DSAzR7nA
[info] Available formats for QJ42yFBHgZk:
format code  extension  resolution note
251          webm       audio only audio_quality_medium   89k , webm_dash container, opus  (48000Hz), 130.87MiB
140          m4a        audio only audio_quality_medium  129k , m4a_dash container, mp4a.40.2 (44100Hz), 189.55MiB
160          mp4        256x144    144p   56k , mp4_dash container, avc1.4d400c, 30fps, video only, 83.29MiB
243          webm       640x360    360p  173k , webm_dash container, vp9, 30fps, video only, 254.31MiB
134          mp4        640x360    360p  235k , mp4_dash container, avc1.4d401e, 30fps, video only, 345.13MiB
136          mp4        1280x720   720p  832k , mp4_dash container, avc1.64001f, 30fps, video only, 1.19GiB
18           mp4        640x360    360p  363k , avc1.42001E, 30fps, mp4a.40.2 (44100Hz), 532.79MiB (best)
$ 

(same with Py3.11)

Presumably you are getting some captcha or blocking page that does not contain the expected initial data block.

You could try to provoke that page in a private/incognito browser session and solve any resulting challenge, then close the session, after which passing the resulting cookies, perhaps with the same UA, might succeed.

@dirkf dirkf changed the title Unable to extract [YouTube] Unable to extract yt initial data (live) Nov 14, 2024
@fluca1978
Copy link
Author

@dirkf this is what I did:

  1. downloaded cookies from a private window (effectively there was the classic "accept cookies" page)
  2. copied the file into another one, since youtube-dl would overwrite it
  3. invoked youtube-dl.
% head cookies.private.txt
# HTTP Cookie File for domains related to youtube.com.
# Downloaded with cookies.txt Chrome Extension (https://chrome.google.com/webstore/detail/njabckikapfpffapmjgojcnbfjonfjfg)
# Example:  wget -x --load-cookies cookies.txt https://www.youtube.com/live/QJ42yFBHgZk
#
.youtube.com    TRUE    /       TRUE    0       YSC     yyZ3myNhZnw

...


% cp cookies.private.txt cookies.txt

% ./youtube-dl --cookies cookies.txt --verbose https://www.youtube.com/live/QJ42yFBHgZk
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--cookies', 'cookies.txt', '--verbose', 'https://www.youtube.com/live/QJ42yFBHgZk']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.12.3 (CPython) - Linux-6.8.0-48-generic-x86_64-with-glibc2.39
[debug] exe versions: ffmpeg 6.1.1, ffprobe 6.1.1
[debug] Proxy map: {}
[youtube:tab] live: Downloading webpage
[youtube] QJ42yFBHgZk: Downloading webpage
ERROR: Unable to extract uploader id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
                ^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/common.py", line 534, in extract
    ie_result = self._real_extract(url)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/youtube.py", line 1794, in _real_extract
    'uploader_id': self._search_regex(r'/(?:channel|user)/([^/?&#]+)', owner_profile_url, 'uploader id') if owner_profile_url else None,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/common.py", line 1012, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract uploader id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.


% diff cookies.private.txt cookies.txt        
1,4c1,3
< # HTTP Cookie File for domains related to youtube.com.
< # Downloaded with cookies.txt Chrome Extension (https://chrome.google.com/webstore/detail/njabckikapfpffapmjgojcnbfjonfjfg)
< # Example:  wget -x --load-cookies cookies.txt https://www.youtube.com/live/QJ42yFBHgZk
< #
---
> # Netscape HTTP Cookie File
> # This file is generated by youtube-dl.  Do not edit.
> 
6d4
< consent.youtube.com   FALSE   /       TRUE    1734166928      OTZ     7821182_52_52_123900_48_436380
11c9,11
< .youtube.com  TRUE    /       TRUE    1794646933      PREF    tz=Europe.Rome&amp;f6=40000000
---
> .youtube.com  TRUE    /       TRUE    1752612982      PREF    tz=Europe.Rome
> .youtube.com  TRUE    /       FALSE   0       CONSENT YES+cb.20210328-17-p0.en+FX+986
> consent.youtube.com   FALSE   /       TRUE    1734166928      OTZ     7821182_52_52_123900_48_436380

It fails also specifying the user agent (after having restored the cookies):

% ./youtube-dl --cookies cookies.txt --user-agent 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:131.0) Gecko/20100101 Firefox/131.0'  --verbose https://www.youtube.com/live/QJ42yFBHgZk
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--cookies', 'cookies.txt', '--user-agent', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:131.0) Gecko/20100101 Firefox/131.0', '--verbose', 'https://www.youtube.com/live/QJ42yFBHgZk']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.12.3 (CPython) - Linux-6.8.0-48-generic-x86_64-with-glibc2.39
[debug] exe versions: ffmpeg 6.1.1, ffprobe 6.1.1
[debug] Proxy map: {}
[youtube:tab] live: Downloading webpage
[youtube] QJ42yFBHgZk: Downloading webpage
ERROR: Unable to extract uploader id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
                ^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/common.py", line 534, in extract
    ie_result = self._real_extract(url)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/youtube.py", line 1794, in _real_extract
    'uploader_id': self._search_regex(r'/(?:channel|user)/([^/?&#]+)', owner_profile_url, 'uploader id') if owner_profile_url else None,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/common.py", line 1012, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract uploader id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

@dirkf
Copy link
Contributor

dirkf commented Nov 15, 2024

Please try your command without and with cookies using the single-file build from the latest nightly release.

@fluca1978
Copy link
Author

Please try your command without and with cookies using the single-file build from the latest nightly release.

Sorry @dirkf I'm a little confused: I'm using commit c509896 and the binary into the bin directory. This should be the most updated version, however what nightly build are you referring to?

@dirkf
Copy link
Contributor

dirkf commented Nov 19, 2024

See #30839 that you were asked to read and also https://github.com/ytdl-org/youtube-dl/releases/latest.

@fluca1978
Copy link
Author

@dirkf thanks for the explaination, I thought that running from head was better.

However, the following is using cookies:

% ./bin/youtube-dl --cookies cookies-youtube-com.txt  --verbose https://www.youtube.com/live/QJ42yFBHgZk 
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--cookies', 'cookies-youtube-com.txt', '--verbose', 'https://www.youtube.com/live/QJ42yFBHgZk']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.12.3 (CPython) - Linux-6.8.0-48-generic-x86_64-with-glibc2.39
[debug] exe versions: ffmpeg 6.1.1, ffprobe 6.1.1
[debug] Proxy map: {}
[youtube:tab] live: Downloading webpage
[youtube] QJ42yFBHgZk: Downloading webpage
ERROR: Unable to extract uploader id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
                ^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/common.py", line 534, in extract
    ie_result = self._real_extract(url)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/youtube.py", line 1794, in _real_extract
    'uploader_id': self._search_regex(r'/(?:channel|user)/([^/?&#]+)', owner_profile_url, 'uploader id') if owner_profile_url else None,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/common.py", line 1012, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract uploader id; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to upda

and without cookies:

% ./bin/youtube-dl --verbose https://www.youtube.com/live/QJ42yFBHgZk
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--verbose', 'https://www.youtube.com/live/QJ42yFBHgZk']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.12.3 (CPython) - Linux-6.8.0-48-generic-x86_64-with-glibc2.39
[debug] exe versions: ffmpeg 6.1.1, ffprobe 6.1.1
[debug] Proxy map: {}
[youtube:tab] live: Downloading webpage
ERROR: Unable to extract yt initial data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Traceback (most recent call last):
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/YoutubeDL.py", line 815, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/YoutubeDL.py", line 836, in __extract_info
    ie_result = ie.extract(url)
                ^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/common.py", line 534, in extract
    ie_result = self._real_extract(url)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/youtube.py", line 2841, in _real_extract
    data = self._extract_yt_initial_data(item_id, webpage)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/youtube.py", line 299, in _extract_yt_initial_data
    self._search_regex(
  File "/home/luca/.python.venv.d/django/lib/python3.12/site-packages/youtube_dl/extractor/common.py", line 1012, in _search_regex
    raise RegexNotFoundError('Unable to extract %s' % _name)
youtube_dl.utils.RegexNotFoundError: Unable to extract yt initial data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

I was running this one https://github.com/ytdl-org/ytdl-nightly/releases/download/2024.08.07/youtube-dl-2024.08.07.tar.gz.
I've used the Firefox extensions "cookies.txt" and "Export cookies" to obtain the cookies, if that matters.

@dirkf
Copy link
Contributor

dirkf commented Nov 19, 2024

If something is tricky to reproduce, running the single-file nightly build for the platform sets a known baseline for testing. Also, its verbose log shows the build and platform details more fully.

The first log looks like the expected behaviour for the stable release, that was fixed in PR #31675.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants