Compare commits

...

24 Commits

Author SHA1 Message Date
github-actions[bot]
cfa76f35d2 Release 2024.12.03
Created by: bashonly

:ci skip all
2024-12-03 20:30:33 +00:00
bashonly
2b67ac300a [cleanup] Misc (#11716)
Authored by: bashonly, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2024-12-03 20:22:21 +00:00
bashonly
c038a7b187 [ie/vk] Fix extractors (#11715)
Closes #5832, Closes #11471, Closes #11646, Closes #11670
Authored by: bashonly
2024-12-03 14:28:43 +00:00
Link
a13a336aa6 [ie/bilibili] Fix subtitles and chapters extraction (#11708)
Authored by: xiaomac
2024-12-03 04:08:46 +00:00
N/Ame
dc16876480 [ie/bilibili] Always try to extract HD formats (#10559)
Closes #10554
Authored by: grqz
2024-12-03 03:44:03 +00:00
N/Ame
f05a1cd149 [ie/bilibili] Fix supporter-only video extraction (#11711)
Fix bug in 239f5f36fe
Closes #11702
Authored by: grqz, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-12-03 01:19:22 +00:00
sepro
d8fb349086 [cleanup] Bump ruff to 0.8.x (#11608)
Authored by: seproDev
2024-12-02 16:29:30 +01:00
sepro
2bea793632 [ie/MicrosoftEmbed] Make format extraction non fatal (#11654)
Authored by: seproDev
2024-12-02 16:22:16 +01:00
Elan Ruusamäe
62cba8a1be [ie/duoplay] Fix extractor (#11588)
Authored by: glensc, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2024-12-01 22:33:11 +00:00
N/Ame
239f5f36fe [ie/bilibili] Fix extractor (#11667)
Closes #11665
Authored by: grqz
2024-12-01 21:55:18 +00:00
bashonly
0d146c1e36 [ie/youtube] Adjust player clients for site changes (#11663)
Closes #11640
Authored by: bashonly
2024-12-01 15:25:09 +00:00
DarkZeros
cd0f934604 [ie/mitele] Fix extractor (#11683)
Closes #11690
Authored by: DarkZeros
2024-12-01 14:21:57 +00:00
N/Ame
360aed810a [ie/instagram] Support share URLs (#11677)
Closes #11630
Authored by: grqz
2024-12-01 14:16:50 +00:00
bashonly
00dcde7286 [ie/dropbox] Fix password-protected video extraction (#11636)
Closes #11634
Authored by: bashonly
2024-11-27 01:47:28 +00:00
bashonly
910ecc4229 [ie/tiktok] Deprioritize animated thumbnails (#11645)
Closes #11641
Authored by: bashonly
2024-11-27 00:45:01 +00:00
bashonly
0a0d80800b [ie/dacast] Fix HLS AES formats extraction (#11644)
Closes #11643
Authored by: bashonly
2024-11-26 23:18:48 +00:00
Simon Sawicki
e0500cbf79 [ie] Handle fragmented formats in _remove_duplicate_formats (#11637)
Authored by: Grub4K
2024-11-27 00:05:07 +01:00
Jakob Kruse
4b5eec0aaa [ie/chaturbate] Fix support for non-public streams (#11624)
Fix bug in 720b3dc453

Closes #11623
Authored by: jkruse
2024-11-24 22:20:30 +00:00
sepro
fe70f20aed [ie/youtube:tab] Fix playlists tab extraction (#11615)
Closes #11524
Authored by: seproDev
2024-11-23 22:46:50 +01:00
coletdjnz
c7316373c0 [rh:websockets] Support websockets 14.0+ (#11616)
Authored by: coletdjnz
2024-11-24 10:30:00 +13:00
N/Ame
e0f1ae813b [ie/facebook] Support more groups URLs (#11576)
Authored by: grqz
2024-11-23 19:47:37 +00:00
sepro
7d6c259a03 Add playlist_webpage_url field (#11613)
Closes #10827
Authored by: seproDev
2024-11-23 20:42:35 +01:00
gitninja1234
16336c51d0 [ie/stripchat] Fix extractor (#11596)
Closes #11587
Authored by: gitninja1234
2024-11-23 19:40:45 +00:00
bashonly
ccf0a6b86b [cleanup] Misc (#11574)
Authored by: bashonly, pzhlkj6612

Co-authored-by: Mozi <29089388+pzhlkj6612@users.noreply.github.com>
2024-11-23 18:51:51 +00:00
36 changed files with 493 additions and 301 deletions

View File

@@ -707,3 +707,6 @@ Sakura286
SamDecrock
stratus-ss
subrat-lima
gitninja1234
jkruse
xiaomac

View File

@@ -4,11 +4,47 @@
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
-->
### 2024.12.03
#### Core changes
- [Add `playlist_webpage_url` field](https://github.com/yt-dlp/yt-dlp/commit/7d6c259a03bc4707a319e5e8c6eff0278707874b) ([#11613](https://github.com/yt-dlp/yt-dlp/issues/11613)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- [Handle fragmented formats in `_remove_duplicate_formats`](https://github.com/yt-dlp/yt-dlp/commit/e0500cbf796323551bbabe5b8ed8c75a511ba47a) ([#11637](https://github.com/yt-dlp/yt-dlp/issues/11637)) by [Grub4K](https://github.com/Grub4K)
- **bilibili**
- [Always try to extract HD formats](https://github.com/yt-dlp/yt-dlp/commit/dc1687648077c5bf64863b307ecc5ab7e029bd8d) ([#10559](https://github.com/yt-dlp/yt-dlp/issues/10559)) by [grqz](https://github.com/grqz)
- [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/239f5f36fe04603bec59c8b975f6a792f10246db) ([#11667](https://github.com/yt-dlp/yt-dlp/issues/11667)) by [grqz](https://github.com/grqz) (With fixes in [f05a1cd](https://github.com/yt-dlp/yt-dlp/commit/f05a1cd1492fc98dc8d80d2081d632a1879913d2) by [bashonly](https://github.com/bashonly), [grqz](https://github.com/grqz))
- [Fix subtitles and chapters extraction](https://github.com/yt-dlp/yt-dlp/commit/a13a336aa6f906812701abec8101b73b73db8ff7) ([#11708](https://github.com/yt-dlp/yt-dlp/issues/11708)) by [xiaomac](https://github.com/xiaomac)
- **chaturbate**: [Fix support for non-public streams](https://github.com/yt-dlp/yt-dlp/commit/4b5eec0aaa7c02627f27a386591b735b90e681a8) ([#11624](https://github.com/yt-dlp/yt-dlp/issues/11624)) by [jkruse](https://github.com/jkruse)
- **dacast**: [Fix HLS AES formats extraction](https://github.com/yt-dlp/yt-dlp/commit/0a0d80800b9350d1a4c4b18d82cfb77ffbc3c507) ([#11644](https://github.com/yt-dlp/yt-dlp/issues/11644)) by [bashonly](https://github.com/bashonly)
- **dropbox**: [Fix password-protected video extraction](https://github.com/yt-dlp/yt-dlp/commit/00dcde728635633eee969ad4d498b9f233c4a94e) ([#11636](https://github.com/yt-dlp/yt-dlp/issues/11636)) by [bashonly](https://github.com/bashonly)
- **duoplay**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/62cba8a1bedbfc0ddde7267ae57b72bf5f7ea7b1) ([#11588](https://github.com/yt-dlp/yt-dlp/issues/11588)) by [bashonly](https://github.com/bashonly), [glensc](https://github.com/glensc)
- **facebook**: [Support more groups URLs](https://github.com/yt-dlp/yt-dlp/commit/e0f1ae813b36e783e2348ba2a1566e12f5cd8f6e) ([#11576](https://github.com/yt-dlp/yt-dlp/issues/11576)) by [grqz](https://github.com/grqz)
- **instagram**: [Support `share` URLs](https://github.com/yt-dlp/yt-dlp/commit/360aed810ad85db950df586282d256516c98cd2d) ([#11677](https://github.com/yt-dlp/yt-dlp/issues/11677)) by [grqz](https://github.com/grqz)
- **microsoftembed**: [Make format extraction non fatal](https://github.com/yt-dlp/yt-dlp/commit/2bea7936323ca4b6f3b9b1fdd892566223e30efa) ([#11654](https://github.com/yt-dlp/yt-dlp/issues/11654)) by [seproDev](https://github.com/seproDev)
- **mitele**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/cd0f934604587ed793e9177f6a127e5dcf99a7dd) ([#11683](https://github.com/yt-dlp/yt-dlp/issues/11683)) by [DarkZeros](https://github.com/DarkZeros)
- **stripchat**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/16336c51d0848a6868a4fa04e749fa03548b4913) ([#11596](https://github.com/yt-dlp/yt-dlp/issues/11596)) by [gitninja1234](https://github.com/gitninja1234)
- **tiktok**: [Deprioritize animated thumbnails](https://github.com/yt-dlp/yt-dlp/commit/910ecc422930bca14e2abe4986f5f92359e3cea8) ([#11645](https://github.com/yt-dlp/yt-dlp/issues/11645)) by [bashonly](https://github.com/bashonly)
- **vk**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/c038a7b187ba24360f14134842a7a2cf897c33b1) ([#11715](https://github.com/yt-dlp/yt-dlp/issues/11715)) by [bashonly](https://github.com/bashonly)
- **youtube**
- [Adjust player clients for site changes](https://github.com/yt-dlp/yt-dlp/commit/0d146c1e36f467af30e87b7af651bdee67b73500) ([#11663](https://github.com/yt-dlp/yt-dlp/issues/11663)) by [bashonly](https://github.com/bashonly)
- tab: [Fix playlists tab extraction](https://github.com/yt-dlp/yt-dlp/commit/fe70f20aedf528fdee332131bc9b6710e54e6f10) ([#11615](https://github.com/yt-dlp/yt-dlp/issues/11615)) by [seproDev](https://github.com/seproDev)
#### Networking changes
- **Request Handler**: websockets: [Support websockets 14.0+](https://github.com/yt-dlp/yt-dlp/commit/c7316373c0a886f65a07a51e50ee147bb3294c85) ([#11616](https://github.com/yt-dlp/yt-dlp/issues/11616)) by [coletdjnz](https://github.com/coletdjnz)
#### Misc. changes
- **cleanup**
- [Bump ruff to 0.8.x](https://github.com/yt-dlp/yt-dlp/commit/d8fb3490863653182864d2a53522f350d67a9ff8) ([#11608](https://github.com/yt-dlp/yt-dlp/issues/11608)) by [seproDev](https://github.com/seproDev)
- Miscellaneous
- [ccf0a6b](https://github.com/yt-dlp/yt-dlp/commit/ccf0a6b86b7f68a75463804fe485ec240b8635f0) by [bashonly](https://github.com/bashonly), [pzhlkj6612](https://github.com/pzhlkj6612)
- [2b67ac3](https://github.com/yt-dlp/yt-dlp/commit/2b67ac300ac8b44368fb121637d1743cea8c5b6b) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
### 2024.11.18
#### Important changes
- **Login with OAuth is no longer supported for YouTube**
Due to a change made by the site, yt-dlp is longer able to support OAuth login for YouTube. [Read more](https://github.com/yt-dlp/yt-dlp/issues/11462#issuecomment-2471703090)
Due to a change made by the site, yt-dlp is no longer able to support OAuth login for YouTube. [Read more](https://github.com/yt-dlp/yt-dlp/issues/11462#issuecomment-2471703090)
#### Core changes
- [Catch broken Cryptodome installations](https://github.com/yt-dlp/yt-dlp/commit/b83ca24eb72e1e558b0185bd73975586c0bc0546) ([#11486](https://github.com/yt-dlp/yt-dlp/issues/11486)) by [seproDev](https://github.com/seproDev)

View File

@@ -1294,6 +1294,7 @@ The available fields are:
- `playlist_uploader_id` (string): Nickname or id of the playlist uploader
- `playlist_channel` (string): Display name of the channel that uploaded the playlist
- `playlist_channel_id` (string): Identifier of the channel that uploaded the playlist
- `playlist_webpage_url` (string): URL of the playlist webpage
- `webpage_url` (string): A URL to the video webpage which, if given to yt-dlp, should yield the same result again
- `webpage_url_basename` (string): The basename of the webpage URL
- `webpage_url_domain` (string): The domain of the webpage URL
@@ -1760,7 +1761,7 @@ $ yt-dlp --replace-in-metadata "title,uploader" "[ _]" "-"
# EXTRACTOR ARGUMENTS
Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. E.g. `--extractor-args "youtube:player-client=mediaconnect,web;formats=incomplete" --extractor-args "funimation:version=uncut"`
Some extractors accept additional arguments which can be passed using `--extractor-args KEY:ARGS`. `ARGS` is a `;` (semicolon) separated string of `ARG=VAL1,VAL2`. E.g. `--extractor-args "youtube:player-client=tv,mweb;formats=incomplete" --extractor-args "funimation:version=uncut"`
Note: In CLI, `ARG` can use `-` instead of `_`; e.g. `youtube:player-client"` becomes `youtube:player_client"`
@@ -1769,7 +1770,7 @@ The following extractors use this feature:
#### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The main clients are `web`, `ios` and `android`, with variants `_music` and `_creator` (e.g. `ios_creator`); and `mweb`, `mediaconnect`, `android_vr`, `web_safari`, `web_embedded`, `tv` and `tv_embedded` with no variants. By default, `ios,mweb` is used, and `web_creator` is added as needed for age-gated videos when account age verification is required. Similarly, the `_music` variants are added for `music.youtube.com` URLs. Some clients, such as `web` and `android`, require a `po_token` for their formats to be downloadable. Some clients, such as the `_creator` variants, will only work with authentication. You can use `all` to use all the clients, and `default` for the default clients. You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=all,-web`
* `player_client`: Clients to extract video data from. The main clients are `web`, `ios` and `android`, with variants `_music` and `_creator` (e.g. `ios_creator`); and `mweb`, `android_vr`, `web_safari`, `web_embedded`, `tv` and `tv_embedded` with no variants. By default, `ios,mweb` is used, or `web_creator,mweb` is used when authenticating with cookies. The `_music` variants are added for `music.youtube.com` URLs. Some clients, such as `web` and `android`, require a `po_token` for their formats to be downloadable. Some clients, such as the `_creator` variants, will only work with authentication. Not all clients support authentication via cookies. You can use `all` to use all the clients, and `default` for the default clients. You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=all,-web`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)

View File

@@ -238,6 +238,6 @@
{
"action": "add",
"when": "52c0ffe40ad6e8404d93296f575007b05b04c686",
"short": "[priority] **Login with OAuth is no longer supported for YouTube**\nDue to a change made by the site, yt-dlp is longer able to support OAuth login for YouTube. [Read more](https://github.com/yt-dlp/yt-dlp/issues/11462#issuecomment-2471703090)"
"short": "[priority] **Login with OAuth is no longer supported for YouTube**\nDue to a change made by the site, yt-dlp is no longer able to support OAuth login for YouTube. [Read more](https://github.com/yt-dlp/yt-dlp/issues/11462#issuecomment-2471703090)"
}
]

View File

@@ -52,7 +52,7 @@ default = [
"pycryptodomex",
"requests>=2.32.2,<3",
"urllib3>=1.26.17,<3",
"websockets>=13.0,<14",
"websockets>=13.0",
]
curl-cffi = [
"curl-cffi==0.5.10; os_name=='nt' and implementation_name=='cpython'",
@@ -76,7 +76,7 @@ dev = [
]
static-analysis = [
"autopep8~=2.0",
"ruff~=0.7.0",
"ruff~=0.8.0",
]
test = [
"pytest~=8.1",
@@ -186,6 +186,7 @@ ignore = [
"E501", # line-too-long
"E731", # lambda-assignment
"E741", # ambiguous-variable-name
"UP031", # printf-string-formatting
"UP036", # outdated-version-block
"B006", # mutable-argument-default
"B008", # function-call-in-default-argument
@@ -258,9 +259,6 @@ select = [
"A002", # builtin-argument-shadowing
"C408", # unnecessary-collection-call
]
"yt_dlp/jsinterp.py" = [
"UP031", # printf-string-formatting
]
[tool.ruff.lint.isort]
known-first-party = [

View File

@@ -216,7 +216,9 @@ class SocksWebSocketTestRequestHandler(SocksTestRequestHandler):
protocol = websockets.ServerProtocol()
connection = websockets.sync.server.ServerConnection(socket=self.request, protocol=protocol, close_timeout=0)
connection.handshake()
connection.send(json.dumps(self.socks_info))
for message in connection:
if message == 'socks_info':
connection.send(json.dumps(self.socks_info))
connection.close()

View File

@@ -1116,7 +1116,7 @@ class YoutubeDL:
def raise_no_formats(self, info, forced=False, *, msg=None):
has_drm = info.get('_has_drm')
ignored, expected = self.params.get('ignore_no_formats_error'), bool(msg)
msg = msg or has_drm and 'This video is DRM protected' or 'No video formats found!'
msg = msg or (has_drm and 'This video is DRM protected') or 'No video formats found!'
if forced or not ignored:
raise ExtractorError(msg, video_id=info['id'], ie=info['extractor'],
expected=has_drm or ignored or expected)
@@ -1947,6 +1947,7 @@ class YoutubeDL:
'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_channel': ie_result.get('channel'),
'playlist_channel_id': ie_result.get('channel_id'),
'playlist_webpage_url': ie_result.get('webpage_url'),
**kwargs,
}
if strict:
@@ -2195,7 +2196,7 @@ class YoutubeDL:
def _default_format_spec(self, info_dict):
prefer_best = (
self.params['outtmpl']['default'] == '-'
or info_dict.get('is_live') and not self.params.get('live_from_start'))
or (info_dict.get('is_live') and not self.params.get('live_from_start')))
def can_merge():
merger = FFmpegMergerPP(self)
@@ -2364,7 +2365,7 @@ class YoutubeDL:
vexts=[f['ext'] for f in video_fmts],
aexts=[f['ext'] for f in audio_fmts],
preferences=(try_call(lambda: self.params['merge_output_format'].split('/'))
or self.params.get('prefer_free_formats') and ('webm', 'mkv')))
or (self.params.get('prefer_free_formats') and ('webm', 'mkv'))))
filtered = lambda *keys: filter(None, (traverse_obj(fmt, *keys) for fmt in formats_info))
@@ -3540,8 +3541,8 @@ class YoutubeDL:
and info_dict.get('container') == 'm4a_dash',
'writing DASH m4a. Only some players support this container',
FFmpegFixupM4aPP)
ffmpeg_fixup(downloader == 'hlsnative' and not self.params.get('hls_use_mpegts')
or info_dict.get('is_live') and self.params.get('hls_use_mpegts') is None,
ffmpeg_fixup((downloader == 'hlsnative' and not self.params.get('hls_use_mpegts'))
or (info_dict.get('is_live') and self.params.get('hls_use_mpegts') is None),
'Possible MPEG-TS in MP4 container or malformed AAC timestamps',
FFmpegFixupM3u8PP)
ffmpeg_fixup(downloader == 'dashsegments'

View File

@@ -1062,7 +1062,7 @@ def _real_main(argv=None):
# If we only have a single process attached, then the executable was double clicked
# When using `pyinstaller` with `--onefile`, two processes get attached
is_onefile = hasattr(sys, '_MEIPASS') and os.path.basename(sys._MEIPASS).startswith('_MEI')
if attached_processes == 1 or is_onefile and attached_processes == 2:
if attached_processes == 1 or (is_onefile and attached_processes == 2):
print(parser._generate_error_message(
'Do not double-click the executable, instead call it from a command line.\n'
'Please read the README for further information on how to use yt-dlp: '
@@ -1109,9 +1109,9 @@ def main(argv=None):
from .extractor import gen_extractors, list_extractors
__all__ = [
'main',
'YoutubeDL',
'parse_options',
'gen_extractors',
'list_extractors',
'main',
'parse_options',
]

View File

@@ -534,19 +534,17 @@ def ghash(subkey, data):
__all__ = [
'aes_cbc_decrypt',
'aes_cbc_decrypt_bytes',
'aes_ctr_decrypt',
'aes_decrypt_text',
'aes_decrypt',
'aes_ecb_decrypt',
'aes_gcm_decrypt_and_verify',
'aes_gcm_decrypt_and_verify_bytes',
'aes_cbc_encrypt',
'aes_cbc_encrypt_bytes',
'aes_ctr_decrypt',
'aes_ctr_encrypt',
'aes_decrypt',
'aes_decrypt_text',
'aes_ecb_decrypt',
'aes_ecb_encrypt',
'aes_encrypt',
'aes_gcm_decrypt_and_verify',
'aes_gcm_decrypt_and_verify_bytes',
'key_expansion',
'pad_block',
'pkcs7_padding',

View File

@@ -1276,8 +1276,8 @@ class YoutubeDLCookieJar(http.cookiejar.MozillaCookieJar):
def _really_save(self, f, ignore_discard, ignore_expires):
now = time.time()
for cookie in self:
if (not ignore_discard and cookie.discard
or not ignore_expires and cookie.is_expired(now)):
if ((not ignore_discard and cookie.discard)
or (not ignore_expires and cookie.is_expired(now))):
continue
name, value = cookie.name, cookie.value
if value is None:

View File

@@ -119,12 +119,12 @@ class HlsFD(FragmentFD):
self.to_screen(f'[{self.FD_NAME}] Fragment downloads will be delegated to {real_downloader.get_basename()}')
def is_ad_fragment_start(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s
or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad'))
return ((s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s)
or (s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad')))
def is_ad_fragment_end(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=master' in s
or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',segment'))
return ((s.startswith('#ANVATO-SEGMENT-INFO') and 'type=master' in s)
or (s.startswith('#UPLYNK-SEGMENT') and s.endswith(',segment')))
fragments = []

View File

@@ -123,8 +123,8 @@ class YoutubeLiveChatFD(FragmentFD):
data,
lambda x: x['continuationContents']['liveChatContinuation'], dict) or {}
func = (info_dict['protocol'] == 'youtube_live_chat' and parse_actions_live
or frag_index == 1 and try_refresh_replay_beginning
func = ((info_dict['protocol'] == 'youtube_live_chat' and parse_actions_live)
or (frag_index == 1 and try_refresh_replay_beginning)
or parse_actions_replay)
return (True, *func(live_chat_continuation))
except HTTPError as err:

View File

@@ -232,7 +232,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
error = self._parse_json(e.cause.response.read(), video_id)
message = error.get('message')
if e.cause.code == 403 and error.get('code') == 'player-bad-geolocation-country':
if e.cause.status == 403 and error.get('code') == 'player-bad-geolocation-country':
self.raise_geo_restricted(msg=message)
raise ExtractorError(message)
else:

View File

@@ -18,7 +18,6 @@ from ..utils import (
InAdvancePagedList,
OnDemandPagedList,
bool_or_none,
clean_html,
determine_ext,
filter_dict,
float_or_none,
@@ -63,7 +62,7 @@ class BilibiliBaseIE(InfoExtractor):
'support_formats', lambda _, v: v['quality'] not in parsed_qualities))], delim=', ')
if missing_formats:
self.to_screen(
f'Format(s) {missing_formats} are missing; you have to login or '
f'Format(s) {missing_formats} are missing; you have to '
f'become a premium member to download them. {self._login_hint()}')
def extract_formats(self, play_info):
@@ -165,14 +164,18 @@ class BilibiliBaseIE(InfoExtractor):
params['w_rid'] = hashlib.md5(f'{query}{self._get_wbi_key(video_id)}'.encode()).hexdigest()
return params
def _download_playinfo(self, bvid, cid, headers=None, qn=None):
params = {'bvid': bvid, 'cid': cid, 'fnval': 4048}
if qn:
params['qn'] = qn
def _download_playinfo(self, bvid, cid, headers=None, query=None):
params = {'bvid': bvid, 'cid': cid, 'fnval': 4048, **(query or {})}
if self.is_logged_in:
params.pop('try_look', None)
if qn := params.get('qn'):
note = f'Downloading video format {qn} for cid {cid}'
else:
note = f'Downloading video formats for cid {cid}'
return self._download_json(
'https://api.bilibili.com/x/player/wbi/playurl', bvid,
query=self._sign_wbi(params, bvid), headers=headers,
note=f'Downloading video formats for cid {cid} {qn or ""}')['data']
query=self._sign_wbi(params, bvid), headers=headers, note=note)['data']
def json2srt(self, json_data):
srt_data = ''
@@ -191,7 +194,7 @@ class BilibiliBaseIE(InfoExtractor):
}
video_info = self._download_json(
'https://api.bilibili.com/x/player/v2', video_id,
'https://api.bilibili.com/x/player/wbi/v2', video_id,
query={'aid': aid, 'cid': cid} if aid else {'bvid': video_id, 'cid': cid},
note=f'Extracting subtitle info {cid}', headers=self._HEADERS)
if traverse_obj(video_info, ('data', 'need_login_subtitle')):
@@ -207,7 +210,7 @@ class BilibiliBaseIE(InfoExtractor):
def _get_chapters(self, aid, cid):
chapters = aid and cid and self._download_json(
'https://api.bilibili.com/x/player/v2', aid, query={'aid': aid, 'cid': cid},
'https://api.bilibili.com/x/player/wbi/v2', aid, query={'aid': aid, 'cid': cid},
note='Extracting chapters', fatal=False, headers=self._HEADERS)
return traverse_obj(chapters, ('data', 'view_points', ..., {
'title': 'content',
@@ -286,7 +289,7 @@ class BilibiliBaseIE(InfoExtractor):
('data', 'interaction', 'graph_version', {int_or_none}))
cid_edges = self._get_divisions(video_id, graph_version, {1: {'cid': cid}}, 1)
for cid, edges in cid_edges.items():
play_info = self._download_playinfo(video_id, cid, headers=headers)
play_info = self._download_playinfo(video_id, cid, headers=headers, query={'try_look': 1})
yield {
**metainfo,
'id': f'{video_id}_{cid}',
@@ -639,40 +642,29 @@ class BiliBiliIE(BilibiliBaseIE):
headers['Referer'] = url
initial_state = self._search_json(r'window\.__INITIAL_STATE__\s*=', webpage, 'initial state', video_id)
if traverse_obj(initial_state, ('error', 'trueCode')) == -403:
self.raise_login_required()
if traverse_obj(initial_state, ('error', 'trueCode')) == -404:
raise ExtractorError(
'This video may be deleted or geo-restricted. '
'You might want to try a VPN or a proxy server (with --proxy)', expected=True)
is_festival = 'videoData' not in initial_state
if is_festival:
video_data = initial_state['videoInfo']
else:
play_info_obj = self._search_json(
r'window\.__playinfo__\s*=', webpage, 'play info', video_id, fatal=False)
if not play_info_obj:
if traverse_obj(initial_state, ('error', 'trueCode')) == -403:
self.raise_login_required()
if traverse_obj(initial_state, ('error', 'trueCode')) == -404:
raise ExtractorError(
'This video may be deleted or geo-restricted. '
'You might want to try a VPN or a proxy server (with --proxy)', expected=True)
play_info = traverse_obj(play_info_obj, ('data', {dict}))
if not play_info:
if traverse_obj(play_info_obj, 'code') == 87007:
toast = get_element_by_class('tips-toast', webpage) or ''
msg = clean_html(
f'{get_element_by_class("belongs-to", toast) or ""}'
+ (get_element_by_class('level', toast) or ''))
raise ExtractorError(
f'This is a supporter-only video: {msg}. {self._login_hint()}', expected=True)
raise ExtractorError('Failed to extract play info')
video_data = initial_state['videoData']
video_id, title = video_data['bvid'], video_data.get('title')
# Bilibili anthologies are similar to playlists but all videos share the same video ID as the anthology itself.
page_list_json = not is_festival and traverse_obj(
page_list_json = (not is_festival and traverse_obj(
self._download_json(
'https://api.bilibili.com/x/player/pagelist', video_id,
fatal=False, query={'bvid': video_id, 'jsonp': 'jsonp'},
note='Extracting videos in anthology', headers=headers),
'data', expected_type=list) or []
'data', expected_type=list)) or []
is_anthology = len(page_list_json) > 1
part_id = int_or_none(parse_qs(url).get('p', [None])[-1])
@@ -689,10 +681,14 @@ class BiliBiliIE(BilibiliBaseIE):
old_video_id = format_field(aid, None, f'%s_part{part_id or 1}')
cid = traverse_obj(video_data, ('pages', part_id - 1, 'cid')) if part_id else video_data.get('cid')
play_info = (
traverse_obj(
self._search_json(r'window\.__playinfo__\s*=', webpage, 'play info', video_id, default=None),
('data', {dict}))
or self._download_playinfo(video_id, cid, headers=headers, query={'try_look': 1}))
festival_info = {}
if is_festival:
play_info = self._download_playinfo(video_id, cid, headers=headers)
festival_info = traverse_obj(initial_state, {
'uploader': ('videoInfo', 'upName'),
'uploader_id': ('videoInfo', 'upMid', {str_or_none}),
@@ -727,62 +723,72 @@ class BiliBiliIE(BilibiliBaseIE):
self._get_interactive_entries(video_id, cid, metainfo, headers=headers), **metainfo,
duration=traverse_obj(initial_state, ('videoData', 'duration', {int_or_none})),
__post_extractor=self.extract_comments(aid))
else:
formats = self.extract_formats(play_info)
if not traverse_obj(play_info, ('dash')):
# we only have legacy formats and need additional work
has_qn = lambda x: x in traverse_obj(formats, (..., 'quality'))
for qn in traverse_obj(play_info, ('accept_quality', lambda _, v: not has_qn(v), {int})):
formats.extend(traverse_obj(
self.extract_formats(self._download_playinfo(video_id, cid, headers=headers, qn=qn)),
lambda _, v: not has_qn(v['quality'])))
self._check_missing_formats(play_info, formats)
flv_formats = traverse_obj(formats, lambda _, v: v['fragments'])
if flv_formats and len(flv_formats) < len(formats):
# Flv and mp4 are incompatible due to `multi_video` workaround, so drop one
if not self._configuration_arg('prefer_multi_flv'):
dropped_fmts = ', '.join(
f'{f.get("format_note")} ({f.get("format_id")})' for f in flv_formats)
formats = traverse_obj(formats, lambda _, v: not v.get('fragments'))
if dropped_fmts:
self.to_screen(
f'Dropping incompatible flv format(s) {dropped_fmts} since mp4 is available. '
'To extract flv, pass --extractor-args "bilibili:prefer_multi_flv"')
else:
formats = traverse_obj(
# XXX: Filtering by extractor-arg is for testing purposes
formats, lambda _, v: v['quality'] == int(self._configuration_arg('prefer_multi_flv')[0]),
) or [max(flv_formats, key=lambda x: x['quality'])]
formats = self.extract_formats(play_info)
if traverse_obj(formats, (0, 'fragments')):
# We have flv formats, which are individual short videos with their own timestamps and metainfo
# Binary concatenation corrupts their timestamps, so we need a `multi_video` workaround
return {
**metainfo,
'_type': 'multi_video',
'entries': [{
'id': f'{metainfo["id"]}_{idx}',
'title': metainfo['title'],
'http_headers': metainfo['http_headers'],
'formats': [{
**fragment,
'format_id': formats[0].get('format_id'),
}],
'subtitles': self.extract_subtitles(video_id, cid) if idx == 0 else None,
'__post_extractor': self.extract_comments(aid) if idx == 0 else None,
} for idx, fragment in enumerate(formats[0]['fragments'])],
'duration': float_or_none(play_info.get('timelength'), scale=1000),
}
else:
return {
**metainfo,
'formats': formats,
'duration': float_or_none(play_info.get('timelength'), scale=1000),
'chapters': self._get_chapters(aid, cid),
'subtitles': self.extract_subtitles(video_id, cid),
'__post_extractor': self.extract_comments(aid),
}
if video_data.get('is_upower_exclusive'):
high_level = traverse_obj(initial_state, ('elecFullInfo', 'show_info', 'high_level', {dict})) or {}
msg = f'{join_nonempty("title", "sub_title", from_dict=high_level, delim="")}. {self._login_hint()}'
if not formats:
raise ExtractorError(f'This is a supporter-only video: {msg}', expected=True)
if '试看' in traverse_obj(play_info, ('accept_description', ..., {str})):
self.report_warning(
f'This is a supporter-only video, only the preview will be extracted: {msg}',
video_id=video_id)
if not traverse_obj(play_info, 'dash'):
# we only have legacy formats and need additional work
has_qn = lambda x: x in traverse_obj(formats, (..., 'quality'))
for qn in traverse_obj(play_info, ('accept_quality', lambda _, v: not has_qn(v), {int})):
formats.extend(traverse_obj(
self.extract_formats(self._download_playinfo(video_id, cid, headers=headers, query={'qn': qn})),
lambda _, v: not has_qn(v['quality'])))
self._check_missing_formats(play_info, formats)
flv_formats = traverse_obj(formats, lambda _, v: v['fragments'])
if flv_formats and len(flv_formats) < len(formats):
# Flv and mp4 are incompatible due to `multi_video` workaround, so drop one
if not self._configuration_arg('prefer_multi_flv'):
dropped_fmts = ', '.join(
f'{f.get("format_note")} ({f.get("format_id")})' for f in flv_formats)
formats = traverse_obj(formats, lambda _, v: not v.get('fragments'))
if dropped_fmts:
self.to_screen(
f'Dropping incompatible flv format(s) {dropped_fmts} since mp4 is available. '
'To extract flv, pass --extractor-args "bilibili:prefer_multi_flv"')
else:
formats = traverse_obj(
# XXX: Filtering by extractor-arg is for testing purposes
formats, lambda _, v: v['quality'] == int(self._configuration_arg('prefer_multi_flv')[0]),
) or [max(flv_formats, key=lambda x: x['quality'])]
if traverse_obj(formats, (0, 'fragments')):
# We have flv formats, which are individual short videos with their own timestamps and metainfo
# Binary concatenation corrupts their timestamps, so we need a `multi_video` workaround
return {
**metainfo,
'_type': 'multi_video',
'entries': [{
'id': f'{metainfo["id"]}_{idx}',
'title': metainfo['title'],
'http_headers': metainfo['http_headers'],
'formats': [{
**fragment,
'format_id': formats[0].get('format_id'),
}],
'subtitles': self.extract_subtitles(video_id, cid) if idx == 0 else None,
'__post_extractor': self.extract_comments(aid) if idx == 0 else None,
} for idx, fragment in enumerate(formats[0]['fragments'])],
'duration': float_or_none(play_info.get('timelength'), scale=1000),
}
return {
**metainfo,
'formats': formats,
'duration': float_or_none(play_info.get('timelength'), scale=1000),
'chapters': self._get_chapters(aid, cid),
'subtitles': self.extract_subtitles(video_id, cid),
'__post_extractor': self.extract_comments(aid),
}
class BiliBiliBangumiIE(BilibiliBaseIE):
@@ -860,10 +866,16 @@ class BiliBiliBangumiIE(BilibiliBaseIE):
self.raise_login_required('This video is for premium members only')
headers['Referer'] = url
play_info = self._download_json(
'https://api.bilibili.com/pgc/player/web/v2/playurl', episode_id,
'Extracting episode', query={'fnval': '4048', 'ep_id': episode_id},
headers=headers)
play_info = (
self._search_json(
r'playurlSSRData\s*=', webpage, 'embedded page info', episode_id,
end_pattern='\n', default=None)
or self._download_json(
'https://api.bilibili.com/pgc/player/web/v2/playurl', episode_id,
'Extracting episode', query={'fnval': 12240, 'ep_id': episode_id},
headers=headers))
premium_only = play_info.get('code') == -10403
play_info = traverse_obj(play_info, ('result', 'video_info', {dict})) or {}

View File

@@ -59,16 +59,15 @@ class ChaturbateIE(InfoExtractor):
'Accept': 'application/json',
}, fatal=False, impersonate=True) or {}
status = response.get('room_status')
if status != 'public':
if error := self._ERROR_MAP.get(status):
raise ExtractorError(error, expected=True)
self.report_warning('Falling back to webpage extraction')
return None
m3u8_url = response.get('url')
if not m3u8_url:
self.raise_geo_restricted()
status = response.get('room_status')
if error := self._ERROR_MAP.get(status):
raise ExtractorError(error, expected=True)
if status == 'public':
self.raise_geo_restricted()
self.report_warning(f'Got status "{status}" from API; falling back to webpage extraction')
return None
return {
'id': video_id,

View File

@@ -1854,12 +1854,26 @@ class InfoExtractor:
@staticmethod
def _remove_duplicate_formats(formats):
format_urls = set()
seen_urls = set()
seen_fragment_urls = set()
unique_formats = []
for f in formats:
if f['url'] not in format_urls:
format_urls.add(f['url'])
fragments = f.get('fragments')
if callable(fragments):
unique_formats.append(f)
elif fragments:
fragment_urls = frozenset(
fragment.get('url') or urljoin(f['fragment_base_url'], fragment['path'])
for fragment in fragments)
if fragment_urls not in seen_fragment_urls:
seen_fragment_urls.add(fragment_urls)
unique_formats.append(f)
elif f['url'] not in seen_urls:
seen_urls.add(f['url'])
unique_formats.append(f)
formats[:] = unique_formats
def _is_valid_url(self, url, video_id, item='video', headers={}):
@@ -3789,7 +3803,7 @@ class InfoExtractor:
def mark_watched(self, *args, **kwargs):
if not self.get_param('mark_watched', False):
return
if self.supports_login() and self._get_login_info()[0] is not None or self._cookies_passed:
if (self.supports_login() and self._get_login_info()[0] is not None) or self._cookies_passed:
self._mark_watched(*args, **kwargs)
def _mark_watched(self, *args, **kwargs):

View File

@@ -1,7 +1,4 @@
import time
from .common import InfoExtractor
from ..networking import HEADRequest
from ..utils import int_or_none
@@ -31,9 +28,6 @@ class CultureUnpluggedIE(InfoExtractor):
video_id = mobj.group('id')
display_id = mobj.group('display_id') or video_id
# request setClientTimezone.php to get PHPSESSID cookie which is need to get valid json data in the next request
self._request_webpage(HEADRequest(
'http://www.cultureunplugged.com/setClientTimezone.php?timeOffset=%d' % -(time.timezone / 3600)), display_id)
movie_data = self._download_json(
f'http://www.cultureunplugged.com/movie-data/cu-{video_id}.json', display_id)

View File

@@ -1,3 +1,4 @@
import functools
import hashlib
import re
import time
@@ -51,6 +52,15 @@ class DacastVODIE(DacastBaseIE):
'thumbnail': 'https://universe-files.dacast.com/26137208-5858-65c1-5e9a-9d6b6bd2b6c2',
},
'params': {'skip_download': 'm3u8'},
}, { # /uspaes/ in hls_url
'url': 'https://iframe.dacast.com/vod/f9823fc6-faba-b98f-0d00-4a7b50a58c5b/348c5c84-b6af-4859-bb9d-1d01009c795b',
'info_dict': {
'id': '348c5c84-b6af-4859-bb9d-1d01009c795b',
'ext': 'mp4',
'title': 'pl1-edyta-rubas-211124.mp4',
'uploader_id': 'f9823fc6-faba-b98f-0d00-4a7b50a58c5b',
'thumbnail': 'https://universe-files.dacast.com/4d0bd042-a536-752d-fc34-ad2fa44bbcbb.png',
},
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.dacast.com/support/knowledgebase/how-can-i-embed-a-video-on-my-website/',
@@ -74,6 +84,15 @@ class DacastVODIE(DacastBaseIE):
'params': {'skip_download': 'm3u8'},
}]
@functools.cached_property
def _usp_signing_secret(self):
player_js = self._download_webpage(
'https://player.dacast.com/js/player.js', None, 'Downloading player JS')
# Rotates every so often, but hardcode a fallback in case of JS change/breakage before rotation
return self._search_regex(
r'\bUSP_SIGNING_SECRET\s*=\s*(["\'])(?P<secret>(?:(?!\1).)+)', player_js,
'usp signing secret', group='secret', fatal=False) or 'odnInCGqhvtyRTtIiddxtuRtawYYICZP'
def _real_extract(self, url):
user_id, video_id = self._match_valid_url(url).group('user_id', 'id')
query = {'contentId': f'{user_id}-vod-{video_id}', 'provider': 'universe'}
@@ -94,10 +113,10 @@ class DacastVODIE(DacastBaseIE):
if 'DRM_EXT' in hls_url:
self.report_drm(video_id)
elif '/uspaes/' in hls_url:
# From https://player.dacast.com/js/player.js
# Ref: https://player.dacast.com/js/player.js
ts = int(time.time())
signature = hashlib.sha1(
f'{10413792000 - ts}{ts}YfaKtquEEpDeusCKbvYszIEZnWmBcSvw').digest().hex()
f'{10413792000 - ts}{ts}{self._usp_signing_secret}'.encode()).digest().hex()
hls_aes['uri'] = f'https://keys.dacast.com/uspaes/{video_id}.key?s={signature}&ts={ts}'
for retry in self.RetryManager():

View File

@@ -261,6 +261,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'tags': [],
'view_count': int,
'like_count': int,
'thumbnail': r're:https://\w+.dmcdn.net/v/WnEY61cmvMxt2Fi6d/x1080',
},
}, {
# https://geo.dailymotion.com/player/xf7zn.html?playlist=x7wdsj
@@ -288,6 +289,25 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'description': 'À bord du « véloto », lalternative à la voiture pour la campagne',
'tags': ['biclou', 'vélo', 'véloto', 'campagne', 'voiture', 'environnement', 'véhicules intermédiaires'],
},
}, {
# https://geo.dailymotion.com/player/xry80.html?video=x8vu47w
'url': 'https://www.metatube.com/en/videos/546765/This-frogs-decorates-Christmas-tree/',
'info_dict': {
'id': 'x8vu47w',
'ext': 'mp4',
'like_count': int,
'uploader': 'Metatube',
'thumbnail': r're:https://\w+.dmcdn.net/v/W1G_S1coGSFTfkTeR/x1080',
'upload_date': '20240326',
'view_count': int,
'timestamp': 1711496732,
'age_limit': 0,
'uploader_id': 'x2xpy74',
'title': 'Está lindas ranitas ponen su arbolito',
'duration': 28,
'description': 'Que lindura',
'tags': [],
},
}]
_GEO_BYPASS = False
_COMMON_MEDIA_FIELDS = '''description
@@ -302,7 +322,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
yield from super()._extract_embed_urls(url, webpage)
for mobj in re.finditer(
r'(?s)DM\.player\([^,]+,\s*{.*?video[\'"]?\s*:\s*["\']?(?P<id>[0-9a-zA-Z]+).+?}\s*\);', webpage):
yield from 'https://www.dailymotion.com/embed/video/' + mobj.group('id')
yield 'https://www.dailymotion.com/embed/video/' + mobj.group('id')
for mobj in re.finditer(
r'(?s)<script [^>]*\bsrc=(["\'])(?:https?:)?//[\w-]+\.dailymotion\.com/player/(?:(?!\1).)+\1[^>]*>', webpage):
attrs = extract_attributes(mobj.group(0))

View File

@@ -48,32 +48,30 @@ class DropboxIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
fn = urllib.parse.unquote(url_basename(url))
title = os.path.splitext(fn)[0]
password = self.get_param('videopassword')
content_id = None
for part in self._yield_decoded_parts(webpage):
if '/sm/password' in part:
webpage = self._download_webpage(
update_url('https://www.dropbox.com/sm/password', query=part.partition('?')[2]), video_id)
content_id = self._search_regex(r'content_id=([\w.+=/-]+)', part, 'content ID')
break
if (self._og_search_title(webpage, default=None) == 'Dropbox - Password Required'
or 'Enter the password for this link' in webpage):
if password:
response = self._download_json(
'https://www.dropbox.com/sm/auth', video_id, 'POSTing video password',
headers={'content-type': 'application/x-www-form-urlencoded; charset=UTF-8'},
data=urlencode_postdata({
'is_xhr': 'true',
't': self._get_cookies('https://www.dropbox.com')['t'].value,
'content_id': self._search_regex(r'content_id=([\w.+=/-]+)["\']', webpage, 'content id'),
'password': password,
'url': url,
}))
if response.get('status') != 'authed':
raise ExtractorError('Invalid password', expected=True)
elif not self._get_cookies('https://dropbox.com').get('sm_auth'):
if content_id:
password = self.get_param('videopassword')
if not password:
raise ExtractorError('Password protected video, use --video-password <password>', expected=True)
response = self._download_json(
'https://www.dropbox.com/sm/auth', video_id, 'POSTing video password',
data=urlencode_postdata({
'is_xhr': 'true',
't': self._get_cookies('https://www.dropbox.com')['t'].value,
'content_id': content_id,
'password': password,
'url': update_url(url, scheme='', netloc=''),
}))
if response.get('status') != 'authed':
raise ExtractorError('Invalid password', expected=True)
webpage = self._download_webpage(url, video_id)
formats, subtitles = [], {}

View File

@@ -5,15 +5,16 @@ from ..utils import (
get_element_text_and_html_by_tag,
int_or_none,
join_nonempty,
parse_qs,
str_or_none,
try_call,
unified_timestamp,
)
from ..utils.traversal import traverse_obj
from ..utils.traversal import traverse_obj, value
class DuoplayIE(InfoExtractor):
_VALID_URL = r'https?://duoplay\.ee/(?P<id>\d+)/[\w-]+/?(?:\?(?:[^#]+&)?ep=(?P<ep>\d+))?'
_VALID_URL = r'https?://duoplay\.ee/(?P<id>\d+)(?:[/?#]|$)'
_TESTS = [{
'note': 'Siberi võmm S02E12',
'url': 'https://duoplay.ee/4312/siberi-vomm?ep=24',
@@ -34,15 +35,16 @@ class DuoplayIE(InfoExtractor):
'episode_number': 12,
'episode_id': '24',
},
'skip': 'No video found',
}, {
'note': 'Empty title',
'url': 'https://duoplay.ee/17/uhikarotid?ep=14',
'md5': '6aca68be71112314738dd17cced7f8bf',
'md5': 'cba9f5dabf2582b224d80ac44fb80e47',
'info_dict': {
'id': '17_14',
'ext': 'mp4',
'title': 'Ühikarotid',
'thumbnail': r're:https://.+\.jpg(?:\?c=\d+)?$',
'title': 'Episode 14',
'thumbnail': r're:https?://.+\.jpg',
'description': 'md5:4719b418e058c209def41d48b601276e',
'upload_date': '20100916',
'timestamp': 1284661800,
@@ -52,6 +54,8 @@ class DuoplayIE(InfoExtractor):
'season_number': 2,
'episode_id': '14',
'release_year': 2010,
'episode': 'Episode 14',
'episode_number': 14,
},
}, {
'note': 'Movie without expiry',
@@ -68,10 +72,32 @@ class DuoplayIE(InfoExtractor):
'timestamp': 1671054000,
'release_year': 2018,
},
'skip': 'No video found',
}, {
'note': 'Episode url without show name',
'url': 'https://duoplay.ee/9644?ep=185',
'md5': '63f324b4fe2dbd8194dca16a6d52184a',
'info_dict': {
'id': '9644_185',
'ext': 'mp4',
'title': 'Episode 185',
'thumbnail': r're:https?://.+\.jpg',
'description': 'md5:ed25ba4e9e5d54bc291a4a0cdd241467',
'upload_date': '20241120',
'timestamp': 1732077000,
'episode': 'Episode 63',
'episode_id': '185',
'episode_number': 63,
'season': 'Season 2',
'season_number': 2,
'series': 'Telehommik',
'series_id': '9644',
},
}]
def _real_extract(self, url):
telecast_id, episode = self._match_valid_url(url).group('id', 'ep')
telecast_id = self._match_id(url)
episode = traverse_obj(parse_qs(url), ('ep', 0, {int_or_none}, {str_or_none}))
video_id = join_nonempty(telecast_id, episode, delim='_')
webpage = self._download_webpage(url, video_id)
video_player = try_call(lambda: extract_attributes(
@@ -79,25 +105,33 @@ class DuoplayIE(InfoExtractor):
if not video_player or not video_player.get('manifest-url'):
raise ExtractorError('No video found', expected=True)
manifest_url = video_player['manifest-url']
session_token = self._download_json(
'https://sts.postimees.ee/session/register', video_id, 'Registering session',
'Unable to register session', headers={
'Accept': 'application/json',
'X-Original-URI': manifest_url,
})['session']
episode_attr = self._parse_json(video_player.get(':episode') or '', video_id, fatal=False) or {}
return {
'id': video_id,
'formats': self._extract_m3u8_formats(video_player['manifest-url'], video_id, 'mp4'),
'formats': self._extract_m3u8_formats(manifest_url, video_id, 'mp4', query={'s': session_token}),
**traverse_obj(episode_attr, {
'title': 'title',
'description': 'synopsis',
'title': ('title', {str}),
'description': ('synopsis', {str}),
'thumbnail': ('images', 'original'),
'timestamp': ('airtime', {lambda x: unified_timestamp(x + ' +0200')}),
'cast': ('cast', {lambda x: x.split(', ')}),
'cast': ('cast', filter, {lambda x: x.split(', ')}),
'release_year': ('year', {int_or_none}),
}),
**(traverse_obj(episode_attr, {
'title': (None, ('subtitle', ('episode_nr', {lambda x: f'Episode {x}' if x else None}))),
'series': 'title',
'title': (None, (('subtitle', {str}, filter), {value(f'Episode {episode}' if episode else None)})),
'series': ('title', {str}),
'series_id': ('telecast_id', {str_or_none}),
'season_number': ('season_id', {int_or_none}),
'episode': 'subtitle',
'episode': ('subtitle', {str}, filter),
'episode_number': ('episode_nr', {int_or_none}),
'episode_id': ('episode_id', {str_or_none}),
}, get_all=False) if episode_attr.get('category') != 'movies' else {}),

View File

@@ -50,7 +50,7 @@ class FacebookIE(InfoExtractor):
[^/]+/videos/(?:[^/]+/)?|
[^/]+/posts/|
events/(?:[^/]+/)?|
groups/[^/]+/(?:permalink|posts)/|
groups/[^/]+/(?:permalink|posts)/(?:[\da-f]+/)?|
watchparty/
)|
facebook:
@@ -410,6 +410,9 @@ class FacebookIE(InfoExtractor):
'uploader': 'Comitato Liberi Pensatori',
'uploader_id': '100065709540881',
},
}, {
'url': 'https://www.facebook.com/groups/1513990329015294/posts/d41d8cd9/2013209885760000/?app=fbl',
'only_matching': True,
}]
_SUPPORTED_PAGLETS_REGEX = r'(?:pagelet_group_mall|permalink_video_pagelet|hyperfeed_story_id_[0-9a-f]+)'
_api_config = {

View File

@@ -193,9 +193,9 @@ class FunimationIE(FunimationBaseIE):
for lang, version, fmt in self._get_experiences(episode):
experience_id = str(fmt['experienceId'])
if (only_initial_experience and experience_id != initial_experience_id
or requested_languages and lang.lower() not in requested_languages
or requested_versions and version.lower() not in requested_versions):
if ((only_initial_experience and experience_id != initial_experience_id)
or (requested_languages and lang.lower() not in requested_languages)
or (requested_versions and version.lower() not in requested_versions)):
continue
thumbnails.append({'url': fmt.get('poster')})
duration = max(duration, fmt.get('duration', 0))

View File

@@ -254,7 +254,7 @@ class InstagramIOSIE(InfoExtractor):
class InstagramIE(InstagramBaseIE):
_VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com(?:/[^/]+)?/(?:p|tv|reels?(?!/audio/))/(?P<id>[^/?#&]+))'
_VALID_URL = r'(?P<url>https?://(?:www\.)?instagram\.com(?:/(?!share/)[^/?#]+)?/(?:p|tv|reels?(?!/audio/))/(?P<id>[^/?#&]+))'
_EMBED_REGEX = [r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//(?:www\.)?instagram\.com/p/[^/]+/embed.*?)\1']
_TESTS = [{
'url': 'https://instagram.com/p/aye83DjauH/?foo=bar#abc',

View File

@@ -26,6 +26,7 @@ class MicrosoftEmbedIE(InfoExtractor):
'timestamp': 1631658316,
'upload_date': '20210914',
},
'expected_warnings': ['Failed to parse XML: syntax error: line 1, column 0'],
}]
_API_URL = 'https://prod-video-cms-rt-microsoft-com.akamaized.net/vhs/api/videos/'
@@ -36,11 +37,11 @@ class MicrosoftEmbedIE(InfoExtractor):
formats = []
for source_type, source in metadata['streams'].items():
if source_type == 'smooth_Streaming':
formats.extend(self._extract_ism_formats(source['url'], video_id, 'mss'))
formats.extend(self._extract_ism_formats(source['url'], video_id, 'mss', fatal=False))
elif source_type == 'apple_HTTP_Live_Streaming':
formats.extend(self._extract_m3u8_formats(source['url'], video_id, 'mp4'))
formats.extend(self._extract_m3u8_formats(source['url'], video_id, 'mp4', fatal=False))
elif source_type == 'mPEG_DASH':
formats.extend(self._extract_mpd_formats(source['url'], video_id))
formats.extend(self._extract_mpd_formats(source['url'], video_id, fatal=False))
else:
formats.append({
'format_id': source_type,

View File

@@ -80,9 +80,9 @@ class MiTeleIE(TelecincoBaseIE):
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
pre_player = self._parse_json(self._search_regex(
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=\s*({.+})',
webpage, 'Pre Player'), display_id)['prePlayer']
pre_player = self._search_json(
r'window\.\$REACTBASE_STATE\.prePlayer_mtweb\s*=',
webpage, 'Pre Player', display_id)['prePlayer']
title = pre_player['title']
video_info = self._parse_content(pre_player['video'], url)
content = pre_player.get('content') or {}

View File

@@ -1,4 +1,5 @@
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
traverse_obj,
@@ -110,8 +111,8 @@ class PixivSketchUserIE(PixivSketchBaseIE):
if not traverse_obj(data, 'is_broadcasting'):
try:
self._call_api(user_id, 'users/current.json', url, 'Investigating reason for request failure')
except ExtractorError as ex:
if ex.cause and ex.cause.code == 401:
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 401:
self.raise_login_required(f'Please log in, or use direct link like https://sketch.pixiv.net/@{user_id}/1234567890', method='cookies')
raise ExtractorError('This user is offline', expected=True)

View File

@@ -28,24 +28,21 @@ class StripchatIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id, headers=self.geo_verification_headers())
data = self._search_json(
r'<script\b[^>]*>\s*window\.__PRELOADED_STATE__\s*=',
webpage, 'data', video_id, transform_source=lowercase_escape)
data = self._parse_json(
self._search_regex(
r'<script\b[^>]*>\s*window\.__PRELOADED_STATE__\s*=(?P<value>.*?)<\/script>',
webpage, 'data', default='{}', group='value'),
video_id, transform_source=lowercase_escape, fatal=False)
if not data:
raise ExtractorError('Unable to find configuration for stream.')
if traverse_obj(data, ('viewCam', 'show'), expected_type=dict):
raise ExtractorError('Model is in private show', expected=True)
elif not traverse_obj(data, ('viewCam', 'model', 'isLive'), expected_type=bool):
if traverse_obj(data, ('viewCam', 'show', {dict})):
raise ExtractorError('Model is in a private show', expected=True)
if not traverse_obj(data, ('viewCam', 'model', 'isLive', {bool})):
raise UserNotLive(video_id=video_id)
model_id = traverse_obj(data, ('viewCam', 'model', 'id'), expected_type=int)
model_id = data['viewCam']['model']['id']
formats = []
for host in traverse_obj(data, ('config', 'data', (
# HLS hosts are currently found in .configV3.static.features.hlsFallback.fallbackDomains[]
# The rest of the path is for backwards compatibility and to guard against A/B testing
for host in traverse_obj(data, ((('config', 'data'), ('configV3', 'static')), (
(('features', 'featuresV2'), 'hlsFallback', 'fallbackDomains', ...), 'hlsStreamHost'))):
formats = self._extract_m3u8_formats(
f'https://edge-hls.{host}/hls/{model_id}/master/{model_id}_auto.m3u8',
@@ -53,7 +50,7 @@ class StripchatIE(InfoExtractor):
if formats:
break
if not formats:
self.raise_no_formats('No active streams found', expected=True)
self.raise_no_formats('Unable to extract stream host', video_id=video_id)
return {
'id': video_id,

View File

@@ -413,15 +413,6 @@ class TikTokBaseIE(InfoExtractor):
for f in formats:
self._set_cookie(urllib.parse.urlparse(f['url']).hostname, 'sid_tt', auth_cookie.value)
thumbnails = []
for cover_id in ('cover', 'ai_dynamic_cover', 'animated_cover', 'ai_dynamic_cover_bak',
'origin_cover', 'dynamic_cover'):
for cover_url in traverse_obj(video_info, (cover_id, 'url_list', ...)):
thumbnails.append({
'id': cover_id,
'url': cover_url,
})
stats_info = aweme_detail.get('statistics') or {}
music_info = aweme_detail.get('music') or {}
labels = traverse_obj(aweme_detail, ('hybrid_label', ..., 'text'), expected_type=str)
@@ -467,7 +458,17 @@ class TikTokBaseIE(InfoExtractor):
'formats': formats,
'subtitles': self.extract_subtitles(
aweme_detail, aweme_id, traverse_obj(author_info, 'uploader', 'uploader_id', 'channel_id')),
'thumbnails': thumbnails,
'thumbnails': [
{
'id': cover_id,
'url': cover_url,
'preference': -1 if cover_id in ('cover', 'origin_cover') else -2,
}
for cover_id in (
'cover', 'ai_dynamic_cover', 'animated_cover',
'ai_dynamic_cover_bak', 'origin_cover', 'dynamic_cover')
for cover_url in traverse_obj(video_info, (cover_id, 'url_list', ...))
],
'duration': (traverse_obj(video_info, (
(None, 'download_addr'), 'duration', {int_or_none(scale=1000)}, any))
or traverse_obj(music_info, ('duration', {int_or_none}))),
@@ -600,11 +601,15 @@ class TikTokBaseIE(InfoExtractor):
'repost_count': 'shareCount',
'comment_count': 'commentCount',
}), expected_type=int_or_none),
'thumbnails': traverse_obj(aweme_detail, (
(None, 'video'), ('thumbnail', 'cover', 'dynamicCover', 'originCover'), {
'url': ({url_or_none}, {self._proto_relative_url}),
},
)),
'thumbnails': [
{
'id': cover_id,
'url': self._proto_relative_url(cover_url),
'preference': -2 if cover_id == 'dynamicCover' else -1,
}
for cover_id in ('thumbnail', 'cover', 'dynamicCover', 'originCover')
for cover_url in traverse_obj(aweme_detail, ((None, 'video'), cover_id, {url_or_none}))
],
}

View File

@@ -17,10 +17,10 @@ from ..utils import (
get_element_html_by_id,
int_or_none,
join_nonempty,
parse_qs,
parse_resolution,
str_or_none,
str_to_int,
traverse_obj,
try_call,
unescapeHTML,
unified_timestamp,
@@ -29,6 +29,7 @@ from ..utils import (
urlencode_postdata,
urljoin,
)
from ..utils.traversal import require, traverse_obj
class VKBaseIE(InfoExtractor):
@@ -91,17 +92,17 @@ class VKBaseIE(InfoExtractor):
class VKIE(VKBaseIE):
IE_NAME = 'vk'
IE_DESC = 'VK'
_EMBED_REGEX = [r'<iframe[^>]+?src=(["\'])(?P<url>https?://vk\.com/video_ext\.php.+?)\1']
_EMBED_REGEX = [r'<iframe[^>]+?src=(["\'])(?P<url>https?://vk(?:(?:video)?\.ru|\.com)/video_ext\.php.+?)\1']
_VALID_URL = r'''(?x)
https?://
(?:
(?:
(?:(?:m|new)\.)?vk\.com/video_|
(?:(?:m|new)\.)?vk(?:(?:video)?\.ru|\.com)/video_|
(?:www\.)?daxab\.com/
)
ext\.php\?(?P<embed_query>.*?\boid=(?P<oid>-?\d+).*?\bid=(?P<id>\d+).*)|
(?:
(?:(?:m|new)\.)?vk\.com/(?:.+?\?.*?z=)?(?:video|clip)|
(?:(?:m|new)\.)?vk(?:(?:video)?\.ru|\.com)/(?:.+?\?.*?z=)?(?:video|clip)|
(?:www\.)?daxab\.com/embed/
)
(?P<videoid>-?\d+_\d+)(?:.*\blist=(?P<list_id>([\da-f]+)|(ln-[\da-zA-Z]+)))?
@@ -110,7 +111,7 @@ class VKIE(VKBaseIE):
_TESTS = [
{
'url': 'http://vk.com/videos-77521?z=video-77521_162222515%2Fclub77521',
'url': 'https://vk.com/videos-77521?z=video-77521_162222515%2Fclub77521',
'info_dict': {
'id': '-77521_162222515',
'ext': 'mp4',
@@ -127,7 +128,7 @@ class VKIE(VKBaseIE):
'params': {'skip_download': 'm3u8'},
},
{
'url': 'http://vk.com/video205387401_165548505',
'url': 'https://vk.com/video205387401_165548505',
'info_dict': {
'id': '205387401_165548505',
'ext': 'mp4',
@@ -182,10 +183,10 @@ class VKIE(VKBaseIE):
'ext': 'mp4',
'title': "DSWD Awards 'Children's Joy Foundation, Inc.' Certificate of Registration and License to Operate",
'description': 'md5:bf9c26cfa4acdfb146362682edd3827a',
'duration': 178,
'duration': 179,
'upload_date': '20130117',
'uploader': "Children's Joy Foundation Inc.",
'uploader_id': 'thecjf',
'uploader_id': '@CJFIofficial',
'view_count': int,
'channel_id': 'UCgzCNQ11TmR9V97ECnhi3gw',
'availability': 'public',
@@ -193,7 +194,7 @@ class VKIE(VKBaseIE):
'live_status': 'not_live',
'playable_in_embed': True,
'channel': 'Children\'s Joy Foundation Inc.',
'uploader_url': 'http://www.youtube.com/user/thecjf',
'uploader_url': 'https://www.youtube.com/@CJFIofficial',
'thumbnail': r're:https?://.+\.jpg$',
'tags': 'count:27',
'start_time': 0.0,
@@ -201,6 +202,7 @@ class VKIE(VKBaseIE):
'channel_url': 'https://www.youtube.com/channel/UCgzCNQ11TmR9V97ECnhi3gw',
'channel_follower_count': int,
'age_limit': 0,
'timestamp': 1358394935,
},
},
{
@@ -222,6 +224,7 @@ class VKIE(VKBaseIE):
'thumbnail': r're:https?://.+x1080$',
'tags': list,
},
'skip': 'This video has been deleted and is no longer available.',
},
{
'url': 'https://vk.com/clips-74006511?z=clip-74006511_456247211',
@@ -235,13 +238,13 @@ class VKIE(VKBaseIE):
'timestamp': 1664995597,
'title': 'Clip by @madempress',
'upload_date': '20221005',
'uploader': 'Шальная императрица',
'uploader': 'Шальная Императрица',
'uploader_id': '-74006511',
},
},
{
# video key is extra_data not url\d+
'url': 'http://vk.com/video-110305615_171782105',
'url': 'https://vk.com/video-110305615_171782105',
'md5': 'e13fcda136f99764872e739d13fac1d1',
'info_dict': {
'id': '-110305615_171782105',
@@ -273,6 +276,7 @@ class VKIE(VKBaseIE):
'params': {
'skip_download': True,
},
'skip': 'No formats found',
},
{
# live stream, hls and rtmp links, most likely already finished live
@@ -312,7 +316,16 @@ class VKIE(VKBaseIE):
{
'url': 'https://vk.com/clip30014565_456240946',
'only_matching': True,
}]
},
{
'url': 'https://vkvideo.ru/video-127553155_456242961',
'only_matching': True,
},
{
'url': 'https://vk.ru/video-220754053_456242564',
'only_matching': True,
},
]
def _real_extract(self, url):
mobj = self._match_valid_url(url)
@@ -338,7 +351,7 @@ class VKIE(VKBaseIE):
video_id = '{}_{}'.format(mobj.group('oid'), mobj.group('id'))
info_page = self._download_webpage(
'http://vk.com/video_ext.php?' + mobj.group('embed_query'), video_id)
'https://vk.com/video_ext.php?' + mobj.group('embed_query'), video_id)
error_message = self._html_search_regex(
[r'(?s)<!><div[^>]+class="video_layer_message"[^>]*>(.+?)</div>',
@@ -432,7 +445,7 @@ class VKIE(VKBaseIE):
if m_opts_url:
opts_url = m_opts_url.group(1)
if opts_url.startswith('//'):
opts_url = 'http:' + opts_url
opts_url = 'https:' + opts_url
return self.url_result(opts_url)
data = player['params'][0]
@@ -512,8 +525,11 @@ class VKIE(VKBaseIE):
class VKUserVideosIE(VKBaseIE):
IE_NAME = 'vk:uservideos'
IE_DESC = "VK - User's Videos"
_VALID_URL = r'https?://(?:(?:m|new)\.)?vk\.com/video/(?:playlist/)?(?P<id>[^?$#/&]+)(?!\?.*\bz=video)(?:[/?#&](?:.*?\bsection=(?P<section>\w+))?|$)'
_TEMPLATE_URL = 'https://vk.com/videos'
_BASE_URL_RE = r'https?://(?:(?:m|new)\.)?vk(?:video\.ru|\.com/video)'
_VALID_URL = [
rf'{_BASE_URL_RE}/playlist/(?P<id>-?\d+_\d+)',
rf'{_BASE_URL_RE}/(?P<id>@[^/?#]+)(?:/all)?/?(?!\?.*\bz=video)(?:[?#]|$)',
]
_TESTS = [{
'url': 'https://vk.com/video/@mobidevices',
'info_dict': {
@@ -527,12 +543,20 @@ class VKUserVideosIE(VKBaseIE):
},
'playlist_mincount': 182,
}, {
'url': 'https://vk.com/video/playlist/-174476437_2',
'url': 'https://vkvideo.ru/playlist/-204353299_426',
'info_dict': {
'id': '-174476437_playlist_2',
'title': 'Анонсы',
'id': '-204353299_playlist_426',
},
'playlist_mincount': 108,
'playlist_mincount': 33,
}, {
'url': 'https://vk.com/video/@gorkyfilmstudio/all',
'only_matching': True,
}, {
'url': 'https://vkvideo.ru/@mobidevices',
'only_matching': True,
}, {
'url': 'https://vk.com/video/playlist/-174476437_2',
'only_matching': True,
}]
_VIDEO = collections.namedtuple('Video', ['owner_id', 'id'])
@@ -552,7 +576,7 @@ class VKUserVideosIE(VKBaseIE):
v = self._VIDEO._make(video[:2])
video_id = '%d_%d' % (v.owner_id, v.id)
yield self.url_result(
'http://vk.com/video' + video_id, VKIE.ie_key(), video_id)
'https://vk.com/video' + video_id, VKIE.ie_key(), video_id)
if count >= total:
break
video_list_json = self._download_payload('al_video', page_id, {
@@ -561,23 +585,25 @@ class VKUserVideosIE(VKBaseIE):
'oid': page_id,
'section': section,
})[0][section]
count += video_list_json['count']
new_count = video_list_json['count']
if not new_count:
self.to_screen(f'{page_id}: Skipping {total - count} unavailable videos')
break
count += new_count
video_list = video_list_json['list']
def _real_extract(self, url):
u_id, section = self._match_valid_url(url).groups()
u_id = self._match_id(url)
webpage = self._download_webpage(url, u_id)
if u_id.startswith('@'):
page_id = self._search_regex(r'data-owner-id\s?=\s?"([^"]+)"', webpage, 'page_id')
elif '_' in u_id:
page_id, section = u_id.split('_', 1)
section = f'playlist_{section}'
page_id = traverse_obj(
self._search_json(r'\bvar newCur\s*=', webpage, 'cursor data', u_id),
('oid', {int}, {str_or_none}, {require('page id')}))
section = traverse_obj(parse_qs(url), ('section', 0)) or 'all'
else:
raise ExtractorError('Invalid URL', expected=True)
if not section:
section = 'all'
page_id, _, section = u_id.partition('_')
section = f'playlist_{section}'
playlist_title = clean_html(get_element_by_class('VideoInfoPanel__title', webpage))
return self.playlist_result(self._entries(page_id, section), f'{page_id}_{section}', playlist_title)
@@ -717,7 +743,7 @@ class VKWallPostIE(VKBaseIE):
class VKPlayBaseIE(InfoExtractor):
_BASE_URL_RE = r'https?://(?:vkplay\.live|live\.vkplay\.ru)/'
_BASE_URL_RE = r'https?://(?:vkplay\.live|live\.vk(?:play|video)\.ru)/'
_RESOLUTIONS = {
'tiny': '256x144',
'lowest': '426x240',
@@ -797,6 +823,9 @@ class VKPlayIE(VKPlayBaseIE):
}, {
'url': 'https://live.vkplay.ru/lebwa/record/33a4e4ce-e3ef-49db-bb14-f006cc6fabc9/records',
'only_matching': True,
}, {
'url': 'https://live.vkvideo.ru/lebwa/record/33a4e4ce-e3ef-49db-bb14-f006cc6fabc9/records',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -839,6 +868,9 @@ class VKPlayLiveIE(VKPlayBaseIE):
}, {
'url': 'https://live.vkplay.ru/lebwa',
'only_matching': True,
}, {
'url': 'https://live.vkvideo.ru/panterka',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -83,6 +83,7 @@ INNERTUBE_CLIENTS = {
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 1,
'REQUIRE_PO_TOKEN': True,
'SUPPORTS_COOKIES': True,
},
# Safari UA returns pre-merged video+audio 144p/240p/360p/720p/1080p HLS formats
'web_safari': {
@@ -95,6 +96,7 @@ INNERTUBE_CLIENTS = {
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 1,
'REQUIRE_PO_TOKEN': True,
'SUPPORTS_COOKIES': True,
},
'web_embedded': {
'INNERTUBE_CONTEXT': {
@@ -104,6 +106,7 @@ INNERTUBE_CLIENTS = {
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 56,
'SUPPORTS_COOKIES': True,
},
'web_music': {
'INNERTUBE_HOST': 'music.youtube.com',
@@ -114,6 +117,7 @@ INNERTUBE_CLIENTS = {
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 67,
'SUPPORTS_COOKIES': True,
},
# This client now requires sign-in for every video
'web_creator': {
@@ -125,6 +129,7 @@ INNERTUBE_CLIENTS = {
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 62,
'REQUIRE_AUTH': True,
'SUPPORTS_COOKIES': True,
},
'android': {
'INNERTUBE_CONTEXT': {
@@ -157,6 +162,7 @@ INNERTUBE_CLIENTS = {
'REQUIRE_JS_PLAYER': False,
'REQUIRE_PO_TOKEN': True,
'REQUIRE_AUTH': True,
'SUPPORTS_COOKIES': True,
},
# This client now requires sign-in for every video
'android_creator': {
@@ -191,6 +197,7 @@ INNERTUBE_CLIENTS = {
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 28,
'REQUIRE_JS_PLAYER': False,
'SUPPORTS_COOKIES': True,
},
# iOS clients have HLS live streams. Setting device model to get 60fps formats.
# See: https://github.com/TeamNewPipe/NewPipeExtractor/issues/680#issuecomment-1002724558
@@ -225,6 +232,7 @@ INNERTUBE_CLIENTS = {
'INNERTUBE_CONTEXT_CLIENT_NAME': 26,
'REQUIRE_JS_PLAYER': False,
'REQUIRE_AUTH': True,
'SUPPORTS_COOKIES': True,
},
# This client now requires sign-in for every video
'ios_creator': {
@@ -253,6 +261,7 @@ INNERTUBE_CLIENTS = {
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 2,
'SUPPORTS_COOKIES': True,
},
'tv': {
'INNERTUBE_CONTEXT': {
@@ -262,6 +271,7 @@ INNERTUBE_CLIENTS = {
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 7,
'SUPPORTS_COOKIES': True,
},
# This client now requires sign-in for every video
# It was previously an age-gate workaround for videos that were `playable_in_embed`
@@ -275,19 +285,7 @@ INNERTUBE_CLIENTS = {
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 85,
'REQUIRE_AUTH': True,
},
# This client now requires sign-in for every video
# It may be able to receive pre-merged video+audio 720p/1080p streams
'mediaconnect': {
'INNERTUBE_CONTEXT': {
'client': {
'clientName': 'MEDIA_CONNECT_FRONTEND',
'clientVersion': '0.1',
},
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 95,
'REQUIRE_JS_PLAYER': False,
'REQUIRE_AUTH': True,
'SUPPORTS_COOKIES': True,
},
}
@@ -317,6 +315,7 @@ def build_innertube_clients():
ytcfg.setdefault('REQUIRE_JS_PLAYER', True)
ytcfg.setdefault('REQUIRE_PO_TOKEN', False)
ytcfg.setdefault('REQUIRE_AUTH', False)
ytcfg.setdefault('SUPPORTS_COOKIES', False)
ytcfg.setdefault('PLAYER_PARAMS', None)
ytcfg['INNERTUBE_CONTEXT']['client'].setdefault('hl', 'en')
@@ -1357,6 +1356,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
}
_SUBTITLE_FORMATS = ('json3', 'srv1', 'srv2', 'srv3', 'ttml', 'vtt')
_DEFAULT_CLIENTS = ('ios', 'mweb')
_DEFAULT_AUTHED_CLIENTS = ('web_creator', 'mweb')
_GEO_BYPASS = False
@@ -2925,7 +2925,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
# Obtain from MPD's maximum seq value
old_mpd_url = mpd_url
last_error = ctx.pop('last_error', None)
expire_fast = immediate or last_error and isinstance(last_error, HTTPError) and last_error.status == 403
expire_fast = immediate or (last_error and isinstance(last_error, HTTPError) and last_error.status == 403)
mpd_url, stream_number, is_live = (mpd_feed(format_id, 5 if expire_fast else 18000)
or (mpd_url, stream_number, False))
if not refresh_sequence:
@@ -3823,12 +3823,13 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _get_requested_clients(self, url, smuggled_data):
requested_clients = []
excluded_clients = []
default_clients = self._DEFAULT_AUTHED_CLIENTS if self.is_authenticated else self._DEFAULT_CLIENTS
allowed_clients = sorted(
(client for client in INNERTUBE_CLIENTS if client[:1] != '_'),
key=lambda client: INNERTUBE_CLIENTS[client]['priority'], reverse=True)
for client in self._configuration_arg('player_client'):
if client == 'default':
requested_clients.extend(self._DEFAULT_CLIENTS)
requested_clients.extend(default_clients)
elif client == 'all':
requested_clients.extend(allowed_clients)
elif client.startswith('-'):
@@ -3838,7 +3839,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
else:
requested_clients.append(client)
if not requested_clients:
requested_clients.extend(self._DEFAULT_CLIENTS)
requested_clients.extend(default_clients)
for excluded_client in excluded_clients:
if excluded_client in requested_clients:
requested_clients.remove(excluded_client)
@@ -3850,9 +3851,18 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
_, base_client, variant = _split_innertube_client(requested_client)
music_client = f'{base_client}_music' if base_client != 'mweb' else 'web_music'
if variant != 'music' and music_client in INNERTUBE_CLIENTS:
if not INNERTUBE_CLIENTS[music_client]['REQUIRE_AUTH'] or self.is_authenticated:
client_info = INNERTUBE_CLIENTS[music_client]
if not client_info['REQUIRE_AUTH'] or (self.is_authenticated and client_info['SUPPORTS_COOKIES']):
requested_clients.append(music_client)
if self.is_authenticated:
unsupported_clients = [
client for client in requested_clients if not INNERTUBE_CLIENTS[client]['SUPPORTS_COOKIES']
]
for client in unsupported_clients:
self.report_warning(f'Skipping client "{client}" since it does not support cookies', only_once=True)
requested_clients.remove(client)
return orderedSet(requested_clients)
def _invalid_player_response(self, pr, video_id):
@@ -3958,6 +3968,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
else:
prs.append(pr)
''' This code is pointless while web_creator is in _DEFAULT_AUTHED_CLIENTS
# EU countries require age-verification for accounts to access age-restricted videos
# If account is not age-verified, _is_agegated() will be truthy for non-embedded clients
if self.is_authenticated and self._is_agegated(pr):
@@ -3965,9 +3976,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
f'{video_id}: This video is age-restricted and YouTube is requiring '
'account age-verification; some formats may be missing', only_once=True)
# web_creator can work around the age-verification requirement
# android_vr and mediaconnect may also be able to work around age-verification
# android_vr may also be able to work around age-verification
# tv_embedded may(?) still work around age-verification if the video is embeddable
append_client('web_creator')
'''
prs.extend(deprioritized_prs)
@@ -3983,8 +3995,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
return prs, player_url
def _needs_live_processing(self, live_status, duration):
if (live_status == 'is_live' and self.get_param('live_from_start')
or live_status == 'post_live' and (duration or 0) > 2 * 3600):
if ((live_status == 'is_live' and self.get_param('live_from_start'))
or (live_status == 'post_live' and (duration or 0) > 2 * 3600)):
return live_status
def _extract_formats_and_subtitles(self, streaming_data, video_id, player_url, live_status, duration):
@@ -4180,7 +4192,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
skip_manifests = set(self._configuration_arg('skip'))
if (not self.get_param('youtube_include_hls_manifest', True)
or needs_live_processing == 'is_live' # These will be filtered out by YoutubeDL anyway
or needs_live_processing and skip_bad_formats):
or (needs_live_processing and skip_bad_formats)):
skip_manifests.add('hls')
if not self.get_param('youtube_include_dash_manifest', True):
@@ -4378,14 +4390,14 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
expected_type=dict)
translated_title = self._get_text(microformats, (..., 'title'))
video_title = (self._preferred_lang and translated_title
video_title = ((self._preferred_lang and translated_title)
or get_first(video_details, 'title') # primary
or translated_title
or search_meta(['og:title', 'twitter:title', 'title']))
translated_description = self._get_text(microformats, (..., 'description'))
original_description = get_first(video_details, 'shortDescription')
video_description = (
self._preferred_lang and translated_description
(self._preferred_lang and translated_description)
# If original description is blank, it will be an empty string.
# Do not prefer translated description in this case.
or original_description if original_description is not None else translated_description)
@@ -4986,6 +4998,10 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
for item in grid_renderer['items']:
if not isinstance(item, dict):
continue
if lockup_view_model := traverse_obj(item, ('lockupViewModel', {dict})):
if entry := self._extract_lockup_view_model(lockup_view_model):
yield entry
continue
renderer = self._extract_basic_item_renderer(item)
if not isinstance(renderer, dict):
continue
@@ -5084,10 +5100,30 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
continue
yield self._extract_video(renderer)
def _extract_lockup_view_model(self, view_model):
content_id = view_model.get('contentId')
if not content_id:
return
content_type = view_model.get('contentType')
if content_type not in ('LOCKUP_CONTENT_TYPE_PLAYLIST', 'LOCKUP_CONTENT_TYPE_PODCAST'):
self.report_warning(
f'Unsupported lockup view model content type "{content_type}"{bug_reports_message()}', only_once=True)
return
return self.url_result(
f'https://www.youtube.com/playlist?list={content_id}', ie=YoutubeTabIE, video_id=content_id,
title=traverse_obj(view_model, (
'metadata', 'lockupMetadataViewModel', 'title', 'content', {str})),
thumbnails=self._extract_thumbnails(view_model, (
'contentImage', 'collectionThumbnailViewModel', 'primaryThumbnail', 'thumbnailViewModel', 'image'), final_key='sources'))
def _rich_entries(self, rich_grid_renderer):
if lockup_view_model := traverse_obj(rich_grid_renderer, ('content', 'lockupViewModel', {dict})):
if entry := self._extract_lockup_view_model(lockup_view_model):
yield entry
return
renderer = traverse_obj(
rich_grid_renderer,
('content', ('videoRenderer', 'reelItemRenderer', 'playlistRenderer', 'shortsLockupViewModel', 'lockupViewModel'), any)) or {}
('content', ('videoRenderer', 'reelItemRenderer', 'playlistRenderer', 'shortsLockupViewModel'), any)) or {}
video_id = renderer.get('videoId')
if video_id:
yield self._extract_video(renderer)
@@ -5114,18 +5150,6 @@ class YoutubeTabBaseInfoExtractor(YoutubeBaseInfoExtractor):
})),
thumbnails=self._extract_thumbnails(renderer, 'thumbnail', final_key='sources'))
return
# lockupViewModel extraction
content_id = renderer.get('contentId')
if content_id and renderer.get('contentType') == 'LOCKUP_CONTENT_TYPE_PODCAST':
yield self.url_result(
f'https://www.youtube.com/playlist?list={content_id}',
ie=YoutubeTabIE, video_id=content_id,
**traverse_obj(renderer, {
'title': ('metadata', 'lockupMetadataViewModel', 'title', 'content', {str}),
}),
thumbnails=self._extract_thumbnails(renderer, (
'contentImage', 'collectionThumbnailViewModel', 'primaryThumbnail', 'thumbnailViewModel', 'image'), final_key='sources'))
return
def _video_entry(self, video_renderer):
video_id = video_renderer.get('videoId')
@@ -5794,7 +5818,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'info_dict': {
'id': 'UCYO_jab_esuFRV4b17AJtAw',
'title': '3Blue1Brown - Playlists',
'description': 'md5:4d1da95432004b7ba840ebc895b6b4c9',
'description': 'md5:602e3789e6a0cb7d9d352186b720e395',
'channel_url': 'https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw',
'channel': '3Blue1Brown',
'channel_id': 'UCYO_jab_esuFRV4b17AJtAw',
@@ -6813,7 +6837,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
tab_url = urljoin(base_url, traverse_obj(
tab, ('endpoint', 'commandMetadata', 'webCommandMetadata', 'url')))
tab_id = (tab_url and self._get_url_mobj(tab_url)['tab'][1:]
tab_id = ((tab_url and self._get_url_mobj(tab_url)['tab'][1:])
or traverse_obj(tab, 'tabIdentifier', expected_type=str))
if tab_id:
return {

View File

@@ -183,4 +183,4 @@ def load_plugins(name, suffix):
sys.meta_path.insert(0, PluginFinder(f'{PACKAGE_NAME}.extractor', f'{PACKAGE_NAME}.postprocessor'))
__all__ = ['directories', 'load_plugins', 'PACKAGE_NAME', 'COMPAT_PACKAGE_NAME']
__all__ = ['COMPAT_PACKAGE_NAME', 'PACKAGE_NAME', 'directories', 'load_plugins']

View File

@@ -44,4 +44,4 @@ def get_postprocessor(key):
globals().update(_PLUGIN_CLASSES)
__all__ = [name for name in globals() if name.endswith('PP')]
__all__.extend(('PostProcessor', 'FFmpegPostProcessor'))
__all__.extend(('FFmpegPostProcessor', 'PostProcessor'))

View File

@@ -626,7 +626,7 @@ class FFmpegEmbedSubtitlePP(FFmpegPostProcessor):
sub_ext = sub_info['ext']
if sub_ext == 'json':
self.report_warning('JSON subtitles cannot be embedded')
elif ext != 'webm' or ext == 'webm' and sub_ext == 'vtt':
elif ext != 'webm' or (ext == 'webm' and sub_ext == 'vtt'):
sub_langs.append(lang)
sub_names.append(sub_info.get('name'))
sub_filenames.append(sub_info['filepath'])

View File

@@ -2683,8 +2683,8 @@ def merge_dicts(*dicts):
merged = {}
for a_dict in dicts:
for k, v in a_dict.items():
if (v is not None and k not in merged
or isinstance(v, str) and merged[k] == ''):
if ((v is not None and k not in merged)
or (isinstance(v, str) and merged[k] == '')):
merged[k] = v
return merged

View File

@@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py
__version__ = '2024.11.18'
__version__ = '2024.12.03'
RELEASE_GIT_HEAD = '7ea2787920cccc6b8ea30791993d114fbd564434'
RELEASE_GIT_HEAD = '2b67ac300ac8b44368fb121637d1743cea8c5b6b'
VARIANT = None
@@ -12,4 +12,4 @@ CHANNEL = 'stable'
ORIGIN = 'yt-dlp/yt-dlp'
_pkg_version = '2024.11.18'
_pkg_version = '2024.12.03'