Have you ever tried to download movies from YouTube? I imply manually with out counting on software program like youtube-dl, yt-dlp or certainly one of “these” web sites. It’s rather more sophisticated than you would possibly assume.
Youtube generates income from consumer advert views, and it’s logical for the platform to implement restrictions to stop individuals from downloading movies and even watching them on an unofficial shopper like YouTube Vanced. In this text, I’ll clarify the technical particulars of those safety mechanisms and the way it’s attainable to bypass them.
A google seek for: youtube downloader
The first step is to seek out the precise URL containing the video file. For this, we will talk with the YouTube API. Specifically, the /youtubei/v1/participant
endpoint permits us to retrieve all the small print of a video, such because the title, description, thumbnails, and most significantly, the codecs. It is inside these codecs that we will find the URL of the file based mostly on the specified high quality (SD, HD, UHD, and so on.).
Here’s an instance for the video with the ID aqz-KE-bpKQ
, the place we’ll get the URL for one of many format. Note that the opposite variables contained inside the context
object are preconditions validated by the API. The accepted values have been discovered by observing the requests despatched by the online browser.
1 2 3 4 5 |
echo -n '{"videoId":"aqz-KE-bpKQ","context":{"client":{"clientName":"WEB","clientVersion":"2.20230810.05.00"}}}' | http publish 'https://www.youtube.com/youtubei/v1/player' | jq -r '.streamingData.adaptiveFormats[0].url' https://rr1---sn-8qu-t0aee.googlevideo.com/videoplayback?expire=1691828966&ei=hu7WZOCJHI7T8wTSrr_QBg [TRUNCATED] |
However, making an attempt to download from this URL results in actually gradual download:
1 2 3 4 |
http --print=b --download 'https://rr1---sn-8qu-t0aee.googlevideo.com/videoplayback?expire=1691828966&ei=hu7WZOCJHI7T8wTSrr_QBg [TRUNCATED]' Downloading to videoplayback.webm [ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ] 0% ( 0.0/1.5 GB ) 6:23:45 66.7 kB/s |
The pace is all the time restricted to round 40-70kB/s. Unfortunately for this 10-minute video, it will take roughly 6 and a half hours to download your entire video. Clearly, the video will not be downloaded at this pace when utilizing an online browser. So what’s totally different?
Here’s the entire URL damaged down. It’s moderately sophisticated, however there’s a particular parameter that pursuits us.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
Protocol: https Hostname: rr1---sn-8qu-t0aee.googlevideo.com Path title: /videoplayback Query Parameters: expire: 1691829310 ei: 3u_WZJT7Cbag_9EPn7mi0A8 ip: 203.0.113.30 id: o-ABGboQn9qMKsUdClvQHd6cHm6l1dWkRw4WNj3V7wBgY1 itag: 315 aitags: 133,134,135,136,160,242,243,244,247,278,298,299,302,303,308,315,394,395,396,397,398,399,400,401 supply: youtube requiressl: sure mh: aP mm: 31,29 mn: sn-8qu-t0aee,sn-t0a7ln7d ms: au,rdu mv: m mvi: 1 pcm2cms: sure pl: 18 initcwndbps: 1422500 spc: UWF9fzkQbIbHWdKe8-ahg0uWbE_UrbUM0U6LbQfFxg vprv: 1 svpuc: 1 mime: video/webm ns: dn5MLRkBtM4BWwzNNOhVxHIP gir: sure clen: 1536155487 dur: 634.566 lmt: 1662347928284893 mt: 1691807356 fvip: 3 keepalive: sure fexp: 24007246,24363392 c: WEB txp: 553C434 n: mAq3ayrWqdeV_7wbIgP sparams: expire,ei,ip,id,aitags,supply,requiressl,spc,vprv,svpuc,mime,ns,gir,clen,dur,lmt sig: AOq0QJ8wRgIhAOx29gNeoiOLRe1GhEfE52PAiXW64ZEWX7nNdAiJE6ezAiEA0Plw6Yn0kmSFFZHO2JZPZyMGd0O-gEblUXPRrexQgrY= lsparams: mh,mm,mn,ms,mv,mvi,pcm2cms,pl,initcwndbps lsig: AG3C_xAwRQIgZVOkDl4rGPGnlK6IGCAXpzxk-cB5RRFmXDesEqOWTRoCIQCzIdPKE6C6_JQVpH6OKMF3woIJ2yVYaztT9mXIVtE6xw== |
Since mid-2021, YouTube has included the question parameter n
within the majority of file URLs. This parameter must be reworked utilizing a JavaScript algorithm positioned within the file base.js
, which is distributed with the online web page. YouTube makes use of this parameter as a problem to confirm that the download originates from an “official” shopper. If the problem will not be resolved and n
will not be reworked appropriately, YouTube will silently apply throttling to the video download.
The JavaScript algorithm is obfuscated and adjustments ceaselessly, so it’s not sensible to aim reverse engineering to know it. The resolution is solely to download the JavaScript file, extract the algorithm code, and execute it by passing the n
parameter to it. The following code accomplishes this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
import axios from 'axios'; import vm from 'vm' const videoId = 'aqz-KE-bpKQ'; /** * From the Youtube API, retrieve metadata in regards to the video (title, video format and audio format) */ async operate retrieveMetadata(videoId) { const response = await axios.publish('https://www.youtube.com/youtubei/v1/player', { "videoId": videoId, "context": { "client": { "clientName": "WEB", "clientVersion": "2.20230810.05.00" } } }); const codecs = response.information.streamingData.adaptiveFormats; return [ response.data.videoDetails.title, formats.filter(w => w.mimeType.startsWith("video/webm"))[0], codecs.filter(w => w.mimeType.beginsWith("audio/webm"))[0], ]; } /** * From the Youtube Web Page, retrieve the problem algorithm for the n question parameter */ async operate retrieveChallenge(video_id){ /** * Find the URL of the javascript file for the present participant model */ async operate retrieve_player_url(video_id) { let response = await axios.get('https://www.youtube.com/embed/' + video_id); let player_hash = //s/participant/(w+)/player_ias.vflset/w+/base.js/.exec(response.information)[1] return `https://www.youtube.com/s/player/${player_hash}/player_ias.vflset/en_US/base.js` } const player_url = await retrieve_player_url(video_id); const response = await axios.get(player_url); let challenge_name = /.get("n"))&&(b=([a-zA-Z0-9$]+)(?:[(d+)])?([a-zA-Z0-9])/.exec(response.information)[1]; challenge_name = new RegExp(`var ${challenge_name}s*=s*[(.+?)]s*[,;]`).exec(response.information)[1]; const problem = new RegExp(`${challenge_name}s*=s*operates*(([w$]+))s*{(.+?}s*return [w$]+.be a part of(""))};`, "s").exec(response.information)[2]; return problem; } /** * Solve the problem and substitute the n question parameter from the url */ operate solveChallenge(problem, formatUrl) { const url = new URL(formatUrl); const n = url.searchParams.get("n"); const n_transformed = vm.runInNewContext(`((a) => {${problem}})('${n}')`); url.searchParams.set("n", n_transformed); return url.toString(); } const [title, video, audio] = await retrieveMetadata(videoId); const problem = await retrieveChallenge(videoId); video.url = solveChallenge(problem, video.url); audio.url = solveChallenge(problem, audio.url); console.log(video.url); |
With this new URL containing the appropriately reworked n
parameter, the following step is to download the video. However, YouTube nonetheless enforces a throttling rule. This rule imposes a variable download pace restrict based mostly on the scale and size of the video, aiming to supply a download time that’s roughly half the period of the video. This aligns with the streaming nature of movies. It could be an enormous waste of bandwidth for YouTube to all the time present the media file as rapidly as attainable.
1 2 3 4 |
http --print=b --download 'https://rr1---sn-8qu-t0aee.googlevideo.com/videoplayback?expire=1691888702&ei=3tfXZIXSI72c_9EP1NGHqA8 [TRUNCATED]' Downloading to videoplayback.webm [ ━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ] 4% ( 0.1/1.5 GB ) 0:06:07 4.0 MB/s |
To bypass this limitation, we will break the download into a number of smaller components utilizing the HTTP Range
header. This header means that you can specify which a part of the file you wish to download with every request (eg: Range bytes=2000-3000
). The following code implements this logic.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
/** * Download a media file by breaking it into a number of 10MB segments */ async operate download(url, size, file){ const MEGABYTE = 1024 * 1024; await fs.guarantees.rm(file, { drive: true }); let downloadedBytes = 0; whereas (downloadedBytes < size) { let nextSegment = downloadedBytes + 10 * MEGABYTE; if (nextSegment > size) nextSegment = size; // Download phase const begin = Date.now(); let response = await axios.get(url, { headers: { "Range": `bytes=${downloadedBytes}-${nextSegment}` }, responseType: 'stream' }); // Write phase await fs.guarantees.writeFile(file, response.information, {flag: 'a'}); const finish = Date.now(); // Print download stats const progress = (nextSegment / size * 100).toFixed(2); const complete = (size / MEGABYTE).toFixed(2); const pace = ((nextSegment - downloadedBytes) / (finish - begin) * 1000 / MEGABYTE).toFixed(2); console.log(`${progress}% of ${complete}MB at ${pace}MB/s`); downloadedBytes = nextSegment + 1; } } |
This works as a result of the throttling rule takes a while to use, and the small segments are downloaded very quickly, all the time using a brand new connection.
1 2 3 4 5 6 7 8 9 10 11 |
node index.js 0.68% of 1464.99MB at 46.73MB/s 1.37% of 1464.99MB at 60.98MB/s 2.05% of 1464.99MB at 71.94MB/s 2.73% of 1464.99MB at 70.42MB/s 3.41% of 1464.99MB at 68.49MB/s 4.10% of 1464.99MB at 68.97MB/s 4.78% of 1464.99MB at 74.07MB/s 5.46% of 1464.99MB at 81.97MB/s 6.14% of 1464.99MB at 104.17MB/s |
We at the moment are capable of download movies a lot quicker. During my assessments, sure download have been shut to completely using a 1 Gb/s connection. However, the typical speeds sometimes ranged between 50-70 MB/s or 400-560 Mb/s, which remains to be fairly quick.
Post-processing
YouTube distributes the video and audio channels in two separate recordsdata. This strategy helps save house, as an HD or UHD video can reuse the identical audio file. Additionally, some movies now provide totally different audio channels based mostly on the language. Therefore, the ultimate step is to mix these two channels right into a single file, and for that, we will merely use ffmpeg
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
/** * Using ffmpeg, combien the audio and video file into one */ async operate combineChannels(destinationFile, videoFile, audioFile) { await fs.guarantees.rm(destinationFile, { drive: true }); child_process.spawnSync('ffmpeg', [ "-y", "-i", videoFile, "-i", audioFile, "-c", "copy", "-map", "0:v:0", "-map", "1:a:0", destinationFile ]); await fs.guarantees.rm(videoFile, { drive: true }); await fs.guarantees.rm(audioFile, { drive: true }); } |
Finally, for these , the complete code may be downloaded right here.
Conclusion
Many initiatives at present use these strategies to bypass the restrictions put in place by YouTube as a way to forestall video downloads. The hottest one is yt-dlp (a fork of youtube-dl) programmed in Python, however it contains its personal customized JavaScript interpreter to remodel the n
parameter.
- yt-dlp
- https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/youtube.py
- VLC media participant
- https://github.com/videolan/vlc/blob/master/share/lua/playlist/youtube.lua
- NewPipe
- https://github.com/Theta-Dev/NewPipeExtractor/blob/dev/extractor/src/main/java/org/schabi/newpipe/extractor/services/youtube/YoutubeJavaScriptExtractor.java
- node-ytdl-core
- https://github.com/fent/node-ytdl-core/blob/master/lib/sig.js
…. to be continued
Read the Original Article
Copyright for syndicated content material belongs to the linked Source : Hacker News – https://blog.0x7d0.dev/history/how-they-bypass-youtube-video-download-throttling/