Check-in [02b80a0c9c]
Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | yaydl 0.10.0, on the road to 1.0.0: * New feature: Playlist support for handlers! Might fix issues like #1. * New site: xHamster. (Uses the new playlist support. Yay!) * The WebDriver port can be set as an environment variable now to save some typing. * The progress bar is now cleared after a download is finished. * cargo will now strip the resulting binary when compiling in release mode. |
---|---|
Downloads: | Tarball | ZIP archive | SQL archive |
Timelines: | family | ancestors | descendants | both | trunk | release-0.10.0 |
Files: | files | file ages | folders |
SHA3-256: |
02b80a0c9c36164e180a69463d3dec19 |
User & Date: | Cthulhux 2022-05-25 21:54:15 |
Context
2022-06-30
| ||
22:33 | yaydl 0.10.1: Fixed YouTube regex for extended URLs (tbd: playlists...) check-in: 8b8d3607ec user: Cthulhux tags: trunk, release-0.10.1 | |
2022-05-25
| ||
21:54 | yaydl 0.10.0, on the road to 1.0.0: * New feature: Playlist support for handlers! Might fix issues like #1. * New site: xHamster. (Uses the new playlist support. Yay!) * The WebDriver port can be set as an environment variable now to save some typing. * The progress bar is now cleared after a download is finished. * cargo will now strip the resulting binary when compiling in release mode. check-in: 02b80a0c9c user: Cthulhux tags: trunk, release-0.10.0 | |
2022-05-24
| ||
02:46 | README clarifications check-in: 5e38d5f13c user: Cthulhux tags: trunk | |
Changes
Changes to Cargo.lock.
︙ | ︙ | |||
688 689 690 691 692 693 694 695 696 697 698 699 700 701 | name = "log" version = "0.4.17" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "abb12e687cfb44aa40f41fc3978ef76448f9b6038cad6aef4259d3c095a2382e" dependencies = [ "cfg-if", ] [[package]] name = "mac" version = "0.1.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c41e0c4fef86961ac6d6f8a82609f55f31b05e4fce149ac5710e439df7619ba4" | > > > > > > > > > | 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 | name = "log" version = "0.4.17" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "abb12e687cfb44aa40f41fc3978ef76448f9b6038cad6aef4259d3c095a2382e" dependencies = [ "cfg-if", ] [[package]] name = "m3u8-rs" version = "4.0.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c27f4a86278e7d10f93c8c97f0191f85a071a45fa4245c261539465729c6d947" dependencies = [ "nom", ] [[package]] name = "mac" version = "0.1.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c41e0c4fef86961ac6d6f8a82609f55f31b05e4fce149ac5710e439df7619ba4" |
︙ | ︙ | |||
727 728 729 730 731 732 733 734 735 736 737 738 739 740 | [[package]] name = "mime" version = "0.3.16" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "2a60c7ce501c71e03a9c9c0d35b861413ae925bd979cc7a4e30d060069aaac8d" [[package]] name = "miniz_oxide" version = "0.5.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "d2b29bd4bc3f33391105ebee3589c19197c4271e3e5a9ec9bfe8127eeff8f082" dependencies = [ "adler", | > > > > > > | 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 | [[package]] name = "mime" version = "0.3.16" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "2a60c7ce501c71e03a9c9c0d35b861413ae925bd979cc7a4e30d060069aaac8d" [[package]] name = "minimal-lexical" version = "0.2.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "68354c5c6bd36d73ff3feceb05efa59b6acb7626617f4962be322a825e61f79a" [[package]] name = "miniz_oxide" version = "0.5.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "d2b29bd4bc3f33391105ebee3589c19197c4271e3e5a9ec9bfe8127eeff8f082" dependencies = [ "adler", |
︙ | ︙ | |||
777 778 779 780 781 782 783 784 785 786 787 788 789 790 | checksum = "e4a24736216ec316047a1fc4252e27dabb04218aa4a3f37c6e7ddbf1f9782b54" [[package]] name = "nodrop" version = "0.1.14" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "72ef4a56884ca558e5ddb05a1d1e7e1bfd9a68d9ed024c21704cc98872dae1bb" [[package]] name = "num-integer" version = "0.1.45" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "225d3389fb3509a24c93f5c29eb6bde2586b98d9f016636dff58d7c6f7569cd9" dependencies = [ | > > > > > > > > > > | 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 | checksum = "e4a24736216ec316047a1fc4252e27dabb04218aa4a3f37c6e7ddbf1f9782b54" [[package]] name = "nodrop" version = "0.1.14" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "72ef4a56884ca558e5ddb05a1d1e7e1bfd9a68d9ed024c21704cc98872dae1bb" [[package]] name = "nom" version = "7.1.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "a8903e5a29a317527874d0402f867152a3d21c908bb0b933e416c65e301d4c36" dependencies = [ "memchr", "minimal-lexical", ] [[package]] name = "num-integer" version = "0.1.45" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "225d3389fb3509a24c93f5c29eb6bde2586b98d9f016636dff58d7c6f7569cd9" dependencies = [ |
︙ | ︙ | |||
1896 1897 1898 1899 1900 1901 1902 | name = "windows_x86_64_msvc" version = "0.36.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c811ca4a8c853ef420abd8592ba53ddbbac90410fab6903b3e79972a631f7680" [[package]] name = "yaydl" | | > > | 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 | name = "windows_x86_64_msvc" version = "0.36.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c811ca4a8c853ef420abd8592ba53ddbbac90410fab6903b3e79972a631f7680" [[package]] name = "yaydl" version = "0.10.0" dependencies = [ "anyhow", "cienli", "clap", "fantoccini", "indicatif", "inventory", "m3u8-rs", "nom", "regex", "scraper", "serde_json", "tokio", "ureq", "url", "urlencoding", ] |
Changes to Cargo.toml.
1 2 3 | [package] name = "yaydl" description = "yet another youtube (and more) down loader" | | > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | [package] name = "yaydl" description = "yet another youtube (and more) down loader" version = "0.10.0" authors = ["Cthulhux <git@tuxproject.de>"] edition = "2021" license = "CDDL-1.0" repository = "https://code.rosaelefanten.org/yaydl" categories = ["command-line-utilities"] keywords = ["youtube", "downloading", "video"] [dependencies] anyhow = "1.0" cienli = "0.3" clap = { version = "3.1", features = ["derive"] } fantoccini = "0.19" indicatif = "0.16" inventory = "0.1" m3u8-rs = "4.0" nom = "7.1" regex = "1.5" scraper = "0.13" serde_json = "1.0" tokio = { version = "1", features = ["rt"] } ureq = { version = "2.4", features = ["json"] } url = "2.2" urlencoding = "2.1" [profile.release] lto = true strip = true |
Changes to README.md.
︙ | ︙ | |||
10 11 12 13 14 15 16 | % yaydl --help # Features * Can download videos. * Can optionally keep only the audio part of them. | | < | < < < < < < < < < < | 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | % yaydl --help # Features * Can download videos. * Can optionally keep only the audio part of them. * Could convert the resulting file to something else (requires the `ffmpeg` binary). * Comes as a single binary (once compiled) - take it everywhere on your thumbdrive, no Python cruft required. ## Currently supported sites * porndoe.com ยท vidoza.net ยท vimeo.com ยท vivo.sx ยท voe.sx ยท watchmdh.to ยท xhamster.com ยท youtube.com There is an easy way to add more supported sites, see below for details. ## Non-features The list of features is deliberately kept short: * No output quality choice. `yaydl` assumes that you have a large hard drive and your internet connection is good enough, or else you would stream, not download. * No complex filters. This is a downloading tool. * No image file support. Videos only. ## How to install ### From the source code Install Rust (e.g. with [rustup](https://rustup.rs)), then: **using Fossil:** |
︙ | ︙ | |||
76 77 78 79 80 81 82 | Other package managers: * Nobody has provided any other packages for `yaydl` yet. You can help! # How to use the web driver (very beta, at your own risk!) | | | | > | 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | Other package managers: * Nobody has provided any other packages for `yaydl` yet. You can help! # How to use the web driver (very beta, at your own risk!) For some video sites, `yaydl` needs to be able to parse a JavaScript on them. For this, it needs to be able to spawn a headless web browser. It requires Google Chrome, Microsoft Edge or Mozilla Firefox to be installed and running on your system. 1. Install and run [ChromeDriver](https://chromedriver.chromium.org) (if you use Chrome), the *typically* named [Microsoft Edge WebDriver](https://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/) (if you use Edge) or [geckodriver](https://github.com/mozilla/geckodriver/releases) (if you use Firefox) for your platform. 2. Tell `yaydl` that you have a web driver running: `yaydl --webdriver <port> <video URL>`. (The drivers usually run on port 4444 or 9515, please consult their documentation if you are not sure.) *Hint:* If you need this feature regularly, you can also use the environment variable `YAYDL_WEBDRIVER_PORT` to set the port number for all further requests. 3. In theory, it should be possible to use more sites with `yaydl` now. :-) # How to contribute code 1. Read and agree to the [Code of ~~Conduct~~ Merit](CODE_OF_CONDUCT.md). 2. Implicitly agree to the [LICENSE](LICENSE). Nobody reads those. I don't either. 3. Find out if anyone has filed a GitHub Issue or even sent a Pull Request yet. Act accordingly. |
︙ | ︙ | |||
120 121 122 123 124 125 126 | // - onlyaudio: true if only the audio part of the video should be // kept, else false. fn can_handle_url<'a>(&'a self, url: &'a str, webdriver_port: u16) -> bool { // Return true here if <url> can be covered by this handler. // Note that yaydl will skip all other handlers then. true } | | | | | > > > > > | > | | | 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | // - onlyaudio: true if only the audio part of the video should be // kept, else false. fn can_handle_url<'a>(&'a self, url: &'a str, webdriver_port: u16) -> bool { // Return true here if <url> can be covered by this handler. // Note that yaydl will skip all other handlers then. true } fn does_video_exist<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<bool> { // Return true here, if the video exists. Ok(false) } fn is_playlist<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<bool> { // Return true here, if the download link is a playlist. Ok(false) } fn find_video_title<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<String> { // Return the video title from <url> here. Ok("".to_string()) } fn find_video_direct_url<'a>(&'a self, url: &'a str, webdriver_port: u16, onlyaudio: bool) -> Result<String> { // Return the direct download URL of the video (or its audio version) here. // Exception: If is_playlist() is true, return the playlist URL here instead. Ok("".to_string()) } fn find_video_file_extension<'a>(&'a self, url: &'a str, webdriver_port: u16, onlyaudio: bool) -> Result<String> { // Return the designated file extension of the video (or audio) file here. Ok("mp4".to_string()) } fn display_name<'a>(&'a self) -> String { // For cosmetics, this is the display name of this handler. "NoopExample" } fn web_driver_required<'a>(&'a self) -> bool { // Return true here, if the implementation requires a web driver to be running. false } } // Push the site definition to the list of known handlers: inventory::submit! { &NoopExampleHandler as &dyn SiteDefinition } ``` ### Fix some bugs or add new features |
︙ | ︙ | |||
175 176 177 178 179 180 181 182 183 | * Liberapay: [Cthulhux](https://liberapay.com/Cthulhux/donate) Thank you. ## Contact * Twitter: [@tux0r](https://twitter.com/tux0r) * IRC: `irc.oftc.net/yaydl` * Matrix: @tux0r:matrix.org | > | 171 172 173 174 175 176 177 178 179 180 | * Liberapay: [Cthulhux](https://liberapay.com/Cthulhux/donate) Thank you. ## Contact * Twitter: [@tux0r](https://twitter.com/tux0r) * Forum: [DonationCoder.com](https://www.donationcoder.com/forum/index.php?topic=50691.0) * IRC: `irc.oftc.net/yaydl` * Matrix: @tux0r:matrix.org |
Changes to src/definitions.rs.
︙ | ︙ | |||
22 23 24 25 26 27 28 29 30 31 | pub trait SiteDefinition { // true, if this site can handle <url>. fn can_handle_url<'a>(&'a self, url: &'a str) -> bool; // true, if the video exists. fn does_video_exist<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<bool>; // returns the title of a video. fn find_video_title<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<String>; | > > > | | 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | pub trait SiteDefinition { // true, if this site can handle <url>. fn can_handle_url<'a>(&'a self, url: &'a str) -> bool; // true, if the video exists. fn does_video_exist<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<bool>; // true, if the URL is a playlist. fn is_playlist<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<bool>; // returns the title of a video. fn find_video_title<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<String>; // returns the download URL of a video or playlist. fn find_video_direct_url<'a>( &'a self, url: &'a str, webdriver_port: u16, onlyaudio: bool, ) -> Result<String>; |
︙ | ︙ |
Added src/download.rs.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | /* * The contents of this file are subject to the terms of the * Common Development and Distribution License, Version 1.0 only * (the "License"). You may not use this file except in compliance * with the License. * * See the file LICENSE in this distribution for details. * A copy of the CDDL is also available via the Internet at * http://www.opensource.org/licenses/cddl1.txt * * When distributing Covered Code, include this CDDL HEADER in each * file and include the contents of the LICENSE file from this * distribution. */ // Yet Another Youtube Down Loader // - download.rs file - use anyhow::Result; use indicatif::{ProgressBar, ProgressStyle}; use nom::Finish; use std::{ fs, io::{self, copy, Read}, path::Path, }; use url::Url; struct DownloadProgress<'a, R> { inner: R, progress_bar: &'a ProgressBar, } impl<R: Read> Read for DownloadProgress<'_, R> { fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> { self.inner.read(buf).map(|n| { self.progress_bar.inc(n as u64); n }) } } pub fn download_from_playlist(url: &str, filename: &str, verbose: bool) -> Result<()> { // Download the playlist file into the temporary directory: if verbose { println!("{}", "Found a playlist. Fetching ..."); } let mut url = Url::parse(url)?; let request = ureq::get(url.as_str()); let playlist_text = request.call()?.into_string()?; if verbose { println!("{}", "Parsing ..."); } // Parse the playlist: let playlist = m3u8_rs::parse_media_playlist(&playlist_text.as_bytes()) .finish() .unwrap(); // Grab and concatenate the segments from the playlist: let file = Path::new(&filename); let mut dest = fs::OpenOptions::new() .create(true) .append(true) .open(&file)?; // Display a progress bar: let total_cnt = playlist.1.segments.len() as u64; let pb = ProgressBar::new(total_cnt); pb.set_style( ProgressStyle::default_bar() .template("{spinner:.green} [{elapsed_precise}] [{bar:40.green/blue}] {percent}%") .progress_chars("#>-"), ); for segment in &playlist.1.segments { // .m3u8 playlists are usually relative. // Take the original path (from the playlist) and replace // the playlist itself by the video (e.g): // playlist URL: https://foo.bar/play/file.m3u8 // playlist item: file1.ts // result: https://foo.bar/play/file1.ts url.path_segments_mut().unwrap().pop().push(&segment.uri); let request = ureq::get(url.as_str()); let mut source = request.call()?.into_reader(); // Note: As we opened the file for appending only, // file concatenation happens automatically. let _ = copy(&mut source, &mut dest)?; // Update the progress bar: pb.inc(1); } pb.finish_and_clear(); Ok(()) } pub fn download(url: &str, filename: &str) -> Result<()> { let url = Url::parse(url)?; let resp = ureq::get(url.as_str()).call()?; // Find the video size: let total_size = resp .header("Content-Length") .unwrap_or("0") .parse::<u64>()?; let mut request = ureq::get(url.as_str()); // Display a progress bar: let pb = ProgressBar::new(total_size); pb.set_style(ProgressStyle::default_bar().template("{spinner:.green} [{elapsed_precise}] [{bar:40.green/blue}] {bytes}/{total_bytes} ({eta})").progress_chars("#>-")); let file = Path::new(filename); if file.exists() { // Continue the file: let size = file.metadata()?.len() - 1; // Override the range: request = ureq::get(url.as_str()) .set("Range", &format!("bytes={}-", size)) .to_owned(); pb.inc(size); } let resp = request.call()?; let mut source = DownloadProgress { progress_bar: &pb, inner: resp.into_reader(), }; let mut dest = fs::OpenOptions::new() .create(true) .append(true) .open(&file)?; let _ = copy(&mut source, &mut dest)?; pb.finish_and_clear(); Ok(()) } |
Changes to src/ffmpeg.rs.
︙ | ︙ | |||
27 28 29 30 31 32 33 | .arg("-i") .arg(inputfile) .arg("-vn") // Skip the video streams. .arg("-loglevel") .arg("quiet") // Shut the fuck up. .arg(outputfile) .output() | > > | > > > > > > > > > > > > > | 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | .arg("-i") .arg(inputfile) .arg("-vn") // Skip the video streams. .arg("-loglevel") .arg("quiet") // Shut the fuck up. .arg(outputfile) .output() .expect("Please install ffmpeg to convert the file into audio."); } pub fn ts_to_mp4(inputfile: &Path, outputfile: &Path) { Command::new("ffmpeg") .arg("-i") .arg(inputfile) .arg("-acodec") .arg("copy") .arg("-vcodec") .arg("copy") .arg("-loglevel") .arg("quiet") // Shut the fuck up. .arg(outputfile) .output() .expect("Please install ffmpeg to convert the file into MP4."); } |
Changes to src/handlers.rs.
︙ | ︙ | |||
18 19 20 21 22 23 24 25 26 27 | mod porndoe; mod vidoza; mod vimeo; mod vivo; mod voe; mod watchmdh; mod youtube; // Add your own modules here. | > | 18 19 20 21 22 23 24 25 26 27 28 | mod porndoe; mod vidoza; mod vimeo; mod vivo; mod voe; mod watchmdh; mod xhamster; mod youtube; // Add your own modules here. |
Changes to src/handlers/porndoe.rs.
︙ | ︙ | |||
69 70 71 72 73 74 75 76 77 78 79 80 81 82 | // Implement the site definition: struct PornDoeHandler; impl SiteDefinition for PornDoeHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"porndoe.com/.+").unwrap().is_match(url) } fn find_video_title<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url, webdriver_port)?; let h1_selector = Selector::parse("h1.-heading").unwrap(); let text = video_info.select(&h1_selector).next(); | > > > > > | 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | // Implement the site definition: struct PornDoeHandler; impl SiteDefinition for PornDoeHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"porndoe.com/.+").unwrap().is_match(url) } fn is_playlist<'a>(&'a self, _url: &'a str, _webdriver_port: u16) -> Result<bool> { // PornDoe has no playlists. Ok(false) } fn find_video_title<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url, webdriver_port)?; let h1_selector = Selector::parse("h1.-heading").unwrap(); let text = video_info.select(&h1_selector).next(); |
︙ | ︙ |
Changes to src/handlers/vidoza.rs.
︙ | ︙ | |||
41 42 43 44 45 46 47 48 49 50 51 52 53 54 | // Implement the site definition: struct VidozaHandler; impl SiteDefinition for VidozaHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"vidoza.net/.+").unwrap().is_match(url) } fn find_video_title<'a>(&'a self, url: &'a str, _webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url)?; // Currently, there only is one <H1> on Vidoza. Good for us. let h1_selector = Selector::parse("h1").unwrap(); | > > > > > | 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | // Implement the site definition: struct VidozaHandler; impl SiteDefinition for VidozaHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"vidoza.net/.+").unwrap().is_match(url) } fn is_playlist<'a>(&'a self, _url: &'a str, _webdriver_port: u16) -> Result<bool> { // Vidoza does not seem to have playlists? Ok(false) } fn find_video_title<'a>(&'a self, url: &'a str, _webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url)?; // Currently, there only is one <H1> on Vidoza. Good for us. let h1_selector = Selector::parse("h1").unwrap(); |
︙ | ︙ |
Changes to src/handlers/vimeo.rs.
︙ | ︙ | |||
66 67 68 69 70 71 72 73 74 75 76 77 78 79 | // Implement the site definition: struct VimeoHandler; impl SiteDefinition for VimeoHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"(?:www\.)?vimeo.com/.+").unwrap().is_match(url) } fn find_video_title<'a>(&'a self, _url: &'a str, _webdriver_port: u16) -> Result<String> { unsafe { let ret = &VIDEO_TITLE; Ok(ret.to_string()) } } | > > > > > | 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | // Implement the site definition: struct VimeoHandler; impl SiteDefinition for VimeoHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"(?:www\.)?vimeo.com/.+").unwrap().is_match(url) } fn is_playlist<'a>(&'a self, _url: &'a str, _webdriver_port: u16) -> Result<bool> { // Vimeo seems to have no playlists? Ok(false) } fn find_video_title<'a>(&'a self, _url: &'a str, _webdriver_port: u16) -> Result<String> { unsafe { let ret = &VIDEO_TITLE; Ok(ret.to_string()) } } |
︙ | ︙ |
Changes to src/handlers/vivo.rs.
︙ | ︙ | |||
43 44 45 46 47 48 49 50 51 52 53 54 55 56 | // Implement the site definition: struct VivoHandler; impl SiteDefinition for VivoHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"vivo.sx/.+").unwrap().is_match(url) } fn find_video_title<'a>(&'a self, url: &'a str, _webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url)?; let title_selector = Selector::parse("div.stream-content").unwrap(); let title_elem = video_info.select(&title_selector).next().unwrap(); | > > > > > | 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | // Implement the site definition: struct VivoHandler; impl SiteDefinition for VivoHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"vivo.sx/.+").unwrap().is_match(url) } fn is_playlist<'a>(&'a self, _url: &'a str, _webdriver_port: u16) -> Result<bool> { // Vivo has no playlists. Ok(false) } fn find_video_title<'a>(&'a self, url: &'a str, _webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url)?; let title_selector = Selector::parse("div.stream-content").unwrap(); let title_elem = video_info.select(&title_selector).next().unwrap(); |
︙ | ︙ |
Changes to src/handlers/voe.rs.
︙ | ︙ | |||
41 42 43 44 45 46 47 48 49 50 51 52 53 54 | // Implement the site definition: struct VoeHandler; impl SiteDefinition for VoeHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"(?:\.)?voe.sx/.+").unwrap().is_match(url) } fn find_video_title<'a>(&'a self, url: &'a str, _webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url)?; let h1_selector = Selector::parse("h1.mt-1").unwrap(); let text = video_info.select(&h1_selector).next(); | > > > > > | 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | // Implement the site definition: struct VoeHandler; impl SiteDefinition for VoeHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"(?:\.)?voe.sx/.+").unwrap().is_match(url) } fn is_playlist<'a>(&'a self, _url: &'a str, _webdriver_port: u16) -> Result<bool> { // TODO: Does VOE still have playlists? Ok(false) } fn find_video_title<'a>(&'a self, url: &'a str, _webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url)?; let h1_selector = Selector::parse("h1.mt-1").unwrap(); let text = video_info.select(&h1_selector).next(); |
︙ | ︙ |
Changes to src/handlers/watchmdh.rs.
︙ | ︙ | |||
58 59 60 61 62 63 64 65 66 67 68 69 70 71 | // Implement the site definition: struct WatchMDHHandler; impl SiteDefinition for WatchMDHHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"watchmdh.to/.+").unwrap().is_match(url) } fn find_video_title<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url, webdriver_port)?; let title_selector = Selector::parse(r#"meta[property="og:title"]"#).unwrap(); let title_elem = video_info.select(&title_selector).next().unwrap(); | > > > > > | 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | // Implement the site definition: struct WatchMDHHandler; impl SiteDefinition for WatchMDHHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"watchmdh.to/.+").unwrap().is_match(url) } fn is_playlist<'a>(&'a self, _url: &'a str, _webdriver_port: u16) -> Result<bool> { // WatchMDH has no playlists. Ok(false) } fn find_video_title<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url, webdriver_port)?; let title_selector = Selector::parse(r#"meta[property="og:title"]"#).unwrap(); let title_elem = video_info.select(&title_selector).next().unwrap(); |
︙ | ︙ |
Added src/handlers/xhamster.rs.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 | /* * The contents of this file are subject to the terms of the * Common Development and Distribution License, Version 1.0 only * (the "License"). You may not use this file except in compliance * with the License. * * See the file LICENSE in this distribution for details. * A copy of the CDDL is also available via the Internet at * http://www.opensource.org/licenses/cddl1.txt * * When distributing Covered Code, include this CDDL HEADER in each * file and include the contents of the LICENSE file from this * distribution. */ // Yet Another Youtube Down Loader // - xHamster handler - use crate::definitions::SiteDefinition; use anyhow::{anyhow, Result}; use fantoccini::ClientBuilder; use nom::Finish; use regex::Regex; use scraper::{Html, Selector}; use tokio::runtime; use url::Url; static mut VIDEO_INFO: String = String::new(); unsafe fn get_video_info(url: &str, webdriver_port: u16) -> Result<Html> { if VIDEO_INFO.is_empty() { // We need to fetch the video information first. // It will contain the whole body for now. let local_url = url.to_owned(); let rt = runtime::Builder::new_current_thread() .enable_time() .enable_io() .build() .unwrap(); rt.block_on(async move { let webdriver_url = format!("http://localhost:{}", webdriver_port); let c = ClientBuilder::native() .connect(&webdriver_url) .await .expect("failed to connect to web driver"); c.goto(&local_url).await.expect("could not go to the URL"); let body = c.source().await.expect("could not read the site source"); c.close_window().await.expect("could not close the window"); VIDEO_INFO = body; }); } // Return it: let d = Html::parse_document(&VIDEO_INFO); Ok(d) } // Implement the site definition: struct XHamsterHandler; impl SiteDefinition for XHamsterHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"xhamster.com/.+").unwrap().is_match(url) } fn is_playlist<'a>(&'a self, _url: &'a str, _webdriver_port: u16) -> Result<bool> { // xHamster has playlists. Ok(true) } fn find_video_title<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<String> { unsafe { let video_info = get_video_info(url, webdriver_port)?; let h1_selector = Selector::parse("h1").unwrap(); let text = video_info.select(&h1_selector).next(); let result = match text { Some(txt) => txt.text().collect(), None => return Err(anyhow!("Could not extract the video title.")), }; Ok(result) } } fn find_video_direct_url<'a>( &'a self, url: &'a str, webdriver_port: u16, _onlyaudio: bool, ) -> Result<String> { unsafe { let video_info = get_video_info(url, webdriver_port)?; // Find the playlist first: let url_selector = Selector::parse(r#"link[rel="preload"][as="fetch"]"#).unwrap(); let url_elem = video_info.select(&url_selector).next().unwrap(); let url_contents = url_elem.value().attr("href").unwrap(); let mut playlist_url = Url::parse(url_contents)?; let request = ureq::get(playlist_url.as_str()); let playlist_text = request.call()?.into_string()?; // Parse the playlist: let playlist = m3u8_rs::parse_media_playlist(&playlist_text.as_bytes()) .finish() .unwrap(); // Grab the last (= best) segment from the media playlist to find the video "playlist" // (which contains all segments of the video): let video_uri = &playlist.1.segments.last().ok_or("").unwrap().uri; // xHamster uses relative URIs in its playlists, so we'll only need to replace // the last URL segment: playlist_url .path_segments_mut() .unwrap() .pop() .push(video_uri); Ok(playlist_url.to_string()) } } fn does_video_exist<'a>(&'a self, url: &'a str, webdriver_port: u16) -> Result<bool> { unsafe { let _video_info = get_video_info(url, webdriver_port); Ok(!VIDEO_INFO.is_empty()) } } fn display_name<'a>(&'a self) -> String { "xHamster".to_string() } fn find_video_file_extension<'a>( &'a self, _url: &'a str, _webdriver_port: u16, _onlyaudio: bool, ) -> Result<String> { Ok("ts".to_string()) } fn web_driver_required<'a>(&'a self) -> bool { true } } // Push the site definition to the list of known handlers: inventory::submit! { &XHamsterHandler as &dyn SiteDefinition } |
Changes to src/handlers/youtube.rs.
︙ | ︙ | |||
51 52 53 54 55 56 57 58 59 60 61 62 63 64 | struct YouTubeHandler; impl SiteDefinition for YouTubeHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"(?:www\.)?youtu(?:be\.com|\.be)/") .unwrap() .is_match(url) } fn find_video_title<'a>(&'a self, url: &'a str, _webdriver_port: u16) -> Result<String> { let id_regex = Regex::new(r"(?:v=|\.be/)(.*$)").unwrap(); let id = id_regex.captures(url).unwrap().get(1).unwrap().as_str(); unsafe { let video_info = get_video_info(id)?; let video_info_title = video_info["videoDetails"]["title"].as_str().unwrap_or(""); | > > > > > | 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | struct YouTubeHandler; impl SiteDefinition for YouTubeHandler { fn can_handle_url<'a>(&'a self, url: &'a str) -> bool { Regex::new(r"(?:www\.)?youtu(?:be\.com|\.be)/") .unwrap() .is_match(url) } fn is_playlist<'a>(&'a self, _url: &'a str, _webdriver_port: u16) -> Result<bool> { // YouTube has broken domains, but no playlists. :-) Ok(false) } fn find_video_title<'a>(&'a self, url: &'a str, _webdriver_port: u16) -> Result<String> { let id_regex = Regex::new(r"(?:v=|\.be/)(.*$)").unwrap(); let id = id_regex.captures(url).unwrap().get(1).unwrap().as_str(); unsafe { let video_info = get_video_info(id)?; let video_info_title = video_info["videoDetails"]["title"].as_str().unwrap_or(""); |
︙ | ︙ |
Changes to src/main.rs.
︙ | ︙ | |||
14 15 16 17 18 19 20 | */ // Yet Another Youtube Down Loader // - main.rs file - use anyhow::Result; use clap::Parser; | < | < > < > | 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | */ // Yet Another Youtube Down Loader // - main.rs file - use anyhow::Result; use clap::Parser; use std::{ env, fs, path::{Path, PathBuf}, str::FromStr, }; mod definitions; mod download; mod ffmpeg; mod handlers; #[derive(Parser)] #[clap(version, about = "Yet Another Youtube Down Loader", long_about = None)] struct Args { #[clap(long = "only-audio", short = 'x', help = "Only keeps the audio stream")] |
︙ | ︙ | |||
60 61 62 63 64 65 66 | #[clap(long, help = "The port of your web driver (required for some sites)")] webdriver: Option<u16>, #[clap(help = "Sets the input URL to use", index = 1)] url: String, } | < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < > > > > > > > > > > > > > > > > | < < < < < < < < < | 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | #[clap(long, help = "The port of your web driver (required for some sites)")] webdriver: Option<u16>, #[clap(help = "Sets the input URL to use", index = 1)] url: String, } fn main() -> Result<()> { // Argument parsing: let args = Args::parse(); let in_url = &args.url; inventory::collect!(&'static dyn definitions::SiteDefinition); let mut site_def_found = false; for handler in inventory::iter::<&dyn definitions::SiteDefinition> { // "15:15 And he found a pair of eyes, scanning the directories for files." // https://kingjamesprogramming.tumblr.com/post/123368869357/1515-and-he-found-a-pair-of-eyes-scanning-the // ------------------------------------ // Find a known handler for <in_url>: if !handler.can_handle_url(in_url) { continue; } // This one is it. site_def_found = true; println!("Fetching from {}.", handler.display_name()); // The WebDriver port could be an argument from the command line // or, to make life easier, from the environment variables // ("YAYDL_WEBDRIVER_PORT") if not specified there. It defaults // to 0. let mut webdriverport: u16 = 0; let webdriver_env = env::var("YAYDL_WEBDRIVER_PORT"); if args.webdriver.is_some() { webdriverport = args.webdriver.unwrap(); } else if webdriver_env.is_ok() { webdriverport = u16::from_str(&webdriver_env.unwrap_or("0".to_string())).unwrap_or(0); } if handler.web_driver_required() && webdriverport == 0 { // This handler would need a web driver, but none is supplied to yaydl. println!("{} requires a web driver installed and running as described in the README. Please tell yaydl which port to use (yaydl --webdriver <PORT>) and try again.", handler.display_name()); continue; } let video_exists = handler.does_video_exist(in_url, webdriverport)?; if !video_exists { println!("The video could not be found. Invalid link?"); } else { if args.verbose { println!("The requested video was found. Processing..."); } let video_title = handler.find_video_title(in_url, webdriverport); let vt = match video_title { Err(_e) => "".to_string(), Ok(title) => title, }; // Usually, we already find errors here. if vt.is_empty() { println!("The video title could not be extracted. Invalid link?"); } else { if args.verbose { println!("Title: {}", vt); } |
︙ | ︙ | |||
195 196 197 198 199 200 201 | targetfile = in_targetfile.to_string(); } if args.verbose { println!("Starting the download."); } | > > > > > > > | > | > > > | | < | > > > > > > | 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 | targetfile = in_targetfile.to_string(); } if args.verbose { println!("Starting the download."); } let mut force_ffmpeg = false; if handler.is_playlist(in_url, webdriverport).unwrap_or(false) { // Multi-part download. download::download_from_playlist(&url, &targetfile, args.verbose)?; force_ffmpeg = true; } else { // Single-file download. download::download(&url, &targetfile)?; } // Convert the file if needed. let outputext = args.audioformat; if args.onlyaudio && ext != outputext || force_ffmpeg { if args.verbose { println!("Post-processing."); } let inpath = Path::new(&targetfile); let mut outpathbuf = PathBuf::from(&targetfile); if args.onlyaudio { // Convert to audio-only: outpathbuf.set_extension(outputext); let outpath = &outpathbuf.as_path(); ffmpeg::to_audio(inpath, outpath); } else { // Convert from .ts to .mp4: outpathbuf.set_extension("mp4"); let outpath = &outpathbuf.as_path(); ffmpeg::ts_to_mp4(inpath, outpath); } // Get rid of the evidence. if !args.keeptempfile { fs::remove_file(&targetfile)?; } // Success! |
︙ | ︙ |