csv:
Improved error messages
Exits early if it hasnt found any items in the first few lines
zip:
Now checks all CSV files instead of hard-coded paths
final qualifiers for immutable locals and parameters
Co-authored-by: litetex <40789489+litetex@users.noreply.github.com>
* Faster iframe api based player extraction.
Uses the IFrame API to reduce the required download to less than 1/50 of the size.
* Remove debug code.
* Extract to two methods.
* Add tests for player URL extraction.
* Add assertThat for tests.
Without removing RunWith and SuiteClasses annotations (and the corresponding imports) in YoutubePlaylistExtractorTest and YoutubeMixPlaylistExtractorTest, some mocks cannot be generated, so the CI fails because of the missing mocks. Mocks of workings tests have been also updated.
Migrate YouTube comments to the desktop version by using the `next` endpoint of the InnerTube internal API.
With the desktop version, we are able to get the exact like count of YouTube comments (by parsing the accessibility data) (the current extraction is used as a fallback). We are also now able to get if the uploader of the comment is verified or not.
Co-authored-by: TiA4f8R <74829229+TiA4f8R@users.noreply.github.com>
Here is now the requests which will be made by the `onFetchPage` method of `YoutubeStreamExtractor`:
- the desktop API is fetched.
If there is no streaming data, the desktop player API with the embed client screen will be fetched (and also the player code), then the Android mobile API.
- if there is no streaming data, a `ContentNotAvailableException` will be thrown by using the message provided in playability status
If the video is age restricted, a request to the next endpoint of the desktop player with the embed client screen will be sent.
Otherwise, the next endpoint will be fetched normally, if the content is available.
If the video is not age-restricted, a request to the player endpoint of the Android mobile API will be made.
We can get more streams by using the Android mobile API but some streams may be not available on this API, so the streaming data of the Android mobile API will be first used to get itags and then the streaming data of the desktop internal API will be used.
If the parsing of the Android mobile API went wrong, only the streams of the desktop API will be used.
Other code changes:
- `prepareJsonBuilder` in `YoutubeParsingHelper` was renamed to `prepareDesktopJsonBuilder`
- `prepareMobileJsonBuilder` in `YoutubeParsingHelper` was renamed to `prepareAndroidMobileJsonBuilder`
- two new methods in `YoutubeParsingHelper` were added: `prepareDesktopEmbedVideoJsonBuilder` and `prepareAndroidMobileEmbedVideoJsonBuilder`
- `createPlayerBodyWithSts` is now public and was moved to `YoutubeParsingHelper`
- a new method in `YoutubeJavaScriptExtractor` was added: `resetJavaScriptCode`, which was needed for the method `resetDebofuscationCode` of `YoutubeStreamExtractor`
- `areHardcodedClientVersionAndKeyValid` in `YoutubeParsingHelper` returns now a `boolean` instead of an `Optional<Boolean>`
- the `fetchVideoInfoPage` method of `YoutubeStreamExtractor` was removed because YouTube returns now 404 for every client with the `get_video_info` page
- some unused objects and some warnings in `YoutubeStreamExtractor` were removed and fixed
Co-authored-by: TiA4f8R <74829229+TiA4f8R@users.noreply.github.com>
This method is needed for YouTube stream tests, because when all YouTube tests are ran, the signatureTimestamp is known (the sts string) so a different body than the body present in the mocks is send by the extractor instance.
As a result, running all YouTube stream tests with the MockDownloader (like the CI does) will fail if this method is not called before fetching the page of a test.
The strings playerJsUrl, sts and playerCode are now static in order to don't fetch again the JavaScript player at each time the signatureTimestamp is needed.
Catch every exception instead of only IOException and ExtractionException.
Add JavaDoc for fetchAndroidMobileJsonPlayer method of YoutubeStreamExtractor
Try to don't fetch again the first page of a YouTube channel when requesting a continuation of it by trying to store the channel name and the channel id into the next page using the ids field of the Page class.
Use the youtubei API for YouTube mixes. The corresponding has been updated because the new API breaks the tests of YoutubeMixPlaylistExtractorTest.
Remove some deprecated code (the old search code with the pbj JSON) and do some other improvements.
Use the Android mobile API to get the itag 22 (720p with audio), removed when the content is protected by signatureCiphers.
Also use this API when they are OTF streams, to get the itag 17 and 36, low 3GPP quality streams but also the itag 139.
Update the web client version.
Use again www.youtube.com and music.youtube.com domains instead of youtubei.googleapis.com domain because it spoofs more a web client of YouTube or YouTube Music and may reduce Google's detection of NewPipe Extractor users.
This commit updates mocks and reenables the test invalidId of the NotAvailable class of the YoutubePlaylistExtractorTest class beacuse with the youtubei/innertube API, it returns "Not found" and doesn't redirect to the YouTube homepage.
The expectedMetaInfo test of the MetaInfoTest class of the YoutubeSearchExtractorTest class was broken because YouTube removes the vaccine progress link of the WHO from the meta info so this commit removes it and the test is now passing.
Add the origin and the referer headers with the https://www.youtube.com value for YouTube JSON POST requests.
Don't add the consent cookie header for the requests which use the youtubei/innertube API because it's uneeded.
Fix some tests and update YouTube mocks
Get the real name of the uploader (for autogenerated channels and music artist channels), like before the migration to the JSON pbj.
Do some other improvements, especially reformatting some code to be in the 100 characters line limit and use final where possible.
Change the domain from music.youtube.com to youtubei.googleapis.com.
Use a lightweight request to check if the hardcoded YouTubeMusic keys are valid. Increase the length of the response to 500 because if the key is invalid, the length of the response returned is higher than 250 and the response when the key is valid is higher than 1500.
Format the YoutubeMusicSearchExtractor file.
Update YouTube web client version and mocks
Add getSearchParameter, a new method in YoutubeSearchQueryHandlerFactory class which returns the params field for a search, or an empty string if there is no one.
Update mocks of YoutubeSearchExtractorTest.
Without this commit, the n param is only decrypted for streams extracted in getVideoStreams (so only for streams in the formats object of the player response).
To fix ``java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.String[] java.lang.String.split(java.lang.String)' on a null object reference``
On Windows, mocks are recorded and read with the Cp1252 encoding so it breaks the mocks on non ASCII characters for Linux OS (and so the CI).
The project is in Java 8, so we can't use FileReader(File, Charset) and FileReader(File, Charset) because these methods require Java 11. Instead of changing the Java version of the extractor, use OutputStreamWriter and FileOutputStream instead of FileWriter and InputStreamReader and FileInputStream instead of FileReader.
Request the api-v2 host with the client_id instead of checking if the streams of a SoundCloud track are not empty: if it is valid, the API returns 404, otherwise it should return 401.
Use final where possible in the package and format code to be in the 100 caracters per line limit.
Fix some warnings generated by Android Studio and do some code improvements
resolveIdWithEmbedPlayer() does not work anymore because the JSON data has been extracted to an API call. For this reason, replace resolveIdWithEmbedPlayer() with resolveIdWithWidgetApi)( which performs the API call.