Commit Graph

2561 Commits

Author SHA1 Message Date
Tobi
5ab1f784e8
Merge pull request #1117 from TeamNewPipe/dependabot/gradle/org.jsoup-jsoup-1.16.2
Bump org.jsoup:jsoup from 1.16.1 to 1.16.2
2023-10-21 19:26:36 +02:00
dependabot[bot]
9d7bcba050
Bump org.jsoup:jsoup from 1.16.1 to 1.16.2
Bumps [org.jsoup:jsoup](https://github.com/jhy/jsoup) from 1.16.1 to 1.16.2.
- [Release notes](https://github.com/jhy/jsoup/releases)
- [Changelog](https://github.com/jhy/jsoup/blob/master/CHANGES)
- [Commits](https://github.com/jhy/jsoup/compare/jsoup-1.16.1...jsoup-1.16.2)

---
updated-dependencies:
- dependency-name: org.jsoup:jsoup
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-20 09:13:21 +00:00
dependabot[bot]
e26065148a Bump com.github.spotbugs:spotbugs-annotations from 4.7.3 to 4.8.0
Bumps [com.github.spotbugs:spotbugs-annotations](https://github.com/spotbugs/spotbugs) from 4.7.3 to 4.8.0.
- [Release notes](https://github.com/spotbugs/spotbugs/releases)
- [Changelog](https://github.com/spotbugs/spotbugs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/spotbugs/spotbugs/compare/4.7.3...4.8.0)

---
updated-dependencies:
- dependency-name: com.github.spotbugs:spotbugs-annotations
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-12 14:19:10 +02:00
FineFindus
34b05a0dda
feat(youtube/comments): support creator replies 2023-10-09 16:33:43 +02:00
TobiGr
0821f09114
Add missing mocks 2023-10-09 16:33:43 +02:00
FineFindus
c1784a4bdb
[YouTube] Add channel owner to comments 2023-10-09 16:33:43 +02:00
TobiGr
f9846352ea Fix wrong @Nullable annotation 2023-10-09 16:02:57 +02:00
Tobi
d6f5cba6e2
Merge pull request #1111 from FineFindus/feat/creator-reply
Add `hasCreatorReply()` to CommentsInfoItem
2023-10-09 12:45:56 +02:00
TobiGr
9d63c75623 Add missing mocks 2023-10-09 11:24:39 +02:00
TobiGr
d49f8411d7 [PeerTube] Implement CommentsInfoItemExtractor.hasCreatorReply() 2023-10-09 02:47:12 +02:00
Stypox
bb132167d5
Merge pull request #1113 from AudricV/snd_fix-non-jpg-images
[SoundCloud] Fix extraction of non-JPG images
2023-10-02 19:40:57 +02:00
AudricV
c98695fcea
[SoundCloud] Fix extraction of non-JPG images
Default image qualities were removed in image URLs with the jpg extension,
causing the addition of the image suffix to full non-JPG images URLs and so to
invalid image URLs.

Only the image quality name with its leading "-" character and the "."
character after the name is now removed and replaced by a string format
replaced itself with the image quality name for each quality.

As the image suffixes do not contain the image extension, the name of image
qualities lists has been adapted with these changes and some related comments
have been also improved.
2023-10-01 20:33:25 +02:00
AudricV
ac00459c1a
Change requirement of image extensions in ImageSuffix class' Javadoc to a possibility
Some services may provide different image formats using the same suffix,
without we know what format the service provide. Enforcing an image extension
could so lead to provide invalid image URLs, like for SoundCloud PNG images
currently.

With this documentation change, it is now clear that users of this class decide
of whether they want to include image extensions in the suffix. The previous
behavior described in the Javadoc was not enforced.
2023-09-30 21:11:09 +02:00
FineFindus
dd7b2d9798
feat(youtube/comments): support creator replies 2023-09-25 10:40:45 +02:00
Youssif Shaaban Alsager
917554acc4
[YouTube] Add support for ultralow audio formats (#1063) 2023-09-24 19:04:34 +02:00
Tobi
8b0068f8f4
Merge pull request #1110 from christian-2hu/chore/update-copyright
chore: Update copyright notices
2023-09-23 00:29:19 +02:00
Christian
fc67d49f59 Update copyright notices
Update copyright notices to comply to GPLv3 and change NewPipe to NewPipe Extractor on some notices that were not updated.
2023-09-22 19:10:15 -03:00
Stypox
289db1178a
Merge pull request #1108 from AudricV/yt_refactor-js-usage
[YouTube] Refactor JavaScript usage and fix extraction of obfuscated signature deobfuscation function
2023-09-22 10:41:57 +02:00
AudricV
6ed22099a2
[YouTube] Update stream mocks 2023-09-21 21:59:34 +02:00
AudricV
714b141ecb
[YouTube] Catch any exception when extracting something from JavaScript's base player 2023-09-21 21:59:33 +02:00
AudricV
588c6a8422
[YouTube] Quote signature deobfuscation function name and add semicolon only where needed 2023-09-21 21:59:33 +02:00
AudricV
1fa85ec6ca
[YouTube] Add tests for signature timestamp extraction and signature deobfuscation function extraction and execution 2023-09-21 21:59:33 +02:00
AudricV
a04bc320de
[YouTube] Convert signature timestamp to integer
The signature timestamp is used as a number by HTML5 clients, so it should be
used in the same way by the extractor too instead of being a string.

As the timestamp doesn't seem to exceed 5 digits, an integer is used to store
its value.
2023-09-21 21:59:32 +02:00
AudricV
7de3753a81
[YouTube] Refactor JavaScript player management API
This commit is introducing breaking changes.

For clients, everything is managed in a new class called
YoutubeJavaScriptPlayerManager:
- caching JavaScript base player code and its extracted code (functions and
variables);
- getting player signature timestamp;
- getting deobfuscated signatures of streaming URLs;
- getting streaming URLs with a throttling parameter deobfuscated, if
applicable.

The class delegates the extraction parts to external package-private classes:
- YoutubeJavaScriptExtractor, to extract and download YouTube's JavaScript base
player code: it always already present before and has been edited to mainly
remove the previous caching system and made it package-private;
- YoutubeSignatureUtils, for player signature timestamp and signature
deobfuscation function of streaming URLs, added in a recent commit;
- YoutubeThrottlingParameterUtils, which was originally
YoutubeThrottlingDecrypter, for throttling parameter of streaming URLs
deobfuscation function and checking whether this parameter is in a streaming
URL.

YoutubeJavaScriptPlayerManager caches and then runs the extracted code if it
has been executed successfully. The cache system of throttling parameters
deobfuscated values has been kept, its size can be get using the
getThrottlingParametersCacheSize method and can be cleared independently using
the clearThrottlingParametersCache method.

If an exception occurs during the extraction or the parsing of a function
property which is not related to JavaScript base player code fetching, it is
stored until caches are cleared, making subsequent failing extraction calls of
the requested function or property faster and consuming less resources, as the
result should be the same until the base player code changes.

All caches can be reset using the clearAllCaches method of
YoutubeJavaScriptPlayerManager.

Classes using JavaScript base player code and utilities directly (in the code
and its tests) have been also updated in this commit.
2023-09-21 21:59:32 +02:00
AudricV
6884d191cd
[YouTube] Add utility class around signatures and fix signature deobfuscation function extraction
The goal of this class is to decouple the extraction of signature timestamp and
signature deobfuscation function from YoutubeStreamExtractor.

The extraction of the signature deobfuscation function has been also adapted to
support the latest YouTube player versions.

This new class, YoutubeSignatureUtils, doens't store anything temporary such as
a copy of the player code, which has to be passed where required. It is not
public, as it will be used by a JavaScript player manager class in the future,
in order to handle in a better way fetching, caching and resetting cache of the
player code.
2023-09-21 21:59:26 +02:00
Tobi
3be76a6406
Merge pull request #1107 from Isira-Seneviratne/Locale_forLanguageTag
Use Locale.forLanguageTag() in tests
2023-09-18 16:48:30 +02:00
TobiGr
17790328cd Improve doc 2023-09-18 16:44:51 +02:00
Isira Seneviratne
4bc8ae7812 Use Locale.forLanguageTag() in tests 2023-09-18 08:59:13 +05:30
Tobi
90aed06a63
Merge pull request #1105 from TeamNewPipe/fix/bandcamp-streame-extractor-test
[Badcamp] Fix StreamExtractorTest
2023-09-18 01:49:04 +02:00
TobiGr
cf49f4a31c [Badcamp] Fix StreamExtractorTest
The song was renamed and the URL changed
2023-09-17 23:58:07 +02:00
Tobi
7c7ceaceab
Merge pull request #1103 from TeamNewPipe/dependabot/github_actions/actions/checkout-4
Bump actions/checkout from 3 to 4
2023-09-17 22:42:01 +02:00
dependabot[bot]
72c475d944
Bump actions/checkout from 3 to 4
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-05 10:00:13 +00:00
Stypox
1f08d28ae5
Merge pull request #889 from AudricV/multiple-images-support
Multiple images support
2023-08-13 11:35:11 +02:00
AudricV
e8bfd20170
[MediaCCC] Apply changes in extractor tests
Also remove some public test methods modifiers.
2023-08-12 22:56:33 +02:00
AudricV
0292c4f3e8
[Bandcamp] Apply changes in extractor tests
Also remove some public test methods modifiers, add missing Test annotations on
old Junit 4 tests (and update them if needed), and use final in some places
where it was possible.

BandcampChannelExtractorTest.testLength has been removed as the test is always
true.
2023-08-12 22:56:32 +02:00
AudricV
2578f22054
[Bandcamp] Add utility test method to test images
This method, testImages(Collection<Image>), will use first the default image
collection test in DefaultTests and then will check that each image URL
contains f4.bcbits.com/img and ends with .jpg or .png.

To do so, a new non-instantiable final class has been added: BandcampTestUtils.
2023-08-12 22:56:32 +02:00
AudricV
ba5315c72d
[PeerTube] Apply changes in extractor tests
Also remove some public test methods modifiers, add missing Test annotations on
old Junit 4 tests (and update them if needed), and improve some code.
2023-08-12 22:56:32 +02:00
AudricV
1d72bac53d
[SoundCloud] Apply changes in extractor tests 2023-08-12 22:56:32 +02:00
AudricV
93a210394d
[YouTube] Apply changes in extractor tests
Also remove some public test methods modifiers, add missing Test annotations on
old Junit 4 tests (and update them if needed), and use final in some places
where it was possible.
2023-08-12 22:56:31 +02:00
AudricV
2c436d428c
[YouTube] Add utility test method to test images in YoutubeTestsUtils
This method, testImages(Collection<Image>), will use first the default image
collection test in DefaultTests and then will check that each image URL
contains the string yt.

The JavaDoc of the class has been also updated to reflect the changes made in
it (it is now more general).
2023-08-12 22:56:31 +02:00
AudricV
d381f3b70b
Update avatar, banners and thumbnail methods' name and apply changes in DefaultStreamExtractorTest 2023-08-12 22:56:31 +02:00
AudricV
434e885708
Add utility methods in ExtractorAsserts to check whether a collection is empty and to test image collections
Two new methods have been added in ExtractorAsserts to check if a collection is
empty:

- assertNotEmpty(String, Collection<?>), checking:
  - the non nullity of the collection;
  - its non emptiness (if that's not case, an exception will be thrown using
    the provided message).

- assertNotEmpty(Collection<?>), calling assertNotEmpty(String, Collection<?>)
  with null as the value of the string argument.

A new one has been added to this assertion class to check the contrary:
assertEmpty(Collection<?>), checking emptiness of the collection only if it is
not null.

Three new methods have been added in ExtractorAsserts as utility test methods
for image collections:

- assertContainsImageUrlInImageCollection(String, Collection<Image>), checking
that:
  - the provided URL and image collection are not null;
  - the image collection contains at least one image which has the provided
    string value as its URL (which is a string) property.

- assertContainsOnlyEquivalentImages(Collection<Image>, Collection<Image>),
  checking that:
  - both collections are not null;
  - they have the same size;
  - each image of the first collection has its equivalent in the second one.
    This means that the properties of an image in the first collection must be
    equal in an image of the second one.

- assertNotOnlyContainsEquivalentImages(Collection<Image>, Collection<Image>),
  checking that:
  - both collections are not null;
  - one of the following conditions is met:
    - they have different sizes;
    - an image of the first collection has not its equivalent in the second one.
      This means that the properties of an image in the first collection must
      be not equal in an image of the second one.

These methods will be used by services extractors tests (and default ones) to
test image collections.
2023-08-12 22:56:31 +02:00
AudricV
5158472852
Apply changes in DefaultTests and add utility method to test image lists
This new method, defaultTestImageList(List<Image), will check that the image
list is not null.

For each image, it will test that its URL is secure and its height and width
are more than or equal to their relevant unknown constants in the Image class
(HEIGHT_UNKNOWN and WIDTH_UNKNOWN).
2023-08-12 22:56:31 +02:00
AudricV
70fb3aa38e
Update BaseExtractorTests image methods' name
Also suppress unused warnings in BaseStreamExtractorTest, like it is done on
other BaseExtractorTests interfaces.
2023-08-12 22:56:30 +02:00
AudricV
e16d521b7b
[MediaCCC] Apply changes in Extractors
Also remove usage of the conference logo as the banner of a conference, as it
is a logo and not a banner.
2023-08-12 22:56:30 +02:00
AudricV
306068a63b
[MediaCCC] Apply changes in InfoItemExtractors 2023-08-12 22:56:30 +02:00
AudricV
2f40861428
[MediaCCC] Add utility methods to get image lists from conference logos and streams
These three new methods, added in MediaCCCParsingHelper,
getImageListFromImageUrl(String), getThumbnailsFromStreamItem(JsonObject) and
getThumbnailsFromLiveStreamItem(JsonObject) (the last two are based on a common
method, getThumbnailsFromObject(JsonObject, String, String)), return an empty
list if the case no image URL could be extracted.

Images returned have their height and width unknown and a resolution level
depending on the image key of the JSON API response.
2023-08-12 22:56:30 +02:00
AudricV
71cda03c4c
[Bandcamp] Apply changes in Extractors 2023-08-12 22:56:29 +02:00
AudricV
7e01eaac33
[Bandcamp] Apply changes in InfoItemExtractors 2023-08-12 22:56:29 +02:00
AudricV
4b80d737a4
[Bandcamp] Add utility methods to get multiple images
Bandcamp images work with image IDs, which provide different resolutions.

Images on Bandcamp are not always squares, and some IDs respect aspect ratios
where some others not.

The extractor will only use the ones which preserve aspect ratio and will not
provide original images, for performance and size purposes.

Because of this aspect ratio preservation constraint, only one dimension will
be known at a time.

The image IDs with their respective dimension used are:

- 10: 1200w;
- 101: 90h;
- 170: 422h;
- 171: 646h;
- 20: 1024w;
- 200: 420h;
- 201: 280h;
- 202: 140h;
- 204: 360h;
- 205: 240h;
- 206: 180h;
- 207: 120h;
- 43: 100h;
- 44: 200h.

(Where w represents the width of the image and h the height of the image)

Note that these dimensions are theoretical because if the image size is less
than the dimensions of the image ID, it will be not upscaled but kept to its
original size.

All these resolutions are stored in a private static list of ThumbnailSuffixes
in BandcampExtractorHelper, in which the methods to get mutliple images have
been added:

- getImagesFromImageUrl(String): public method to get images from an image URL;
- getImagesFromImageId(long, boolean): public method to get images from an
  image ID;
- getImagesFromImageBaseUrl(String): private utility method to get images from
  the static list of ThumbnailSuffixes from a given image base URL, containing
  the path to the image, a "a" letter if it comes from an album, its ID and an
  underscore.

Some existing methods have been also edited:

- the documentation of getImageUrl(long, boolean) has been changed to reflect
  the Bandcamp images findings;
- getThumbnailUrlFromSearchResult has been renamed to
  getImagesFromSearchResult, and a documentation has been added to this method.

The method replaceHttpWithHttps of the Utils class has been also used in
BandcampExtractorHelper instead of doing manually what the method does.
2023-08-12 22:56:29 +02:00