Skip to content

chore(deps): bump com.github.crawler-commons:crawler-commons from 1.4 to 1.6#15

Open
dependabot[bot] wants to merge 1 commit into
masterfrom
dependabot/gradle/com.github.crawler-commons-crawler-commons-1.6
Open

chore(deps): bump com.github.crawler-commons:crawler-commons from 1.4 to 1.6#15
dependabot[bot] wants to merge 1 commit into
masterfrom
dependabot/gradle/com.github.crawler-commons-crawler-commons-1.6

Conversation

@dependabot

@dependabot dependabot Bot commented on behalf of github Apr 19, 2026

Copy link
Copy Markdown

Bumps com.github.crawler-commons:crawler-commons from 1.4 to 1.6.

Release notes

Sourced from com.github.crawler-commons:crawler-commons's releases.

crawler-commons-1.6

Important Changes

  • This release adds support for IDN2008 domain names and public suffixes in EffectiveTldFinder. If you rely on a recent version of the public suffix list, please upgrade to release 1.6! See [issue report #551](crawler-commons/crawler-commons#551) for more information.

Full List of Changes

  • Support IDNA2008 Unicode domains by using ALLOW_UNASSIGNED in IDN methods within TldFinder / Normalizer. (Richard Zowalla, sebastian-nagel) #551, #552
  • Add URLUtils class for URL resolution functionality (HamzaElzarw-2022, Richard Zowalla, sebastian-nagel) #526
  • Replace deprecated URL constructor in BasicURLNormalizer (HamzaElzarw-2022, Richard Zowalla, sebastian-nagel) #531
  • Add matchedWildcard flag to BaseRobotRules (CoGiang, sebastian-nagel) #530
  • [Domains] Update unit tests after change in public suffix list (sebastian-nagel, Richard Zowalla) #544
  • [Domains] Replace deleted *.uberspace.de with a wildcard from Google (Richard Zowalla) #532
  • Partial replacement of deprecated URL constructors (HamzaElzarw-2022, kkrugler, sebastian-nagel) #522, #524, #545, #536
  • Upgrade dependencies (dependabot) #521, #533, #534, #547, #549,
  • Upgrade Maven plugins (dependabot) #520, #538, #539, #540, #546, #550, #553, #554, #555

New Contributors

crawler-commons-1.5

Important Changes

  • The robots.txt parser is now pedantic regarding the user-agent names passed to the parseContent() method. The names in the robotNames parameter must be lower-case and the wildcard agent name "*" must not be included. An exception is thrown if these conditions are not met. Please see the Javadoc and #453.

Full List of Changes

  • Migrate publishing from OSSRH to Central Portal (jnioche, sebastian-nagel, Richard Zowalla, aecio) #510, #516
  • [Sitemaps] Add cross-submit feature (Avi Hayun, kkrugler, sebastian-nagel, Richard Zowalla) #85, #515
  • [Sitemaps] Complete sitemap extension attributes (sebastian-nagel, Richard Zowalla) #513, #514
  • [Sitemaps] Allow partial extension metadata (adriabonetmrf, sebastian-nagel, Richard Zowalla) #456, #458, #512
  • [Domains] EffectiveTldFinder to also take shorter suffix matches into account (sebastian-nagel, Richard Zowalla) #479, #505
  • Add package-info.java to all packages (sebastian-nagel, Richard Zowalla) #432, #504
  • [Robots.txt] Extend API to allow to check java.net.URL objects (sebastian-nagel, aecio, Richard Zowalla) #502
  • [Robots.txt] Incorrect robots.txt result for uppercase user agents (teammakdi, sebastian-nagel, aecio, Richard Zowalla) #453, #500
  • Remove class utils.Strings (sebastian-nagel, Richard Zowalla) #503
  • [BasicNormalizer] Complete normalization feature list of BasicURLNormalizer (sebastian-nagel, kkrugler) #494
  • [Robots] Document that URLs not properly normalized may not be matched by robots.txt parser (sebastian-nagel, kkrugler) #492, #493
  • [Sitemaps] Added https variants of namespaces (jnioche) #487
  • [Domains] Add version of public suffix list shipped with release packages enhancement (sebastian-nagel, Richard Zowalla) #433, #484
  • [Domains] Improve representation of public suffix match results by class EffectiveTLD (sebastian-nagel, Richard Zowalla) #478
  • Javadoc: fix links to Java core classes (sebastian-nagel, Richard Zowalla) #417, #483
  • [Sitemaps] Improve logging done by SiteMapParser (Valery Yatsynovich, sebastian-nagel) #457
  • [Sitemaps] Google Sitemap PageMap extensions (josepowera, sebastian-nagel, Richard Zowalla, jnioche) #388, #442
  • [Domains] Installation of a gzip-compressed public suffix list from Maven cache breaks EffectiveTldFinder to address (sebastian-nagel, Richard Zowalla) #441, #443
  • Upgrade dependencies (dependabot) #437, #444, #448, #451, #473, #465, #466, #468, #488, #491, #506, #511, #517
  • Upgrade Maven plugins (dependabot) #434, #438, #439, #449, #445, #452, #455, #459, #460, #464, #469, #467, #470, #471, #472, #474, #475, #476, #477, #480, #481, #482, #489, #490, #495, #496, #497, #498, #499, #508, #509, #518
  • Upgrade GitHub workflow actions v2 -> v4 (sebastian-nagel, Richard Zowalla) #501
Changelog

Sourced from com.github.crawler-commons:crawler-commons's changelog.

Crawler-Commons Change Log

Current Development 1.7-SNAPSHOT (yyyy-mm-dd)

Release 1.6 (2025-12-04)

  • Support IDNA2008 Unicode domains by using ALLOW_UNASSIGNED in IDN methods within TldFinder / Normalizer. (Richard Zowalla, sebastian-nagel) #551, #552
  • Add URLUtils class for URL resolution functionality (HamzaElzarw-2022, Richard Zowalla, sebastian-nagel) #526
  • Replace deprecated URL constructor in BasicURLNormalizer (HamzaElzarw-2022, Richard Zowalla, sebastian-nagel) #531
  • Add matchedWildcard flag to BaseRobotRules (CoGiang, sebastian-nagel) #530
  • [Domains] Update unit tests after change in public suffix list (sebastian-nagel, Richard Zowalla) #544
  • [Domains] Replace deleted *.uberspace.de with a wildcard from Google (Richard Zowalla) #532
  • Partial replacement of deprecated URL constructors (HamzaElzarw-2022, kkrugler, sebastian-nagel) #522, #524, #545, #536
  • Upgrade dependencies (dependabot) #521, #533, #534, #547, #549,
  • Upgrade Maven plugins (dependabot) #520, #538, #539, #540, #546, #550, #553, #554, #555

Release 1.5 (2025-06-27)

  • Migrate publishing from OSSRH to Central Portal (jnioche, sebastian-nagel, Richard Zowalla, aecio) #510, #516
  • [Sitemaps] Add cross-submit feature (Avi Hayun, kkrugler, sebastian-nagel, Richard Zowalla) #85, #515
  • [Sitemaps] Complete sitemap extension attributes (sebastian-nagel, Richard Zowalla) #513, #514
  • [Sitemaps] Allow partial extension metadata (adriabonetmrf, sebastian-nagel, Richard Zowalla) #456, #458, #512
  • [Domains] EffectiveTldFinder to also take shorter suffix matches into account (sebastian-nagel, Richard Zowalla) #479, #505
  • Add package-info.java to all packages (sebastian-nagel, Richard Zowalla) #432, #504
  • [Robots.txt] Extend API to allow to check java.net.URL objects (sebastian-nagel, aecio, Richard Zowalla) #502
  • [Robots.txt] Incorrect robots.txt result for uppercase user agents (teammakdi, sebastian-nagel, aecio, Richard Zowalla) #453, #500
  • Remove class utils.Strings (sebastian-nagel, Richard Zowalla) #503
  • [BasicNormalizer] Complete normalization feature list of BasicURLNormalizer (sebastian-nagel, kkrugler) #494
  • [Robots] Document that URLs not properly normalized may not be matched by robots.txt parser (sebastian-nagel, kkrugler) #492, #493
  • [Sitemaps] Added https variants of namespaces (jnioche) #487
  • [Domains] Add version of public suffix list shipped with release packages enhancement (sebastian-nagel, Richard Zowalla) #433, #484
  • [Domains] Improve representation of public suffix match results by class EffectiveTLD (sebastian-nagel, Richard Zowalla) #478
  • Javadoc: fix links to Java core classes (sebastian-nagel, Richard Zowalla) #417, #483
  • [Sitemaps] Improve logging done by SiteMapParser (Valery Yatsynovich, sebastian-nagel) #457
  • [Sitemaps] Google Sitemap PageMap extensions (josepowera, sebastian-nagel, Richard Zowalla, jnioche) #388, #442
  • [Domains] Installation of a gzip-compressed public suffix list from Maven cache breaks EffectiveTldFinder to address (sebastian-nagel, Richard Zowalla) #441, #443
  • Upgrade dependencies (dependabot) #437, #444, #448, #451, #473, #465, #466, #468, #488, #491, #506, #511, #517
  • Upgrade Maven plugins (dependabot) #434, #438, #439, #449, #445, #452, #455, #459, #460, #464, #469, #467, #470, #471, #472, #474, #475, #476, #477, #480, #481, #482, #489, #490, #495, #496, #497, #498, #499, #508, #509, #518
  • Upgrade GitHub workflow actions v2 -> v4 (sebastian-nagel, Richard Zowalla) #501

Release 1.4 (2023-07-13)

  • [Robots.txt] Implement Robots Exclusion Protocol (REP) IETF Draft: port unit tests (sebastian-nagel, Richard Zowalla) #245, #360
  • [Robots.txt] Close groups of rules as defined in RFC 9309 (kkrugler, garyillyes, jnioche, sebastian-nagel) #114, #390, #430
  • [Robots.txt] Empty disallow statement not to clear other rules (sebastian-nagel, jnioche) #422, #424
  • [Robots.txt] SimpleRobotRulesParser main() to follow five redirects (sebastian-nagel, jnioche) #428
  • [Robots.txt] Add more spelling variants and typos of robots.txt directives (sebastian-nagel, jnioche) #425
  • [Robots.txt] Document effect of rules merging in combination with multiple agent names (sebastian-nagel, Richard Zowalla) #423, #426
  • [Robots.txt] Pass empty collection of agent names to select rules for any robot (wildcard user-agent name) (sebastian-nagel, Richard Zowalla) #427
  • [Robots.txt] Rename default user-agent / robot name in unit tests (sebastian-nagel, Richard Zowalla) #429
  • [Robots.txt] Add units test based on examples in RFC 9309 (sebastian-nagel, Richard Zowalla) #420
  • [BasicNormalizer] Query parameters normalization in BasicURLNormalizer (aecio, sebastian-nagel, Richard Zowalla) #308, #421
  • [Robots.txt] Deduplicate robots rules before matching (sebastian-nagel, jnioche) #416

... (truncated)

Commits
  • ce0fcb3 [maven-release-plugin] prepare release crawler-commons-1.6
  • 3d24b45 Update CHANGES.txt for release of 1.6
  • 9c82251 Bump org.apache.maven.plugins:maven-source-plugin from 3.3.1 to 3.4.0
  • 59cb351 Bump de.thetaphi:forbiddenapis from 3.9 to 3.10
  • 4730575 Bump org.apache.maven.plugins:maven-jar-plugin from 3.4.2 to 3.5.0
  • 5164d06 Update changelog
  • e58ecbc #551 – Support IDNA2008 Unicode domains by using ALLOW_UNASSIGNED in IDN meth...
  • 23aaeb2 Update Changelog
  • 8eb9c8f Unit tests for URLUtils.resolve(...)
  • 94658e3 Update Changelog
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [com.github.crawler-commons:crawler-commons](https://github.com/crawler-commons/crawler-commons) from 1.4 to 1.6.
- [Release notes](https://github.com/crawler-commons/crawler-commons/releases)
- [Changelog](https://github.com/crawler-commons/crawler-commons/blob/master/CHANGES.txt)
- [Commits](crawler-commons/crawler-commons@crawler-commons-1.4...crawler-commons-1.6)

---
updated-dependencies:
- dependency-name: com.github.crawler-commons:crawler-commons
  dependency-version: '1.6'
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot @github

dependabot Bot commented on behalf of github Apr 19, 2026

Copy link
Copy Markdown
Author

Labels

The following labels could not be found: dependencies. Please create it before Dependabot can add it to a pull request.

Please fix the above issues or remove invalid values from dependabot.yml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants