Skip to content

Releases: apache/opennlp

OpenNLP 3.0.0-M3

01 May 17:37

Choose a tag to compare

Apache OpenNLP 3.0.0-M3

This release focuses on security hardening, new NLP capabilities, and dependency maintenance.

Security Fixes

Three security issues are addressed in this release (also backported to 2.5.9).

XXE in DictionaryEntryPersistor (OPENNLP-1819)

The DictionaryEntryPersistor previously used a SAXParserFactory that did not enable secure processing or disable DTD handling, leaving external entity resolution active. A malicious dictionary file could exploit this for local file disclosure or SSRF before any dictionary entry was processed.

The parsing path is now aligned with the project's existing XmlUtil helper, which properly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl.

Arbitrary Class Instantiation in ExtensionLoader (OPENNLP-1820)

ExtensionLoader.instantiateExtension() performed its isAssignableFrom type check after Class.forName() had already executed the target class's static initializer, allowing a crafted model archive to trigger the static initializer of any class on the classpath.

The fix introduces a package-prefix allowlist consulted before Class.forName() is invoked:

  • Classes under opennlp.* remain permitted by default.
  • Other packages must be opted in via ExtensionLoader.registerAllowedPackage(String) or the OPENNLP_EXT_ALLOWED_PACKAGES system property (comma-separated list).

OOM via Unbounded Array Allocation in AbstractModelReader (OPENNLP-1821)

getOutcomes(), getOutcomePatterns(), and getPredicates() read attacker-controlled 32-bit count fields from binary model streams and passed them directly to array allocations. A crafted .bin file could trigger an immediate OutOfMemoryError and crash the JVM.

Each count is now bounded (default 10,000,000, configurable via -DOPENNLP_MAX_ENTRIES=<n>), with negative or oversized values failing fast via IllegalArgumentException.

⚠️ For all three issues, users who cannot upgrade immediately should restrict input (dictionary and model files) to trusted sources only.

New Features & Improvements

What's Changed

New Contributors

Full Changelog: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12311215&version=12356813

OpenNLP 2.5.9

01 May 17:39

Choose a tag to compare

Apache OpenNLP 2.5.9

This is a maintenance and security release on the 2.x line. It backports the security fixes shipped in 3.0.0-M3 and refreshes several dependencies.

Security Fixes

Three security issues are addressed in this release (also fixed in 3.0.0-M3 on the 3.x line).

XXE in DictionaryEntryPersistor (OPENNLP-1819)

The DictionaryEntryPersistor previously used a SAXParserFactory that did not enable secure processing or disable DTD handling, leaving external entity resolution active. A malicious dictionary file could exploit this for local file disclosure or SSRF before any dictionary entry was processed.

The parsing path is now aligned with the project's existing XmlUtil helper, which properly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl.

Arbitrary Class Instantiation in ExtensionLoader (OPENNLP-1820)

ExtensionLoader.instantiateExtension() performed its isAssignableFrom type check after Class.forName() had already executed the target class's static initializer, allowing a crafted model archive to trigger the static initializer of any class on the classpath.

The fix introduces a package-prefix allowlist consulted before Class.forName() is invoked:

  • Classes under opennlp.* remain permitted by default.
  • Other packages must be opted in via ExtensionLoader.registerAllowedPackage(String) or the OPENNLP_EXT_ALLOWED_PACKAGES system property (comma-separated list).

OOM via Unbounded Array Allocation in AbstractModelReader (OPENNLP-1821)

getOutcomes(), getOutcomePatterns(), and getPredicates() read attacker-controlled 32-bit count fields from binary model streams and passed them directly to array allocations. A crafted .bin file could trigger an immediate OutOfMemoryError and crash the JVM.

Each count is now bounded (default 10,000,000, configurable via -DOPENNLP_MAX_ENTRIES=<n>), with negative or oversized values failing fast via IllegalArgumentException.

⚠️ For all three issues, users who cannot upgrade immediately should restrict input (dictionary and model files) to trusted sources only.

What's Changed

Full Changelog: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12311215&version=12356814

OpenNLP 2.5.8

31 Mar 05:34

Choose a tag to compare

Summary

Maintenance Infos:

  • Bug Fixes:
  • Improvements:
    • The OpenNLP developer manual (HTML + PDF) got an uplift for the UIMA documentation part, being largely extended (OPENNLP-49)
    • Some updates of dependencies

What's Changed

Full Changelog: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12356577&projectId=12311215

OpenNLP 3.0.0-M2

31 Mar 05:27

Choose a tag to compare

Summary

The 3.x release line of Apache OpenNLP introduces no known breaking changes while significantly modularizing the project to improve library usage and future extensibility. The core API remains stable and fully compatible with 2.x, so existing projects can continue using the opennlp-tools artifact without (substantial) modifications.

Key Highlights:

  • Notable Changes:
  • New Features:
    • Apache OpenNLP can now detect sentiment from text (OPENNLP-855)
    • The eval corpus format for GermEval2014 is now supported (OPENNLP-976)
    • Document Categorization is now possible via a binding to LibSVM (OPENNLP-1808)
  • Bug Fixes:
    • The SentenceDetector got three fixes in handling edge cases with abbreviation dictionaries (OPENNLP-1809, OPENNLP-1810, OPENNLP-1811) - NOTE: These fixes will be back-ported to the upcoming OpenNLP release 2.5.8 as well.
  • Improvements:
    • Language Codes passed in are now stricter validated to comply with ISO-693 standard (OPENNLP-991)
    • The OpenNLP developer manual (HTML + PDF) got an uplift for the UIMA documentation part, being largely extended (OPENNLP-49)

What's Changed

Full Changelog: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12356724&projectId=12311215

OpenNLP 3.0.0-M1

23 Feb 11:11

Choose a tag to compare

Summary

The 3.x release line of Apache OpenNLP introduces no known breaking changes while significantly modularizing the project to improve library usage and future extensibility.
The core API remains stable and fully compatible with 2.x, so existing projects can continue using the opennlp-tools artifact without modifications.

Key Highlights and Recommendations:
• Modularization: The project is now organized into multiple modules:
opennlp-api, opennlp-core, opennlp-cli, opennlp-extensions, ML modules (e.g., opennlp-ml-maxent, opennlp-ml-perceptron), and more.
• Users can include only the modules needed, reducing dependency footprint.
• Only opennlp-runtime is mandatory for basic functionality.
• CLI Stability: Existing command-line usage remains unchanged.

What's Changed

Read more

OpenNLP 2.5.7

11 Dec 10:23

Choose a tag to compare

What's Changed

Full Changelog: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12311215&version=12356333

OpenNLP 2.5.6.1

20 Oct 06:59

Choose a tag to compare

What's Changed

Full Changelog: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12311215&version=12356312

OpenNLP 2.5.6

12 Oct 06:32

Choose a tag to compare

What's Changed

Full Changelog: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12311215&version=12355253

OpenNLP 2.5.5

23 Jul 10:08

Choose a tag to compare

What's Changed

Full Changelog: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12311215&version=12355873

OpenNLP 2.5.4

16 Apr 19:56

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: opennlp-2.5.3...opennlp-2.5.4