Skip to content

Implements rule connection algorithm#155

Closed
kodymoodley wants to merge 5 commits into
mainfrom
implements-rule-connections
Closed

Implements rule connection algorithm#155
kodymoodley wants to merge 5 commits into
mainfrom
implements-rule-connections

Conversation

@kodymoodley

@kodymoodley kodymoodley commented Mar 26, 2025

Copy link
Copy Markdown
Collaborator
  • Implements rule connection prediction based on fuzzy string matching (for action, outcome and sanction connections)
  • Implements basic caching for wordnet lookup (to speed up performance)
  • Implements negation detection sub module (required to detect sanction-driven connections)
  • Updates deontics dictionary
  • Adds wordnet lemmatization for matching text
  • cleans up code a bit to separate concerns
  • adds some test data and unit tests (vitest)

A combination of wordnet and compromise.js are used

@kodymoodley kodymoodley requested a review from sverhoeven March 26, 2025 17:07
@kodymoodley

Copy link
Copy Markdown
Collaborator Author

I'm happy to continue working on this to optimise performance. This is just a first implementation. Would be good to hear your thoughts @sverhoeven

@kodymoodley

Copy link
Copy Markdown
Collaborator Author

Actually I was wrong it seems, Wordnet is much smaller: https://wordnet.princeton.edu/download/current-version thats good news! Just tens of MBs. But also I am doing calls to the Wordnet API for every pair of token comparisons using the natural.js library. So perhaps one call upfront to download the entire db into localStorage could make it much faster too. Then we just do calls to the local db.

@kodymoodley

Copy link
Copy Markdown
Collaborator Author

Actually I was wrong it seems, Wordnet is much smaller: https://wordnet.princeton.edu/download/current-version thats good news! Just tens of MBs. But also I am doing calls to the Wordnet API for every pair of token comparisons using the natural.js library. So perhaps one call upfront to download the entire db into localStorage could make it much faster too. Then we just do calls to the local db.

Turns out that natural.js downloads wordnet (it does not do remote API calls) so that was not the reason for the very slow computation. See here. So I just implemented basic caching of word lookups and filtering only relevant parts of wordnet based on the tokens in the Statement array inputs. It's much faster. 5 seconds for 20 statements. Yay! But I think we should consider using an alternative to natural.js because it says on the natural.js documentation page: "Keep in mind that the WordNet integration is to be considered experimental at this point, and not production-ready. The API is also subject to change. For an implementation with vastly increased performance, as well as a command-line interface, see wordpos."

@sverhoeven sverhoeven added this to the spa-release milestone Apr 1, 2025
@sverhoeven

sverhoeven commented Apr 1, 2025

Copy link
Copy Markdown
Collaborator

In nlp-route branch I tested if I could run the code in browser sadly it

natural.js?v=a925bc7b:337516 Uncaught TypeError: Class extends value [object Object] is not a constructor or null
    at node_modules/natural/lib/natural/classifiers/classifier.js (natural.js?v=a925bc7b:337516:48)
    at __require2 (chunk-PLDDJCW6.js?v=a925bc7b:17:50)
    at node_modules/natural/lib/natural/classifiers/bayes_classifier.js (natural.js?v=a925bc7b:339867:22)
    at __require2 (chunk-PLDDJCW6.js?v=a925bc7b:17:50)
    at node_modules/natural/lib/natural/classifiers/index.js (natural.js?v=a925bc7b:341265:31)
    at __require2 (chunk-PLDDJCW6.js?v=a925bc7b:17:50)
    at node_modules/natural/lib/natural/index.js (natural.js?v=a925bc7b:852297:7)
    at __require2 (chunk-PLDDJCW6.js?v=a925bc7b:17:50)
    at natural.js?v=a925bc7b:852316:16

Can you replace natural with https://github.com/moos/wordpos-web , which is made to run in browser, while natural performs NodeJS calls which can never run in browser.

@kodymoodley

Copy link
Copy Markdown
Collaborator Author

In nlp-route branch I tested if I could run the code in browser sadly it

natural.js?v=a925bc7b:337516 Uncaught TypeError: Class extends value [object Object] is not a constructor or null
    at node_modules/natural/lib/natural/classifiers/classifier.js (natural.js?v=a925bc7b:337516:48)
    at __require2 (chunk-PLDDJCW6.js?v=a925bc7b:17:50)
    at node_modules/natural/lib/natural/classifiers/bayes_classifier.js (natural.js?v=a925bc7b:339867:22)
    at __require2 (chunk-PLDDJCW6.js?v=a925bc7b:17:50)
    at node_modules/natural/lib/natural/classifiers/index.js (natural.js?v=a925bc7b:341265:31)
    at __require2 (chunk-PLDDJCW6.js?v=a925bc7b:17:50)
    at node_modules/natural/lib/natural/index.js (natural.js?v=a925bc7b:852297:7)
    at __require2 (chunk-PLDDJCW6.js?v=a925bc7b:17:50)
    at natural.js?v=a925bc7b:852316:16

Can you replace natural with https://github.com/moos/wordpos-web , which is made to run in browser, while natural performs NodeJS calls which can never run in browser.

Okay, I will look in to this @sverhoeven

@kinsta

kinsta Bot commented Apr 1, 2025

Copy link
Copy Markdown

Preview deployments for INA-tool ⚡️

Status Branch preview Commit preview
✅ Ready Visit preview Visit preview

Commit: cec01136d1fb376420d64042936e34a234cb4a41

Deployment ID: a704d775-feb8-4da6-add2-35d51a0bb4b4

Static site name: ina-tool-5f19y

…ionary subsets - no raw wordnet files are used to reduce load time
@kodymoodley

Copy link
Copy Markdown
Collaborator Author

@sverhoeven I tried to take a different approach by taking the subsets of wordnet I require at this stage and storing them as JS objects here and here and I use compromise.js for the rest. The code passes my current test suite, but I imagine I will have to extend these dictionaries and the test suite to make it more robust. Let me know if you can at least build and test it for the browser though :)

@sverhoeven

Copy link
Copy Markdown
Collaborator

Yep it works in #161, which is now also prettier and moved compute from main ui thread to web worker.

localhost_5173__project=Example (8)

@kodymoodley

Copy link
Copy Markdown
Collaborator Author

Yep it works in #161, which is now also prettier and moved compute from main ui thread to web worker.

localhost_5173__project=Example (8)

Thanks @sverhoeven! Do you intend to add a button to the component and statement level network panels for this functionality? So that when the user clicks on it, the edges are automatically drawn after computation? Or do you foresee that the user would prefer to use the separate "NLP analysis" section in the menu?

@sverhoeven

Copy link
Copy Markdown
Collaborator

Yep it works in #161, which is now also prettier and moved compute from main ui thread to web worker.
localhost_5173__project=Example (8)

Thanks @sverhoeven! Do you intend to add a button to the component and statement level network panels for this functionality? So that when the user clicks on it, the edges are automatically drawn after computation? Or do you foresee that the user would prefer to use the separate "NLP analysis" section in the menu?

Dunno yet. Users might want to approve/dismiss found connections then a separate page is easier. Also a separate page was easier to implement for debugging.

@sverhoeven

Copy link
Copy Markdown
Collaborator

I combined your work with my UI in #161 .

@sverhoeven sverhoeven closed this Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants