feat: add support for per-level alphabet in NFT#618
Conversation
|
I think it would be cleaner if we had a single The abstract alphabet would probably have to allow for translating optionally on a specific level, so Most, if not all, of the additional functions here would then be unnecessary. |
|
I agree that it would be cleaner, reduce a lot of the confusion and potentially be better for the serialisation/deserialisation of the .mata format later on. I will work on it as soon as possible. |
|
@Adda0 Please take a look. I refactored the code to use This is my first C++ project, so I may have missed some conventions or done something in a way that is uncommon in C++. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #618 +/- ##
==========================================
- Coverage 72.91% 72.65% -0.26%
==========================================
Files 45 45
Lines 6796 7289 +493
Branches 1538 1646 +108
==========================================
+ Hits 4955 5296 +341
- Misses 1227 1347 +120
- Partials 614 646 +32 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| virtual std::string reverse_translate_symbol(Symbol symbol, size_t level) const { | ||
| (void)level; | ||
| return reverse_translate_symbol(symbol); | ||
| } |
There was a problem hiding this comment.
Side note: We definitely need to rename these monstrosities when we are refactoring alphabets. It is awful to work with this in a function call, e.g., something like minimize(nfa, nfa->alphabet.reverse_translate_symbol(symbol), true)...
|
It looks like you can ignore the failing CI actions. Not related. The approach and the interface look good to me overall. It is a schame we cannot hide the invalid operations (with/without levels) for specific alphabet types without something like std::variant etc. Maybe even exploring something like |
|
One idea crossed my mind about handling the levels: what if we introduce another abstract class, LevelAlphabet, deriving from Alphabet, and provide default implementations such that anything implementing Alphabet is also automatically a valid LevelAlphabet? That way, NFA would keep using Alphabet as usual, while NFT would use LevelAlphabet, which could either be a specialized implementation, or just fall back to a plain Alphabet behaving as if all levels share the same alphabet. This would let us leave the current Alphabet implementation untouched. Does this sound like a reasonable direction in C++, or is there a downside I'm missing? As for the Also sorry for my late response, I was focusing on other projects |
This sounds great, but how would you want to implement this? |
|
My idea was something like this class LevelAlphabet : public Alphabet {
public:
~LevelAlphabet() override = default;
// --- Level-aware interface ------------------------------------------
// Defaults: ignore `level`, defer to the flat Alphabet API.
virtual Symbol translate_symb_at_level(const std::string& symb,
Level /*level*/) {
return this->translate_symb(symb);
}
virtual std::string reverse_translate_symbol_at_level(Symbol symbol,
Level /*level*/) const {
return this->reverse_translate_symbol(symbol);
}
virtual utils::OrdVector<Symbol>
get_alphabet_symbols_at_level(Level /*level*/) const {
return this->get_alphabet_symbols();
}
virtual utils::OrdVector<Symbol>
get_complement_at_level(const utils::OrdVector<Symbol>& symbols,
Level /*level*/) const {
return this->get_complement(symbols);
}
virtual bool empty_at_level(Level /*level*/) const {
return this->empty();
}
};Then we provide an adapter over a regular When constructing an NFT, we allow it to put in either Alphabet (then we wrap it into the adapter inside the constructor) or LevelAlphabet. The advantage of this over storing a The only real concern I see is that edit: of course the nft constructor can, in case of single Alphabet, create a vector containing all same pointer to that one alphabet, this would eliminate the if else branch I mentioned above, so maybe the real question is that will in the future have more leveled alphabet representation or not, or simply just leave it to when that future comes and we go for simplicity first. |
|
Just to clarify, I meant the It is an intriguing idea indeed. However, this approach has issues with inheritance. NFTs derive from NFAs, so each NFT has an The idea that we modify the original
|
|
Oh my bad, sorry for the confusion. Personally I prefer the first approach, it is straightforward and hard to gone wrong. My only concern with the second one is that since AlphabetLevels derives from Alphabet, nothing stops someone from nesting a level alphabet inside another level alphabet. It will throw error as soon as someone try to create and run it, but it can be completely avoided by design (this was the main reason I proposed keeping LevelAlphabet separate from normal Alphabet). |
|
Mata has always gone with the assumption that the user knows what they are doing, and any mistakes such as this are their issue. However, if we can make it so that the issue cannot even be created, even better. |
|
I discussed this with @koniksedy, and we agreed that the way forward seems to be the following (approach 1 from above):
What it means for this PR? That, probably, we should create |
|
That sounds clean, I will work on this ASAP. |
|
@Adda0 I've just pushed an update. Could you please check it again? There is now a standalone class |
Adda0
left a comment
There was a problem hiding this comment.
I am really happy with the changes. Thank you. It looks good. Only a few nits remaining, and we can merge.
|
Could you rebase onto master? I want to see whether that fixes the CI. |
This is a very naive implementation, not sure if it is actually correct or I missed something. (cherry picked from commit 705db38)
Co-authored-by: David Chocholatý <chocholaty.david@protonmail.com>
afac0cf to
854aeb5
Compare
|
I hope you mean the |
Oh, yeah. Of course. My bad. I see this as ready for merge. I might first disable auto-tagging on new merges in |
This PR adds support for per-level alphabets in
mata::nft::Nft.The main addition is:
std::vector<Alphabet*> level_alphabetswith the following semantics:
Nftcan fallback to use the inheritednfa::Nfa::alphabetNft::alphabetis set tonullptr, and the per-level alphabets are stored inlevel_alphabetsThe PR also adds two helper functions:
alphabet_of_levelreturns the alphabet for a given level and falls back to the inheritedNfaalphabet when no level-specific alphabet is availableset_level_alphabetsto assign or update alphabetsThis PR does not yet introduce any changes to NFT operations or to the
.mataformat for NFTs. The.mataformat part likely deserves a separate discussion.