You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The RuleWritable is probably the most important datatype in Thrax. It is a representation of an SCFG rule. It has these fields:
lhs, a Text (a Hadoop datatype for fast-comparison Strings) representing the left hand side nonterminal of the rule.
source, a Text representation of the source side of the rule
target, a Text of the target side of the rule
e2f and f2e, two AlignmentArrays giving the target-to-source alignments and source-to-target alignments, respectively
features, a MapWritable.
Here are some notes:
The AlignmentArray is a two-dimensional array of Text. It has a length equal to the number of terminal symbols on a given side, and the first item of each array is that terminal symbol. The remaining items are the terminals it has been aligned to, or "/UNALIGNED/" if the word is unaligned. For example, let's say we have a rule
[X] ||| foo [X] bar baz ||| a b [X] c |||
where foo is aligned to a and b, baz is aligned to c and bar is unaligned. Then the AlignmentArrays would look like this:
e2f:
[ a | foo ]
[ b | foo ]
[ c | baz]
f2e:
[ foo | a | b ]
[ bar | /UNALIGNED/ ]
[ baz | c ]
features is a MapWritable. This is what you will want to modify to add new feature values to a rule. Once you calculate a feature value, you can simply put it into the map. Easy.