You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I propose extending the key-bracket binding grammar so a key can bind to a temporal range rather than a single instant. Today foo[t] binds a key
to the value(s) that existed at one point in time (parsed into a TemporalKeySymbol wrapping a single TimestampSymbol, or folded onto a NavigationKeyStop). This ticket adds three new forms:
foo[t1...t2] binds the key to a half-open interval [t1, t2)
(start-inclusive, end-exclusive).
foo[...t2] is the open-start form, bounded only on the end side
(exclusive).
foo[t1...] is the open-end form, bounded only on the start side
(inclusive).
This is the grammar, AST, and parse side only. It defines the syntax,
the new AST node, and the canonical, re-parseable string form. The AST
records the two endpoints; it does not decide what a range read returns.
Server-side evaluation (resolving a range binding against stored revisions)
is a separate concourse change and is explicitly out of scope here, and the
precise read semantics described below under "Evaluation semantics are
deferred" are part of that change, not this one.
The motivating use case is a data-sync capability in cinchapi-server, a
load-balanced service whose instances scale up and down on Kubernetes.
Connector configuration (sources, credentials, schedule, and a lock field)
lives in a shared Concourse instance with no external coordinator: each
instance claims an available connection, records a lock, runs the sync, and
releases the lock. Because an instance can die mid-sync and never release
its lock, a lock must be treated as expired when it was last asserted longer
ago than a timeout window, and that staleness must be decided at query time
without mutating stored state. A range binding on the lock key is the
primitive that lets a caller scope a lock read to a freshness window in a
single key expression. Whether that window is interpreted as "values
present at some moment in the window" or "values whose write instant falls
in the window" is an evaluation-semantics decision that belongs to the
concourse change; see "Evaluation semantics are deferred" below.
Proposed API / Syntax
I propose three new bracket-content shapes, recognized only where the
existing single-timestamp bracket binding is already accepted. The
range separator is the literal token ... (three dots).
key[<start>...<end>] closed range: [start, end)
key[...<end>] open start: bounded on the end side (exclusive)
key[<start>...] open end: bounded on the start side (inclusive)
Precise parse-side semantics:
Inclusivity. The interval is half-open, start-inclusive and
end-exclusive, matching Concourse's existing BETWEEN operator
convention (verified in IndexRecord: the BETWEEN branch scans subSet(start, true, end, false) and guards historical matches with >= start and < end). An endpoint equal to start is included; an
endpoint equal to end is excluded.
Endpoint values. Each endpoint is a timestamp in exactly the forms
the single-timestamp bracket already accepts: a microsecond epoch, a
date string, or a natural-language phrase. Each endpoint is parsed
independently by NaturalLanguage.parseMicros
(src/main/java/com/cinchapi/ccl/util/NaturalLanguage.java:78), which
cannot parse a range string, so the two endpoints must be split on ... and parsed one at a time.
At least one endpoint required.key[...] with both endpoints
omitted is rejected at parse time; a range must pin at least one side.
Ordering. When both endpoints are present and resolve to micros, I
recommend the parser reject start > end at parse time with a SyntaxException (a backwards interval is always a user error and is
cheap to catch). start == end denotes an empty interval under the
half-open rule; I recommend accepting it at parse time and letting the
evaluator return nothing, so the parser does not need to special-case
it. Note that natural-language endpoints are resolved relative to the
current instant, so the ordering check applies to the resolved micros at
parse time.
At most one binding per key. A key carries at most one
bracket-timestamp binding, whether single or range. foo[t1][t2...t3]
and any other doubled bracket remain rejected, consistent with the
current invariant.
Single-instant form unchanged.foo[t] continues to parse to a TemporalKeySymbol. The range forms parse to a new sibling node so the
single-instant contract and its toString() are untouched.
New AST node (sibling to TemporalKeySymbol, e.g. TemporalRangeKeySymbol):
It wraps a KeyTokenSymbol plus an optional start TimestampSymbol and
an optional end TimestampSymbol, with the invariant that at least one
is present.
It mirrors the existing TemporalKeySymbol constructor guards
(src/main/java/com/cinchapi/ccl/grammar/TemporalKeySymbol.java:61-74):
it rejects wrapping a NavigationKeySymbol (navigation timestamps live
on stops) and rejects wrapping an already-temporal key.
isParameterized() returns true; stripParameters() and baseKey()
delegate to the wrapped key, exactly as TemporalKeySymbol does
(TemporalKeySymbol.java:108-122).
toString() emits a canonical, re-parseable string using the resolved
micros of whichever endpoints are present: key[<m1>...<m2>], key[...<m2>], or key[<m1>...]. This canonical form is a cross-repo
contract: concourse's Keys.parse must re-parse the exact same string
(see the related concourse change in Scope and dependencies). The
existing single form emits key[<micros>]
(TemporalKeySymbol.java:104-106); the range form follows the same
bracket-of-micros style.
No new programmatic builder API is required. Per the concourse-driver-java Criteria contract, the binding is expressed inside the key string (for
example .key("lock[t1...]")) and parsed by CCL; the Criteria Javadoc
documents this where().key("name[t]") pattern directly. The
operation-level .at(Timestamp) builder
(concourse-driver-java/.../lang/Criteria.java:111) is an unrelated
point-in-time fallback and is not touched.
Evaluation semantics are deferred
This ticket records two endpoints on the AST; it does not decide what a
range read returns. There are two coherent interpretations, and choosing
between them is an evaluation-semantics decision that belongs to the
concourse change:
Present-during: the values that existed at some moment in the
interval. This is the natural generalization of the single foo[t]
form, which reads the state as of t (per CCL_REFERENCE.md section 8.1,
a leaf bracket pins the leaf's evaluation timestamp). Under this
interpretation a value written before start and never changed is still
present throughout the interval and is in range.
Asserted-during: the values whose write instant falls in the
interval. Under this interpretation a value written before start is
not in range, even if it is still the current value.
These differ precisely for the lock-freshness case. A lock written once,
long before now - timeout, and never re-asserted is present-during the
freshness window but was not asserted-during it; freshness wants the
asserted-during answer. The parser does not need to take a position to
record the endpoints, and it must not, because evaluation is out of scope.
The concourse evaluation change is where this is pinned down and where the
endpoints described here are translated into a revision scan. I am calling
this out so the cross-repo follow-up does not silently inherit the wrong
interpretation.
Examples
Connector lock freshness (the motivating case). The caller scopes a lock
read to a freshness window in a single key expression; the exact
freshness rule is decided by the concourse evaluation change (see
"Evaluation semantics are deferred"):
lock[1718380800000000...] = "claimed"
Read commands with a range-bound key:
select lock[1718380800000000...] from 1
get status[...1718380800000000] from 1
find lock[1718380800000000...1718384400000000] = "claimed"
Endpoint value forms (each endpoint independently, as in the single form):
This is a planning ticket; the following is the recommended approach. The
existing single-timestamp binding lives in two parallel parse paths that
must change in lockstep, exactly as the single form does.
1. Token: add an explicit ... separator. Add a RANGE_SEPARATOR
token for the literal ... rather than leaning on the NON_ALPHANUMERIC
catch-all (grammar.jjt:316). Hazard: . is shared across the NON_ALPHANUMERIC catch-all, SIGNED_DECIMAL (grammar.jjt:302), and the
dotted key tokens, so an unquoted run such as 123...456 is lexically
ambiguous and must be disambiguated deliberately rather than left to the
default token competition. (Note that 123...456 does not itself match SIGNED_DECIMAL, which requires exactly one period with digits on both
sides; the risk is mis-tokenization across the shared . rules, not a
single-token collision.) Guard the new production with LOOKAHEAD, and to
sidestep the lexer hazard entirely for a first cut I recommend requiring
numeric-micros or quoted endpoints around ... (the single form's tests
already exercise both numeric and quoted endpoints). Note that the
lexer-path #KEY_BRACKET fragment (grammar.jjt:310) already admits any
character except ], [, \n, \r inside brackets, so ... is already
lexically legal inside a lexer-captured bracket; the work there is in the
content parser, not the token.
2. Parser path (flat / leaf and scope-prefix keys).Key()
(grammar.jjt:543-573) calls BracketedTimestamp() (grammar.jjt:771-780),
which returns a single TimestampSymbol. Extend the bracket production (or
add a sibling, e.g. BracketedTimestampRange(), selected by LOOKAHEAD on
the presence of ...) to return either a single timestamp or a start/end
pair. In Key(), when the bracket yields a range and result is not a NavigationKeySymbol, construct the new TemporalRangeKeySymbol instead of TemporalKeySymbol (grammar.jjt:572). The SearchKey() path
(grammar.jjt:529-541) reaches flat search keys via the lexer-captured
bracket and does not call BracketedTimestamp(), so it is covered by the
lexer-path work below.
3. Lexer path (bracket content captured inside key tokens). For keys
captured as PERIOD_SEPARATED_STRING / NAVIGATION_SCOPE_OPEN / ASTERISK_SUFFIXED_STRING, the bracket content is parsed by NavigationKeyStop.parseBracketContent
(src/main/java/com/cinchapi/ccl/grammar/NavigationKeyStop.java:208-230),
which today returns one TimestampSymbol and splits on whitespace. Teach
it to detect the ... separator, split into endpoints, parse each via NaturalLanguage.parseMicros, and carry the result.
4. AST node and per-stop carrier. Add TemporalRangeKeySymbol modeled
on TemporalKeySymbol (file cited above), including equals/hashCode
over the wrapped key plus both endpoints, and the parameter-aware hooks
(isParameterized, stripParameters, baseKey) defined on KeyTokenSymbol
(src/main/java/com/cinchapi/ccl/grammar/KeyTokenSymbol.java:106-132).
5. Scope. I recommend implementing leaf keys first. The per-stop
navigation carrier (NavigationKeyStop, which currently holds a single @Nullable TimestampSymbol at NavigationKeyStop.java:265-266, with withTimestamp at line 323 and segment() at line 364) and the
scope-prefix forms multiply the work across NavigationKeyStop and NavigationKeySymbol, so range support on navigation stops and scope
prefixes is a sensible follow-up rather than part of the first cut.
6. Command acceptance. Range bindings should be accepted on the same
commands that accept the single bracket binding and rejected on the same
ones that reject it, via the existing requireNotParameterized guard
(KeyTokenSymbol.java:34-57), which is already invoked from write-command and
range-history grammar actions (grammar.jjt:1059, 1081, 1099, 1124, 1136,
1595, 1612, 1696, 1710, 1780, and others). Because the new node returns isParameterized() == true, writes and the history-range commands
(audit, chronicle, diff) reject it for free; those commands already
take a from ... to window, as noted in CCL_REFERENCE.md section 8.2. For
a first cut I recommend accepting the range binding wherever the single form
is accepted (read and find paths).
7. No grammar-visitor change. As with the single form, the binding
lives on the key symbol and the expression node merely wraps it, so the
compiler visitor needs no change.
8. Regenerate the parser. The grammar is JavaCC/JJTree at
grammar/grammar.jjt (not ANTLR). After editing the .jjt, regenerate the
checked-in sources under src/main/java/com/cinchapi/ccl/generated by
running ./javacc-parser-generator.sh (a self-contained jar in the repo
root). Edit the .jjt, never the generated files.
Performance
Parse cost. Parsing is unchanged in order of growth: one extra LOOKAHEAD to detect ... and at most two NaturalLanguage.parseMicros
calls per bracket instead of one. There is no full scan and no nested
iteration introduced at parse time.
No O(n^2) risk in the grammar. The LOOKAHEAD guard that
disambiguates ... from SIGNED_DECIMAL and the dotted-key tokens must
be bounded (fixed-token lookahead), not a syntactic lookahead over the
whole bracket, so it does not turn parsing super-linear.
Evaluation is out of scope, but the AST must enable push-down. The
node should expose start and end as resolved micros so the concourse
evaluator can translate a range binding into an index-backed range scan
over a key's revisions rather than materializing and filtering full
history. Carrying the endpoints as plain micros (not as opaque phrases)
is what keeps that push-down possible on large datasets. The parser
resolving natural-language endpoints once, at parse time, also avoids
repeated re-resolution downstream.
Testing
This repo writes tests but does not run them (a live server is required);
reproduction and behavior tests still belong in the ticket. Follow the
existing conventions: JUnit 4, and the mandatory Goal / Start state /
Workflow / Expected Javadoc block that the project guidelines require on
every @Test. BracketTimestampMatrixTest already follows this block on
every method and is the model for the new tests; the new tests must carry
the same block.
closed foo[t1...t2], open-start foo[...t2], open-end foo[t1...]
each parse to the new node with the expected resolved endpoints;
half-open boundary intent is documented in the Goal (an endpoint equal
to start is in range, an endpoint equal to end is not), even though
the parser only records endpoints;
foo[...] (both endpoints omitted) is rejected;
backwards foo[t2...t1] with t2 > t1 is rejected;
double binding foo[t1][t2...t3] is rejected, mirroring the existing testF7_DoubleBracketOnLeafRejected flat double-bracket case;
keyword-equivalence is unaffected (the at/on/during keyword is a
single-instant concept and does not combine with ...);
round-trip: extend the assertRoundTrip helper coverage so each
new form survives tokenize then toString then re-parse to an equal
AST. This is the gate that proves the canonical toString() is
re-parseable.
Node unit tests mirroring
src/test/java/com/cinchapi/ccl/grammar/TemporalKeySymbolTest.java for the
new TemporalRangeKeySymbol: null-key and null-endpoint rejection,
rejection of a fully-open (both-null) range, rejection of wrapping a NavigationKeySymbol or an already-temporal key, accessor behavior, equals/hashCode, and the three toString() shapes.
Command tests in
src/test/java/com/cinchapi/ccl/BracketTimestampCommandTest.java:
range binding accepted on the read/find commands chosen for the first
cut, and rejected on writes and on audit/chronicle/diff with the
context-naming SyntaxException the existing assertCommandRejected
helper checks for.
Docs. Update CCL_REFERENCE.md section 8.4 (bracket annotation, in
detail) to document the three range forms and the half-open semantics,
and confirm section 8.2 (where brackets are accepted) still describes the
accept/reject split correctly for the range forms.
Scope and dependencies
In scope (first cut):
The ... range separator token and the three range forms on leaf
keys, in both the parser path and the lexer path.
The new TemporalRangeKeySymbol AST node with a canonical,
re-parseable toString().
Parse-time validation: at least one endpoint, reject backwards ranges,
reject double bindings.
Command acceptance via the existing requireNotParameterized guard, and
the matrix, node, command, and reference-doc updates above.
Out of scope:
Range bindings on navigation stops (a[t1...t2].b) and on scope
prefixes (a[t1...].(...)). These multiply the work across NavigationKeyStop and NavigationKeySymbol and are a natural
follow-up.
Server-side evaluation of a range binding, including the choice
between the present-during and asserted-during interpretations
documented above. That is a separate concourse change.
Cross-repo dependency. Concourse pins ccl 4.0.0
(concourse-driver-java/build.gradle); the local ccl base version is 4.1.0
(.version). This grammar change must ship in a ccl release, after which
concourse bumps its ccl dependency before the matching concourse
range-evaluation change can land. That concourse change is where Keys.parse
must be taught to re-parse the canonical range toString() produced here,
making this node's canonical string form a cross-repo contract. I am
naming the related repo and feature in prose deliberately and am not
inventing an issue number for it.
Related
Part of the connector data-sync locking initiative (cinchapi-server).
Server-side evaluation of this grammar: cinchapi/concourse#789.
Summary
I propose extending the key-bracket binding grammar so a key can bind to a
temporal range rather than a single instant. Today
foo[t]binds a keyto the value(s) that existed at one point in time (parsed into a
TemporalKeySymbolwrapping a singleTimestampSymbol, or folded onto aNavigationKeyStop). This ticket adds three new forms:foo[t1...t2]binds the key to a half-open interval[t1, t2)(start-inclusive, end-exclusive).
foo[...t2]is the open-start form, bounded only on the end side(exclusive).
foo[t1...]is the open-end form, bounded only on the start side(inclusive).
This is the grammar, AST, and parse side only. It defines the syntax,
the new AST node, and the canonical, re-parseable string form. The AST
records the two endpoints; it does not decide what a range read returns.
Server-side evaluation (resolving a range binding against stored revisions)
is a separate concourse change and is explicitly out of scope here, and the
precise read semantics described below under "Evaluation semantics are
deferred" are part of that change, not this one.
The motivating use case is a data-sync capability in cinchapi-server, a
load-balanced service whose instances scale up and down on Kubernetes.
Connector configuration (sources, credentials, schedule, and a lock field)
lives in a shared Concourse instance with no external coordinator: each
instance claims an available connection, records a lock, runs the sync, and
releases the lock. Because an instance can die mid-sync and never release
its lock, a lock must be treated as expired when it was last asserted longer
ago than a timeout window, and that staleness must be decided at query time
without mutating stored state. A range binding on the lock key is the
primitive that lets a caller scope a lock read to a freshness window in a
single key expression. Whether that window is interpreted as "values
present at some moment in the window" or "values whose write instant falls
in the window" is an evaluation-semantics decision that belongs to the
concourse change; see "Evaluation semantics are deferred" below.
Proposed API / Syntax
I propose three new bracket-content shapes, recognized only where the
existing single-timestamp bracket binding is already accepted. The
range separator is the literal token
...(three dots).Precise parse-side semantics:
end-exclusive, matching Concourse's existing
BETWEENoperatorconvention (verified in
IndexRecord: theBETWEENbranch scanssubSet(start, true, end, false)and guards historical matches with>= startand< end). An endpoint equal tostartis included; anendpoint equal to
endis excluded.the single-timestamp bracket already accepts: a microsecond epoch, a
date string, or a natural-language phrase. Each endpoint is parsed
independently by
NaturalLanguage.parseMicros(src/main/java/com/cinchapi/ccl/util/NaturalLanguage.java:78), which
cannot parse a range string, so the two endpoints must be split on
...and parsed one at a time.key[...]with both endpointsomitted is rejected at parse time; a range must pin at least one side.
recommend the parser reject
start > endat parse time with aSyntaxException(a backwards interval is always a user error and ischeap to catch).
start == enddenotes an empty interval under thehalf-open rule; I recommend accepting it at parse time and letting the
evaluator return nothing, so the parser does not need to special-case
it. Note that natural-language endpoints are resolved relative to the
current instant, so the ordering check applies to the resolved micros at
parse time.
bracket-timestamp binding, whether single or range.
foo[t1][t2...t3]and any other doubled bracket remain rejected, consistent with the
current invariant.
foo[t]continues to parse to aTemporalKeySymbol. The range forms parse to a new sibling node so thesingle-instant contract and its
toString()are untouched.New AST node (sibling to
TemporalKeySymbol, e.g.TemporalRangeKeySymbol):KeyTokenSymbolplus an optional startTimestampSymbolandan optional end
TimestampSymbol, with the invariant that at least oneis present.
TemporalKeySymbolconstructor guards(src/main/java/com/cinchapi/ccl/grammar/TemporalKeySymbol.java:61-74):
it rejects wrapping a
NavigationKeySymbol(navigation timestamps liveon stops) and rejects wrapping an already-temporal key.
isParameterized()returnstrue;stripParameters()andbaseKey()delegate to the wrapped key, exactly as
TemporalKeySymboldoes(TemporalKeySymbol.java:108-122).
toString()emits a canonical, re-parseable string using the resolvedmicros of whichever endpoints are present:
key[<m1>...<m2>],key[...<m2>], orkey[<m1>...]. This canonical form is a cross-repocontract: concourse's
Keys.parsemust re-parse the exact same string(see the related concourse change in Scope and dependencies). The
existing single form emits
key[<micros>](TemporalKeySymbol.java:104-106); the range form follows the same
bracket-of-micros style.
No new programmatic builder API is required. Per the concourse-driver-java
Criteriacontract, the binding is expressed inside the key string (forexample
.key("lock[t1...]")) and parsed by CCL; theCriteriaJavadocdocuments this
where().key("name[t]")pattern directly. Theoperation-level
.at(Timestamp)builder(concourse-driver-java/.../lang/Criteria.java:111) is an unrelated
point-in-time fallback and is not touched.
Evaluation semantics are deferred
This ticket records two endpoints on the AST; it does not decide what a
range read returns. There are two coherent interpretations, and choosing
between them is an evaluation-semantics decision that belongs to the
concourse change:
interval. This is the natural generalization of the single
foo[t]form, which reads the state as of
t(per CCL_REFERENCE.md section 8.1,a leaf bracket pins the leaf's evaluation timestamp). Under this
interpretation a value written before
startand never changed is stillpresent throughout the interval and is in range.
interval. Under this interpretation a value written before
startisnot in range, even if it is still the current value.
These differ precisely for the lock-freshness case. A lock written once,
long before
now - timeout, and never re-asserted is present-during thefreshness window but was not asserted-during it; freshness wants the
asserted-during answer. The parser does not need to take a position to
record the endpoints, and it must not, because evaluation is out of scope.
The concourse evaluation change is where this is pinned down and where the
endpoints described here are translated into a revision scan. I am calling
this out so the cross-repo follow-up does not silently inherit the wrong
interpretation.
Examples
Connector lock freshness (the motivating case). The caller scopes a lock
read to a freshness window in a single key expression; the exact
freshness rule is decided by the concourse evaluation change (see
"Evaluation semantics are deferred"):
Read commands with a range-bound key:
Endpoint value forms (each endpoint independently, as in the single form):
Canonical round-trip (what
toString()re-emits, all-micros):Implementation pointers
This is a planning ticket; the following is the recommended approach. The
existing single-timestamp binding lives in two parallel parse paths that
must change in lockstep, exactly as the single form does.
1. Token: add an explicit
...separator. Add aRANGE_SEPARATORtoken for the literal
...rather than leaning on theNON_ALPHANUMERICcatch-all (grammar.jjt:316). Hazard:
.is shared across theNON_ALPHANUMERICcatch-all,SIGNED_DECIMAL(grammar.jjt:302), and thedotted key tokens, so an unquoted run such as
123...456is lexicallyambiguous and must be disambiguated deliberately rather than left to the
default token competition. (Note that
123...456does not itself matchSIGNED_DECIMAL, which requires exactly one period with digits on bothsides; the risk is mis-tokenization across the shared
.rules, not asingle-token collision.) Guard the new production with
LOOKAHEAD, and tosidestep the lexer hazard entirely for a first cut I recommend requiring
numeric-micros or quoted endpoints around
...(the single form's testsalready exercise both numeric and quoted endpoints). Note that the
lexer-path
#KEY_BRACKETfragment (grammar.jjt:310) already admits anycharacter except
],[,\n,\rinside brackets, so...is alreadylexically legal inside a lexer-captured bracket; the work there is in the
content parser, not the token.
2. Parser path (flat / leaf and scope-prefix keys).
Key()(grammar.jjt:543-573) calls
BracketedTimestamp()(grammar.jjt:771-780),which returns a single
TimestampSymbol. Extend the bracket production (oradd a sibling, e.g.
BracketedTimestampRange(), selected byLOOKAHEADonthe presence of
...) to return either a single timestamp or a start/endpair. In
Key(), when the bracket yields a range andresultis not aNavigationKeySymbol, construct the newTemporalRangeKeySymbolinstead ofTemporalKeySymbol(grammar.jjt:572). TheSearchKey()path(grammar.jjt:529-541) reaches flat search keys via the lexer-captured
bracket and does not call
BracketedTimestamp(), so it is covered by thelexer-path work below.
3. Lexer path (bracket content captured inside key tokens). For keys
captured as
PERIOD_SEPARATED_STRING/NAVIGATION_SCOPE_OPEN/ASTERISK_SUFFIXED_STRING, the bracket content is parsed byNavigationKeyStop.parseBracketContent(src/main/java/com/cinchapi/ccl/grammar/NavigationKeyStop.java:208-230),
which today returns one
TimestampSymboland splits on whitespace. Teachit to detect the
...separator, split into endpoints, parse each viaNaturalLanguage.parseMicros, and carry the result.4. AST node and per-stop carrier. Add
TemporalRangeKeySymbolmodeledon
TemporalKeySymbol(file cited above), includingequals/hashCodeover the wrapped key plus both endpoints, and the parameter-aware hooks
(
isParameterized,stripParameters,baseKey) defined onKeyTokenSymbol(src/main/java/com/cinchapi/ccl/grammar/KeyTokenSymbol.java:106-132).
5. Scope. I recommend implementing leaf keys first. The per-stop
navigation carrier (
NavigationKeyStop, which currently holds a single@Nullable TimestampSymbolat NavigationKeyStop.java:265-266, withwithTimestampat line 323 andsegment()at line 364) and thescope-prefix forms multiply the work across
NavigationKeyStopandNavigationKeySymbol, so range support on navigation stops and scopeprefixes is a sensible follow-up rather than part of the first cut.
6. Command acceptance. Range bindings should be accepted on the same
commands that accept the single bracket binding and rejected on the same
ones that reject it, via the existing
requireNotParameterizedguard(KeyTokenSymbol.java:34-57), which is already invoked from write-command and
range-history grammar actions (grammar.jjt:1059, 1081, 1099, 1124, 1136,
1595, 1612, 1696, 1710, 1780, and others). Because the new node returns
isParameterized() == true, writes and the history-range commands(
audit,chronicle,diff) reject it for free; those commands alreadytake a
from ... towindow, as noted in CCL_REFERENCE.md section 8.2. Fora first cut I recommend accepting the range binding wherever the single form
is accepted (read and find paths).
7. No grammar-visitor change. As with the single form, the binding
lives on the key symbol and the expression node merely wraps it, so the
compiler visitor needs no change.
8. Regenerate the parser. The grammar is JavaCC/JJTree at
grammar/grammar.jjt (not ANTLR). After editing the
.jjt, regenerate thechecked-in sources under src/main/java/com/cinchapi/ccl/generated by
running ./javacc-parser-generator.sh (a self-contained jar in the repo
root). Edit the
.jjt, never the generated files.Performance
LOOKAHEADto detect...and at most twoNaturalLanguage.parseMicroscalls per bracket instead of one. There is no full scan and no nested
iteration introduced at parse time.
LOOKAHEADguard thatdisambiguates
...fromSIGNED_DECIMALand the dotted-key tokens mustbe bounded (fixed-token lookahead), not a syntactic lookahead over the
whole bracket, so it does not turn parsing super-linear.
node should expose start and end as resolved micros so the concourse
evaluator can translate a range binding into an index-backed range scan
over a key's revisions rather than materializing and filtering full
history. Carrying the endpoints as plain micros (not as opaque phrases)
is what keeps that push-down possible on large datasets. The parser
resolving natural-language endpoints once, at parse time, also avoids
repeated re-resolution downstream.
Testing
This repo writes tests but does not run them (a live server is required);
reproduction and behavior tests still belong in the ticket. Follow the
existing conventions: JUnit 4, and the mandatory Goal / Start state /
Workflow / Expected Javadoc block that the project guidelines require on
every
@Test.BracketTimestampMatrixTestalready follows this block onevery method and is the model for the new tests; the new tests must carry
the same block.
src/test/java/com/cinchapi/ccl/BracketTimestampMatrixTest.java (the
Bracket-timestamp syntax for key annotations #58 matrix). Add range rows alongside the existing flat cases:
foo[t1...t2], open-startfoo[...t2], open-endfoo[t1...]each parse to the new node with the expected resolved endpoints;
to
startis in range, an endpoint equal toendis not), even thoughthe parser only records endpoints;
foo[...](both endpoints omitted) is rejected;foo[t2...t1]witht2 > t1is rejected;foo[t1][t2...t3]is rejected, mirroring the existingtestF7_DoubleBracketOnLeafRejectedflat double-bracket case;at/on/duringkeyword is asingle-instant concept and does not combine with
...);assertRoundTriphelper coverage so eachnew form survives
tokenizethentoStringthen re-parse to an equalAST. This is the gate that proves the canonical
toString()isre-parseable.
src/test/java/com/cinchapi/ccl/grammar/TemporalKeySymbolTest.java for the
new
TemporalRangeKeySymbol: null-key and null-endpoint rejection,rejection of a fully-open (both-null) range, rejection of wrapping a
NavigationKeySymbolor an already-temporal key, accessor behavior,equals/hashCode, and the threetoString()shapes.src/test/java/com/cinchapi/ccl/BracketTimestampCommandTest.java:
range binding accepted on the read/find commands chosen for the first
cut, and rejected on writes and on
audit/chronicle/diffwith thecontext-naming
SyntaxExceptionthe existingassertCommandRejectedhelper checks for.
detail) to document the three range forms and the half-open semantics,
and confirm section 8.2 (where brackets are accepted) still describes the
accept/reject split correctly for the range forms.
Scope and dependencies
In scope (first cut):
...range separator token and the three range forms on leafkeys, in both the parser path and the lexer path.
TemporalRangeKeySymbolAST node with a canonical,re-parseable
toString().reject double bindings.
requireNotParameterizedguard, andthe matrix, node, command, and reference-doc updates above.
Out of scope:
a[t1...t2].b) and on scopeprefixes (
a[t1...].(...)). These multiply the work acrossNavigationKeyStopandNavigationKeySymboland are a naturalfollow-up.
between the present-during and asserted-during interpretations
documented above. That is a separate concourse change.
Cross-repo dependency. Concourse pins ccl 4.0.0
(concourse-driver-java/build.gradle); the local ccl base version is 4.1.0
(.version). This grammar change must ship in a ccl release, after which
concourse bumps its ccl dependency before the matching concourse
range-evaluation change can land. That concourse change is where
Keys.parsemust be taught to re-parse the canonical range
toString()produced here,making this node's canonical string form a cross-repo contract. I am
naming the related repo and feature in prose deliberately and am not
inventing an issue number for it.
Related
Part of the connector data-sync locking initiative (cinchapi-server).