Skip to content

feat: support time travel#201

Merged
jackye1995 merged 14 commits into
lance-format:mainfrom
zhangyue19921010:support-spark-lance-timetravel-v3
Feb 5, 2026
Merged

feat: support time travel#201
jackye1995 merged 14 commits into
lance-format:mainfrom
zhangyue19921010:support-spark-lance-timetravel-v3

Conversation

@zhangyue19921010
Copy link
Copy Markdown
Contributor

Support TIMESTAMP AS OF and VERSION AS OF

spark.sql("SELECT * FROM " + fullName + " TIMESTAMP AS OF " + ts1);

spark.sql("SELECT * FROM " + fullName + " VERSION AS OF " + "2");

@github-actions github-actions Bot added the enhancement New feature or request label Feb 4, 2026
Copy link
Copy Markdown
Collaborator

@hamersaw hamersaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the reading with options example from the docs currently does not work?

I have PR out to add tag support to VERSION AS OF. If this gets merged first I can rebase and build on top, seems like a pretty eash update to the correct loadTable function.

public static long findVersion(List<Version> versions, long timestamp) {
long versionID = -1;
Instant instant = instantFromTimestamp(timestamp);
for (Version version : versions) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are versions guaranteed to be ordered here? If not, I think we could return an incorrect version.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +105 to +126
public static LanceSparkReadOptions createReadOptions(
String location,
Long versionId,
LanceSparkCatalogConfig catalogConfig,
LanceNamespace namespace,
List<String> tableId) {
LanceSparkReadOptions.Builder builder =
LanceSparkReadOptions.builder().datasetUri(location).withCatalogDefaults(catalogConfig);

if (tableId != null) {
builder.tableId(tableId);
}

if (namespace != null) {
builder.namespace(namespace);
}

if (versionId != null) {
builder.version(versionId.intValue());
}
return builder.build();
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're just checking for null on all of the inputs here do we need to have the other two overloaded instances?

return builder.build();
}

// Determine if the timestamp is in microseconds or nanoseconds and convert to Instant
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which cases is the timestamp given in either nanos or micros?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. only need to take care of microseconds is good enough based on spark

  /**
   * Load table metadata at a specific time by {@link Identifier identifier} from the catalog.
   * <p>
   * If the catalog supports views and contains a view for the identifier and not a table, this
   * must throw {@link NoSuchTableException}.
   *
   * @param ident a table identifier
   * @param timestamp timestamp of the table, which is microseconds since 1970-01-01 00:00:00 UTC
   * @return the table's metadata
   * @throws NoSuchTableException If the table doesn't exist or is a view
   */
  default Table loadTable(Identifier ident, long timestamp) throws NoSuchTableException {
    throw QueryCompilationErrors.noSuchTableError(ident);
  }

String location = describeResponse.getLocation();
Map<String, String> initialStorageOptions = describeResponse.getStorageOptions();
@Override
public Table loadTable(Identifier ident, String version) throws NoSuchTableException {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like all of the loadTable instances are basically duplicated. Would it make sense to consolidate some of the code? maybe provide a "version" either from string parsing or timestamp lookup as an argument and if it doesn't exist than use latest?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@zhangyue19921010
Copy link
Copy Markdown
Contributor Author

zhangyue19921010 commented Feb 5, 2026

Does this mean that the reading with options example from the docs currently does not work?

Hi @hamersaw Thanks a lot for your review. As I known current lance-spark does not support VERSION AS OF SQL expression, but it can take effect by setting .option("version", "5")

Copy link
Copy Markdown
Collaborator

@hamersaw hamersaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'm excited to build on this for TAG support!

Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work!

@jackye1995 jackye1995 merged commit 40fd68c into lance-format:main Feb 5, 2026
9 checks passed
jackye1995 pushed a commit that referenced this pull request Feb 11, 2026
This PR fixes load table when using a Lance namespace implementation, by
setting the correct read options. This issue was introduced in
#201.

This fixes #223

To reproduce the problem, attempt to query a table when using a
namespace catalog that vends credentials. The credentials will not be
used and the query will fail.

Note that throughout the code there are a different of ways that
datasets are opened but I left those as-is for now as I don't have the
full context on if that is intentional or not.

Co-authored-by: Bryan Keller <bkeller@netflix.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants