feat: Add Polaris SQLshell prototype#229
Conversation
snazy
left a comment
There was a problem hiding this comment.
Thanks for putting this together. I think this is a useful proof of concept, and I like the general direction of having a lightweight Polaris CLI/shell for catalog exploration. Also, I’m fine with a big draft PR that contains the whole idea end-to-end, as long as we treat it as a design/prototype discussion and later split it into smaller, actually reviewable PRs.
I do have a few bigger design concerns though, mostly around where this would go if we turned it into a real user-facing CLI.
First, I wonder whether we should look more closely for example at the Nessie CLI before going too far down this path. It already has quite a bit of REPL/script execution/terminal/completion infrastructure, and the CongoCC-based grammar is pretty well suited for completion. I’m a bit hesitant about adding ANTLR here, partly because of the extra runtime jar and possible dependency conflicts, and partly because completion seems to be an important part of the UX for this kind of tool.
Second, I’m not sure generic SELECT support is the right starting point. It can very quickly turn this from a catalog/admin shell into a small query engine, with all the semantics and expectations that come with that. I’d feel more comfortable starting with catalog-oriented commands like listing namespaces/tables, describing schemas/properties/snapshots/locations, diagnostics, etc., and being very explicit that this is not a SQL execution engine.
Related to that, I’m also not sure about EXPLAIN SELECT. The current implementation seems more like an Iceberg/table scan diagnostic than a query-engine explain plan. Maybe that’s still useful, but I’d probably frame it as a table diagnostic command instead of tying it to SQL EXPLAIN.
The other big one for me is credentials. If this is intended to become a real CLI for users, I’m strongly against documenting a new plaintext properties file with any client or object storage secrets. For local demos that’s one thing, but for actual usage we should have a better config story from the beginning. SmallRye Config might be worth considering here: it gives us typed Java config mapping, environment variable support, and mechanisms for encrypted values / secrets managers. At minimum I’d want env-var support, clear guidance around file permissions, and examples that don’t encourage putting long-lived secrets in a checked-in or casually copied properties file.
So overall: I think this is a good PoC and useful for discussing the shape of the tool, but before merging something like this I’d like us to agree on the CLI scope and architecture first, especially around parser/completion, whether we want any data-querying at all, and credential handling.
|
Thanks @snazy for the detailed response! I'm taking a look now |
I realize this is unrealistically large at 3K lines. I started working on this and was looking at it from the perspective of the entire idea, not adding this over multiple PRs. So I'll ask to take a look at different parts, and if we agree to pursue this idea and direction, I'll break it into smaller PRs, targeting at most 1K lines per PR.
This PR adds polaris-shell, an interactive SQL shell for exploring Iceberg tables and catalog metadata through Polaris via its REST catalog API.
Motivation
Getting quick answers about your catalog — table counts, snapshot stats, storage location, small-file diagnostics — currently requires switching to Trino, Spark, or pyiceberg. Polaris Shell provides a lightweight SQL interface for these tasks without spinning up a heavy query engine.
How it works
Connects to Polaris using the Iceberg
RESTCatalogwith OAuth2 client credentials. SQL statements are parsed with an ANTLR 4 grammar, converted to a query plan, and executed directly through the Iceberg Java library — no JDBC driver, no query engine.SQL input → ANTLR parser → QueryPlan → Iceberg REST catalog API → results
Supported commands
SELECTORDER BY,LIMITSHOW TABLES IN <namespace>DESCRIBE STATS <table>SHOW TABLE LOCATION <table>SHOW TABLE POLICIES <table>DIAGNOSE TABLE <table>EXPLAIN SELECT ...Demo
A fully self-contained demo runs locally via Docker Compose + MinIO — no AWS account or external Polaris server required.
See
polaris-shell/README.mdfor full documentation, sample output, configuration reference, and demo instructions.