feat: Add Polaris SQLshell prototype by bbejeck · Pull Request #229 · apache/polaris-tools

bbejeck · 2026-05-16T22:49:39Z

I realize this is unrealistically large at 3K lines. I started working on this and was looking at it from the perspective of the entire idea, not adding this over multiple PRs. So I'll ask to take a look at different parts, and if we agree to pursue this idea and direction, I'll break it into smaller PRs, targeting at most 1K lines per PR.

This PR adds polaris-shell, an interactive SQL shell for exploring Iceberg tables and catalog metadata through Polaris via its REST catalog API.

Motivation

Getting quick answers about your catalog — table counts, snapshot stats, storage location, small-file diagnostics — currently requires switching to Trino, Spark, or pyiceberg. Polaris Shell provides a lightweight SQL interface for these tasks without spinning up a heavy query engine.

How it works

Connects to Polaris using the Iceberg RESTCatalog with OAuth2 client credentials. SQL statements are parsed with an ANTLR 4 grammar, converted to a query plan, and executed directly through the Iceberg Java library — no JDBC driver, no query engine.

SQL input → ANTLR parser → QueryPlan → Iceberg REST catalog API → results

Supported commands

Command	Purpose
`SELECT`	Sample table data with predicate pushdown, column projection, `ORDER BY`, `LIMIT`
`SHOW TABLES IN <namespace>`	List tables and count under a namespace
`DESCRIBE STATS <table>`	Snapshot count, current snapshot ID, partition spec, schema
`SHOW TABLE LOCATION <table>`	Storage location
`SHOW TABLE POLICIES <table>`	Effective Polaris policies
`DIAGNOSE TABLE <table>`	Small-file count vs. 128 MiB threshold
`EXPLAIN SELECT ...`	Scan plan: manifest pruning, files eliminated, estimated bytes, warnings

SELECT queries are intended for sampling and exploration, not production workloads.

Demo

A fully self-contained demo runs locally via Docker Compose + MinIO — no AWS account or external Polaris server required.

See polaris-shell/README.md for full documentation, sample output, configuration reference, and demo instructions.

…nner

snazy

Thanks for putting this together. I think this is a useful proof of concept, and I like the general direction of having a lightweight Polaris CLI/shell for catalog exploration. Also, I’m fine with a big draft PR that contains the whole idea end-to-end, as long as we treat it as a design/prototype discussion and later split it into smaller, actually reviewable PRs.

I do have a few bigger design concerns though, mostly around where this would go if we turned it into a real user-facing CLI.

First, I wonder whether we should look more closely for example at the Nessie CLI before going too far down this path. It already has quite a bit of REPL/script execution/terminal/completion infrastructure, and the CongoCC-based grammar is pretty well suited for completion. I’m a bit hesitant about adding ANTLR here, partly because of the extra runtime jar and possible dependency conflicts, and partly because completion seems to be an important part of the UX for this kind of tool.

Second, I’m not sure generic SELECT support is the right starting point. It can very quickly turn this from a catalog/admin shell into a small query engine, with all the semantics and expectations that come with that. I’d feel more comfortable starting with catalog-oriented commands like listing namespaces/tables, describing schemas/properties/snapshots/locations, diagnostics, etc., and being very explicit that this is not a SQL execution engine.

Related to that, I’m also not sure about EXPLAIN SELECT. The current implementation seems more like an Iceberg/table scan diagnostic than a query-engine explain plan. Maybe that’s still useful, but I’d probably frame it as a table diagnostic command instead of tying it to SQL EXPLAIN.

The other big one for me is credentials. If this is intended to become a real CLI for users, I’m strongly against documenting a new plaintext properties file with any client or object storage secrets. For local demos that’s one thing, but for actual usage we should have a better config story from the beginning. SmallRye Config might be worth considering here: it gives us typed Java config mapping, environment variable support, and mechanisms for encrypted values / secrets managers. At minimum I’d want env-var support, clear guidance around file permissions, and examples that don’t encourage putting long-lived secrets in a checked-in or casually copied properties file.

So overall: I think this is a good PoC and useful for discussing the shape of the tool, but before merging something like this I’d like us to agree on the CLI scope and architecture first, especially around parser/completion, whether we want any data-querying at all, and credential handling.

bbejeck · 2026-05-23T19:34:36Z

Thanks @snazy for the detailed response! I'm taking a look now

Add Polaris SQL shell prototype with ANTLR-based parser and query pla…

bce6e72

…nner

snazy reviewed May 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Polaris SQLshell prototype#229

feat: Add Polaris SQLshell prototype#229
bbejeck wants to merge 1 commit into
apache:mainfrom
bbejeck:add-polaris-shell

bbejeck commented May 16, 2026 •

edited

Loading

Uh oh!

snazy left a comment

Uh oh!

bbejeck commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bbejeck commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

How it works

Supported commands

Demo

Uh oh!

snazy left a comment

Choose a reason for hiding this comment

Uh oh!

bbejeck commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bbejeck commented May 16, 2026 •

edited

Loading