Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ on:
push:
branches: [ main, master, develop ]
pull_request:
branches: [ main, master, develop ]
workflow_dispatch:

concurrency:
Expand Down
1 change: 0 additions & 1 deletion .github/workflows/pr-checks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ name: PR Checks

on:
pull_request:
branches: [ main, master, develop ]

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
Expand Down
1 change: 0 additions & 1 deletion .github/workflows/syfon-backend-e2e.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ name: Syfon Backend E2E

on:
pull_request:
branches: [ main, master, develop ]
workflow_dispatch:

concurrency:
Expand Down
1 change: 0 additions & 1 deletion .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ on:
push:
branches: [ main, master, develop ]
pull_request:
branches: [ main, master, develop ]
workflow_dispatch:

concurrency:
Expand Down
184 changes: 78 additions & 106 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,143 +3,115 @@
---
# NOTICE

git-drs is not yet fully compliant with DRS. It currently works against Gen3 DRS server. Full GA4GH DRS support is expected once v1.6 of the specification has been published.
`git-drs` is not a pure GA4GH DRS client. It targets Syfon/Gen3-style DRS workflows and uses extensions where repo-scale behavior requires them.

---

[![Tests](https://github.com/calypr/git-drs/actions/workflows/test.yaml/badge.svg)](https://github.com/calypr/git-drs/actions/workflows/test.yaml)

**Git/DRS orchestration with optional Git LFS compatibility**
**Git/DRS orchestration with Git-compatible pointer workflows**

Git DRS manages Git-facing DRS workflows: local metadata, Git hooks, filter behavior, lookup/register/push/pull orchestration, and optional Git LFS compatibility. Provider-specific transfer, signed URL behavior, and direct cloud inspection live in client code outside this repo.
`git-drs` manages:

- remote Gen3/Syfon configuration
- local DRS metadata
- pointer-aware push/pull orchestration
- bucket-scoped object reference workflows

## Key Features

- **Unified Workflow**: Manage both code and large data files using standard Git commands
- **DRS Integration**: Built-in support for Gen3 DRS servers
- **Multi-Remote Support**: Work with development, staging, and production servers in one repository
- **Automatic Processing**: Files are processed automatically during commits and pushes
- **Flexible Tracking**: Track individual files, patterns, or entire directories
- unified Git/data workflow around DRS-backed pointers
- Gen3/Syfon integration
- multiple remotes in one repository
- explicit file tracking and hydration
- metadata-only reference support for existing bucket objects

## How It Works

Git DRS works alongside Git LFS when you want LFS-compatible pointers and storage, while still supporting DRS-centric workflows:
At a high level:

1. **Initialization**: Set up repository and DRS server configuration
2. **Automatic Commits**: Create DRS objects during pre-commit hooks
3. **Automatic Pushes**: Register files with DRS servers and upload to configured storage
4. **On-Demand Downloads**: Pull specific files or patterns as needed
1. configure a remote for one `organization/project`
2. let `remote add` bootstrap repo-local `git-drs` state if needed
3. track file patterns with `git drs track`
4. add/commit/push normally
5. remove tracked pointers with `git drs rm` when you want repository deletion to reconcile with remote DRS state
5. hydrate pointer files later with `git drs pull`

## Quick Start

### Installation

```bash
# Install Git LFS first
brew install git-lfs # macOS
git lfs install --skip-smudge

# Install Git DRS
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/calypr/git-drs/refs/heads/main/install.sh)" -- $GIT_DRS_VERSION

# Install global Git filter configuration for git-drs
git drs install
```

### Basic Usage

```bash
# Initialize repository (one-time Git repo setup)
git drs init

# Add DRS remote
git drs remote add gen3 production \
--cred /path/to/credentials.json \
--url https://calypr-public.ohsu.edu \
--organization my-program \
--project my-project \
--bucket my-bucket

# Required prerequisite (usually steward/admin setup):
# create bucket credentials, then map org/project to full storage roots before users run push/pull
git drs bucket add production \
--bucket my-bucket \
--region us-east-1 \
--access-key "$AWS_ACCESS_KEY_ID" \
--secret-key "$AWS_SECRET_ACCESS_KEY" \
--s3-endpoint https://s3.amazonaws.com
git drs bucket add-organization production \
--organization my-program \
--path s3://my-bucket/my-program
git drs bucket add-project production \
--organization my-program \
--project my-project \
--path s3://my-bucket/my-program/my-project

# Track files
git lfs track "*.bam"
git drs remote add gen3 production HTAN_INT/BForePC --cred /path/to/credentials.json
git drs track "*.bam"
git add .gitattributes

# Add and commit files
git add my-file.bam
git commit -m "Add data file"
git push

# Download files
git lfs pull -I "*.bam"
git add sample.bam
git commit -m "Add sample"
git drs push
git drs ls-files
git drs pull -I "*.bam"
```

## Documentation

For detailed setup and usage information:
## Current CLI Shape

- **[Getting Started](docs/getting-started.md)** - Repository setup and basic workflows
- **[Commands Reference](docs/commands.md)** - Complete command documentation
- **[Installation Guide](docs/installation.md)** - Platform-specific installation
- **[Troubleshooting](docs/troubleshooting.md)** - Common issues and solutions
- **[E2E Modes + Local Setup](docs/e2e-modes-and-local-setup.md)** - Local vs remote mode, server config, and end-to-end runbooks
- **[Cloud/Object Integration](docs/adding-s3-files.md)** - Adding files from provider URLs or configured bucket object keys
- **[Developer Guide](docs/developer-guide.md)** - Internals and development
The cleaned CLI intentionally removed legacy commands:

## Supported Servers
- removed:
- `git drs fetch`
- `git drs list`
- `git drs upload`
- `git drs download`
- `git drs pull` is hydration-only
- `git drs ls-files` is the local file inventory command
- `git drs remote add gen3` takes scope as `organization/project`

- **Gen3 Data Commons** (e.g., CALYPR)
Example:

## Supported Environments

- **Local Development** environments
- **HPC Systems** (e.g., ARC)
```bash
git drs remote add gen3 production HTAN_INT/BForePC --cred /path/to/credentials.json
```

## Commands Overview
## Bucket Mapping Model

End users should not need to know the bucket name.

Push and pull depend on server-side bucket mapping for the requested scope. That mapping is normally provisioned once by a steward/admin using the bucket commands.

## Common Commands

| Command | Description |
| --- | --- |
| `git drs install` | Install global `git-drs` filter config |
| `git drs init` | Explicitly initialize or repair repository-local `git-drs` state |
| `git drs remote add gen3 [remote] <org/project>` | Add or refresh a Gen3/Syfon remote |
| `git drs remote list` | List configured remotes |
| `git drs remote remove <name>` | Remove a configured DRS remote |
| `git drs remote set <name>` | Set the default remote |
| `git drs track <pattern>` | Track files or globs |
| `git drs untrack <pattern>` | Stop tracking files or globs |
| `git drs rm <path>...` | Remove tracked DRS/LFS files from Git |
| `git drs ls-files` | List tracked files and localization state |
| `git drs pull` | Hydrate pointer files in the current checkout |
| `git drs push` | Register/upload objects, reconcile committed deletes, and push refs |
| `git drs add-url` | Add an existing provider object by URL or scoped key |
| `git drs add-ref` | Add a local reference to an existing DRS object |
| `git drs query` | Query a DRS object by ID |
| `git drs copy-records` | Copy Syfon records between remotes for one scope |

| Command | Description |
| ---------------------- | ------------------------------------- |
| `git drs install` | Install global git-drs filter config |
| `git drs init` | Initialize repository |
| `git drs remote add` | Add a DRS remote server |
| `git drs remote list` | List configured remotes |
| `git drs remote set` | Set default remote |
| `git drs add-url` | Add files via provider URLs or configured bucket object keys |
| `git lfs track` | Track file patterns with LFS |
| `git lfs ls-files` | List tracked files |
| `git lfs pull` | Download tracked files |
| `git drs fetch` | Fetch metadata from DRS server |
| `git drs push` | Push objects to DRS server |
## Documentation

Use `--help` with any command for details. See [Commands Reference](docs/commands.md) for complete documentation.
- [Getting Started](docs/getting-started.md)
- [Commands Reference](docs/commands.md)
- [Troubleshooting](docs/troubleshooting.md)
- [Developer Guide](docs/developer-guide.md)
- [GA4GH DRS Scalability Gaps](docs/ga4gh-drs-scalability-gaps.md)

## Requirements

- Git LFS installed and configured
- Access credentials for your DRS server
- Go 1.24+ (for building from source)
- Git
- access credentials for the target Gen3/Syfon deployment
- Go 1.26.2+ for local builds

## Support

- **Issues**: [GitHub Issues](https://github.com/calypr/git-drs/issues)
- **Releases**: [GitHub Releases](https://github.com/calypr/git-drs/releases)
- **Documentation**: See `docs/` folder for detailed guides

## License

This project is part of the CALYPR data commons ecosystem.
- [GitHub Issues](https://github.com/calypr/git-drs/issues)
- [GitHub Releases](https://github.com/calypr/git-drs/releases)
51 changes: 51 additions & 0 deletions attic/issue-add-include-pattern-to-git-drs-pull.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Add `-I "pattern"` include filter support to `git drs pull`

## Summary
Add include-pattern filtering to `git drs pull`, similar to legacy `git lfs pull -I "pattern"` workflows.

## Motivation
Current `git drs pull` behavior pulls based on repository resolution without a user-facing path pattern filter. Users migrating from `git lfs pull -I` expect selective hydration of files by glob/path.

## Proposed UX
Support:

```bash
git drs pull -I "results/*.txt"
git drs pull -I "*.bam" -I "data/**"
git drs pull --include "path/to/file"
```

Optional:
- `--exclude` parity (if desired in same change or follow-up)

## Proposed behavior
1. Parse one or more include patterns (`-I`, `--include`).
2. Resolve candidate pointers as usual.
3. Filter by repo-relative path match before download.
4. Download only matched objects; skip others with clear logging.
5. If no pattern supplied, preserve current default behavior.

## Scope
- `cmd/pull/main.go` CLI flags and pull selection pipeline
- pointer/path inventory layer (where path<->OID candidates are produced)
- docs: `docs/commands.md`, `docs/getting-started.md`, `docs/troubleshooting.md`
- tests for include filtering semantics

## Acceptance criteria
- [ ] `git drs pull -I "<pattern>"` works for a single pattern.
- [ ] Repeated `-I` flags are supported.
- [ ] Include matching is against repo-relative paths.
- [ ] Default `git drs pull` behavior unchanged when no `-I` is passed.
- [ ] Help text documents pattern syntax and examples.
- [ ] Unit/integration tests cover positive and negative matches.

## Testing matrix
- Single file exact path include.
- Wildcard include (`*.bam`, `data/**`).
- Multiple `-I` values.
- No matches (should no-op cleanly and return success unless policy says otherwise).
- Mixed matched/unmatched objects in same pull run.

## Notes
This closes a usability gap for users transitioning from `git lfs` CLI habits to `git drs` commands while keeping pull behavior explicit and predictable.

36 changes: 24 additions & 12 deletions cmd/addurl/main_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ import (
"github.com/calypr/git-drs/internal/config"
"github.com/calypr/git-drs/internal/drsobject"
"github.com/calypr/git-drs/internal/gitrepo"
"github.com/calypr/git-drs/internal/lfs"
"github.com/calypr/git-drs/internal/precommit_cache"
sycloud "github.com/calypr/syfon/client/cloud"
)
Expand Down Expand Up @@ -100,9 +99,9 @@ func TestRunAddURL_WritesPointerAndLFSObject(t *testing.T) {
t.Fatalf("service.Run error: %v", err)
}

oid, err := lfs.SyntheticOIDFromETag("abcd1234")
oid, err := placeholderOIDForUnknownSHA("abcd1234", "s3://bucket/path/to/file.bin")
if err != nil {
t.Fatalf("SyntheticOIDFromETag: %v", err)
t.Fatalf("placeholderOIDForUnknownSHA: %v", err)
}

pointerPath := filepath.Join(tempDir, "path/to/file.bin")
Expand All @@ -120,15 +119,8 @@ func TestRunAddURL_WritesPointerAndLFSObject(t *testing.T) {
}

lfsObject := filepath.Join(lfsRoot, "objects", oid[0:2], oid[2:4], oid)
if _, err := os.Stat(lfsObject); err != nil {
t.Fatalf("expected LFS object at %s: %v", lfsObject, err)
}
sentinel, err := os.ReadFile(lfsObject)
if err != nil {
t.Fatalf("read sentinel: %v", err)
}
if !lfs.IsAddURLSentinelBytes(sentinel) {
t.Fatalf("expected add-url sentinel payload, got: %q", string(sentinel))
if _, err := os.Stat(lfsObject); !os.IsNotExist(err) {
t.Fatalf("expected no local LFS object payload at %s, got err=%v", lfsObject, err)
}

drsObject, err := drsobject.ReadObject(common.DRS_OBJS_PATH, oid)
Expand All @@ -143,6 +135,26 @@ func TestRunAddURL_WritesPointerAndLFSObject(t *testing.T) {
}
}

func TestPlaceholderOIDForUnknownSHA(t *testing.T) {
oid1, err := placeholderOIDForUnknownSHA("etag-abc", "s3://bucket/key")
if err != nil {
t.Fatalf("placeholderOIDForUnknownSHA: %v", err)
}
oid2, err := placeholderOIDForUnknownSHA(`"etag-abc"`, "s3://bucket/key")
if err != nil {
t.Fatalf("placeholderOIDForUnknownSHA quoted: %v", err)
}
if oid1 != oid2 {
t.Fatalf("expected trimmed etag handling to be stable: %s vs %s", oid1, oid2)
}
if len(oid1) != 64 {
t.Fatalf("expected 64-char oid, got %q", oid1)
}
if _, err := placeholderOIDForUnknownSHA("", "s3://bucket/key"); err == nil {
t.Fatal("expected empty etag error")
}
}

func TestParseAddURLInput_DoesNotRequireAWSFlags(t *testing.T) {
cmd := NewCommand()
in, err := parseAddURLInput(cmd, []string{"gs://bucket/path/to/file.bin"})
Expand Down
Loading
Loading