Skip to content

Add BDF Toolbox metadata (bdf.yaml + codemeta.json + CITATION.cff)#1403

Open
allaway wants to merge 4 commits into
Sage-Bionetworks:developfrom
allaway:bdf-toolbox-metadata
Open

Add BDF Toolbox metadata (bdf.yaml + codemeta.json + CITATION.cff)#1403
allaway wants to merge 4 commits into
Sage-Bionetworks:developfrom
allaway:bdf-toolbox-metadata

Conversation

@allaway

@allaway allaway commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds machine-readable project metadata to satisfy several ARPA-H BDF ENHANCE Scorecard checks. All three files are new and live at the repo root; no code, dependencies, or workflows are touched.

Files added

File Purpose Scorecard check resolved
bdf.yaml SystemMetadata instance conforming to ARPA-H-BDF/bdfkb-schema (src/bdfkb_schema/schema/bdfkb_schema.yaml, root class SystemMetadata). BDF metadata presence and TRL maturity (current_maturity: 8, ≥5)
codemeta.json CodeMeta 2.0 / schema.org SoftwareSourceCode descriptor. CodeMeta presence
CITATION.cff Citation File Format 1.2.0 metadata. Citation file presence

Validation performed

  • bdf.yaml: linkml-validate -s bdfkb_schema.yaml -C SystemMetadata bdf.yamlNo issues found.
  • codemeta.json: parses as valid JSON.
  • CITATION.cff: parses as valid YAML.
  • All repo pre-commit hooks (check-json, check-yaml, trailing-whitespace, end-of-file-fixer, etc.) pass on the new files.

Real values for the client were filled in where known (version 4.13.0, repo URL, https://www.synapse.org, https://python-docs.synapse.org, PyPI URL, REST endpoint repo-prod.prod.sagebase.org/repo/v1, Apache-2.0 license confirmed against LICENSE).

⚠️ TODO fields for maintainers to confirm

These are placeholders (some are dummy-but-valid values needed to pass schema validation) and must be reviewed before this is considered authoritative:

  • bdf.yamlmaturity.current_maturity: 8confirm official BDF TRL with Sage program lead (must be ≥5 to pass the scorecard).
  • bdf.yamlfunding.source, funding.agreement (award number), funding.link — currently TODO placeholders (the link is a dummy valid URL so the file validates).
  • bdf.yamlcredit[0].email (platform@sagebase.org) — confirm preferred BDF contact email.
  • CITATION.cff → individual authors (currently only the "Sage Bionetworks" org entity) — see inline # TODO.

Out of scope (intentionally not done)

  • GitHub repo topics — cannot be set via a PR.
  • .github/workflows secrets handling — that scorecard finding is a false positive; left untouched.

🤖 Generated with Claude Code

Adds machine-readable project metadata to satisfy ARPA-H BDF ENHANCE
Scorecard checks:
- bdf.yaml: SystemMetadata instance for the ARPA-H-BDF/bdfkb-schema
  (validated with linkml-validate). Resolves BDF metadata + TRL maturity.
- codemeta.json: CodeMeta/schema.org SoftwareSourceCode descriptor.
- CITATION.cff: Citation File Format 1.2.0 metadata.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@allaway allaway requested a review from a team as a code owner June 11, 2026 15:17
@allaway

allaway commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Tagging @milen-sage for viz

@allaway

allaway commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

@milen-sage, can you please comment on these aspects of the PR:

bdf.yaml → maturity.current_maturity: 8 — confirm official BDF TRL with Sage program lead (must be ≥5 to pass the scorecard).
bdf.yaml → funding.source, funding.agreement (award number), funding.link — currently TODO placeholders (the link is a dummy valid URL so the file validates).
bdf.yaml → credit[0].email (platform@sagebase.org) — confirm preferred BDF contact email.
CITATION.cff → individual authors (currently only the "Sage Bionetworks" org entity) — see inline # TODO.

Comment thread CITATION.cff
title: "Synapse Python Client (synapseclient)"
type: software
authors:
# TODO: add individual authors (names / ORCIDs) in addition to the organization below.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which individual authors should be going into this section? Really it would be all of @Sage-Bionetworks/dpe

Comment thread codemeta.json
"@type": "SoftwareSourceCode",
"name": "Synapse Python Client (synapseclient)",
"description": "A Python client for Sage Bionetworks' Synapse, a collaborative, open-source research platform that allows teams to share data, track analyses, and collaborate. The client can be used as a library for software that communicates with Synapse or as a command-line utility.",
"version": "4.13.0",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be another annoying spot to maintain a version that can easily diverge from the actual deployed version. Is there a way to dynamically bring this version in from https://github.com/Sage-Bionetworks/synapsePythonClient/blob/develop/synapseclient/synapsePythonClient so we only maintain it in a single place?

Comment thread codemeta.json Outdated
Comment thread bdf.yaml
# (src/bdfkb_schema/schema/bdfkb_schema.yaml, tree_root: SystemMetadata).
# Validate with:
# linkml-validate -s <path>/bdfkb_schema.yaml -C SystemMetadata bdf.yaml
version: "4.13.0"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread bdf.yaml
credit:
- name: "Sage Bionetworks"
email:
- "platform@sagebase.org" # TODO: confirm preferred BDF contact email

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be dpe@sagebase.org?

Comment thread bdf.yaml
funding:
source: "TODO: confirm funding source"
agreement: "TODO: confirm agreement / award number"
link: "https://example.org/TODO-confirm-funding-link"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these need to be fixed?

allaway and others added 2 commits June 11, 2026 11:32
Addresses review feedback that the version in bdf.yaml and codemeta.json
(and CITATION.cff) duplicates the canonical client version and can drift.

Adds:
- .github/scripts/sync_version_metadata.py: rewrites the version field in
  bdf.yaml, codemeta.json, and CITATION.cff to match latestVersion in
  synapseclient/synapsePythonClient (the single source of truth). Supports
  --write and --check.
- .github/workflows/sync-version-metadata.yml: on PRs that touch the version
  file or metadata, runs the sync and commits the fix back to the PR branch
  (same-repo), or fails with guidance for fork PRs so drift can't merge.

Now the version is maintained in one place; releases that bump
synapseclient/synapsePythonClient propagate automatically.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines +11 to +19
on:
pull_request:
paths:
- "synapseclient/synapsePythonClient"
- "bdf.yaml"
- "codemeta.json"
- "CITATION.cff"
- ".github/scripts/sync_version_metadata.py"
- ".github/workflows/sync-version-metadata.yml"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good in theory, but it doesn't match the typical workflow that we follow for updating the version and going through the release process.

https://sagebionetworks.jira.com/wiki/spaces/SYNPY/pages/643498030/Synapse+Python+Client+Staging+Validation+Production

Some AI suggestions I was seeing:

  1. Add in a local pre-commit hook (Most promising, but is outside the typical CI/CD)
  2. Update the release process to modify the version in 3 places (Not the worst, but prone to error)
  3. Update our release process to remove direct commit/pushing and follow a PR style approach (I'd rather not take this so that we can follow the same process)
  4. Update the build process in this area
    - name: update-version
    shell: bash
    run: |
    if [[ -n "$VERSION" ]]; then
    sed "s|\"latestVersion\":.*$|\"latestVersion\":\"$VERSION\",|g" synapseclient/synapsePythonClient > temp
    rm synapseclient/synapsePythonClient
    mv temp synapseclient/synapsePythonClient
    fi
    to also call this sync_version_metadata.py script. This would be good automation, but it would not affect source control - Only the artifact release to pypi.

@allaway I see https://github.com/ARPA-H-BDF/bdfkb-schema/blob/6afe39430894512131b7f86f75ac165e8db65551/src/bdfkb_schema/schema/bdfkb_schema.yaml#L101-L104 that version is a required field. However, what I am not certain on is if the version in source control has to be correct, or just the version in the packaged artifact of https://pypi.org/project/synapseclient/ .

If source control has to be correct then I think the most straight forward way is to just update our release process to do this manually in 3 spots.

If source control doesn't have to be correct then packaging the artifact with the correct version/files would be nice for maintainability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants