Skip to content

Notes from a first-time contributor going through the tutorial #480

@Jeebjean

Description

@Jeebjean

Context

I just went through the full contributing tutorial (Steps 0–7) implementing the Local Neighborhood algorithm as part of my SROP internship with @agitter. The tutorial explains the concepts clearly. I'm opening this issue to share the spots where I had to guess or trial-and-error my way through, in case any of it is useful for future contributors. @agitter suggested I post these as an issue first so the team can decide what's worth acting on.

These are observations from one person on macOS, so a lot of the setup notes are macOS-specific and may not generalize.

Where I got stuck

Environment setup

The prerequisites section lists topics and links but assumes a working dev environment. On my machine I had to figure out:

  • Installing Docker on Mac. @ntalluri recommended OrbStack over Docker Desktop and it worked really well for me, so it might be worth mentioning as an option in the docs. One small gotcha with OrbStack: I had to set DOCKER_HOST so the Python docker library could find the socket.
  • Installing the right Miniconda installer for Apple Silicon
  • Running conda init zsh (zsh is the default shell on recent macOS)
  • Quoting python -m pip install -e '.[dev]' in zsh
  • Generating a GitHub PAT with both repo and workflow scopes

A short pre-flight checklist might help (docker --version, docker run hello-world, docker login, conda --version).

Step 1: running local_neighborhood_alg.py

The exact cp commands and the full python invocation took me a while to assemble from the repo. I'm not sure if showing them outright would defeat the learning purpose; maybe a hidden "if you get stuck" block would work.

Step 2: Dockerfile

PathLinker is referenced as the example, but it has wget, apk add, envsubst, etc., which is more than Local Neighborhood needs. A simpler existing wrapper might be a better reference for a first contribution.

Step 3: wrapper functions

Two small things:

  1. The interactive Python block uses Dataset(dataset_dict) with a plain dict, but the current API expects a DatasetSchema object, so the example fails as-written.
  2. A skeleton with the imports and the three PRM method stubs would have saved me time figuring out the structure. @agitter mentioned the idea of a stub template file, which would cover this.

Step 4: registering the algorithm

The exact edits to spras/runner.py (import + dictionary entry) and the config/config.yaml entry were straightforward once I figured them out, but explicit examples would have been quicker. Side note: the Pydantic config schema wants include: directly under name: and runs:, not under params:, which tripped me up briefly.

Step 5: tests

Instructions like "include a key-value pair in the algo_exp_file dictionary" are clear in intent but the exact key name and indentation required some digging through the existing tests.

Step 7: pull request

The push at Step 5 failed for me with a non-obvious error. From what I observed on my own fork (github.com/Jeebjean/spras):

Scenario Result
HTTPS push, no PAT Auth error
HTTPS push, PAT with repo scope only, normal commit Works
HTTPS push, PAT with repo scope only, commit touches .github/workflows/ Rejected (workflow scope required)
HTTPS push, PAT with repo + workflow scopes Works
SSH push Works

PAT(Personal Access Token)

Step 5 modifies a workflow file, which is why this shows up there. A one-line note in the tutorial might save HTTPS users some debugging.

Open questions

A few things @agitter raised that I don't have a strong opinion on:

  1. How to handle OS-specific instructions (one default OS + alt docs, or parallel sections)
  2. Where to draw the line between explicit commands and exploration
  3. Whether a simpler existing wrapper should replace PathLinker as the Dockerfile reference
  4. Whether a stub template directory for new algorithms would be useful, and where it should live

cc @agitter
cc @ntalluri

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions