UIUCLibrary · alexdryden · Feb 26, 2026 · Feb 23, 2026 · Feb 23, 2026 · Feb 23, 2026
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -1,49 +1,41 @@
-name: Test
+name: Tests
 
 on:
   push:
-    branches: [ main, develop ]
+    branches: [ main, develop, copilot/** ]
   pull_request:
     branches: [ main, develop ]
-  workflow_dispatch:
 
 jobs:
   test:
     runs-on: ubuntu-latest
-    strategy:
-      fail-fast: false
-      matrix:
-        python-version: ['3.9', '3.10', '3.11']
-
+    permissions:
+      contents: read
+
     steps:
-    - uses: actions/checkout@v3
+    - uses: actions/checkout@v4
 
-    - name: Set up Python ${{ matrix.python-version }}
-      uses: actions/setup-python@v4
+    - name: Set up Python
+      uses: actions/setup-python@v5
       with:
-        python-version: ${{ matrix.python-version }}
+        python-version: '3.12'
+        cache: 'pip'
 
-    - name: Cache pip dependencies
-      uses: actions/cache@v3
+    - name: Setup Ruby
+      uses: ruby/setup-ruby@v1
       with:
-        path: ~/.cache/pip
-        key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
-        restore-keys: |
-          ${{ runner.os }}-pip-
+        ruby-version: '3.1'
+        bundler-cache: true
 
-    - name: Install dependencies
+    - name: Install Python dependencies
       run: |
-        python -m pip install --upgrade pip
         pip install -r requirements.txt
+        pip install pytest
 
-    - name: Run tests with pytest
+    - name: Install Ruby dependencies
       run: |
-        pytest --cov=arcflow --cov-report=xml --cov-report=term
+        bundle install
 
-    - name: Upload coverage to Codecov
-      uses: codecov/codecov-action@v3
-      with:
-        files: ./coverage.xml
-        flags: unittests
-        name: codecov-${{ matrix.python-version }}
-        fail_ci_if_error: false
+    - name: Run tests
+      run: |
+        pytest tests/unit -v
diff --git a/.gitignore b/.gitignore
@@ -60,4 +60,9 @@ MANIFEST
 .arcflow.yml
 
 # PID files
-*.pid
+*.pid
+
+# Ruby/Bundler files
+Gemfile.lock
+.bundle/
+vendor/bundle/
diff --git a/Gemfile b/Gemfile
@@ -0,0 +1,9 @@
+source 'https://rubygems.org'
+
+gem 'traject', '~> 3.0'
+gem 'traject_plus'
+
+# Optional: for testing
+group :test do
+  gem 'rspec', '~> 3.0'
+end
diff --git a/tests/README.md b/tests/README.md
@@ -1,233 +1,54 @@
-# ArcFlow Tests
+# ArcFlow Test Suite
 
-Comprehensive test suite for the arcflow repository to accelerate AI agent development workflows.
+This directory contains tests for the ArcFlow project.
 
-## Running Tests
-
-### Run All Tests
+## Test Structure
 
-```bash
-pytest
-```
+- `unit/` - Fast unit tests for individual components
+- `conftest.py` - Shared test fixtures and configuration
 
-### Run Specific Test Files
+## Running Tests
 
 ```bash
-pytest tests/test_file_operations.py
-pytest tests/test_ead_operations.py
-```
-
-### Run Tests by Marker
+# Run all tests
+pytest
 
-```bash
 # Run only unit tests
-pytest -m unit
-
-# Run only integration tests
-pytest -m integration
-
-# Skip slow tests
-pytest -m "not slow"
-```
-
-### Run with Coverage Report
-
-```bash
-# Terminal report
-pytest --cov=arcflow --cov-report=term-missing
-
-# HTML report (opens in browser)
-pytest --cov=arcflow --cov-report=html
-open htmlcov/index.html
-```
+pytest tests/unit
 
-### Run Specific Test Functions
+# Run with verbose output
+pytest -v
 
-```bash
-pytest tests/test_utilities.py::test_get_repo_id
-pytest tests/test_file_operations.py::TestSaveFile::test_save_file_success
-```
-
-## Test Organization
-
-### Infrastructure Files
-
-- **`conftest.py`** - Shared pytest fixtures (mock clients, temp directories, sample data)
-- **`README.md`** - This file
-- **`pytest.ini`** - Test configuration (in repository root)
-
-### Test Files
-
-1. **`test_file_operations.py`** - File I/O operations
-   - `save_file()` - Writing files with error handling
-   - `create_symlink()` - Creating symbolic links
-   - `get_ead_from_symlink()` - Extracting EAD IDs from symlinks
-
-2. **`test_subprocess_fixes.py`** - Subprocess and shell operations
-   - `glob.glob()` wildcard expansion in batch file processing
-
-3. **`test_ead_operations.py`** - EAD XML operations
-   - `get_ead_id_from_file()` - Extracting EAD IDs from XML
-   - Dots-to-dashes sanitization in EAD IDs
-
-4. **`test_batching.py`** - Batch processing logic
-   - Batch size calculations
-   - Edge cases (empty lists, single items, exact multiples)
-
-5. **`test_config_discovery.py`** - Configuration file discovery
-   - `find_traject_config()` - Multi-path search logic
-   - Priority order: arcuit_dir → bundle show → fallback
-
-6. **`test_xml_manipulation.py`** - XML content handling
-   - `xml_escape()` for plain text labels
-   - `get_creator_bioghist()` - Biographical note extraction
-   - Proper handling of structured XML vs plain text
-
-7. **`test_utilities.py`** - Simple helper functions
-   - `get_repo_id()` - Repository ID extraction
-   - Path construction utilities
-
-8. **`test_agent_filtering.py`** - **STUB ONLY**
-   - All tests intentionally skipped
-   - Documents need for refactoring before testing
-   - See file for details on complexity issues
-
-## Writing New Tests
-
-### Use Shared Fixtures
-
-```python
-def test_example(temp_dir, mock_asnake_client, sample_agent):
-    """Use fixtures from conftest.py."""
-    # temp_dir: Temporary directory for file operations
-    # mock_asnake_client: Mock ArchivesSpace client
-    # sample_agent: Sample agent data structure
-    pass
+# Run specific test file
+pytest tests/unit/test_traject_smoke.py
 ```
 
-### Mark Your Tests
-
-```python
-import pytest
-
-@pytest.mark.unit
-def test_simple_function():
-    """Unit test that doesn't need external dependencies."""
-    pass
-
-@pytest.mark.integration
-def test_with_mocked_api():
-    """Integration test with mocked external services."""
-    pass
-
-@pytest.mark.slow
-def test_long_running():
-    """Test that takes significant time."""
-    pass
-```
-
-### Test Structure
-
-Follow the Arrange-Act-Assert pattern:
-
-```python
-def test_example():
-    # Arrange: Set up test data
-    input_data = "test"
-
-    # Act: Execute the function under test
-    result = function_to_test(input_data)
-
-    # Assert: Verify the results
-    assert result == expected_value
-```
-
-### Mocking External Dependencies
-
-```python
-from unittest.mock import Mock, patch
-
-def test_with_mock():
-    # Mock ArchivesSpace API calls
-    with patch('arcflow.main.ASnakeClient') as mock_client:
-        mock_client.return_value.get.return_value.json.return_value = {}
-        # Test code here
-```
-
-## Test Coverage Goals
-
-- **Target**: 80%+ code coverage for new features
-- **Focus**: Test critical paths and edge cases
-- **Skip**: Complex filtering logic that needs refactoring (see `test_agent_filtering.py`)
-
-## Continuous Integration
-
-Tests run automatically on:
-- Push to `main` or `develop` branches
-- Pull requests
-- Python versions: 3.9, 3.10, 3.11
-
-See `.github/workflows/test.yml` for CI configuration.
-
-## Dependencies
-
-Testing requires:
-- `pytest>=7.0.0` - Test framework
-- `pytest-cov>=4.0.0` - Coverage reporting
-- `pytest-mock>=3.10.0` - Mocking utilities
-
-Install with:
-```bash
-pip install -r requirements.txt
-```
-
-## Troubleshooting
-
-### Tests Fail to Import arcflow
-
-Make sure you're running from the repository root:
-```bash
-cd /path/to/arcflow
-pytest
-```
-
-### Coverage Report Not Generated
-
-Ensure pytest-cov is installed:
-```bash
-pip install pytest-cov
-```
-
-### Mock Client Issues
-
-If tests fail with authentication errors, ensure you're using the `mock_asnake_client` fixture:
-```python
-def test_example(mock_asnake_client):
-    # Use mock_asnake_client instead of real client
-    pass
-```
+## Traject Smoke Tests
 
-## Contributing Tests
+Tests in `tests/unit/test_traject_smoke.py` verify traject configuration without requiring Solr.
 
-When adding new functionality:
+### What They Test
+- Ruby syntax validity of traject configs
+- Traject can load and parse configs
+- XML transformation logic (without indexing)
 
-1. Write tests first (TDD approach recommended)
-2. Use existing fixtures from `conftest.py`
-3. Add new fixtures if needed (keep them reusable)
-4. Mark tests appropriately (`@pytest.mark.unit`, etc.)
-5. Run tests locally before committing
-6. Ensure coverage doesn't decrease
+### Setup Requirements
+- Ruby 3.1+
+- Bundler
+- Run `bundle install` to install traject gem
 
-## Notes on Test Philosophy
+### Performance
+- First run: ~60 seconds (includes gem install)
+- Cached runs: ~2 seconds (gems cached)
+- Fast enough for CI/agent iteration
 
-- **Minimal mocking**: Only mock external dependencies (API calls, file system when appropriate)
-- **Real logic testing**: Test actual business logic, not mocks
-- **Edge cases matter**: Test boundary conditions, empty inputs, error paths
-- **Fast feedback**: Most tests should run in milliseconds
-- **Clear failures**: Test names and assertions should make failures obvious
+### Skipping
+These tests skip gracefully if traject config doesn't exist yet.
 
-## Known Limitations
+## Writing Tests
 
-- **Agent filtering**: Logic too complex to test effectively in current state (see `test_agent_filtering.py`)
-- **Subprocess tests**: May not work on non-Unix systems
-- **Traject integration**: Requires Ruby/bundler setup (mocked in tests)
+When adding new tests:
+- Use pytest fixtures from `conftest.py`
+- Keep unit tests fast (< 1 second each)
+- Add integration tests to appropriate subdirectories
+- Use `pytest.skip()` for tests that require optional dependencies