Skip to content

feat(repository): add search and filtering functionality #79

@F88

Description

@F88

Summary

Add search and filtering capabilities to ProtopediaInMemoryRepository to enable efficient data querying without requiring users to implement filtering logic themselves.

Motivation

Currently, users must call getAllFromSnapshot() and implement their own filtering logic. This leads to:

  • Code duplication across applications
  • Inconsistent filtering behavior
  • Performance issues when filtering large datasets repeatedly
  • Difficulty implementing complex queries

Proposed Features

Search Functionality

  • searchByKeyword(query) - Search across name, summary, freeComment fields
  • Full-text search capability

Filtering Functionality

  • filterByTags(tags, mode?) - Filter by tags (AND/OR mode)
  • filterByMaterials(materials, mode?) - Filter by materials
  • filterByUsers(users) - Filter by user names
  • filterByEvents(events) - Filter by events
  • filterByDateRange(start, end, field?) - Filter by createDate/updateDate/releaseDate
  • filterByStatus(status) - Filter by status value
  • filterByViewCountRange(min, max) - Filter by view count
  • filterByGoodCountRange(min, max) - Filter by good count

Chaining Support

  • Enable method chaining for complex queries
  • Return filtered results as array for further processing

Design Considerations

This feature requires careful consideration of multiple aspects:

1. String Matching Complexity

Issue: ProtoPedia allows high freedom in data entry, resulting in:

  • Mixed case (uppercase, lowercase)
  • Japanese characters (Hiragana, Katakana, Kanji)
  • Half-width and full-width Katakana
  • Spaces and special characters
  • Inconsistent formatting

Impact: Simple exact string matching would provide poor user experience and limited utility.

Options to consider:

  • Exact match only (simple but limited)
  • Case-insensitive matching
  • Normalize Unicode characters (NFKC normalization)
  • Convert Katakana variants (half-width ↔ full-width)
  • Fuzzy matching / edit distance
  • Japanese-specific normalization (ひらがな ↔ カタカナ)

2. Performance vs. Flexibility Trade-offs

Options:

  • Pre-build search indexes (faster, more memory)
  • On-demand filtering (slower, less memory)
  • Hybrid approach with configurable indexing

3. API Design

Questions:

  • Should filters mutate repository state or return new arrays?
  • Should we support SQL-like query builders?
  • Should we provide both simple and advanced APIs?

4. Localization and I18N

Considerations:

  • Japanese text normalization requirements
  • Unicode normalization strategies (NFC, NFD, NFKC, NFKD)
  • Collation rules for sorting

5. Dependencies

Question: Should we introduce additional dependencies?

  • String normalization libraries
  • Japanese text processing libraries (e.g., kuroshiro, wanakana)
  • Search libraries (e.g., fuse.js for fuzzy search)

6. Scope Definition

Critical decision: What level of sophistication should v1 provide?

  • Minimal: Exact match only (quick to implement, limited value)
  • Moderate: Case-insensitive + Unicode normalization
  • Advanced: Full fuzzy search with Japanese text handling

7. Library Separation

Note: PROMIDAS provides core functionality as a library. Features that don't belong in the core should be implemented in promidas-utils.

Consider for promidas-utils:

  • Advanced fuzzy matching algorithms
  • Japanese-specific text normalization utilities
  • Complex query builders
  • Domain-specific filtering helpers

Keep in PROMIDAS core:

  • Basic filtering by exact match
  • Simple case-insensitive search
  • Core data access patterns

Related Code

  • lib/repository/protopedia-in-memory-repository.ts
  • lib/types/normalized-prototype.ts
  • lib/repository/types/repository.types.ts

Related Projects

  • promidas-utils - Utility library for features that should not be in PROMIDAS core

Open Questions

  1. What is the minimum acceptable matching quality for tags/materials?
  2. Should we prioritize implementation speed or feature completeness?
  3. Are there specific use cases we should optimize for?
  4. What performance targets should we aim for (e.g., filter 10,000 items in <100ms)?
  5. Should filtering be case-sensitive by default?
  6. Which features belong in PROMIDAS core vs promidas-utils?

Suggested Approach

  1. Define clear boundary between PROMIDAS core and promidas-utils
  2. Start with basic implementation in core (case-insensitive, NFKC normalization)
  3. Implement advanced features in promidas-utils
  4. Gather user feedback on matching quality
  5. Iterate based on real-world usage patterns

Acceptance Criteria

  • Define and document matching strategy for v1
  • Decide which features belong in core vs promidas-utils
  • Implement core filtering methods in PROMIDAS
  • Add comprehensive tests covering edge cases (mixed case, Japanese text, special characters)
  • Performance benchmarks with 10,000+ items
  • Documentation with examples
  • Consider backward compatibility if API changes

Notes

This issue intentionally leaves many decisions open for discussion. Please provide feedback on priority and scope before implementation begins.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions