Skip to content

Add event indexing for faster query performance #10

@melvincarvalho

Description

@melvincarvalho

Problem

The relay currently uses O(n) linear search through all events for every filter query. This becomes slow as the event count approaches the MAX_EVENTS limit (1000+), especially on mobile devices.

Current Performance Issues:

  • REQ queries: Must scan all events for each filter
  • Tag filtering: Iterates through all events to find matches
  • Author filtering: Linear search through entire event array
  • Kind filtering: No optimization for common kinds (0, 1, 3)

Mobile Impact:

  • CPU usage increases linearly with event count
  • Battery drain from repeated searches
  • Slower response times as relay stores more events
  • Poor UX for clients querying the relay

Proposed Indexing Solutions

Option 1: Simple in-memory indexes

class EventIndex {
  constructor() {
    this.byAuthor = new Map()    // pubkey -> [events]
    this.byKind = new Map()      // kind -> [events] 
    this.byId = new Map()        // id -> event
    this.byTag = new Map()       // tag -> [events]
  }
  
  addEvent(event) {
    // Index by author
    if (!this.byAuthor.has(event.pubkey)) {
      this.byAuthor.set(event.pubkey, [])
    }
    this.byAuthor.get(event.pubkey).push(event)
    
    // Index by kind
    if (!this.byKind.has(event.kind)) {
      this.byKind.set(event.kind, [])
    }
    this.byKind.get(event.kind).push(event)
    
    // Index by tags
    event.tags?.forEach(([tagType, value]) => {
      const key = \`#\${tagType}:\${value}\`
      if (!this.byTag.has(key)) {
        this.byTag.set(key, [])
      }
      this.byTag.get(key).push(event)
    })
  }
}

Option 2: Time-based indexing

class TimeIndex {
  constructor() {
    this.sortedEvents = []  // Keep events sorted by created_at
  }
  
  findByTimeRange(since, until) {
    // Binary search for time range
    const startIdx = this.binarySearchGTE(since || 0)
    const endIdx = this.binarySearchLTE(until || Infinity)
    return this.sortedEvents.slice(startIdx, endIdx + 1)
  }
}

Option 3: Hybrid approach

  • Fast paths: Direct index lookups for common queries
  • Fallback: Linear search for complex filters
  • Memory efficient: Only index frequently queried fields

Implementation Strategy

Phase 1: Basic indexes

  1. Index by `pubkey` (authors)
  2. Index by `kind`
  3. Index by `id`
  4. Maintain sorted order by `created_at`

Phase 2: Tag indexing

  1. Index common tags (`#e`, `#p`)
  2. Generic tag indexing for other tag types
  3. Composite queries using multiple indexes

Phase 3: Query optimization

  1. Analyze filter patterns to choose optimal index
  2. Combine multiple indexes for complex filters
  3. Add query result caching for repeated requests

Performance Expectations

Current (O(n)):

  • 1000 events: ~1ms per query
  • Filter query time grows linearly

With indexing (O(log n) to O(1)):

  • Author queries: ~0.01ms (direct Map lookup)
  • Kind queries: ~0.01ms (direct Map lookup)
  • Time range: ~0.1ms (binary search)
  • Tag queries: ~0.01ms (direct Map lookup)

Memory overhead:

  • Estimated ~2x memory usage for indexes
  • Still within mobile device constraints
  • Much better than current O(n) CPU usage

Implementation Considerations

Memory management:

  • Indexes must be updated when events are evicted (FIFO)
  • Consider weak references for large datasets
  • Lazy index building for less common queries

Backwards compatibility:

  • Keep existing linear search as fallback
  • No changes to external API
  • Environment flag for enabling/disabling indexes

Configuration

const USE_INDEXING = process.env.USE_INDEXING !== 'false' // Default: enabled

Benefits

  • 10-100x faster queries for indexed fields
  • Better mobile performance: Less CPU usage, better battery life
  • Scalability: Handles MAX_EVENTS limit efficiently
  • User experience: Faster response times for clients

Testing

  1. Benchmark with 1000 events (current MAX_EVENTS)
  2. Test memory usage on mobile devices
  3. Validate index accuracy against linear search
  4. Performance test with various filter combinations

This optimization would significantly improve the relay's performance profile on mobile devices.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions