Skip to content

lumosearch/search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LumoSearch

npm version license

Fast, browser-first search engine for local datasets. Fuzzy matching, BM25F ranking, and candidate pruning — no server required.

LumoSearch replaces Fuse-style client-side search with a proper retrieval pipeline: inverted indexes narrow candidates fast, trigrams recover typo-tolerant matches, and only bounded candidates are rescored. The result is accurate, weighted multi-field search that stays fast as your dataset grows.

Features

  • BM25F ranking — field-weighted scoring with token rarity and length normalization
  • Fuzzy matching — trigram-based typo tolerance (handles javscrippt -> javascript)
  • Candidate pruning — token, trigram, and prefix indexes narrow candidates before scoring
  • Multi-field search — configure per-field weights and search across nested keys
  • Highlighting — character-level match ranges for each result field
  • Autocomplete — prefix-aware suggestions with the same scoring pipeline
  • Synonyms — token-level alias expansion (js -> javascript)
  • Filters & predicates — exact-match filters and custom predicate functions
  • Incremental mutations — add, remove, and replace documents without rebuilding
  • Persistence — snapshot export/import with InMemoryStorage and IndexedDbStorage adapters
  • Worker support — off-main-thread search via LumoSearchWorker
  • Hybrid reranking — async searchAsync() with pluggable reranker for semantic/custom scoring
  • Zero dependencies — ships as pure ES modules

Install

npm install @lumosearch/search

Quick Start

import { LumoSearch } from '@lumosearch/search'

const docs = [
  { title: 'JavaScript Patterns', body: 'Reusable design patterns for JavaScript.', category: 'books' },
  { title: 'TypeScript Handbook', body: 'Core TypeScript syntax and type system.', category: 'docs' },
  { title: 'Node.js in Action', body: 'Server-side JavaScript with Express.', category: 'books' }
]

const search = new LumoSearch(docs, {
  keys: [{ name: 'title', weight: 3 }, { name: 'body', weight: 1 }],
  candidateLimit: 250,
  limit: 10
})

const results = search.search('javscrippt paterns')
// => [{ item: { title: 'JavaScript Patterns', ... }, score: 0.94, highlights: [...] }]

API

Constructor

new LumoSearch(docs, {
  keys: ['title', { name: 'body', weight: 0.8 }],
  limit: 10,
  candidateLimit: 250,
  synonyms: { js: ['javascript'], auth: ['authentication'] }
})

Search

// Basic search
const results = search.search('patterns')

// With options
const results = search.search('patterns', {
  limit: 5,
  filters: { category: 'books' },
  predicate: (doc) => doc.title.length > 10
})

Autocomplete

const suggestions = search.autocomplete('jav', { limit: 5 })

Async Hybrid Reranking

const results = await search.searchAsync('ui architecture', {
  limit: 5,
  rerankLimit: 10,
  reranker: {
    async rerank({ query, candidates }) {
      // plug in your own semantic/ML reranker here
      return candidates.map((c) => ({ refIndex: c.refIndex, score: c.lexicalScore }))
    }
  }
})

Mutations

search.add({ title: 'New Doc', body: '...', category: 'docs' })
search.removeAt(0)
search.remove((doc) => doc.category === 'archived')
search.setCollection(newDocs)

Persistence

import { InMemoryStorage } from '@lumosearch/search'

// Save and restore
const storage = new InMemoryStorage()
await search.save(storage)
const restored = await LumoSearch.load(storage)

// Snapshot export/import (synchronous)
const snapshot = search.exportSnapshot()
const fromSnap = LumoSearch.fromSnapshot(snapshot)

For browsers, use IndexedDbStorage:

import { IndexedDbStorage } from '@lumosearch/search'

const storage = new IndexedDbStorage({ dbName: 'my-app-search', key: 'docs-index' })
await search.save(storage)

Worker Mode

import { LumoSearchWorker } from '@lumosearch/search/worker'

const worker = new LumoSearchWorker(docs, {
  keys: [{ name: 'title', weight: 3 }, { name: 'body', weight: 1 }]
})

const results = await worker.search('javscrippt paterns')
worker.terminate()

Result Shape

interface SearchResult<T> {
  item: T
  refIndex: number
  score: number
  matchedFields: string[]
  highlights: SearchHighlight[]
  lexicalScore?: number   // present when reranked
  rerankScore?: number    // present when reranked
}

interface SearchHighlight {
  field: string
  value: string
  indices: [number, number][]  // character ranges
}

How It Works

  1. Index — Normalize and tokenize configured fields. Build token, trigram, and prefix inverted indexes.
  2. Retrieve — For each query, gather candidates from postings (token 4x, prefix 3x, trigram 1x weight).
  3. Prune — Keep only the top candidateLimit candidates.
  4. Score — Rank with BM25F + exact/prefix/phrase/proximity boosts.
  5. Return — Top-k results with highlights.

Package Exports

Export Description
@lumosearch/search Main entry — LumoSearch, types, persistence adapters
@lumosearch/search/worker LumoSearchWorker for off-main-thread search
@lumosearch/search/worker-script Raw worker script entry for custom bundler setups

Browser Demo

A static demo is included in examples/browser-demo.

npm run build
npm run demo:serve
# Open http://localhost:4173/examples/browser-demo/

Contributing

Contributions are welcome. Please open an issue first to discuss what you'd like to change.

git clone https://github.com/lumosearch/search.git
cd search
npm install
npm test

License

MIT

About

Browser-first search engine with BM25F ranking, fuzzy matching, and candidate pruning. Zero dependencies

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors