Skip to content

Architecture

InspoSearch is a client-first application with a minimal Cloudflare Workers backend for CORS bypass, AI, and shared state.

  • The browser does the search — 6 ES modules, bundled by esbuild into a single app.js
  • A Cloudflare Worker handles jobs the browser can't do alone: CORS-walled APIs, Workers AI vision, shared boards
  • Cloudflare KV backs 30-day shared boards
  • No traditional backend, no framework, no runtime npm deps in the app

The whole system, at a glance

┌─────────────────────── Browser ───────────────────────┐
│                                                       │
│   src/state.js     constants, registry, classifier    │
│   src/core.js      health, cache, safeFetch, scoring  │
│   src/fetchers.js  adapters (ADAPTERS map) + expand   │
│   src/app.js       orchestration, render, AI, UI      │
│   src/i18n.js      6 base + 95 generated locales      │
│   src/main.js      esbuild entry (imports the above)  │
│                                                       │
│   → bundled to insposearch/app.js (IIFE)              │
│                                                       │
└──────────┬──────────────────┬─────────────────────────┘
           │                  │
           │ direct           │ via Worker (when needed)
           │ (CORS-open APIs) │
           ▼                  ▼
   ┌────────────────┐  ┌──────────────────────────────┐
   │ 2,400+ source  │  │ api/worker.js                │
   │ APIs           │  │  /search /sources /random    │
   │ (Met, Rijks,   │  │  /proxy  /semantic /caption  │
   │  Europeana…)   │  │  /tags   /contribute /board  │
   └────────────────┘  │  /health                     │
                       └──────┬───────────────────────┘

                     ┌────────┴──────────┐
                     ▼                   ▼
              Workers AI           KV: BOARDS
              (LLaVA 1.5)          (30-day TTL)

Module dependency order

No circular imports. main.js loads in this order:

state.js     → core.js      → fetchers.js  → app.js
                                           → i18n.js
ModuleResponsibility
src/state.jsConstants, ALL_SOURCES registry, BADGE_META / SOURCE_META / SOURCE_GROUPS / SOURCE_DOMAINS, classifyQueryExtended / classifyQueryV2
src/core.jssafeFetch, health tracking, session cache, selectDynamicSources, scoring utilities
src/fetchers.js100+ fetch* adapter functions, Datamuse / Wikidata keyword expansion, populates the ADAPTERS map
src/app.jsSearch orchestration (3-lane dispatcher, RRF, MMR), grid rendering, DOM events, detail view, board view, 3D constellation, AI features
src/i18n.js101-language strings, DOM attribute binding (data-i18n, data-i18n-placeholder, data-i18n-title, data-i18n-aria)
src/main.jsesbuild entry point — imports the others in order

Edit src/*.js, never the bundled insposearch/app.js output.

Search data flow

1. Query classification
   classifyQueryExtended(q)  →  intent ∈ {nature, space, art, history,
                                          design, science, photo, general}

2. Dynamic source selection
   selectDynamicSources(intent)  →  scored top-N sources

3. Keyword expansion (explore mode + Lane B)
   Datamuse synonyms  +  Wikidata SPARQL translations
                      +  art-period / species aliases
                      +  Datamuse qualifier pool (Lane B rotation)

4. Parallel fetch
   promisePool(sources, ADAPTERS[id], concurrency: 40)
   timeout ceiling: 12s for slow sources
   safeFetch wraps every call — no throw escapes

5. Health tracking
   Misses recorded per query intent.
   Sources exceeding threshold: disabled for current session.
   Auto-recovery on each new query.

6. Dedup → fuse → diversify
   URL-level dedup  →  Reciprocal Rank Fusion across sources
                    →  Maximal Marginal Relevance diversification

7. Render
   Lazy-loaded grid, preload-then-append (no black squares).
   Colour extraction per tile.
   Infinite scroll via 3-lane load-more.

The 3-lane load-more dispatcher

Load-more is the single hardest part of the system — naive pagination exhausts by page 2. We run three lanes in parallel:

LaneStrategyWhat it provides
AFetch next page of already-hit paginated sourcesFresh results from the same set
BRe-run the query with linguistic variations (Datamuse qualifiers)New angle on the same intent
CWiden the source pool — include sources that didn't make the initial cutNew provenance

Each lane contributes candidates; dedup + RRF + MMR do the final merge.

Source registry

Sources live in two places:

Display metadata for every source: BADGE_META, SOURCE_META, SOURCE_GROUPS, SOURCE_DOMAINS in state.js. Adding a source touches all four (plus the adapter).

CORS-blocked sources

Some APIs refuse browser-to-server calls. scripts/fetch-cors-blocked.js runs nightly via GitHub Actions and writes pre-fetched JSON to insposearch/data/. Their adapters read from /data/ instead of the live API — identical shape, stale by up to 24h.

The nightly push is race-safe (rebase-and-retry on non-fast-forward).

Cloudflare infrastructure

ComponentRole
PagesHosts insposearch/ — static site. The docs you're reading are a subpath at /docs/
Worker (api/worker.js)REST API: /search, /sources, /random, /health, /proxy, /semantic, /caption, /tags, /contribute, /board
Worker (api/image-proxy.js)Thumbnail CORS proxy for sources that don't set permissive headers
Workers AIDefault vision provider — LLaVA 1.5 powers analyse when no BYOK is set
KV: BOARDSShared board state, 30-day TTL

See API Contracts for endpoint-level detail.

State management

Minimal custom store — no Redux, no signals library. One plain object, UI modules subscribe via a small event emitter:

js
const state = {
  query: '',
  mode: 'exact',               // or 'explore'
  results: [],
  sources: [],
  enabledSources: [],
  apiKeys: {},                 // from localStorage
  loading: false,
  selectedImage: null,
  board: [],
  constellation: null,
  aiProvider: null,            // null → Workers AI default
}

Key transitions:

ActionTriggerEffect
SEARCH_STARTEnter pressedloading: true, clear results
RESULTS_APPENDSource respondsAppend + re-render, rescore
SEARCH_COMPLETEAll sources settledloading: false
LOAD_MOREScroll / buttonDispatch 3 lanes in parallel
SELECT_IMAGEClick or keyboardOpen detail view
TOGGLE_SOURCESidebar checkboxUpdate enabledSources
SET_API_KEYKeys panel inputSave to localStorage, re-run current query
PIN_TO_BOARDb + drag or clickAppend to board, persist to localStorage
SET_MODEmToggle exact ↔ explore, re-run current query

Design decisions worth knowing

  1. Vanilla JS, no framework. Removes ~100kb of bundle, ~100ms of hydration, a lifetime of upgrade churn
  2. No backend for search. Every API call goes browser → source. Means no server cost, no rate-limit bottleneck, no privacy leak
  3. Parallel via Promise.allSettled. One slow source never blocks a fast one
  4. Progressive render. Results appear per-source, not in a final batch
  5. Local-only persistence for keys, boards, preferences. No account to leak
  6. Optional AI. The app is fully usable without it. Default AI uses Workers AI (no user keys) — BYOK is an override, not a requirement
  7. IIIF-native. Any IIIF source gets deep zoom automatically via OpenSeadragon
  8. No circular imports. Module order is explicit in main.js and enforced by review

· AGPL-3.0 · app · github