Architecture
InspoSearch is a client-first application with a minimal Cloudflare Workers backend for CORS bypass, AI, and shared state.
- The browser does the search — 6 ES modules, bundled by esbuild into a single
app.js - A Cloudflare Worker handles jobs the browser can't do alone: CORS-walled APIs, Workers AI vision, shared boards
- Cloudflare KV backs 30-day shared boards
- No traditional backend, no framework, no runtime npm deps in the app
The whole system, at a glance
┌─────────────────────── Browser ───────────────────────┐
│ │
│ src/state.js constants, registry, classifier │
│ src/core.js health, cache, safeFetch, scoring │
│ src/fetchers.js adapters (ADAPTERS map) + expand │
│ src/app.js orchestration, render, AI, UI │
│ src/i18n.js 6 base + 95 generated locales │
│ src/main.js esbuild entry (imports the above) │
│ │
│ → bundled to insposearch/app.js (IIFE) │
│ │
└──────────┬──────────────────┬─────────────────────────┘
│ │
│ direct │ via Worker (when needed)
│ (CORS-open APIs) │
▼ ▼
┌────────────────┐ ┌──────────────────────────────┐
│ 2,400+ source │ │ api/worker.js │
│ APIs │ │ /search /sources /random │
│ (Met, Rijks, │ │ /proxy /semantic /caption │
│ Europeana…) │ │ /tags /contribute /board │
└────────────────┘ │ /health │
└──────┬───────────────────────┘
│
┌────────┴──────────┐
▼ ▼
Workers AI KV: BOARDS
(LLaVA 1.5) (30-day TTL)Module dependency order
No circular imports. main.js loads in this order:
state.js → core.js → fetchers.js → app.js
→ i18n.js| Module | Responsibility |
|---|---|
src/state.js | Constants, ALL_SOURCES registry, BADGE_META / SOURCE_META / SOURCE_GROUPS / SOURCE_DOMAINS, classifyQueryExtended / classifyQueryV2 |
src/core.js | safeFetch, health tracking, session cache, selectDynamicSources, scoring utilities |
src/fetchers.js | 100+ fetch* adapter functions, Datamuse / Wikidata keyword expansion, populates the ADAPTERS map |
src/app.js | Search orchestration (3-lane dispatcher, RRF, MMR), grid rendering, DOM events, detail view, board view, 3D constellation, AI features |
src/i18n.js | 101-language strings, DOM attribute binding (data-i18n, data-i18n-placeholder, data-i18n-title, data-i18n-aria) |
src/main.js | esbuild entry point — imports the others in order |
Edit src/*.js, never the bundled insposearch/app.js output.
Search data flow
1. Query classification
classifyQueryExtended(q) → intent ∈ {nature, space, art, history,
design, science, photo, general}
2. Dynamic source selection
selectDynamicSources(intent) → scored top-N sources
3. Keyword expansion (explore mode + Lane B)
Datamuse synonyms + Wikidata SPARQL translations
+ art-period / species aliases
+ Datamuse qualifier pool (Lane B rotation)
4. Parallel fetch
promisePool(sources, ADAPTERS[id], concurrency: 40)
timeout ceiling: 12s for slow sources
safeFetch wraps every call — no throw escapes
5. Health tracking
Misses recorded per query intent.
Sources exceeding threshold: disabled for current session.
Auto-recovery on each new query.
6. Dedup → fuse → diversify
URL-level dedup → Reciprocal Rank Fusion across sources
→ Maximal Marginal Relevance diversification
7. Render
Lazy-loaded grid, preload-then-append (no black squares).
Colour extraction per tile.
Infinite scroll via 3-lane load-more.The 3-lane load-more dispatcher
Load-more is the single hardest part of the system — naive pagination exhausts by page 2. We run three lanes in parallel:
| Lane | Strategy | What it provides |
|---|---|---|
| A | Fetch next page of already-hit paginated sources | Fresh results from the same set |
| B | Re-run the query with linguistic variations (Datamuse qualifiers) | New angle on the same intent |
| C | Widen the source pool — include sources that didn't make the initial cut | New provenance |
Each lane contributes candidates; dedup + RRF + MMR do the final merge.
Source registry
Sources live in two places:
- Hardcoded adapters in
src/fetchers.js— onefetch*per source, registered inADAPTERS - Community manifests in
insposearch/sources/*.json— validated byscripts/validate-sources.jsin CI
Display metadata for every source: BADGE_META, SOURCE_META, SOURCE_GROUPS, SOURCE_DOMAINS in state.js. Adding a source touches all four (plus the adapter).
CORS-blocked sources
Some APIs refuse browser-to-server calls. scripts/fetch-cors-blocked.js runs nightly via GitHub Actions and writes pre-fetched JSON to insposearch/data/. Their adapters read from /data/ instead of the live API — identical shape, stale by up to 24h.
The nightly push is race-safe (rebase-and-retry on non-fast-forward).
Cloudflare infrastructure
| Component | Role |
|---|---|
| Pages | Hosts insposearch/ — static site. The docs you're reading are a subpath at /docs/ |
Worker (api/worker.js) | REST API: /search, /sources, /random, /health, /proxy, /semantic, /caption, /tags, /contribute, /board |
Worker (api/image-proxy.js) | Thumbnail CORS proxy for sources that don't set permissive headers |
| Workers AI | Default vision provider — LLaVA 1.5 powers analyse when no BYOK is set |
| KV: BOARDS | Shared board state, 30-day TTL |
See API Contracts for endpoint-level detail.
State management
Minimal custom store — no Redux, no signals library. One plain object, UI modules subscribe via a small event emitter:
const state = {
query: '',
mode: 'exact', // or 'explore'
results: [],
sources: [],
enabledSources: [],
apiKeys: {}, // from localStorage
loading: false,
selectedImage: null,
board: [],
constellation: null,
aiProvider: null, // null → Workers AI default
}Key transitions:
| Action | Trigger | Effect |
|---|---|---|
SEARCH_START | Enter pressed | loading: true, clear results |
RESULTS_APPEND | Source responds | Append + re-render, rescore |
SEARCH_COMPLETE | All sources settled | loading: false |
LOAD_MORE | Scroll / button | Dispatch 3 lanes in parallel |
SELECT_IMAGE | Click or keyboard | Open detail view |
TOGGLE_SOURCE | Sidebar checkbox | Update enabledSources |
SET_API_KEY | Keys panel input | Save to localStorage, re-run current query |
PIN_TO_BOARD | b + drag or click | Append to board, persist to localStorage |
SET_MODE | m | Toggle exact ↔ explore, re-run current query |
Design decisions worth knowing
- Vanilla JS, no framework. Removes ~100kb of bundle, ~100ms of hydration, a lifetime of upgrade churn
- No backend for search. Every API call goes browser → source. Means no server cost, no rate-limit bottleneck, no privacy leak
- Parallel via
Promise.allSettled. One slow source never blocks a fast one - Progressive render. Results appear per-source, not in a final batch
- Local-only persistence for keys, boards, preferences. No account to leak
- Optional AI. The app is fully usable without it. Default AI uses Workers AI (no user keys) — BYOK is an override, not a requirement
- IIIF-native. Any IIIF source gets deep zoom automatically via OpenSeadragon
- No circular imports. Module order is explicit in
main.jsand enforced by review