The live wire for Tampa Bay.
A unified guide to live music, festivals, food, and family fun across the Tampa Bay area — Tampa, St. Petersburg, Clearwater, Brandon, Bradenton, Safety Harbor, Dunedin, and an Other catch-all for edge cases. Listings are aggregated daily from multiple sources, deduplicated, and ranked for readability. Curated Places (beaches, venues, food, and similar) ship on /places via a separate discovery pipeline. Lives at baywire.app.
Baywire runs 18 event adapters (src/lib/scrapers/index.ts). The daily matrix includes only sources that are enabled in the database and have a matching adapter (scripts/ci-scrape-matrix.ts). Each job tries structured data first (JSON-LD on detail pages, the WordPress Tribe Events JSON API, the Ticketmaster Discovery API, iCal exports) and falls back to OpenAI extraction when nothing structured is available; venue rows are upserted into Places as events are processed. Vercel hosts the read-only Next.js app — scrapes run on GitHub Actions, not on Vercel.
GHA cron (daily 12:00 UTC)
│ matrix: one job per source
▼
adapter.listEvents
│
├─ tryStructured() ── parse JSON-LD / ICS / vendor JSON ──┐
│ ▼
└─ fetchAndReduce() ── HTML reducer ── OpenAI extractor ──▶ Prisma Postgres + Accelerate
│
│ (Places upserted from venues)
▼
Vercel Next.js (read-only)
- Next.js 16 (App Router, React 19, RSC by default) + Serwist (
@serwist/turbopack) for the PWA shell proxy.ts(Next.js proxy) — anonymous guest profile cookie bootstrap- Tailwind CSS v4 + custom coastal palette
- Prisma ORM + Prisma Postgres + Prisma Accelerate (managed Postgres + edge cache / connection pool in one URL); URL also lives in
prisma.config.tsfor Prisma 7 CLI - OpenAI
gpt-4.1-miniwith Zod-typed structured outputs (also works with any OpenAI-compatible proxy viaOPENAI_BASE_URL, e.g. Poe / Groq / Together) - Google Places API (New) + Vercel Blob for place discovery hero images (
npm run places:discover, scheduled workflow) - Stytch for SMS sign-in / sessions (optional for local dev depending on feature work)
cheeriofor HTML reduction,p-limitfor per-host pacing, Playwright where adapters need a real browser- GitHub Actions — daily event scrape matrix, weekly places discovery, daily expired-row cleanup
- Vercel hosts the read-only Next.js app (event and place pages load from the DB on the server)
| Slug | Site | Path | Notes |
|---|---|---|---|
eventbrite |
eventbrite.com | JSON-LD | Geo-search across all metro cities (excluding other), 2 pages each |
ticketmaster |
ticketmaster.com/discover/tampa | Discovery API | Official Discovery API, DMA 635 (Tampa-St. Pete-Sarasota) |
visit_tampa_bay |
visittampabay.com/events | JSON-LD | Official tourism, curated |
visit_st_pete_clearwater |
visitstpeteclearwater.com | JSON-LD | Both /events and /events-festivals listings |
tampa_gov |
tampa.gov/calendar | JSON-LD + ICS | City of Tampa public events calendar |
ilovetheburg |
ilovetheburg.com | Tribe REST API | St. Pete blog |
thats_so_tampa |
thatssotampa.com | Tribe REST API | Tampa-side blog |
tampa_bay_times |
tampabay.com/things-to-do | HTML + LLM | Editorial weekend picks |
tampa_bay_markets |
tampabaymarkets.com | HTML + LLM | Recurring farmers' markets across the bay |
safety_harbor |
cityofsafetyharbor.com | RSS hint + LLM | CivicPlus RSS feed → SSR detail pages |
side_splitters |
sidesplitterscomedy.com | HTML + LLM | Comedy club; listings link out to OvationTix |
dont_tell_comedy |
donttellcomedy.com | HTML + LLM | Pop-up “secret” comedy shows |
funny_bone_tampa |
tampa.funnybone.com | HTML + LLM | Funny Bone Tampa show listings |
straz_center |
strazcenter.org | HTML + LLM | Playwright solves Incapsula challenge automatically |
tampa_theatre |
tampatheatre.org | HTML + LLM | Live events page + detail pages |
Browser-powered sources: The following adapters use Playwright (headless Chromium) to bypass WAF challenges or render JS-heavy pages:
| Slug | Site | Strategy | Notes |
|---|---|---|---|
dunedin_gov |
dunedinfl.net | Browser render + AI listings | Akamai; JS-rendered calendar |
unation |
unation.com | Browser render | Cloudflare bot challenge |
feverup |
feverup.com | Browser render + AI listings | Full SPA |
straz_center |
strazcenter.org | Browser cookies | Incapsula WAF |
funny_bone_tampa |
tampa.funnybone.com | Browser cookies | DataDome WAF |
visit_tampa_bay |
visittampabay.com | Browser render (listing only) | JS-rendered calendar |
# 1. Install dependencies (postinstall runs `prisma generate`)
npm install
# 2. Configure environment
cp .env.example .env.local
# Edit .env.local — required for scrapes:
# DATABASE_URL, OPENAI_API_KEY, CRON_SECRET
# Optional / feature-specific (see .env.example comments):
# TICKETMASTER_API_KEY, GOOGLE_MAPS_API_KEY, BLOB_READ_WRITE_TOKEN,
# STYTCH_PROJECT_ID, STYTCH_SECRET, STYTCH_ENVIRONMENT
# 3. Push the schema to the database
npm run db:push
# 4. Seed it with one full scrape
npm run scrape
# or scrape a single source:
npm run scrape eventbrite
# 5. Start the dev server
npm run devOpen http://localhost:3000.
- Sign in at console.prisma.io and create a Prisma Postgres database.
- Copy the connection string — it looks like
prisma+postgres://accelerate.prisma-data.net/?api_key=...— and paste it intoDATABASE_URLin.env.local. - Run
npm run db:pushto materialize the schema. (Usenpm run db:migrate:devonce you want versioned migrations.)
The same URL handles both the live query path (via Accelerate, with edge caching) and migrations, so you don't need a separate "direct URL".
| Command | What it does |
|---|---|
npm run dev |
Next.js dev server (clears .next first) |
npm run build |
Production next build (run npm install first so postinstall generates Prisma Client) |
npm run typecheck |
tsc --noEmit |
npm run lint |
ESLint (eslint .) |
npm run db:push |
Push schema to the database (dev only) |
npm run db:migrate:dev |
Generate + apply a development migration |
npm run db:migrate |
Apply existing migrations (production) |
npm run db:studio |
Open Prisma Studio |
npm run scrape [slug] |
Run the event scrape pipeline once, locally |
npm run places:discover |
Run the Google Places + editorial discovery pipeline (--help for flags) |
npm run cleanup:expired |
Delete stale events (and old orphan places unless --skip-places) |
npm run ci:scrape-matrix |
Emit the GitHub Actions scrape matrix JSON from the DB (CI only) |
Production splits Vercel (HTTP app) from GitHub Actions (scheduled writes).
- Push to GitHub and import the repo into Vercel.
- In Project Settings → Environment Variables, set at minimum
DATABASE_URL. AddCRON_SECRETif you want the manualPOST /api/cron/scraperoute bearer-gated. For profiles, places imagery, and SMS auth, addBLOB_READ_WRITE_TOKEN,GOOGLE_MAPS_API_KEY, and theSTYTCH_*variables from.env.exampleas needed. - Deploy. There are no Vercel cron schedules —
vercel.jsonis intentionally empty.
The workflow at .github/workflows/scrape.yml
runs every day at 12:00 UTC. It builds a matrix from enabled sources
rows that match a code adapter (npm run ci:scrape-matrix), then runs
npm run scrape -- <slug> per cell. Each job parses the [scrape:result] line
into the step summary and uploads scripts/.last-html/ plus scrape.log when it fails.
Playwright browsers are cached and Chromium is installed per job.
Secrets for scrape:
| Secret | Required | Notes |
|---|---|---|
DATABASE_URL |
yes | Same Prisma Postgres URL Vercel uses |
OPENAI_API_KEY |
yes | Used only by HTML+LLM fallbacks |
OPENAI_BASE_URL |
optional | OpenAI-compatible proxy (Poe / Groq / …) |
OPENAI_EXTRACT_MODEL |
optional | Override default gpt-4.1-mini |
TICKETMASTER_API_KEY |
optional | Free Discovery API key. Without it the ticketmaster adapter errors and the rest of the matrix continues |
.github/workflows/discover-places.yml
runs Sundays at 08:00 UTC (and supports workflow_dispatch with optional city
and type inputs). It executes npm run places:discover.
Additional secrets: GOOGLE_MAPS_API_KEY, BLOB_READ_WRITE_TOKEN, plus the same
OpenAI variables as the scrape workflow for the editorial pass.
.github/workflows/cleanup.yml runs daily at 06:00 UTC
and executes npm run cleanup:expired with DATABASE_URL only.
-
workflow_dispatchon the scrape workflow accepts an optionalsourceinput. Leave it blank to run every enabled adapter in parallel. -
Scheduled GitHub runs may start 5–30 minutes late under platform load; use
workflow_dispatchor the Vercel cron lever below when you need tighter timing. -
POST /api/cron/scrapeon Vercel is preserved as a manual lever:curl -H "Authorization: Bearer $CRON_SECRET" \ https://your-app.vercel.app/api/cron/scrapeIt returns immediately and continues the work in the background via
next/after. GitHub Actions remains the source of truth for daily scrapes.
proxy.ts Next.js proxy — guest profile cookie bootstrap
src/
app/
events/ Event list + `/events/[id]` detail
places/ Curated places index + `/places/[slug]` detail
api/cron/scrape/ Bearer-gated manual scrape trigger
serwist/ Serwist worker route (PWA)
components/ RSC + client UI (home, places, design-system, …)
lib/
cities.ts City constants (shared DB enum + UI)
auth/ Stytch wiring + session helpers
db/ Prisma client (Accelerate) + query helpers
extract/ OpenAI structured-output extraction
pipeline/ Event orchestrator; `discoverPlaces` for places
places/ Google Places client + discovery types
scrapers/ Per-source adapters + shared fetch/reduce/browser
time/ America/New_York-aware window helpers
utils.ts Shared helpers (`cn`, formatting, …)
prisma/
schema.prisma Source, ScrapeRun, Event, Place, profiles, …
migrations/ SQL migrations (after `db:migrate:dev`)
prisma.config.ts Prisma 7 datasource URL for CLI
scripts/
scrape.ts `npm run scrape`
places-discover.ts `npm run places:discover`
cleanup-expired.ts `npm run cleanup:expired`
ci-scrape-matrix.ts Matrix builder for GitHub Actions
- Per-host pacing is 1 request / 1.1 seconds with concurrent extraction at 4 in flight.
- Structured-first: adapters with JSON-LD / ICS / vendor JSON skip the LLM entirely via
tryStructured. The HTML+LLM fallback runs only when no structured surface exists for a given event. - Content-hash short-circuit: events whose structured payload (or reduced HTML) hasn't changed never re-hit the LLM.
- Reduced HTML is capped at 16k characters (well under 4k tokens) before being sent to
gpt-4.1-mini. - A daily run currently touches ~100–200 unique events. With Phase 2 structured-first enabled, most adapters do zero LLM calls per event; only
tampa_bay_times,tampa_bay_markets, andsafety_harborconsistently use the LLM, plus any detail page where structured extraction returnednull. - Accelerate-backed read helpers use
cacheStrategy: { ttl: 60, swr: 300 }where caching applies (src/lib/db/queries.ts,queriesPlaces.ts), so repeated reads can hit the edge cache.
This project respects each source's robots.txt and only fetches public listing and detail pages. Event cards link back to the original URL, and the site footer lists enabled sources from the database. If a publisher requests removal, contact them or open an issue and we'll disable the relevant adapter.