tira · Capture operator UI for the harvester PaaS

The operator pane for an ephemeral capture fleet.

tira is what we built so that capturing content from the open web — for a Phoenix rebuild, a content migration, a competitive sweep — is a thing you can see. Recipes you can read, workers that spin up per run and tear down at terminal state, cost events priced in euros.

Own scraping infra. Own egress strategy. Own capture artifacts. Lives at tira.eleven11.pro, behind password and HMAC cookie.

Request partner access Start consultation

Surface signal

Status

PARTNER

Tenancy

Internal

Workspace

tira.eleven11.pro

Why this exists

Scraping at scale, without renting somebody else's fleet.

Most teams that need to capture content from the open web end up paying a black-box scraping vendor, watching jobs land in someone else's dashboard, and trusting that the artifacts won't be re-sold under a different brand. The work is fragile, the bill compounds, and the moment a target turns hostile, you're a support ticket away from blocked.

tira is the other shape. It's the operator pane for an ephemeral capture fleet that lives inside Eleven11 — recipes you can read, workers that spin up on a Hetzner box per run and tear down at terminal state, and cost events priced in euros to the minute. You see every dispatch, you own every artifact, and the substrate is something a senior engineer signed off on.

Self-sustained by design

Owned, not rented.

Capture is the seam where most teams accidentally hand operating leverage to a vendor. tira is built so the leverage stays inside the engagement.

Your scraping infra, not a SaaS plan.

tira is the front of an in-house harvester. Workers are cpx42 boxes provisioned on demand from a Hetzner project we control, on a unique Celery queue per submission, reaped by a task-state-aware watchdog. No shared pool, no noisy neighbours, no per-request billing surprise.

Capture artifacts in your bucket.

Every run writes a zip bundle to Hetzner Object Storage with per-URL JSONL plus screenshots, html, and a manifest. The download URL is a 7-day presign against storage you administer. Bundles do not transit a third-party scraping cloud.

Egress modes you can name.

direct for customer-authorised scans, vpn for geo-sensitive or rate-limited targets, privileged for WordPress rebuild captures. Six EU Mullvad WireGuard configs in a swap-on-error pool, kill-switch enforced via iptables before the tunnel comes up. Egress is a deliberate choice, not a vendor default.

Recipes are code, not config in a UI.

generic, compras_mx, phoenix_capture, url_scout — each is a Celery task with an input schema and a typed manifest. Adding a recipe means writing a Python module and registering it. There is no central template engine to placate.

Honest about hostile targets.

WAF datacenter blocking is real. Hetzner ASN and Mullvad exits both get 403'd by some hosts (Indian government Akamai and Imperva, Cloudflare-fronted login walls). url_scout detects the abort, writes the reason to the manifest, and the operator falls back to a residential browser. We don't pretend the open web is uniformly reachable.

The primitive

Three things you can name. Ephemeral.

tira is built on a small, opinionated trio — a recipe to plan with, a run to dispatch into, and a worker that exists only as long as its run does. Everything else in the UI is a server action against one of those three.

01 · Plan

Recipe

/recipes/[name]

A typed input schema plus a Celery task. tira renders the schema as a form — shard count for compras_mx, privileged toggle and WP creds for phoenix_capture, discovery and pagination JSON for url_scout. Submit fires a server action that calls POST /jobs on the harvester API.

02 · Dispatch

Run

/runs/[id]

One submission, one ephemeral worker, one Celery queue. tira's run page tails progress (done plus skipped, with a +N resumed chip on MX restarts), shows live cost in euros, and surfaces revoked or watchdog-abandoned tasks instead of leaving them as ghost-queued rows.

03 · Reap

Worker

/workers

Read-only fleet view of the cpx42 ephemerals currently alive — task id, queue, ipv4, heartbeat freshness. Every box is a thirty-cents-an-hour rental that exists only as long as its task is in-flight. The watchdog reaps on terminal state inside seconds.

How it fits the fleet

The capture seam every other tool reads from.

tira is rarely the surface a customer thinks they bought — but it's the seam where their corpus enters the fleet. What lands in a bundle here grounds what Phoenix rebuilds, what Discovery scores, what Architect remembers.

Phoenix

WordPress rebuild captures fire phoenix_capture in privileged mode. Admin creds transit memory only — never Redis, never logs, shredded on shutdown via a systemd unit. The bundle feeds punah's content normalizer; the tech fingerprint feeds Dhara's intelligence graph.

Discovery

Every privileged capture writes a tech fingerprint (WordPress version, active theme, plugins) into the manifest top-level. Discovery consumes it through a planned hook, so a single capture compounds into CVE surface for the same host.

Outreach

Competitive intelligence sweeps and prospect scrapes run as generic batch or url_scout jobs. Outreach reads the resulting items.zip without owning capture infra of its own.

Alerts

Workers post run-level events to tira's /api/v1/harvester/events endpoint, which forwards them into e11-alerts. Bundle-ready emails carry a clickable presigned link. Operator stays out of the inbox until something needs a decision.

Architect

Capture bundles destined for a workspace memory get ingested through Architect's import flow. The matter that owns the work knows where its source corpus came from and which run produced it.

Operator

tira is operator-pattern by design — a Next.js shell that server-renders from harvester-api over HTTPS with a bearer token. No DB in-app. Same two-layer client shape as the rest of the admin plane, so a senior engineer reads it once and knows the wiring.

Surfaces & contracts

Six things you actually call.

Five routes a person opens, one ingress a worker posts to. Phase 0 is deliberately small — Postgres, lab mode, and admin mutation come behind it.

/dashboard

Fleet pulse.

Tiles for api health, active runs, active workers, runs in the last 24h, plus a recent runs table and a workers strip. The first surface a human opens to answer is anything on fire right now.

/recipes

What you can fire.

List of registered recipes with schema previews. Row click opens the recipe page with a typed form — shard count, resume-from job id for MX, discovery and pagination JSON for scout. Submit is a server action, not a client fetch.

/runs

Sortable history.

Filterable table over Redis-backed run state plus Celery's authoritative result backend. Status is inferred from celery-task-meta when a recipe doesn't write progress hashes itself, so generic runs no longer linger as queued forever.

/runs/[id]

Live progress and bundle.

Tailed log lines, per-URL counts (done, skipped, failed), live cost chip in euros against server type, and a download button that resolves to a presigned S3 URL when the bundle lands.

/workers

What is alive right now.

Read-only view of the ephemeral fleet — task id, queue, ipv4, heartbeat age, mode (direct, vpn, privileged). Provision and drain controls are deliberately absent in Phase 0; the operator API is the path to mutation.

/api/v1/harvester/events

Worker telemetry ingress.

POST-only with x-api-key. Accepts events in the harvester namespace plus an idempotency key, forwards into alerts-api with a 3s timeout, returns 202 even if the forward failed. Alerts liveness does not gate the caller.

Senior engineering, visible

The proofs are in the substrate.

Five decisions visible in the provisioner, the watchdog, the entrypoint, and the deploy shape — not adjectives, design choices.

One worker per run, no pool.

Every POST /jobs and POST /phoenix/captures provisions its own cpx42 on a unique Celery queue, executes the task, and is reaped on terminal state. Cross-run contention is impossible by construction. A 22k-URL Mexico tender scrape costs roughly one euro of fleet time.

Cold-boot is budgeted, not optimised away.

Six minutes of apt, docker install, and a 2GB image pull from the private registry. Watchdog runs with BOOT_GRACE=600s and MAX_BOOT_WAIT=1200s — shorter values reap workers mid-boot. The substrate is honest about what physics costs.

Two-layer client, server-only by import.

lib/harvester.ts holds the low-level fetch and reads HARVESTER_API_BASE_URL plus the bearer at module load. server/tira/*.ts wraps every call behind requireTiraSession() and starts each file with import "server-only". The browser cannot construct a harvester request even by mistake.

Authoritative terminal state from the broker.

Recipes that don't write started_at or finished_at are still surfaced correctly because list_runs cross-references celery-task-meta in Redis DB 1. SUCCESS, FAILURE, and REVOKED are read from one place. Ghost queued-forever rows are a fixed bug, not a known limitation.

Build local, deploy boring.

Phase 0 builds the image on the eleven11 box itself with pull_policy: never, because the alternative — adding insecure-registries to dockerd — would restart the daemon and disturb 46 production containers. Phase 1 moves to GHCR with the standard docker-publish and deploy-ssh workflows. The trade-off is documented, not hidden.

Who this is for

Teams whose capture is load-bearing.

tira earns its keep when the cost of a black-box scraping vendor — bills, brittleness, attribution risk — starts to exceed the cost of running an ephemeral fleet you administer.

Phoenix-rebuild engagements where a WordPress site has to be captured into a typed bundle before any rewriting starts.

Content-migration projects whose source CMS is alive but whose vendor's export is missing custom post types, taxonomies, or media.

Competitive-intelligence sweeps that need recurring captures with a recipe you can read, not a SaaS scraper that changes selectors without telling you.

Tender-registry and procurement-data programmes where resume-safety matters and a 22k-URL run is a normal Tuesday.

Security and threat-research workflows where IP attribution is part of the threat model and a shared scraping cloud is a non-starter.

FAQ

Final friction, reduced.

Is tira a SaaS we can sign up to?

No. tira is the operator UI for our internal harvester PaaS, partner-deployed alongside an Eleven11 engagement. If you have a capture programme that warrants its own substrate, we discuss it as a build, not a subscription. Talk to us about scope.

How do you handle hostile targets?

url_scout detects abort signals — Cloudflare challenges, Turnstile, reCAPTCHA, hCaptcha, 403/404/5xx — before extraction wastes time. For broad-spectrum WAF blocking we don't pretend a residential pool is free; the operator captures URLs from a residential browser and fires the detail recipe directly. Honest beats clever.

What happens to credentials in a privileged Phoenix capture?

Creds enter through the tira form over TLS, transit ctrl process memory only, are baked into hcloud user-data for the worker, land at /etc/e11/creds.env (0600), and are shredded by a systemd unit before shutdown. Never in Redis, never in logs, never on ctrl disk.

What's on the roadmap that isn't in Phase 0?

A persistent recipe-authoring lab — an always-up worker with a noVNC headful browser and a recipe-editor pane in tira, so iterating selectors takes seconds instead of cold-booting a fresh VM each cycle. Sketched as L1 through L4 chunks. After that: Postgres-backed run history, a proper cost rollup for fan-out parents, and GHCR CI.

Discuss tira

Bring your capture work home.

tira is partner-deployed today — internal to Eleven11 engagements that need their own capture substrate. Talk to us about a Phoenix rebuild, a tender programme, or a recurring intelligence sweep.

Request partner access Start consultation

Direct line

Consultation requests stay owned. We reply from e11 after reviewing fit and timing.