Base 42 — Run Together

Launch day is just Day 1. Base 42 is our managed-ops layer that keeps your product fast, secure, and always shipping. It blends light Site-Reliability Engineering (SRE) with product-ops so adoption and uptime rise in tandem.

1 · Why Base 42?

Production Pain	How Base 42 Solves It
“Who's on pager duty?”	24 × 7 on-call rotation, first-line triage in < 15 min.
Bug found → no one can roll back	Git tag + blue/green release flow with instant revert.
Metrics in ten dashboards, none linked to features	Unified Grafana board tying KPIs, errors, logs, and cost to each epic.
Roadmap stalls under unplanned maintenance	Monthly optimisation sprint keeps feature velocity ≥ 80 % of dev budget.

2 · Managed-Hosting Options

Tier	Stack	Ideal For	SLA¹	Typical Cost²
BYOC (Bring your own cloud)	AWS, Azure, GCP, on-prem Docker/K8s	Enterprises with internal DevOps	Advisory only	5–10 % of dev spend
Edge Cloud (default)	Fly.io + Postgres	SaaS & fintech MVPs needing global latency	99.9 %	12–16 %
Hybrid	Fly.io front, AWS data plane, Salesforce or SAP integrations	Regulated workloads	99.95 % core, 99.9 % edge	15–20 %
Full-Service	Edge Cloud + CI/CD + DaaS squad	Scale-ups wanting "no-ops"	99.95 % + hot-fix < 1 h	18–22 %

¹ Monthly uptime.

² Percentage of monthly engineering run-rate (DaaS or Pod).

All tiers include Datadog, Sentry, Statuspage and automated SSL rotation.

3 · Ops Lifecycle

Phase	What Happens	Artefacts
Product-Ops Sprint (monthly)	Prioritise bugs, A/B tests, infra chores	Ranked "Run Backlog"
Blue/Green Release	Traffic shift with auto-rollback if health < 99 %	Release notes
Telemetry & Alerts	Datadog SLOs, Sentry error triage	Real-time dashboard
Incident Response	PagerDuty rotation, Slack war-room	RCA doc in 24 h
Post-Mortem & Tech-Debt Ticket	Blameless review, ticket into Forge backlog	Jira ticket linked to RCA

4 · Reliability & Performance Budgets

Error Budget

p95 latency target (default 300 ms API, 3 s PWA).

Change Failure Rate

goal ≤ 5 %.

Mean Time to Restore (MTTR)

P1 ≤ 2 h, P2 ≤ 8 h.

Traffic Surge Policy

auto scale 3× in < 60 s on Fly.io; static warm pool for AWS.

5 · Security & Compliance

Control	Implementation
Secrets Management	Doppler or AWS Secrets Manager — no keys in env files.
Vulnerability Scans	Nightly Snyk + GitHub Dependabot blobs auto-PR.
Data Encryption	TLS 1.3 in transit; AES-256 at rest.
Audit Log	Immutable CloudTrail / Fly Log-Drains; 30-day hot, 1-year cold.
Compliance Support	SOC-2 / ISO 27001 questionnaire pack; HIPAA BAA addendum.

6 · Optimisation & Cost Control

Usage-to-Cost Dashboard

unit cost per 1 k requests, per tenant.

Weekly Anomaly Alerts

spend spike > 15 % triggers Slack ping.

Quarterly Infra Review

right-size instances, prune dead feature flags.

AI-assisted Index Tuning

pgvector and Postgres auto-suggested indexes applied after tests.

Scale-ups on Full-Service tier cut infra cost/MAU by avg 22 % in year 1.

7 · Exit & Portability

14-day data export + Terraform scripts.
Handoff meeting, doc stack, and credentials.
Optional "Shadow Month" at 50 % fee for knowledge overlap.
No vendor lock-in: you own cloud accounts (Edge Cloud uses your Fly.io org).

8 · Mini Case

Retail Flash-Sale Platform — Black-Friday traffic spiked 18×; autoscale kept p95 < 280 ms, zero downtime. Infra spend only +31 % vs baseline due to right-sizing.

9 · FAQs

10 · Call to Action

Want 99.9 % uptime without hiring an SRE team?

Book a Base 42 readiness call—get a tailored hosting plan and cost in 48 h.

Secure My Ops