Using Supabase service_role correctly: the production runbook.

2026-05-09 · vibecheck team · 12 min read · Database · Supabase

Quick answer service_role bypasses Row-Level Security, so it's the right tool for legitimate server-side admin work — running migrations, processing webhooks, executing cron jobs, supporting "admin dashboard" features that need to read across users. It's the wrong tool for anything a user-scoped JWT could handle. The four rules: (1) service_role never enters the browser bundle or any file the bundler reads — server runtime config only; (2) every endpoint that uses service_role performs its own authorization check before doing anything; (3) per-request, you initialize a service_role client only when you actually need to bypass RLS; (4) every privileged operation gets logged. Skipping any of these recreates the lack-of-RLS bug at the application layer.

There are two common posts about Supabase service_role: "your service_role leaked, here's the rotation plan" and "never use service_role." Both are useful. Both miss the middle case — you genuinely need it, you're using it correctly server-side, and you want to make sure the next person on the codebase doesn't unwind that correctness.

This post is the middle case. When to use service_role, where to put it, the authorization shape that goes in front of it, and the logging discipline that makes audit possible.

When you actually need service_role

Four scenarios where the user's JWT can't do the work and service_role is the right escape hatch:

Schema migrations. Changing tables, adding policies, granting permissions. The only key that can do this is service_role (or the database superuser, but you generally don't have that on Supabase Cloud). Migrations run from CI or your local machine, not from production code.
Cron / scheduled jobs. Operations that run on behalf of "the system" — sending daily digest emails, expiring abandoned subscriptions, recomputing materialized views, reconciling external billing data. There's no end user; there's no JWT to forward.
Webhook handlers. A Stripe charge.succeeded arrives. You need to upgrade the user's account regardless of their session state. Stripe doesn't carry a Supabase JWT; you only have the user identifier in the webhook payload. The handler authenticates the webhook (HMAC signature) and then needs server-side privileges to make the upgrade — see the webhook secrets post for the verification side.
Cross-user admin features. An "admin dashboard" that lets a real human admin see all customer data, or a moderation interface that needs to read across users. RLS policies tied to auth.uid() can't help here — the operator's identity isn't the data owner.

If the operation can be done with the user's JWT and properly-written RLS policies, do that instead. The RLS patterns post covers the patterns that handle most "user reads/writes their own stuff" cases without needing service_role at all.

Where service_role lives (and where it doesn't)

The single most-violated rule. service_role belongs in runtime config — environment variables read by your server runtime, not values bundled into source code or browser-shipped JS. Specifically:

Yes:

Cloudflare Workers / Pages secrets via wrangler secret put SUPABASE_SERVICE_ROLE_KEY.
Vercel env vars marked "Sensitive" (not exposed via NEXT_PUBLIC_*).
Supabase Edge Functions secrets via supabase secrets set SUPABASE_SERVICE_ROLE_KEY=....
Your own server's process environment, populated at deploy time from a secrets manager (Doppler, 1Password Secrets, AWS Secrets Manager, GCP Secret Manager).
CI environment variables for migrations and one-off jobs, scoped per-environment (separate values for staging and production).

No:

NEXT_PUBLIC_*, VITE_*, EXPO_PUBLIC_*, REACT_APP_* — every framework has a public-prefix convention that ships the value to the browser. Names matter; double-check your convention.
.env files committed to source. .env.local in .gitignore is fine; .env.production in the repo is a leak waiting to happen. git log -S "eyJ" finds historical commits that touched JWT-shaped strings.
CI logs that echo the value during a debug session and forget to delete the workflow run.
Slack messages between developers troubleshooting a deploy.
Browser localStorage, even in admin tools (you'll lose admin access on cookie clear, and any XSS exfiltrates the key).

If you find service_role in a place from the second list, treat it as compromised — see the incident-response post. Rotate first, audit logs second, fix the deployment pattern third.

The authorization layer that goes in front

The mistake that produces application-layer "no RLS" bugs: the developer correctly puts service_role server-only, then writes a route handler that uses the service_role client to fulfill any user request that comes in. The route handler validates that the user is logged in (or sometimes not even that) and trusts whatever they ask for.

An authenticated user hits POST /api/admin/list-all-users. The handler runs a query with service_role. RLS doesn't apply (because service_role). The handler returns every user. The endpoint exists because some legitimate flow needs admin to see all users. The bug: every authenticated user, not just admins, hit it.

Fix: every endpoint that uses service_role does its own authorization check before doing anything else.

// Edge Function or your server endpoint.
import { createClient } from "@supabase/supabase-js";

// Two clients on the same project: anon for "act as the caller",
// service for "bypass RLS to do admin work."
const anon = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_ANON_KEY!,
);
const service = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!,
);

export async function POST(req: Request) {
  // 1. Get the caller's JWT from the Authorization header.
  const auth = req.headers.get("authorization");
  if (!auth?.startsWith("Bearer ")) {
    return new Response("unauthorized", { status: 401 });
  }
  const token = auth.slice(7);

  // 2. Validate the JWT against the anon-key client. This sets up the
  //    same auth.uid() / auth.role() context RLS would see.
  const { data: { user }, error } = await anon.auth.getUser(token);
  if (error || !user) {
    return new Response("invalid token", { status: 401 });
  }

  // 3. Authorization check. This is the part most apps skip.
  //    Verify the user has the role the endpoint requires.
  const { data: profile } = await anon
    .from("profiles")
    .select("role")
    .eq("id", user.id)
    .single();
  if (profile?.role !== "admin") {
    return new Response("forbidden", { status: 403 });
  }

  // 4. ONLY NOW use service_role. The anon-key check confirmed the
  //    caller exists and is logged in; the role check confirmed they
  //    have admin privileges; service_role lets us bypass RLS to do
  //    the cross-user query.
  const { data, error: dbErr } = await service
    .from("users")
    .select("*");

  // 5. Audit log (covered below).
  await logPrivilegedOp(user.id, "list-all-users", { count: data?.length });

  if (dbErr) return new Response(dbErr.message, { status: 500 });
  return Response.json(data);
}

Three things in that pattern matter:

Two clients side by side. The anon client validates the JWT and reads with the user's RLS context (cleaner errors, RLS-correct). The service client only runs after authorization passed.
The role check is against the user's own record. Don't trust client-supplied claims about the user's role. Read it from the database (or, if you mint custom JWT claims at login, validate the claim signature — see the JWT mistakes post).
Service-role usage is scoped to the actual cross-user operation. If the same handler also fetches the caller's own profile, that read goes through the anon client with the user's JWT, not service_role. Only the parts that genuinely need to bypass RLS use service.

Per-request client construction (when in doubt, scope it)

The pattern above caches the anon and service clients at module scope. That's fine for stateless Edge Functions — there's no shared state between concurrent requests. For long-running Node servers, prefer per-request client construction when the request needs the user's JWT injected into the auth context:

function userClient(token: string) {
  return createClient(
    process.env.SUPABASE_URL!,
    process.env.SUPABASE_ANON_KEY!,
    {
      global: { headers: { Authorization: `Bearer ${token}` } },
    },
  );
}

// Inside the handler:
const supabase = userClient(token);
const { data } = await supabase.from("notes").select("*");
// RLS sees the user's auth.uid() automatically; no manual scoping needed.

This is the "act as the user" pattern. RLS policies that filter on auth.uid() behave correctly because the JWT is in the request context. Use this for the 90% of endpoints that don't need service_role at all.

Webhooks: signature first, service_role second

Webhook handlers are the cleanest legitimate use of service_role — there's no user JWT to forward, but the work is still authenticated (by the webhook's HMAC signature). The pattern:

// app/api/stripe/webhook/route.ts
import Stripe from "stripe";
import { createClient } from "@supabase/supabase-js";

const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!, { apiVersion: "2024-04-10" });
const service = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_ROLE_KEY!,
);

export async function POST(req: Request) {
  const sig = req.headers.get("stripe-signature");
  if (!sig) return new Response("missing signature", { status: 401 });
  const body = await req.text();

  // 1. Verify the webhook came from Stripe.
  let event;
  try {
    event = stripe.webhooks.constructEvent(body, sig, process.env.STRIPE_WEBHOOK_SECRET!);
  } catch {
    return new Response("invalid signature", { status: 401 });
  }

  // 2. Now we know the request is real. Use service_role to update DB.
  if (event.type === "customer.subscription.created") {
    const sub = event.data.object;
    await service
      .from("subscriptions")
      .insert({
        stripe_subscription_id: sub.id,
        user_id: sub.metadata.user_id,
        status: sub.status,
        plan_id: sub.items.data[0].price.id,
      });
    await logPrivilegedOp(null, "stripe-subscription-created", { sub_id: sub.id });
  }

  return new Response("ok");
}

The pattern: verify the inbound credential first; only after verification, use service_role to do work. Skipping the verification step recreates the "anyone can call this endpoint" bug — see the webhook secrets post for the depth on signature handling.

Audit logging

Every privileged operation should produce an audit row. The audit table itself uses the append-only pattern — INSERT allowed, UPDATE/DELETE refused at the trigger level even via service_role.

async function logPrivilegedOp(
  userId: string | null,
  operation: string,
  context: Record<string, unknown>,
) {
  await service.from("audit_events").insert({
    user_id: userId,
    operation,
    payload: context,
    created_at: new Date().toISOString(),
  });
}

The information you want in audit:

Who. The caller's user_id (or null for system / cron / webhook).
What. A short operation name (list-all-users, stripe-subscription-created, delete-account).
When. The timestamp.
Context. Enough metadata to reconstruct the operation. Counts (how many rows affected). Identifiers (the subscription ID, the user being modified). Don't log full request bodies — those have PII.

The point of audit isn't real-time alerting (use observability for that). It's post-incident: when something goes wrong six months from now, you can answer "what privileged operations ran in this window? Did anything unusual happen?" Without an audit log, that question is unanswerable.

Rotation strategy

Even with everything above correct, plan for rotation. Two reasons: a key might leak via a path you didn't anticipate, and quarterly/annual rotation is a hygiene practice that catches deployment-process bugs (you find out which env-var consumer didn't get updated before a real incident forces it).

Generate the new key. Supabase Dashboard → Project Settings → API → "Reveal" service_role → there's no separate "new key" button on Supabase Cloud; you have to use the JWT secret rotation flow under Settings → API → JWT Secret. Rotating the JWT secret invalidates every existing key (anon + service_role) and issues new ones.
Roll out the new keys. Update SUPABASE_ANON_KEY and SUPABASE_SERVICE_ROLE_KEY in every environment (production, staging, local dev). Redeploy.
Verify. Smoke against the deploy. Anon-key clients should still work; service-role-using endpoints should still work; old tokens (any user with an active session) get invalidated and need to re-login.
Monitor. The first 24 hours after rotation are when you'll discover env-var consumers you forgot about.

If the rotation is being done because of a leak (vs. routine hygiene), the audit-log review during steps 3–4 is critical. Look for unauthorized usage during the exposure window.

What vibecheck can verify

vibecheck's secrets detector (supabase_service_role_in_client) checks rule #1 — that service_role isn't in the browser bundle. It can't verify rule #2 (your endpoints actually authorize before bypassing RLS) without seeing your server-side code, but it can verify the symptoms: tables that respond with rows to anonymous requests get flagged via the Supabase RLS probe. If the underlying issue is "no RLS at all," it shows up there. If the underlying issue is "we use service_role on a public endpoint," it shows up as "the endpoint returned data to vibecheck despite RLS" — same finding, different cause.

For the rest, the only tool is human review. Static analysis tools can help (any SAST that flags service_role usage in route handlers without a preceding role check); even a self-review pass against this checklist before merging is high-leverage.

Inspect your app for service_role exposure