Every millisecond of latency costs you. In AI applicationsβ€”where users expect instant responses and real-time interactionsβ€”the difference between a server in Virginia and one in your user's city can mean the difference between a seamless experience and an abandoned session.

This isn't theoretical. We run asabove.tech on Cloudflare Pages. Our voice agents process audio through Workers. Email routing happens at the edge. API endpoints respond in under 50ms globally. The infrastructure that would have required a team of DevOps engineers five years ago now deploys with a single command.

This guide is the practical walkthrough I wish I'd hadβ€”covering everything from your first "hello world" Worker to production deployments handling real traffic. We'll use actual code from our infrastructure, not sanitized examples. By the end, you'll understand not just how to use Workers, but when they're the right choiceβ€”and when they're not.

1. What Is Edge Computing and Why It Matters

Traditional cloud computing centralizes processing in a few massive data centers. Your server might be in us-east-1 (Virginia), and a user in Tokyo experiences 150ms+ of latency just from the physics of light traveling through fiber optic cables. That's before your application even starts processing.

Edge computing flips this model. Instead of users coming to your server, your code runs on servers distributed across the globeβ€”as close to users as possible. Cloudflare operates 300+ data centers in 100+ countries. When a user in Tokyo makes a request, it's handled by a server in Tokyo. When someone in SΓ£o Paulo requests the same endpoint, a server in SΓ£o Paulo handles it.

Traditional Architecture ───────────────────────── Tokyo User ──────────────────┐ β”‚ 150ms London User ─────────────────┼──────────────> [ Virginia Server ] β”‚ 100ms SΓ£o Paulo User β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ 200ms Edge Architecture ───────────────────── Tokyo User ────────> [ Tokyo Edge ] ~10ms London User ───────> [ London Edge ] ~10ms SΓ£o Paulo User ────> [ SΓ£o Paulo Edge ] ~10ms

Why This Matters for AI Applications

AI applications have unique latency requirements that make edge computing particularly valuable:

πŸ’‘ The Real-World Impact

Our voice agent worker handles initial WebSocket connections at the edge. The connection setupβ€”authentication, session creation, initial stateβ€”happens in the nearest data center. Only the actual AI inference calls go to centralized GPU servers. The result: voice interactions feel instantaneous even for users on the other side of the world.

Edge Computing Trade-offs

Edge isn't magic. Understanding the trade-offs helps you architect correctly:

Advantage Trade-off
Low latency globally Limited CPU time per request (varies by plan)
Automatic scaling No persistent connections to databases (must use HTTP/REST)
Zero cold starts Execution environment is V8 isolates, not full containers
Built-in DDoS protection Storage options are eventually consistent (for global distribution)
Simple deployment Not all npm packages work (no Node.js APIs like fs, net)

The key insight: edge is for request handling, not heavy computation. Use edge workers to route, validate, transform, cache, and respond quickly. Use traditional servers (or serverless functions with longer timeouts) for heavy lifting like AI inference.

2. Cloudflare Workers vs Traditional Hosting

Before diving into code, let's compare Workers to alternatives you might consider. Each has its placeβ€”the goal is choosing the right tool.

Cloudflare Workers
Edge Functions

V8 isolates running in 300+ locations. Zero cold starts, millisecond spin-up. Best for request handling, API routing, and lightweight processing.

Cold Start
0ms
CPU Limit
10-30ms (free), 30s (paid)
Memory
128MB
Locations
300+

Best For

  • API gateways and routing
  • Authentication and authorization
  • Request/response transformation
  • Static site enhancement (A/B testing, personalization)
  • Webhook handlers
  • Simple AI inference with Workers AI
AWS Lambda / Google Cloud Functions
Traditional Serverless

Container-based serverless with full runtime support. Higher limits but cold starts and regional deployment.

Cold Start
100-500ms typical
Timeout
15min (Lambda)
Memory
Up to 10GB
Locations
~30 regions

Best For

  • Long-running processes
  • Heavy computation
  • Complex dependencies (native modules)
  • Direct database connections
  • Jobs requiring more than 128MB memory
Traditional VPS / Containers
Always-On Servers

EC2, DigitalOcean Droplets, or containers on Kubernetes. Full control, persistent connections, but you manage scaling and availability.

Cold Start
N/A (always running)
Timeout
Unlimited
Memory
Whatever you pay for
Locations
As many as you deploy

Best For

  • Persistent WebSocket connections
  • Stateful applications
  • GPU-based AI inference
  • Database servers
  • Applications requiring specific OS features

Decision Framework

Use this mental model when choosing:

πŸ€” Where Should This Code Run?
Request
Is it handling HTTP requests that need fast, global responses?
β†’ Yes: Workers are ideal. Fast, cheap, scales automatically.
Duration
Does the operation take more than 30 seconds?
β†’ Yes: Use traditional serverless or containers.
State
Does it need persistent connections or in-memory state?
β†’ Yes: Use Durable Objects (edge) or containers (heavy).
Compute
Does it need GPUs, heavy CPU, or lots of memory?
β†’ Yes: Run on dedicated GPU servers, call from edge worker.
βœ… The Hybrid Pattern

The most effective architectures combine edge and centralized compute. Workers handle the request layerβ€”authentication, routing, caching, response formattingβ€”while heavier operations happen on specialized servers. Your Worker becomes a smart orchestrator, not just a dumb proxy.

3. Setting Up Wrangler CLI

Wrangler is Cloudflare's CLI for managing Workers. It handles local development, deployment, secrets, and configuration. Let's get it running.

Installation

Terminal
# Install globally with npm
npm install -g wrangler

# Or use npx (no global install needed)
npx wrangler --version

# Authenticate with your Cloudflare account
wrangler login

The wrangler login command opens a browser window for OAuth authentication. After approving, Wrangler stores credentials locally and you're ready to deploy.

Project Structure

A typical Workers project looks like this:

my-worker/ β”œβ”€β”€ src/ β”‚ └── index.ts # Your worker code β”œβ”€β”€ wrangler.toml # Configuration β”œβ”€β”€ package.json # Dependencies └── tsconfig.json # TypeScript config (optional)

Configuration: wrangler.toml

The wrangler.toml file is the heart of your Worker configuration:

wrangler.toml
# Basic configuration
name = "my-worker"
main = "src/index.ts"
compatibility_date = "2024-01-01"

# Account info (optional - defaults to authenticated account)
account_id = "your-account-id"

# Environment-specific settings
[env.production]
name = "my-worker-production"
route = "api.example.com/*"

[env.staging]
name = "my-worker-staging"
route = "staging-api.example.com/*"

# KV namespace bindings
[[kv_namespaces]]
binding = "CACHE"
id = "abc123..."

# Environment variables (non-secret)
[vars]
ENVIRONMENT = "development"
API_VERSION = "v1"

# Durable Objects
[[durable_objects.bindings]]
name = "SESSIONS"
class_name = "SessionManager"

# Workers AI
[ai]
binding = "AI"

Essential Commands

Command Purpose
wrangler dev Start local development server with hot reload
wrangler deploy Deploy to production
wrangler deploy --env staging Deploy to specific environment
wrangler tail Live stream logs from production
wrangler secret put API_KEY Add encrypted secret
wrangler kv:namespace create CACHE Create KV namespace
wrangler pages deploy ./dist Deploy static site to Pages
πŸ’‘ Local Development

wrangler dev runs a local server that closely mimics the production environment, including access to KV, Durable Objects, and other bindings. It's fast, supports hot reload, and catches most issues before deployment. Use wrangler dev --remote to test against actual production bindings (useful for debugging data issues).

4. Your First Worker: Hello World to Production

Let's build a Worker from scratch, understanding each part along the way. We'll start simple and add complexity until we have something production-ready.

The Simplest Worker

src/index.ts
export default {
  async fetch(request: Request): Promise<Response> {
    return new Response("Hello, World!");
  }
};

That's it. A Worker is an object with a fetch handler that receives a Request and returns a Response. The interface mirrors the web standard Fetch APIβ€”if you know how to use fetch() in the browser, you understand the basics.

Adding Request Handling

Let's make it actually do something useful:

src/index.ts
export default {
  async fetch(request: Request): Promise<Response> {
    const url = new URL(request.url);
    
    // Route based on path
    switch (url.pathname) {
      case "/":
        return new Response("Welcome to the API");
      
      case "/health":
        return Response.json({ status: "healthy", timestamp: Date.now() });
      
      case "/echo":
        if (request.method !== "POST") {
          return new Response("Method not allowed", { status: 405 });
        }
        const body = await request.json();
        return Response.json({ received: body });
      
      default:
        return new Response("Not found", { status: 404 });
    }
  }
};

Adding Type Safety and Environment

Production Workers need type-safe access to bindings (KV, secrets, etc.):

src/index.ts
// Define your environment bindings
interface Env {
  // Secrets (set via wrangler secret put)
  API_KEY: string;
  
  // KV namespaces
  CACHE: KVNamespace;
  
  // Environment variables (from wrangler.toml [vars])
  ENVIRONMENT: string;
  
  // Workers AI binding
  AI: Ai;
}

export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    // Now you have type-safe access to all bindings
    const apiKey = env.API_KEY;
    const cached = await env.CACHE.get("some-key");
    
    return new Response(`Environment: ${env.ENVIRONMENT}`);
  }
};

A Production-Ready API Worker

Here's a more complete example with error handling, CORS, logging, and proper structure:

src/index.ts
interface Env {
  API_KEY: string;
  CACHE: KVNamespace;
  ALLOWED_ORIGINS: string;
}

// CORS headers for browser requests
function corsHeaders(origin: string | null, allowedOrigins: string): HeadersInit {
  const allowed = allowedOrigins.split(",");
  const allowOrigin = origin && allowed.includes(origin) ? origin : allowed[0];
  
  return {
    "Access-Control-Allow-Origin": allowOrigin,
    "Access-Control-Allow-Methods": "GET, POST, OPTIONS",
    "Access-Control-Allow-Headers": "Content-Type, Authorization",
  };
}

// Error response helper
function errorResponse(message: string, status: number, headers: HeadersInit = {}): Response {
  return Response.json(
    { error: message, status },
    { status, headers }
  );
}

// Main handler
export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    const url = new URL(request.url);
    const origin = request.headers.get("Origin");
    const cors = corsHeaders(origin, env.ALLOWED_ORIGINS);
    
    // Handle CORS preflight
    if (request.method === "OPTIONS") {
      return new Response(null, { status: 204, headers: cors });
    }
    
    try {
      // Route to handlers
      let response: Response;
      
      if (url.pathname === "/api/data") {
        response = await handleData(request, env);
      } else if (url.pathname.startsWith("/api/cache")) {
        response = await handleCache(request, env, url);
      } else {
        response = errorResponse("Not found", 404);
      }
      
      // Add CORS headers to all responses
      const newHeaders = new Headers(response.headers);
      Object.entries(cors).forEach(([key, value]) => newHeaders.set(key, value));
      
      return new Response(response.body, {
        status: response.status,
        headers: newHeaders
      });
      
    } catch (error) {
      console.error("Worker error:", error);
      return errorResponse("Internal server error", 500, cors);
    }
  }
};

// Handler functions
async function handleData(request: Request, env: Env): Promise<Response> {
  // Validate API key
  const authHeader = request.headers.get("Authorization");
  if (authHeader !== `Bearer ${env.API_KEY}`) {
    return errorResponse("Unauthorized", 401);
  }
  
  // Process request
  return Response.json({
    message: "Authenticated successfully",
    timestamp: new Date().toISOString()
  });
}

async function handleCache(request: Request, env: Env, url: URL): Promise<Response> {
  const key = url.pathname.replace("/api/cache/", "");
  
  if (request.method === "GET") {
    const value = await env.CACHE.get(key);
    if (!value) {
      return errorResponse("Not found", 404);
    }
    return Response.json({ key, value: JSON.parse(value) });
  }
  
  if (request.method === "POST") {
    const body = await request.json();
    await env.CACHE.put(key, JSON.stringify(body), { expirationTtl: 3600 });
    return Response.json({ success: true, key });
  }
  
  return errorResponse("Method not allowed", 405);
}

Deploy to Production

Terminal
# Set your secret
wrangler secret put API_KEY
# Enter: your-secret-api-key

# Deploy
wrangler deploy

# Watch logs
wrangler tail
βœ… You Just Deployed Globally

That wrangler deploy command pushed your code to 300+ data centers worldwide. Users in Tokyo, London, SΓ£o Pauloβ€”everyone gets sub-50ms response times. No load balancers to configure, no auto-scaling rules to tune, no multi-region replication to manage. It just works.

5. Cloudflare Pages for Static Sites

While Workers handle dynamic requests, Cloudflare Pages is optimized for static sites and frontend applications. It's what we use for asabove.techβ€”and it's remarkably simple for what it delivers.

🌐 Our Deployment: asabove.tech
https://asabove.tech

Static site built with vanilla HTML/CSS/JS, deployed via Git integration. Automatic deployments on push, preview URLs for branches, global CDN distribution. Total monthly cost: $0 (free tier).

Pages vs Workers: When to Use Which

Use Case Cloudflare Pages Cloudflare Workers
Static websites Best choice Overkill
SPAs (React, Vue) Best choice Can work, but why?
API endpoints Pages Functions Best choice
Full-stack apps Pages + Functions Workers + KV/D1
Real-time features Not supported Best choice

Setting Up Pages

Option 1: Git Integration (Recommended)

  1. Push your site to GitHub or GitLab
  2. Go to Cloudflare Dashboard β†’ Pages β†’ Create a project
  3. Connect your repository
  4. Configure build settings (if using a framework)
  5. Deploy

Every push to your main branch triggers a deployment. Pull requests get preview URLs. It's the simplest CI/CD setup possible.

Option 2: Direct Upload with Wrangler

Terminal
# Deploy a directory directly
wrangler pages deploy ./dist

# Create a new project
wrangler pages project create my-site

# Deploy to specific project
wrangler pages deploy ./dist --project-name my-site

Build Configuration for Frameworks

Framework Build Command Output Directory
React (CRA) npm run build build
Next.js (static) next build && next export out
Vue npm run build dist
Astro npm run build dist
SvelteKit (static) npm run build build
Plain HTML None / (root)

Pages Functions: Adding Server-Side Logic

Pages supports "Functions"β€”Workers that run alongside your static site. Create a functions directory, and files become API endpoints:

my-site/ β”œβ”€β”€ public/ β”‚ β”œβ”€β”€ index.html β”‚ └── styles.css β”œβ”€β”€ functions/ β”‚ β”œβ”€β”€ api/ β”‚ β”‚ β”œβ”€β”€ hello.ts β†’ /api/hello β”‚ β”‚ └── users/ β”‚ β”‚ └── [id].ts β†’ /api/users/:id β”‚ └── _middleware.ts β†’ Runs on all requests └── package.json
functions/api/hello.ts
export async function onRequest(context) {
  return new Response(JSON.stringify({
    message: "Hello from Pages Functions!",
    timestamp: new Date().toISOString()
  }), {
    headers: { "Content-Type": "application/json" }
  });
}
functions/api/users/[id].ts
// Dynamic route: /api/users/123 β†’ params.id = "123"
export async function onRequest(context) {
  const { params } = context;
  
  return new Response(JSON.stringify({
    userId: params.id,
    // In reality, fetch from database
    name: `User ${params.id}`
  }), {
    headers: { "Content-Type": "application/json" }
  });
}

Custom Domain Setup

  1. In Pages project settings, go to "Custom domains"
  2. Add your domain (e.g., asabove.tech)
  3. If using Cloudflare DNS: automatic configuration
  4. If using external DNS: add the provided CNAME record
πŸ’‘ Free SSL, Automatic Renewal

Cloudflare automatically provisions and renews SSL certificates for custom domains. No configuration needed, no certificate management, no renewal reminders. It's included in the free tier.

6. Email Workers and Routing

One of Cloudflare's hidden gems: Email Routing with Workers. You can receive emails at your domain and process them with codeβ€”parsing, forwarding, triggering workflows, storing data.

πŸ“§ Our Deployment: Email Routing Worker
*@asabove.tech β†’ Processing β†’ Forwarding

All emails to our domain route through a Worker. Auto-replies, spam filtering, intelligent forwarding based on content, webhook triggers for support tickets. Zero external email service dependencies.

Setting Up Email Routing

  1. Enable Email Routing: In Cloudflare Dashboard, go to your domain β†’ Email β†’ Email Routing β†’ Enable
  2. Add DNS records: Cloudflare will prompt you to add MX records (automatic if using Cloudflare DNS)
  3. Create catch-all rule: Route all addresses to a Worker

Basic Email Worker

wrangler.toml
name = "email-handler"
main = "src/index.ts"
compatibility_date = "2024-01-01"

# Enable email handling
[[email]]
enabled = true
src/index.ts
import { EmailMessage } from "cloudflare:email";
import { createMimeMessage } from "mimetext";

interface Env {
  FORWARD_TO: string;
  KV_EMAILS: KVNamespace;
}

export default {
  async email(message: EmailMessage, env: Env, ctx: ExecutionContext) {
    // Parse email details
    const from = message.from;
    const to = message.to;
    const subject = message.headers.get("subject") || "(no subject)";
    
    console.log(`Email received: ${from} β†’ ${to}: ${subject}`);
    
    // Get email body
    const rawEmail = await new Response(message.raw).text();
    
    // Store in KV for logging
    await env.KV_EMAILS.put(
      `email:${Date.now()}`,
      JSON.stringify({ from, to, subject, receivedAt: new Date().toISOString() }),
      { expirationTtl: 86400 * 30 } // 30 days
    );
    
    // Forward to destination
    await message.forward(env.FORWARD_TO);
  }
};

Advanced Email Handling

Here's a more sophisticated example with routing logic, auto-replies, and webhook triggers:

src/index.ts
import { EmailMessage } from "cloudflare:email";
import { createMimeMessage } from "mimetext";

interface Env {
  // Forwarding destinations
  SUPPORT_EMAIL: string;
  SALES_EMAIL: string;
  DEFAULT_EMAIL: string;
  
  // Webhook for notifications
  WEBHOOK_URL: string;
  
  // Storage
  KV_EMAILS: KVNamespace;
}

// Route emails based on recipient address
function getDestination(toAddress: string, env: Env): string {
  const localPart = toAddress.split("@")[0].toLowerCase();
  
  const routes: Record<string, string> = {
    "support": env.SUPPORT_EMAIL,
    "help": env.SUPPORT_EMAIL,
    "sales": env.SALES_EMAIL,
    "contact": env.SALES_EMAIL,
    "info": env.DEFAULT_EMAIL,
  };
  
  return routes[localPart] || env.DEFAULT_EMAIL;
}

// Check if email looks like spam
function isLikelySpam(from: string, subject: string): boolean {
  const spamIndicators = [
    /\b(viagra|cialis|lottery|winner|prince|inheritance)\b/i,
    /^.{0,5}$/, // Very short subject
    /URGENT.*REPLY/i,
  ];
  
  return spamIndicators.some(pattern => pattern.test(subject));
}

// Send webhook notification
async function notifyWebhook(data: object, webhookUrl: string): Promise<void> {
  await fetch(webhookUrl, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(data)
  });
}

export default {
  async email(message: EmailMessage, env: Env, ctx: ExecutionContext) {
    const from = message.from;
    const to = message.to;
    const subject = message.headers.get("subject") || "(no subject)";
    
    console.log(`Processing: ${from} β†’ ${to}: ${subject}`);
    
    // Spam check
    if (isLikelySpam(from, subject)) {
      console.log("Rejected as spam");
      message.setReject("Message rejected");
      return;
    }
    
    // Determine destination
    const destination = getDestination(to, env);
    
    // Log to KV
    const emailId = `email:${Date.now()}:${Math.random().toString(36).slice(2)}`;
    await env.KV_EMAILS.put(emailId, JSON.stringify({
      from,
      to,
      subject,
      destination,
      receivedAt: new Date().toISOString()
    }));
    
    // Send webhook notification (non-blocking)
    ctx.waitUntil(
      notifyWebhook({
        type: "email_received",
        from,
        to,
        subject,
        destination,
        id: emailId
      }, env.WEBHOOK_URL)
    );
    
    // Forward the email
    await message.forward(destination);
    
    console.log(`Forwarded to ${destination}`);
  }
};

Email Routing Patterns

πŸ“¬ Common Email Routing Patterns
Support
support@, help@, bugs@ β†’ Ticketing system webhook + forward to support team
Auto-create tickets, assign priority based on content
Sales
sales@, contact@, info@ β†’ CRM webhook + forward to sales
Log leads, auto-respond with availability
Transactional
noreply@, notifications@ β†’ Parse, extract data, store in database
Process receipts, shipping notifications, alerts
Catch-All
*@ β†’ Log unknown addresses, forward to admin
Discover typos, track attempted deliveries
⚠️ Email Worker Limitations
  • Can receive and forward emails, but not send arbitrary outbound emails
  • Size limit: 25MB per email
  • Rate limit: 100,000 emails/day on free tier
  • For sending emails, pair with an outbound service (SendGrid, SES, etc.)

7. KV Storage and Durable Objects

Edge workers are stateless by defaultβ€”each request is independent. For persistence, Cloudflare provides two primary options: KV for simple key-value storage and Durable Objects for stateful, coordinated workloads.

Workers KV: Global Key-Value Storage

KV is eventually consistent, globally distributed key-value storage. Think of it as a global Redis with ~60-second eventual consistency.

Characteristic Value
Max key size 512 bytes
Max value size 25MB
Reads Eventually consistent (~60s), cached at edge
Writes Propagate globally within 60 seconds
Free tier 100,000 reads/day, 1,000 writes/day

Setting Up KV

Terminal
# Create a namespace
wrangler kv:namespace create CACHE

# Create a preview namespace (for wrangler dev)
wrangler kv:namespace create CACHE --preview

# List namespaces
wrangler kv:namespace list
wrangler.toml
[[kv_namespaces]]
binding = "CACHE"
id = "abc123..."  # From create command output
preview_id = "def456..."  # From preview create output

Using KV in Your Worker

src/index.ts
interface Env {
  CACHE: KVNamespace;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const url = new URL(request.url);
    
    // Simple get
    const value = await env.CACHE.get("my-key");
    
    // Get with type
    const jsonValue = await env.CACHE.get("my-json", "json");
    
    // Get with metadata
    const { value: val, metadata } = await env.CACHE.getWithMetadata("my-key");
    
    // Put with expiration
    await env.CACHE.put("temp-key", "temp-value", {
      expirationTtl: 3600 // 1 hour in seconds
    });
    
    // Put with metadata
    await env.CACHE.put("user:123", JSON.stringify({ name: "Alice" }), {
      metadata: { createdAt: Date.now(), version: 1 }
    });
    
    // Delete
    await env.CACHE.delete("old-key");
    
    // List keys (with prefix)
    const keys = await env.CACHE.list({ prefix: "user:", limit: 100 });
    
    return Response.json({ keys: keys.keys });
  }
};

KV Best Practices

Durable Objects: Stateful Edge Computing

Durable Objects are the answer to "but what if I need real state?" They provide:

Each Durable Object has a unique ID and lives in one location. All requests to that object route to that location, ensuring consistency.

When to Use Durable Objects

Use Case Why Durable Objects
Real-time collaboration Single source of truth for document state
Game servers Consistent game state, no race conditions
Rate limiting Accurate counters per user/IP
Session management WebSocket connections + persistent state
Distributed locks Coordination without external databases

Durable Object Example: Session Manager

src/session.ts
export class SessionManager {
  state: DurableObjectState;
  sessions: Map<string, WebSocket> = new Map();
  
  constructor(state: DurableObjectState) {
    this.state = state;
  }
  
  async fetch(request: Request): Promise<Response> {
    const url = new URL(request.url);
    
    if (url.pathname === "/websocket") {
      return this.handleWebSocket(request);
    }
    
    if (url.pathname === "/broadcast") {
      const message = await request.text();
      this.broadcast(message);
      return new Response("Broadcasted");
    }
    
    return new Response("Not found", { status: 404 });
  }
  
  async handleWebSocket(request: Request): Promise<Response> {
    const upgradeHeader = request.headers.get("Upgrade");
    if (upgradeHeader !== "websocket") {
      return new Response("Expected WebSocket", { status: 400 });
    }
    
    const [client, server] = Object.values(new WebSocketPair());
    const sessionId = crypto.randomUUID();
    
    server.accept();
    this.sessions.set(sessionId, server);
    
    server.addEventListener("message", (event) => {
      // Handle incoming messages
      console.log(`Message from ${sessionId}: ${event.data}`);
      
      // Echo back or process
      this.broadcast(`[${sessionId}]: ${event.data}`);
    });
    
    server.addEventListener("close", () => {
      this.sessions.delete(sessionId);
      console.log(`Session ${sessionId} closed`);
    });
    
    // Store session count persistently
    await this.state.storage.put("sessionCount", this.sessions.size);
    
    return new Response(null, {
      status: 101,
      webSocket: client
    });
  }
  
  broadcast(message: string) {
    for (const [id, socket] of this.sessions) {
      try {
        socket.send(message);
      } catch (e) {
        this.sessions.delete(id);
      }
    }
  }
}
src/index.ts
export { SessionManager } from "./session";

interface Env {
  SESSIONS: DurableObjectNamespace;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const url = new URL(request.url);
    
    // Get room ID from path or query
    const roomId = url.searchParams.get("room") || "default";
    
    // Get the Durable Object for this room
    const id = env.SESSIONS.idFromName(roomId);
    const stub = env.SESSIONS.get(id);
    
    // Forward request to the Durable Object
    return stub.fetch(request);
  }
};
wrangler.toml
name = "my-app"
main = "src/index.ts"

[[durable_objects.bindings]]
name = "SESSIONS"
class_name = "SessionManager"

[[migrations]]
tag = "v1"
new_classes = ["SessionManager"]
πŸ’‘ When KV vs Durable Objects

KV: Read-heavy, eventually consistent, simple key-value patterns. Use for caching, configuration, session storage where slight staleness is OK.

Durable Objects: Strong consistency, real-time coordination, WebSockets. Use when you need "exactly once" semantics or real-time features.

8. Workers AI: Running Models at the Edge

Workers AI brings machine learning inference to the edge. Instead of calling external APIs with added latency, run models directly in Cloudflare's networkβ€”text generation, embeddings, image classification, speech recognition, and more.

πŸ€– Our Deployment: Voice Agent Worker
wss://voice.asabove.tech/agent

WebSocket-based voice agent using Workers AI for speech-to-text and embeddings. Audio processing at the edge means sub-200ms response times globally. Heavy inference calls route to our GPU servers.

Available Models

Cloudflare hosts various model categories:

Category Models Use Cases
Text Generation Llama 3, Mistral, Gemma Chatbots, content generation, summarization
Embeddings BGE, E5 Semantic search, RAG, similarity
Image Stable Diffusion, ResNet Generation, classification, description
Speech Whisper Transcription, voice interfaces
Translation M2M-100 Multi-language translation

Setting Up Workers AI

wrangler.toml
name = "ai-worker"
main = "src/index.ts"
compatibility_date = "2024-01-01"

[ai]
binding = "AI"

Text Generation

src/index.ts
interface Env {
  AI: Ai;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { prompt } = await request.json();
    
    const response = await env.AI.run("@cf/meta/llama-3-8b-instruct", {
      messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: prompt }
      ],
      max_tokens: 500
    });
    
    return Response.json(response);
  }
};

Streaming Responses

src/index.ts
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { prompt } = await request.json();
    
    // Stream the response
    const stream = await env.AI.run("@cf/meta/llama-3-8b-instruct", {
      messages: [
        { role: "user", content: prompt }
      ],
      stream: true
    });
    
    return new Response(stream, {
      headers: { "Content-Type": "text/event-stream" }
    });
  }
};

Embeddings for Semantic Search

src/index.ts
interface Env {
  AI: Ai;
  VECTOR_INDEX: VectorizeIndex; // Cloudflare Vectorize
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { query } = await request.json();
    
    // Generate embedding for the query
    const embedding = await env.AI.run("@cf/baai/bge-base-en-v1.5", {
      text: query
    });
    
    // Search vector database
    const results = await env.VECTOR_INDEX.query(embedding.data[0], {
      topK: 5,
      returnMetadata: true
    });
    
    return Response.json({ results: results.matches });
  }
};

Speech-to-Text with Whisper

src/index.ts
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    // Get audio data from request
    const audioData = await request.arrayBuffer();
    
    // Transcribe
    const result = await env.AI.run("@cf/openai/whisper", {
      audio: [...new Uint8Array(audioData)]
    });
    
    return Response.json({
      text: result.text,
      language: result.language
    });
  }
};

AI Gateway: Logging and Caching

Cloudflare AI Gateway sits in front of AI providers (OpenAI, Anthropic, Workers AI) and provides:

src/index.ts
// Using AI Gateway with external providers
const response = await fetch(
  "https://gateway.ai.cloudflare.com/v1/YOUR_ACCOUNT/YOUR_GATEWAY/openai/chat/completions",
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${env.OPENAI_KEY}`,
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      model: "gpt-4",
      messages: [{ role: "user", content: "Hello!" }]
    })
  }
);
⚠️ Workers AI Limitations
  • Smaller models than hosted services (8B params vs 70B+)
  • No fine-tuning (use pre-trained models only)
  • Longer inference times for complex models
  • Best for: embeddings, small generation tasks, preprocessing
  • Not ideal for: complex reasoning, long-form generation

9. Environment Variables and Secrets

Workers need configurationβ€”API keys, feature flags, environment-specific settings. Cloudflare provides two mechanisms: environment variables (in code) and secrets (encrypted, never in code).

Environment Variables

Non-sensitive configuration goes in wrangler.toml:

wrangler.toml
[vars]
ENVIRONMENT = "development"
API_VERSION = "v2"
MAX_RESULTS = "100"
FEATURE_NEW_UI = "true"

[env.production.vars]
ENVIRONMENT = "production"
MAX_RESULTS = "50"

[env.staging.vars]
ENVIRONMENT = "staging"

Secrets

Sensitive values (API keys, passwords, tokens) should never be in your codebase:

Terminal
# Add a secret
wrangler secret put API_KEY
# Prompts for value (not echoed)

# Add secret to specific environment
wrangler secret put API_KEY --env production

# List secrets (shows names, not values)
wrangler secret list

# Delete a secret
wrangler secret delete API_KEY

Using in Code

src/index.ts
interface Env {
  // Environment variables (from wrangler.toml [vars])
  ENVIRONMENT: string;
  API_VERSION: string;
  MAX_RESULTS: string;  // Note: always strings!
  FEATURE_NEW_UI: string;
  
  // Secrets (from wrangler secret put)
  API_KEY: string;
  DATABASE_URL: string;
  JWT_SECRET: string;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    // Access like any other property
    const isProduction = env.ENVIRONMENT === "production";
    const maxResults = parseInt(env.MAX_RESULTS, 10);
    const showNewUI = env.FEATURE_NEW_UI === "true";
    
    // Use secrets for auth
    const response = await fetch("https://api.external.com/data", {
      headers: { "Authorization": `Bearer ${env.API_KEY}` }
    });
    
    return new Response(`Environment: ${env.ENVIRONMENT}`);
  }
};

Environment-Specific Deployments

wrangler.toml
name = "my-api"
main = "src/index.ts"

# Default (development)
[vars]
ENVIRONMENT = "development"
LOG_LEVEL = "debug"

# Production environment
[env.production]
name = "my-api-production"
routes = [{ pattern = "api.example.com/*", zone_name = "example.com" }]

[env.production.vars]
ENVIRONMENT = "production"
LOG_LEVEL = "error"

# Staging environment
[env.staging]
name = "my-api-staging"
routes = [{ pattern = "staging.example.com/*", zone_name = "example.com" }]

[env.staging.vars]
ENVIRONMENT = "staging"
LOG_LEVEL = "info"
Terminal
# Deploy to development (default)
wrangler deploy

# Deploy to staging
wrangler deploy --env staging

# Deploy to production
wrangler deploy --env production

# Secrets are environment-specific
wrangler secret put API_KEY --env production
wrangler secret put API_KEY --env staging
πŸ’‘ Local Development with Secrets

For local development, create a .dev.vars file (add to .gitignore!):

API_KEY=dev-key-for-testing
DATABASE_URL=postgres://localhost:5432/dev

Wrangler automatically loads this file during wrangler dev.

10. Custom Domains and DNS

Connecting Workers and Pages to custom domains requires DNS configuration. Cloudflare makes this trivial if you're using their DNSβ€”slightly more involved with external DNS.

Option 1: Cloudflare-Managed DNS (Recommended)

If your domain's nameservers point to Cloudflare:

  1. Add your domain to Cloudflare (free plan works)
  2. Update nameservers at your registrar
  3. Wait for propagation (minutes to hours)
  4. Workers and Pages can now route to your domain automatically

Adding Routes to Workers

wrangler.toml
# Simple route
route = "api.example.com/*"

# Multiple routes
routes = [
  { pattern = "api.example.com/*", zone_name = "example.com" },
  { pattern = "api.example.org/*", zone_name = "example.org" }
]

# Custom domain (automatic SSL)
[[routes]]
pattern = "api.example.com/*"
custom_domain = true

Via Dashboard

  1. Go to Workers & Pages β†’ Your Worker β†’ Triggers
  2. Add route or custom domain
  3. Cloudflare creates necessary DNS records automatically

Option 2: Custom Domain on Pages

  1. In Pages project β†’ Custom Domains
  2. Add domain (e.g., asabove.tech)
  3. Cloudflare handles SSL certificate provisioning
  4. If using Cloudflare DNS: automatic
  5. If external DNS: add the CNAME record shown

Option 3: External DNS

If you can't move DNS to Cloudflare, you can still use Workers:

External DNS Records
# For Workers
Type: CNAME
Name: api
Value: your-worker.your-subdomain.workers.dev

# For Pages
Type: CNAME
Name: www
Value: your-project.pages.dev
⚠️ External DNS Limitations

With external DNS, you lose some Cloudflare benefits: no automatic SSL on apex domains (only subdomains), no full DDoS protection, no analytics on the DNS level. For production, Cloudflare DNS is strongly recommended.

SSL/TLS Configuration

Cloudflare automatically provisions SSL certificates for custom domains. Configure encryption mode in your domain settings:

Mode Description Use When
Off No encryption Never (don't use)
Flexible HTTPS to Cloudflare, HTTP to origin Origin doesn't support HTTPS
Full HTTPS end-to-end (any cert) Self-signed cert at origin
Full (Strict) HTTPS end-to-end (valid cert) Production (recommended)

For Workers and Pages with no external origin, "Full (Strict)" works automaticallyβ€”there's no origin to worry about.

11. Real Examples: Our Deployments

Theory is nice, but let's look at how we actually use Workers in production. These are real deployments running right now.

🌐 asabove.tech - Static Site on Pages
https://asabove.tech

What it does:

  • Static content platform with articles, guides, and documentation
  • Zero JavaScript frameworksβ€”vanilla HTML/CSS for performance
  • Automatic deployments from Git
  • Preview URLs for every branch

Configuration:

  • Build command: None (static files)
  • Output directory: public/
  • Custom domain via Cloudflare DNS
  • Monthly cost: $0 (free tier)

Why Pages: Pure static content with no server-side logic needed. Git integration means content updates deploy automatically on push.

πŸŽ™οΈ Voice Agent Worker
wss://voice.asabove.tech/agent

What it does:

  • WebSocket endpoint for real-time voice interactions
  • Audio preprocessing and validation at the edge
  • Session management with Durable Objects
  • Routes transcription requests to Whisper (Workers AI)
  • Forwards complex inference to GPU backend

Architecture:

Client ──WebSocket──> [Edge Worker] ──> [Durable Object: Session] β”‚ β”‚ β”œβ”€β”€ Audio Validation β”œβ”€β”€ State Management β”œβ”€β”€ Workers AI Whisper β”œβ”€β”€ Context Window └── Auth/Rate Limit └── Response Buffering β”‚ β–Ό [GPU Backend Server] (Complex LLM inference)

Why Workers + Durable Objects: WebSocket connections need persistent state. Durable Objects provide single-threaded, consistent session management. Edge preprocessing reduces latency for the voice-critical path.

πŸ“§ Email Routing Worker
*@asabove.tech

What it does:

  • Catch-all email routing for the domain
  • Intelligent forwarding based on address and content
  • Spam filtering with basic heuristics
  • Webhook notifications for specific patterns
  • Logging to KV for analytics

Routing Logic:

Incoming Email β”‚ β–Ό [Spam Check] ──No──> [Address Router] β”‚ β”‚ Yes β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Reject support@ sales@ help@ contact@ β”‚ β”‚ β–Ό β–Ό Forward to Forward to Support Sales Team β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ Log to KV Webhook Notify

Why Email Workers: No external email service needed for receiving. Programmable routing beats static forwarding rules. Webhooks enable integration with ticketing systems, CRMs, etc.

πŸ”Œ API Gateway Worker
https://api.asabove.tech/v1/*

What it does:

  • Authentication and API key validation
  • Rate limiting per user/IP
  • Request transformation and validation
  • Response caching in KV
  • Routing to various backend services
  • CORS handling

Pattern:

Simplified gateway logic
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    // 1. CORS preflight
    if (request.method === "OPTIONS") {
      return handleCors(request, env);
    }
    
    // 2. Authentication
    const auth = await validateAuth(request, env);
    if (!auth.valid) {
      return errorResponse("Unauthorized", 401);
    }
    
    // 3. Rate limiting
    const rateLimitOk = await checkRateLimit(auth.userId, env);
    if (!rateLimitOk) {
      return errorResponse("Rate limit exceeded", 429);
    }
    
    // 4. Check cache
    const cached = await getFromCache(request, env);
    if (cached) return cached;
    
    // 5. Route to backend
    const response = await routeToBackend(request, env);
    
    // 6. Cache response
    ctx.waitUntil(cacheResponse(request, response.clone(), env));
    
    return response;
  }
};

Why Workers for API Gateway: Authentication and rate limiting at the edge protects backend servers. Caching reduces origin load. CORS handling is consistent across all endpoints. All with sub-50ms overhead globally.

Deployment Workflow

Our deployment process for Workers:

πŸš€ Deployment Pipeline
Local Dev
wrangler dev with .dev.vars for secrets
Hot reload, local KV, test against production bindings with --remote
Staging
wrangler deploy --env staging
Separate Worker name, staging secrets, staging routes
Test
Manual testing + automated smoke tests against staging
Verify functionality before production push
Production
wrangler deploy --env production
Atomic deployment, instant rollback if needed
Monitor
wrangler tail --env production
Live logs, error tracking, performance monitoring in dashboard

12. Pricing and Free Tier Limits

Cloudflare's free tier is genuinely usable for productionβ€”not a trial that expires or degrades. Understanding the limits helps you plan.

Workers Pricing

Resource Free Tier Paid ($5/mo base)
Requests 100,000/day 10M included, $0.30/M after
CPU Time 10ms/request 30s/request (50ms included)
KV Reads 100,000/day 10M included, $0.50/M after
KV Writes 1,000/day 1M included, $5/M after
KV Storage 1GB 1GB included, $0.50/GB/mo after
Durable Objects Not available Requests: $0.15/M, Storage: $0.20/GB

Pages Pricing

Resource Free Tier Pro ($20/mo)
Bandwidth Unlimited Unlimited
Builds 500/month 5,000/month
Concurrent builds 1 5
Sites Unlimited Unlimited
Functions requests 100,000/day 10M/month included

Workers AI Pricing

Model Category Free Tier Pricing
Text Generation (small) 10,000 neurons/day $0.011/1K neurons
Text Generation (large) 10,000 neurons/day $0.022/1K neurons
Embeddings Included $0.00002/1K tokens
Speech-to-Text Included $0.003/minute
Image Generation Included $0.02/image
βœ… Real Cost Example

Our asabove.tech deployment (Pages + Workers + KV + Email Routing):

Monthly traffic: ~50,000 page views, ~10,000 API calls, ~500 emails
Monthly cost: $0 (well within free tier)

The free tier is substantial enough for many production workloads. You only pay when you scale significantly.

When Costs Grow

Scenarios where you'll exceed free tier:

Even at scale, Workers is often cheaper than alternatives. Compare to Lambda ($0.20/M requests + compute time) or always-on servers ($5-50/mo minimum).

13. When NOT to Use Workers

Workers are powerful, but they're not the right choice for everything. Knowing when not to use them saves you from painful refactors.

Don't Use Workers For:

❌ Long-Running Tasks

The problem: Workers have a 30-second CPU time limit (paid plan). Long-running data processing, report generation, or batch jobs will timeout.

Use instead: AWS Lambda (15min), Google Cloud Functions (9min), or dedicated servers for batch processing. Workers can trigger these jobs and handle the callback.

❌ GPU-Intensive AI

The problem: Workers AI offers smaller models. Complex reasoning, long-form generation, or fine-tuned models need more power.

Use instead: Dedicated GPU servers (Replicate, Modal, RunPod), or major provider APIs (OpenAI, Anthropic). Use Workers as a gateway for auth, caching, and response handling.

❌ Traditional Databases

The problem: Workers can't maintain persistent TCP connections to databases. Every request would open a new connectionβ€”catastrophic for connection limits.

Use instead: HTTP-based databases (PlanetScale, Neon's serverless driver, Cloudflare D1, Supabase). Or use Workers as an API layer in front of a traditional backend that manages database connections.

❌ WebSocket Broadcasting

The problem: While Durable Objects support WebSockets, broadcasting to millions of connections isn't what they're designed for.

Use instead: Cloudflare Pub/Sub, dedicated WebSocket services (Pusher, Ably), or self-hosted solutions (Socket.io on servers). Workers can handle connection setup and message validation at the edge.

❌ Native Modules / Binary Dependencies

The problem: Workers run V8 isolates, not full Node.js. Native npm modules (compiled C/C++) won't work. No fs, net, child_process.

Use instead: Full serverless (Lambda, Cloud Functions) or containers. Many popular packages have web-compatible versions; check before assuming it won't work.

❌ Stateful Long-Running Processes

The problem: Even Durable Objects are designed for request-response patterns, not always-on background processes.

Use instead: Actual servers (EC2, DigitalOcean, fly.io). Workers can coordinate with these servers, but can't replace them for persistent background work.

The Hybrid Architecture

The most effective architectures combine Workers with traditional backends:

Internet Traffic β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Cloudflare Workers β”‚ ← Auth, validation, routing β”‚ (Edge Layer) β”‚ ← Caching, rate limiting β”‚ β”‚ ← Simple transformations β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β–Ό β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Static β”‚ β”‚ API β”‚ β”‚ GPU β”‚ β”‚ (Pages) β”‚ β”‚ Servers β”‚ β”‚ Servers β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Database β”‚ β”‚ AI APIs β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Workers excel at the request layer: fast, global, cheap at scale. Traditional infrastructure handles what Workers can't: stateful processing, heavy computation, persistent connections. The edge layer becomes a smart, distributed API gateway.

πŸ’‘ The Right Mental Model

Think of Workers as your application's "front desk." They handle everyone who walks in the door: checking credentials, answering common questions, routing complex issues to specialists. The front desk doesn't do surgeryβ€”that's what specialists (your backend servers) are for. But a good front desk makes everything run smoother.

Summary: Workers Are Right When...

βœ… Use Workers ❌ Use Something Else
Request/response handling Long-running batch jobs
API gateways and routing GPU-intensive AI inference
Authentication/authorization Traditional database connections
Simple data transformations Native module dependencies
Caching and CDN logic Large-scale WebSocket broadcasting
Email processing Always-on background processes
Edge AI (embeddings, small models) Complex reasoning tasks
Real-time features (with Durable Objects) Heavy stateful computation

Conclusion: Start Simple, Scale Globally

Cloudflare Workers represent a fundamental shift in how we build web applications. The ability to run code in 300+ locations worldwide, with zero cold starts and minimal configuration, removes barriers that previously required entire DevOps teams to overcome.

For AI applications specifically, edge computing isn't optionalβ€”it's essential. Voice interfaces demand sub-200ms latency. Real-time features need instant responses. API gateways must protect expensive AI backends from abuse. Workers provide the infrastructure to make all of this possible without managing servers.

Our deployments at asabove.tech demonstrate the practical reality: static sites on Pages, voice agents processing audio through Workers, email routing without external services, API endpoints responding in milliseconds globally. The combined monthly cost? Zero, on the free tier.

🎯 Your Next Steps
Today
Install Wrangler, run wrangler login, deploy a hello world Worker
15 minutes to your first global deployment
This Week
Add a real endpoint: authentication, KV caching, or request transformation
Learn the patterns that matter for your use case
This Month
Build something production-worthy: an API gateway, email handler, or AI endpoint
Connect to your existing infrastructure as the edge layer
Ongoing
Expand edge capabilities: more endpoints, Durable Objects for state, Workers AI
The edge layer grows with your application

The edge is no longer a luxury for companies with massive infrastructure budgets. It's accessible to anyone who can write JavaScript. The tools are mature, the documentation is solid, and the free tier is generous enough for real production use.

Every millisecond of latency you eliminate is a user experience improved. Every request handled at the edge is a backend server protected. Every global deployment is infrastructure you didn't have to manage.

Deploy to the edge. Your users are there.

Ready to explore more infrastructure and AI topics?

Explore Techne

Share this article