Task Lifecycle

The SerpWatch API uses an asynchronous task-based architecture for most operations. Understanding the task lifecycle helps you build robust integrations.

Overview

When you submit a request to an async endpoint (like SERP crawling or keyword research), the API creates a task and returns immediately with a task ID. The task is then processed in the background, and you can retrieve results by polling or receiving a webhook callback.

Sync vs Async

Some endpoints offer both sync (/live) and async versions. Sync endpoints wait for results before responding, while async endpoints return immediately with a task ID. Use async for batch processing or when results aren't needed immediately.

Status Flow

Tasks progress through the following statuses:

Pending

Task has been created and is queued for processing. This is the initial state returned when you submit a request.

Processing

Task is actively being processed. A worker has picked up the task and is executing the crawl or data retrieval.

Completed / Error

Task has finished. If successful, status is completed and results are available. If failed, status is error with an error message.

Status Values

Status Terminal? Description
pending No Task is queued, waiting for a worker
processing No Task is being actively processed
completed Yes Task finished successfully, results available
success Yes Alias for completed (used by some endpoints)
error Yes Task failed, error_message field explains why

Processing Times

Task processing times vary by endpoint type and current system load. Here are typical ranges:

Task Type Typical Time Max Time
SERP Crawl (single) 15-30 seconds 2 minutes
SERP Crawl (depth 100) 30-60 seconds 3 minutes
Keyword Data 10-20 seconds 1 minute
Keyword Ideas 20-40 seconds 2 minutes

Peak Times

During peak usage, tasks may take longer to start processing. If you're building a user-facing application, set appropriate timeout expectations and consider using webhooks rather than polling.

Retrieving Results

There are two ways to get task results: polling and webhooks.

Polling

Call the task status endpoint periodically until the task reaches a terminal state (completed or error).

import time
import requests

def poll_for_result(task_id, endpoint, timeout=120, interval=3):
    """Poll for task completion with exponential backoff."""
    start = time.time()
    attempts = 0

    while time.time() - start < timeout:
        response = requests.get(
            f"{BASE_URL}/{endpoint}/{task_id}",
            headers={"Authorization": f"Bearer {API_KEY}"}
        )
        result = response.json()
        status = result.get("status")

        if status in ["completed", "success"]:
            return result
        elif status == "error":
            raise Exception(f"Task failed: {result.get('error_message')}")

        # Exponential backoff: 3s, 4.5s, 6.75s... capped at 15s
        attempts += 1
        wait_time = min(interval * (1.5 ** attempts), 15)
        time.sleep(wait_time)

    raise TimeoutError(f"Task {task_id} did not complete in {timeout}s")
async function pollForResult(taskId, endpoint, timeout = 120000, interval = 3000) {
  const start = Date.now();
  let attempts = 0;

  while (Date.now() - start < timeout) {
    const response = await fetch(
      `${BASE_URL}/${endpoint}/${taskId}`,
      { headers: { "Authorization": `Bearer ${API_KEY}` } }
    );
    const result = await response.json();
    const status = result.status;

    if (status === "completed" || status === "success") {
      return result;
    } else if (status === "error") {
      throw new Error(`Task failed: ${result.error_message}`);
    }

    // Exponential backoff: 3s, 4.5s, 6.75s... capped at 15s
    attempts++;
    const waitTime = Math.min(interval * Math.pow(1.5, attempts), 15000);
    await new Promise(r => setTimeout(r, waitTime));
  }

  throw new Error(`Task ${taskId} did not complete in ${timeout}ms`);
}

Webhooks (Recommended)

For production systems, use webhooks instead of polling. Provide a postback_url when creating the task, and we'll POST results to your endpoint when the task completes.

# Create task with webhook
curl -X POST "https://engine.v2.serpwatch.io/api/v2/serp/crawl" \
  -H "Authorization: Bearer $SERPWATCH_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "keyword": "project management software",
    "postback_url": "https://yourapp.com/webhook/serp"
  }'

See the Webhooks guide for complete setup instructions.

Task Retention

Completed tasks and their results are retained for a limited time. Plan your integration accordingly.

Item Retention Period
Task metadata (ID, status, parameters) 30 days
Task results (SERP data, keywords) 7 days
Error details 30 days

Store Your Data

Always store important results in your own database. The API is not designed for long-term data storage. Retrieve and persist results as soon as tasks complete.

Error Handling

When a task fails, the status becomes error and an error_message field explains what went wrong.

Common Error Types

Error Cause Resolution
Crawl failed Unable to fetch search results Retry the task after a few minutes
Invalid location Unrecognized location code or name Check location code reference
Rate limited Too many requests in time window Wait and retry with backoff
Timeout Task took too long to process Retry; reduce depth if persistent

Retry Strategy

import time
from functools import wraps

def retry_on_error(max_retries=3, backoff_base=5):
    """Decorator to retry failed tasks with exponential backoff."""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            last_error = None
            for attempt in range(max_retries):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    last_error = e
                    if attempt < max_retries - 1:
                        wait = backoff_base * (2 ** attempt)
                        print(f"Attempt {attempt + 1} failed, retrying in {wait}s")
                        time.sleep(wait)
            raise last_error
        return wrapper
    return decorator

@retry_on_error(max_retries=3)
def crawl_serp(keyword):
    # Submit task and poll for result
    pass