Task Lifecycle
The SerpWatch API uses an asynchronous task-based architecture for most operations. Understanding the task lifecycle helps you build robust integrations.
Overview
When you submit a request to an async endpoint (like SERP crawling or keyword research), the API creates a task and returns immediately with a task ID. The task is then processed in the background, and you can retrieve results by polling or receiving a webhook callback.
Sync vs Async
Some endpoints offer both sync (/live) and async versions.
Sync endpoints wait for results before responding, while async endpoints
return immediately with a task ID. Use async for batch processing or when
results aren't needed immediately.
Status Flow
Tasks progress through the following statuses:
Pending
Task has been created and is queued for processing. This is the initial state returned when you submit a request.
Processing
Task is actively being processed. A worker has picked up the task and is executing the crawl or data retrieval.
Completed / Error
Task has finished. If successful, status is completed and
results are available. If failed, status is error with an
error message.
Status Values
| Status | Terminal? | Description |
|---|---|---|
pending |
No | Task is queued, waiting for a worker |
processing |
No | Task is being actively processed |
completed |
Yes | Task finished successfully, results available |
success |
Yes | Alias for completed (used by some endpoints) |
error |
Yes | Task failed, error_message field explains why |
Processing Times
Task processing times vary by endpoint type and current system load. Here are typical ranges:
| Task Type | Typical Time | Max Time |
|---|---|---|
| SERP Crawl (single) | 15-30 seconds | 2 minutes |
| SERP Crawl (depth 100) | 30-60 seconds | 3 minutes |
| Keyword Data | 10-20 seconds | 1 minute |
| Keyword Ideas | 20-40 seconds | 2 minutes |
Peak Times
During peak usage, tasks may take longer to start processing. If you're building a user-facing application, set appropriate timeout expectations and consider using webhooks rather than polling.
Retrieving Results
There are two ways to get task results: polling and webhooks.
Polling
Call the task status endpoint periodically until the task reaches a terminal state (completed or error).
import time
import requests
def poll_for_result(task_id, endpoint, timeout=120, interval=3):
"""Poll for task completion with exponential backoff."""
start = time.time()
attempts = 0
while time.time() - start < timeout:
response = requests.get(
f"{BASE_URL}/{endpoint}/{task_id}",
headers={"Authorization": f"Bearer {API_KEY}"}
)
result = response.json()
status = result.get("status")
if status in ["completed", "success"]:
return result
elif status == "error":
raise Exception(f"Task failed: {result.get('error_message')}")
# Exponential backoff: 3s, 4.5s, 6.75s... capped at 15s
attempts += 1
wait_time = min(interval * (1.5 ** attempts), 15)
time.sleep(wait_time)
raise TimeoutError(f"Task {task_id} did not complete in {timeout}s")
async function pollForResult(taskId, endpoint, timeout = 120000, interval = 3000) {
const start = Date.now();
let attempts = 0;
while (Date.now() - start < timeout) {
const response = await fetch(
`${BASE_URL}/${endpoint}/${taskId}`,
{ headers: { "Authorization": `Bearer ${API_KEY}` } }
);
const result = await response.json();
const status = result.status;
if (status === "completed" || status === "success") {
return result;
} else if (status === "error") {
throw new Error(`Task failed: ${result.error_message}`);
}
// Exponential backoff: 3s, 4.5s, 6.75s... capped at 15s
attempts++;
const waitTime = Math.min(interval * Math.pow(1.5, attempts), 15000);
await new Promise(r => setTimeout(r, waitTime));
}
throw new Error(`Task ${taskId} did not complete in ${timeout}ms`);
}
Webhooks (Recommended)
For production systems, use webhooks instead of polling. Provide a
postback_url when creating the task, and we'll POST results
to your endpoint when the task completes.
# Create task with webhook
curl -X POST "https://engine.v2.serpwatch.io/api/v2/serp/crawl" \
-H "Authorization: Bearer $SERPWATCH_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"keyword": "project management software",
"postback_url": "https://yourapp.com/webhook/serp"
}'
See the Webhooks guide for complete setup instructions.
Task Retention
Completed tasks and their results are retained for a limited time. Plan your integration accordingly.
| Item | Retention Period |
|---|---|
| Task metadata (ID, status, parameters) | 30 days |
| Task results (SERP data, keywords) | 7 days |
| Error details | 30 days |
Store Your Data
Always store important results in your own database. The API is not designed for long-term data storage. Retrieve and persist results as soon as tasks complete.
Error Handling
When a task fails, the status becomes error and an
error_message field explains what went wrong.
Common Error Types
| Error | Cause | Resolution |
|---|---|---|
| Crawl failed | Unable to fetch search results | Retry the task after a few minutes |
| Invalid location | Unrecognized location code or name | Check location code reference |
| Rate limited | Too many requests in time window | Wait and retry with backoff |
| Timeout | Task took too long to process | Retry; reduce depth if persistent |
Retry Strategy
import time
from functools import wraps
def retry_on_error(max_retries=3, backoff_base=5):
"""Decorator to retry failed tasks with exponential backoff."""
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
last_error = None
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except Exception as e:
last_error = e
if attempt < max_retries - 1:
wait = backoff_base * (2 ** attempt)
print(f"Attempt {attempt + 1} failed, retrying in {wait}s")
time.sleep(wait)
raise last_error
return wrapper
return decorator
@retry_on_error(max_retries=3)
def crawl_serp(keyword):
# Submit task and poll for result
pass