Batch Processing

Process hundreds or thousands of keywords efficiently using the batch API endpoints. Batch requests reduce API overhead and are the recommended approach for large-scale operations.

Overview

The batch endpoint allows you to submit multiple crawl requests in a single API call. Each request in the batch is processed independently, and results are delivered via webhook or available for retrieval using individual task IDs.

When to Use Batch

Use batch processing when you have more than 5-10 keywords to process. For 1-5 keywords, individual requests may be simpler to implement.

Batch Endpoint

POST /api/v2/serp/crawl/google/batch

Submit multiple Google SERP crawl requests in a single API call.

Request Body

Array of crawl request objects. Each object supports the following parameters:

Parameter Type Required Description
keyword string Required The search query
location_name string Required Geographic location (e.g., "United States", "London,United Kingdom")
iso_code string Required Country ISO code (e.g., "US", "GB")
depth integer Optional Number of results (1-100, default: 10)
device string Optional desktop or mobile. Default: desktop
language_code string Optional Language code (e.g., "en")
domain string Optional Your domain to track in results
business_name string Optional Business name to track in local pack results
competitors array Optional List of competitor domains to track
postback_url string Optional Webhook URL for result delivery
frequency integer Optional Cache duration in hours. Default: 24
se_id integer Optional Search engine configuration ID from Locations API

Basic Example

Submit a batch of keywords with shared settings:

curl -X POST "https://engine.v2.serpwatch.io/api/v2/serp/crawl/google/batch" \
  -H "Authorization: Bearer $SERPWATCH_API_KEY" \
  -H "Content-Type: application/json" \
  -d '[
    {
      "keyword": "project management software",
      "location_name": "United States",
      "iso_code": "US",
      "depth": 10,
      "device": "desktop"
    },
    {
      "keyword": "task management tools",
      "location_name": "United States",
      "iso_code": "US",
      "depth": 10,
      "device": "desktop"
    },
    {
      "keyword": "team collaboration software",
      "location_name": "United States",
      "iso_code": "US",
      "depth": 10,
      "device": "desktop"
    }
  ]'
import requests
import os

API_KEY = os.environ.get("SERPWATCH_API_KEY")
BASE_URL = "https://engine.v2.serpwatch.io"

# Define keywords
keywords = [
    "project management software",
    "task management tools",
    "team collaboration software"
]

# Build batch request with shared settings
batch = [{
    "keyword": kw,
    "location_name": "United States",
    "iso_code": "US",
    "depth": 10,
    "device": "desktop",
    "language_code": "en"
} for kw in keywords]

# Submit batch
response = requests.post(
    f"{BASE_URL}/api/v2/serp/crawl/google/batch",
    headers={
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    },
    json=batch
)

tasks = response.json()
print(f"Created {len(tasks)} tasks")

for task in tasks:
    print(f"  {task['id']}: {task['keyword']} - {task['status']}")
const API_KEY = process.env.SERPWATCH_API_KEY;
const BASE_URL = "https://engine.v2.serpwatch.io";

// Define keywords
const keywords = [
  "project management software",
  "task management tools",
  "team collaboration software"
];

// Build batch request with shared settings
const batch = keywords.map(keyword => ({
  keyword,
  location_name: "United States",
  iso_code: "US",
  depth: 10,
  device: "desktop",
  language_code: "en"
}));

// Submit batch
const response = await fetch(
  `${BASE_URL}/api/v2/serp/crawl/google/batch`,
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${API_KEY}`,
      "Content-Type": "application/json"
    },
    body: JSON.stringify(batch)
  }
);

const tasks = await response.json();
console.log(`Created ${tasks.length} tasks`);

for (const task of tasks) {
  console.log(`  ${task.id}: ${task.keyword} - ${task.status}`);
}

The response returns an array of task objects:

[
  {
    "id": 1166085028196491264,
    "status": "awaiting",
    "keyword": "project management software",
    "location_name": "United States",
    "iso_code": "US",
    "device": "desktop",
    "depth": 10
  },
  {
    "id": 1166085028196491265,
    "status": "awaiting",
    "keyword": "task management tools",
    "location_name": "United States",
    "iso_code": "US",
    "device": "desktop",
    "depth": 10
  },
  {
    "id": 1166085028196491266,
    "status": "awaiting",
    "keyword": "team collaboration software",
    "location_name": "United States",
    "iso_code": "US",
    "device": "desktop",
    "depth": 10
  }
]

Processing Large Batches

For very large keyword lists (1000+), split your keywords into smaller batches and submit them sequentially with a small delay between batches.

import time

def chunk_list(lst, chunk_size):
    """Split a list into chunks."""
    for i in range(0, len(lst), chunk_size):
        yield lst[i:i + chunk_size]

def submit_large_batch(keywords, batch_size=100, delay=1.0):
    """
    Submit a large list of keywords in smaller batches.

    Args:
        keywords: List of keyword strings
        batch_size: Keywords per batch (default: 100)
        delay: Seconds between batches (default: 1.0)

    Returns:
        List of all task objects
    """
    all_tasks = []

    for i, chunk in enumerate(chunk_list(keywords, batch_size)):
        print(f"Submitting batch {i + 1} ({len(chunk)} keywords)...")

        batch = [{
            "keyword": kw,
            "location_name": "United States",
            "iso_code": "US",
            "depth": 10,
            "device": "desktop"
        } for kw in chunk]

        response = requests.post(
            f"{BASE_URL}/api/v2/serp/crawl/google/batch",
            headers={
                "Authorization": f"Bearer {API_KEY}",
                "Content-Type": "application/json"
            },
            json=batch
        )

        if response.status_code == 200:
            tasks = response.json()
            all_tasks.extend(tasks)
            print(f"  Created {len(tasks)} tasks")
        else:
            print(f"  Error: {response.status_code} - {response.text}")

        # Delay between batches to avoid rate limiting
        if delay > 0:
            time.sleep(delay)

    return all_tasks

# Example: Submit 500 keywords
large_keyword_list = [f"keyword {i}" for i in range(500)]
all_tasks = submit_large_batch(large_keyword_list, batch_size=100)
print(f"\nTotal tasks created: {len(all_tasks)}")
function chunkArray(arr, chunkSize) {
  const chunks = [];
  for (let i = 0; i < arr.length; i += chunkSize) {
    chunks.push(arr.slice(i, i + chunkSize));
  }
  return chunks;
}

const sleep = (ms) => new Promise(resolve => setTimeout(resolve, ms));

async function submitLargeBatch(keywords, batchSize = 100, delayMs = 1000) {
  /**
   * Submit a large list of keywords in smaller batches.
   */
  const allTasks = [];
  const chunks = chunkArray(keywords, batchSize);

  for (let i = 0; i < chunks.length; i++) {
    const chunk = chunks[i];
    console.log(`Submitting batch ${i + 1} (${chunk.length} keywords)...`);

    const batch = chunk.map(keyword => ({
      keyword,
      location_name: "United States",
      iso_code: "US",
      depth: 10,
      device: "desktop"
    }));

    const response = await fetch(
      `${BASE_URL}/api/v2/serp/crawl/google/batch`,
      {
        method: "POST",
        headers: {
          "Authorization": `Bearer ${API_KEY}`,
          "Content-Type": "application/json"
        },
        body: JSON.stringify(batch)
      }
    );

    if (response.ok) {
      const tasks = await response.json();
      allTasks.push(...tasks);
      console.log(`  Created ${tasks.length} tasks`);
    } else {
      console.error(`  Error: ${response.status}`);
    }

    // Delay between batches
    if (delayMs > 0 && i < chunks.length - 1) {
      await sleep(delayMs);
    }
  }

  return allTasks;
}

// Example: Submit 500 keywords
const largeKeywordList = Array.from({ length: 500 }, (_, i) => `keyword ${i}`);
const allTasks = await submitLargeBatch(largeKeywordList, 100);
console.log(`\nTotal tasks created: ${allTasks.length}`);

Collecting Results

For batch operations, the recommended approach is to use webhooks. If polling, collect results efficiently by tracking pending tasks.

import time
from concurrent.futures import ThreadPoolExecutor, as_completed

def get_task_result(task_id):
    """Fetch a single task result."""
    response = requests.get(
        f"{BASE_URL}/api/v2/serp/crawl/{task_id}",
        headers={"Authorization": f"Bearer {API_KEY}"}
    )
    return response.json()

def collect_results(task_ids, max_workers=10, timeout=300):
    """
    Collect results for multiple tasks using concurrent requests.

    Args:
        task_ids: List of task IDs to collect
        max_workers: Concurrent request limit
        timeout: Maximum wait time in seconds

    Returns:
        Dict mapping task_id to result
    """
    results = {}
    pending = set(task_ids)
    start_time = time.time()

    while pending and (time.time() - start_time) < timeout:
        # Check pending tasks concurrently
        with ThreadPoolExecutor(max_workers=max_workers) as executor:
            futures = {
                executor.submit(get_task_result, tid): tid
                for tid in list(pending)[:max_workers * 2]  # Limit concurrent checks
            }

            for future in as_completed(futures):
                task_id = futures[future]
                try:
                    result = future.result()
                    if result["status"] in ["success", "completed", "error"]:
                        results[task_id] = result
                        pending.discard(task_id)
                except Exception as e:
                    print(f"Error checking {task_id}: {e}")

        if pending:
            completed = len(results)
            total = len(task_ids)
            print(f"Progress: {completed}/{total} ({len(pending)} pending)")
            time.sleep(3)

    return results

# Usage
task_ids = [task["id"] for task in all_tasks]
results = collect_results(task_ids)
print(f"Collected {len(results)} results")
async function getTaskResult(taskId) {
  const response = await fetch(
    `${BASE_URL}/api/v2/serp/crawl/${taskId}`,
    { headers: { "Authorization": `Bearer ${API_KEY}` } }
  );
  return response.json();
}

async function collectResults(taskIds, concurrency = 10, timeoutMs = 300000) {
  /**
   * Collect results for multiple tasks with concurrency control.
   */
  const results = new Map();
  const pending = new Set(taskIds);
  const startTime = Date.now();

  while (pending.size > 0 && (Date.now() - startTime) < timeoutMs) {
    // Process in batches with concurrency limit
    const batch = Array.from(pending).slice(0, concurrency);

    const batchResults = await Promise.all(
      batch.map(async (taskId) => {
        try {
          const result = await getTaskResult(taskId);
          return { taskId, result };
        } catch (e) {
          console.error(`Error checking ${taskId}:`, e.message);
          return { taskId, result: null };
        }
      })
    );

    for (const { taskId, result } of batchResults) {
      if (result && ["success", "completed", "error"].includes(result.status)) {
        results.set(taskId, result);
        pending.delete(taskId);
      }
    }

    if (pending.size > 0) {
      console.log(`Progress: ${results.size}/${taskIds.length} (${pending.size} pending)`);
      await sleep(3000);
    }
  }

  return results;
}

// Usage
const taskIds = allTasks.map(task => task.id);
const results = await collectResults(taskIds);
console.log(`Collected ${results.size} results`);

Best Practices

Batch Size Recommendations

Scenario Batch Size Notes
Regular rank tracking 50-100 Good balance of efficiency and manageability
Large keyword research 100-200 Higher throughput, more complex result handling
With webhooks 200-500 No polling overhead, can handle larger batches

Optimization Tips

  • Group by settings - Batch keywords with the same location/device together for cleaner code
  • Use webhooks - Eliminates polling overhead for large batches
  • Set appropriate depth - Use depth: 10 unless you need more results
  • Enable caching - Set frequency to reuse results for repeated queries
  • Handle partial failures - Some tasks may fail; implement retry logic for errors

Error Handling

Batch requests may have partial failures where some tasks succeed and others fail. Always check the status of each task in the response.

# Check for failed tasks
for result in results.values():
    if result["status"] == "error":
        print(f"Failed: {result['keyword']}")
        print(f"  Error: {result.get('error_message', 'Unknown error')}")

        # Optionally retry failed tasks
        retry_keywords.append(result["keyword"])