Working with Pagination
January 30, 2025 32174b4 Edit this page

Working with Pagination 🗄️ Archived

Learn how to efficiently paginate through large datasets

When dealing with large datasets, pagination is essential for performance and usability.

Pagination Types

1. Offset-Based Pagination

The most common and straightforward approach:

curl "https://api.example.com/v1/resources?page=2&per_page=25" \
  -H "Authorization: Bearer YOUR_TOKEN"

Parameters:

  • page: Current page number (default: 1)
  • per_page: Items per page (default: 20, max: 100)

Response:

{
  "data": [...],
  "meta": {
    "page": 2,
    "per_page": 25,
    "total": 250,
    "total_pages": 10,
    "links": {
      "first": "/v1/resources?page=1&per_page=25",
      "prev": "/v1/resources?page=1&per_page=25",
      "next": "/v1/resources?page=3&per_page=25",
      "last": "/v1/resources?page=10&per_page=25"
    }
  }
}

2. Cursor-Based Pagination

More efficient for large datasets and real-time data:

curl "https://api.example.com/v1/resources?cursor=eyJpZCI6MTAwfQ&limit=25" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "data": [...],
  "meta": {
    "cursor": {
      "next": "eyJpZCI6MTI1fQ",
      "prev": "eyJpZCI6NzV9",
      "has_more": true
    },
    "limit": 25
  }
}

Implementing Pagination

Python Example

import requests

def fetch_all_resources(base_url, token):
    """Fetch all resources using pagination"""
    all_resources = []
    page = 1
    
    headers = {"Authorization": f"Bearer {token}"}
    
    while True:
        response = requests.get(
            f"{base_url}/resources",
            params={"page": page, "per_page": 100},
            headers=headers
        )
        
        data = response.json()
        all_resources.extend(data["data"])
        
        # Check if there are more pages
        if page >= data["meta"]["total_pages"]:
            break
            
        page += 1
    
    return all_resources

JavaScript Example

async function* paginateResources(baseUrl, token) {
  let page = 1;
  let hasMore = true;
  
  while (hasMore) {
    const response = await fetch(
      `${baseUrl}/resources?page=${page}&per_page=50`,
      {
        headers: {
          'Authorization': `Bearer ${token}`
        }
      }
    );
    
    const data = await response.json();
    
    yield data.data;
    
    hasMore = page < data.meta.total_pages;
    page++;
  }
}

// Usage
for await (const resources of paginateResources(url, token)) {
  console.log(`Fetched ${resources.length} resources`);
  // Process resources
}

Cursor Pagination Implementation

Python Example

def fetch_with_cursor(base_url, token):
    """Fetch resources using cursor pagination"""
    all_resources = []
    cursor = None
    
    headers = {"Authorization": f"Bearer {token}"}
    
    while True:
        params = {"limit": 100}
        if cursor:
            params["cursor"] = cursor
        
        response = requests.get(
            f"{base_url}/resources",
            params=params,
            headers=headers
        )
        
        data = response.json()
        all_resources.extend(data["data"])
        
        # Check if there are more results
        cursor_meta = data["meta"]["cursor"]
        if not cursor_meta.get("has_more"):
            break
            
        cursor = cursor_meta["next"]
    
    return all_resources

Best Practices

1. Choose Appropriate Page Size

# Too small - many requests
page_size = 10  # ❌

# Good balance
page_size = 50  # ✅

# Too large - slow responses
page_size = 1000  # ❌

2. Cache Pages

from functools import lru_cache

@lru_cache(maxsize=100)
def get_page(page_number):
    return fetch_page(page_number)

3. Handle Errors Gracefully

def fetch_page_with_retry(page, max_retries=3):
    for attempt in range(max_retries):
        try:
            return fetch_page(page)
        except requests.RequestException as e:
            if attempt == max_retries - 1:
                raise
            time.sleep(2 ** attempt)  # Exponential backoff

4. Show Progress to Users

from tqdm import tqdm

def fetch_all_with_progress(total_pages):
    all_data = []
    
    for page in tqdm(range(1, total_pages + 1), desc="Fetching"):
        data = fetch_page(page)
        all_data.extend(data)
    
    return all_data

Performance Considerations

  1. Offset Pagination:

    • Pros: Simple, allows jumping to specific pages
    • Cons: Slower for large offsets, inconsistent with real-time data
  2. Cursor Pagination:

    • Pros: Faster, consistent results, better for real-time data
    • Cons: Can’t jump to specific page, more complex
  3. When to Use Each:

    • Use offset for user-facing paginated lists with page numbers
    • Use cursor for API integrations and background processing

Pagination with Filtering and Sorting

Combine pagination with other query parameters:

curl "https://api.example.com/v1/resources?page=2&per_page=50&sort=-created_at&filter[status]=active" \
  -H "Authorization: Bearer YOUR_TOKEN"

Maintain consistency across all pages by keeping filters constant throughout pagination.