> For the complete documentation index, see [llms.txt](https://docs.pullbay.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.pullbay.com/documentation/plans-and-usage/rate-limits.md).

# Rate Limits

Learn how Pullbay API rate limits work, how to monitor your usage, and proven strategies to handle rate limits without disrupting your application. This guide covers rate limit headers, best practices, and production-ready code examples.

## How Rate Limits Work

Pullbay implements request rate limiting to ensure fair usage and maintain service stability for all users.

### Rate Limit Basics

* **Scope:** Rate limits are applied **per API key**, not per IP address
* **Time Window:** Rate limits reset on a **per-minute basis** (60-second rolling window)
* **Enforcement:** When you exceed your plan's limit, the API returns a `429 Too Many Requests` status code
* **Charges:** Failed requests due to rate limiting still consume API credits

### What Happens When You Exceed Rate Limits

When you exceed your plan's rate limit:

1. The API returns HTTP status `429 Too Many Requests`
2. An error response is sent with the `code: "rate_limit_exceeded"`
3. Subsequent requests are rejected until the time window resets
4. The `X-RateLimit-Reset` header tells you exactly when limits reset (Unix timestamp)

**Error Response Example:**

```json
{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded. You have used 60 requests this minute. Limit: 60. Resets at: 1712973600",
    "request_id": "req_abc123xyz789"
  }
}
```

## Plan-Based Rate Limits

Your rate limit depends on your Pullbay plan. Upgrade your plan to increase your request capacity.

| Plan        | Requests/Min | Best For                                  |
| ----------- | ------------ | ----------------------------------------- |
| **Free**    | 10           | Development, testing, small projects      |
| **Starter** | 60           | Small to medium apps, production use      |
| **Growth**  | 300          | High-traffic apps, frequent updates       |
| **Scale**   | 1,000+       | Enterprise, mission-critical applications |

See the [Pullbay dashboard](https://app.pullbay.com/) or [Plans page](broken://pages/d42df7cbba9fd5a924ac96071d54aa509c4e2cf5) for current pricing.

**Example:** With the Growth plan (300 req/min), you can make 5 requests per second sustainably.

### Checking Your Current Rate Limit

Your API key's rate limit is shown:

* In the Pullbay dashboard under Account Settings
* In the `X-RateLimit-Limit` response header from any API call

```bash
curl -I https://api.pullbay.com/v1/app-store/reviews/all?app_id=test \
  -H "Authorization: Bearer YOUR_API_KEY"
```

Look for:

```
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1712973600
```

## Rate Limit Response Headers

Every Pullbay API response includes rate limit information in the response headers:

| Header                  | Type           | Description                                                           |
| ----------------------- | -------------- | --------------------------------------------------------------------- |
| `X-RateLimit-Limit`     | Integer        | Your plan's maximum requests per minute                               |
| `X-RateLimit-Remaining` | Integer        | Requests remaining in this minute                                     |
| `X-RateLimit-Reset`     | Unix Timestamp | When the current rate limit window resets (seconds since Jan 1, 1970) |

**Example Response Headers:**

```
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 43
X-RateLimit-Reset: 1712973602
```

**Converting the Unix Timestamp:**

```python
from datetime import datetime
reset_time = datetime.fromtimestamp(1712973602)
print(f"Rate limit resets at: {reset_time}")
# Output: Rate limit resets at: 2024-04-13 10:00:02
```

### Monitoring Rate Limit Status

Always inspect rate limit headers after each request:

```python
response = requests.get(
    "https://api.pullbay.com/v1/app-store/reviews/all",
    params={"app_id": "com.example.app"},
    headers={"Authorization": f"Bearer {API_KEY}"}
)

remaining = int(response.headers.get("X-RateLimit-Remaining", 0))
limit = int(response.headers.get("X-RateLimit-Limit", 0))
reset = int(response.headers.get("X-RateLimit-Reset", 0))

print(f"Requests remaining: {remaining}/{limit}")
print(f"Resets at: {datetime.fromtimestamp(reset)}")
```

## Rate Limit Handling Strategies

### Strategy 1: Proactive Throttling

**Best for:** Predictable, steady-state usage with varying load

Monitor the `X-RateLimit-Remaining` header and proactively slow down requests when approaching the limit. This prevents rate limit errors entirely.

```python
import time
import requests
from datetime import datetime

API_KEY = "test_abc123xyz789"
BASE_URL = "https://api.pullbay.com/v1"

class ProactiveThrottler:
    def __init__(self, api_key, throttle_threshold=5):
        """
        api_key: Your Pullbay API key
        throttle_threshold: Start throttling when remaining requests drop below this number
        """
        self.api_key = api_key
        self.throttle_threshold = throttle_threshold
        self.last_remaining = None
        self.last_reset = None

    def get_with_throttle(self, endpoint, params):
        """Fetch data with proactive rate limit throttling."""
        headers = {"Authorization": f"Bearer {self.api_key}"}

        response = requests.get(
            f"{BASE_URL}{endpoint}",
            params=params,
            headers=headers
        )

        # Extract rate limit info from headers
        remaining = int(response.headers.get("X-RateLimit-Remaining", 0))
        limit = int(response.headers.get("X-RateLimit-Limit", 0))
        reset = int(response.headers.get("X-RateLimit-Reset", 0))

        # Log rate limit status
        print(f"[{datetime.now().isoformat()}] Remaining: {remaining}/{limit}")

        # Proactive throttling: if approaching limit, wait
        if remaining <= self.throttle_threshold:
            time_until_reset = reset - time.time()
            if time_until_reset > 0:
                print(f"⚠️  Approaching rate limit. Waiting {time_until_reset:.1f}s for reset...")
                time.sleep(time_until_reset + 1)

        return response.json()

# Usage
throttler = ProactiveThrottler(api_key=API_KEY, throttle_threshold=10)

reviews = throttler.get_with_throttle(
    "/reviews/all",
    params={"app_id": "com.example.app", "limit": 100}
)
```

**Advantages:**

* Prevents rate limit errors
* Smooth, predictable request flow
* Works well with steady pagination

**Disadvantages:**

* May add unnecessary delays
* Not ideal for bursty traffic patterns

***

### Strategy 2: Reactive Backoff

**Best for:** Simple implementations, occasional rate limit hits

Wait when you encounter a `429` error. Extract the reset time from the `X-RateLimit-Reset` header and retry after the window resets.

```python
import time
import requests
from datetime import datetime

API_KEY = "test_abc123xyz789"
BASE_URL = "https://api.pullbay.com/v1"

def get_with_reactive_backoff(endpoint, params, max_retries=3):
    """Fetch data with reactive backoff on 429 errors."""
    headers = {"Authorization": f"Bearer {API_KEY}"}

    for attempt in range(max_retries):
        response = requests.get(
            f"{BASE_URL}{endpoint}",
            params=params,
            headers=headers
        )

        if response.status_code == 200:
            return response.json()

        elif response.status_code == 429:
            # Rate limit hit - extract reset time and wait
            reset_timestamp = int(response.headers.get("X-RateLimit-Reset", time.time()))
            wait_seconds = max(reset_timestamp - time.time(), 1)

            print(f"⚠️  Rate limited (429). Waiting {wait_seconds:.1f}s...")
            print(f"Resets at: {datetime.fromtimestamp(reset_timestamp)}")

            time.sleep(wait_seconds + 1)  # Add 1 second buffer
            continue

        else:
            # Non-rate-limit error
            error = response.json().get("error", {})
            raise Exception(f"API Error: {error.get('code')} - {error.get('message')}")

    raise Exception(f"Max retries ({max_retries}) exceeded")

# Usage
reviews = get_with_reactive_backoff(
    "/reviews/all",
    params={"app_id": "com.example.app"}
)
```

**Advantages:**

* Simple to implement
* No overhead if rate limits aren't hit
* Works for occasional bursts

**Disadvantages:**

* Still results in some 429 errors
* Less efficient than proactive throttling
* Cumulative delay when frequently hitting limits

***

### Strategy 3: Exponential Backoff with Jitter

**Best for:** Robust production systems, highly resilient applications

Implement exponential backoff with randomized jitter. When a rate limit (or other transient error) occurs, wait an exponentially increasing duration with random variance to avoid thundering herd problems.

```python
import time
import random
import requests
from datetime import datetime

API_KEY = "test_abc123xyz789"
BASE_URL = "https://api.pullbay.com/v1"

class ExponentialBackoffRetry:
    def __init__(self, api_key, max_retries=5, base_delay=1):
        """
        api_key: Your Pullbay API key
        max_retries: Maximum retry attempts
        base_delay: Initial delay in seconds (will be multiplied exponentially)
        """
        self.api_key = api_key
        self.max_retries = max_retries
        self.base_delay = base_delay

    def get_with_exponential_backoff(self, endpoint, params):
        """Fetch data with exponential backoff and jitter."""
        headers = {"Authorization": f"Bearer {self.api_key}"}

        for attempt in range(self.max_retries + 1):
            try:
                response = requests.get(
                    f"{BASE_URL}{endpoint}",
                    params=params,
                    headers=headers,
                    timeout=30
                )

                if response.status_code == 200:
                    print(f"✓ Success on attempt {attempt + 1}")
                    return response.json()

                elif response.status_code == 429:
                    # Rate limit - use X-RateLimit-Reset if available
                    if attempt < self.max_retries:
                        reset_timestamp = int(response.headers.get("X-RateLimit-Reset", 0))
                        if reset_timestamp > 0:
                            wait_seconds = max(reset_timestamp - time.time(), 1)
                            print(f"⚠️  Rate limited (429) on attempt {attempt + 1}. Waiting {wait_seconds:.1f}s...")
                            time.sleep(wait_seconds + 1)
                        else:
                            # Fallback to exponential backoff if reset header missing
                            wait_seconds = self._exponential_backoff_delay(attempt)
                            print(f"⚠️  Rate limited (429) on attempt {attempt + 1}. Exponential backoff {wait_seconds:.1f}s...")
                            time.sleep(wait_seconds)
                        continue
                    else:
                        raise Exception("Max retries exceeded after rate limit")

                elif response.status_code in [500, 502, 503]:
                    # Transient server errors - retry with backoff
                    if attempt < self.max_retries:
                        wait_seconds = self._exponential_backoff_delay(attempt)
                        print(f"⚠️  Server error ({response.status_code}) on attempt {attempt + 1}. Retrying in {wait_seconds:.1f}s...")
                        time.sleep(wait_seconds)
                        continue
                    else:
                        raise Exception(f"Server error {response.status_code} after {self.max_retries} retries")

                else:
                    # Non-retryable error (4xx except 429)
                    error = response.json().get("error", {})
                    raise Exception(f"Non-retryable error: {response.status_code} - {error.get('message')}")

            except requests.Timeout:
                if attempt < self.max_retries:
                    wait_seconds = self._exponential_backoff_delay(attempt)
                    print(f"⚠️  Timeout on attempt {attempt + 1}. Retrying in {wait_seconds:.1f}s...")
                    time.sleep(wait_seconds)
                    continue
                else:
                    raise

        raise Exception("All retry attempts exhausted")

    def _exponential_backoff_delay(self, attempt):
        """Calculate exponential backoff with jitter."""
        # Base formula: 2^attempt * base_delay, with random jitter
        exponential_delay = (2 ** attempt) * self.base_delay
        jitter = random.uniform(0, exponential_delay * 0.1)  # 10% jitter
        return exponential_delay + jitter

# Usage
client = ExponentialBackoffRetry(api_key=API_KEY, max_retries=5, base_delay=1)

reviews = client.get_with_exponential_backoff(
    "/reviews/all",
    params={"app_id": "com.example.app", "limit": 100}
)

print(f"Successfully fetched {len(reviews['data'])} reviews")
```

**Advantages:**

* Most robust approach for production
* Handles transient errors (429, 5xx) gracefully
* Jitter prevents thundering herd
* Exponential backoff allows service recovery time

**Disadvantages:**

* More complex implementation
* Can introduce significant delays during outages

**Backoff Sequence Example:**

```
Attempt 1: Wait 1 + jitter seconds
Attempt 2: Wait 2 + jitter seconds
Attempt 3: Wait 4 + jitter seconds
Attempt 4: Wait 8 + jitter seconds
Attempt 5: Wait 16 + jitter seconds
```

***

## Best Practices for Rate Limit Management

### 1. Monitor Rate Limit Headers After Every Request

Always extract and log rate limit information:

```python
def log_rate_limit_status(response):
    """Log rate limit status from response headers."""
    remaining = response.headers.get("X-RateLimit-Remaining")
    limit = response.headers.get("X-RateLimit-Limit")
    reset = response.headers.get("X-RateLimit-Reset")

    print(f"Rate Limit: {remaining}/{limit} (resets at {reset})")
```

### 2. Batch Requests Efficiently

Combine multiple logical requests into fewer API calls when possible:

```python
# ❌ Inefficient: 3 API calls
reviews_us = get_reviews(app_id="com.example", country="us")
reviews_ca = get_reviews(app_id="com.example", country="ca")
reviews_uk = get_reviews(app_id="com.example", country="uk")

# ✓ More efficient: 1 API call with managed pagination
all_reviews = get_reviews(app_id="com.example", managed_pagination=True)
```

### 3. Use Managed Pagination Endpoints

Pullbay's managed pagination endpoints (e.g., `/reviews/all`) handle pagination internally and consume fewer API credits:

```python
# ✓ Better: Uses managed pagination (1 API call for all data)
response = requests.get(
    "https://api.pullbay.com/v1/app-store/reviews/all",
    params={"app_id": "com.example.app"},
    headers={"Authorization": f"Bearer {API_KEY}"}
)
```

### 4. Cache API Responses

Cache responses to avoid redundant API calls:

```python
import json
from datetime import datetime, timedelta

class APICache:
    def __init__(self, ttl_minutes=60):
        self.cache = {}
        self.ttl = timedelta(minutes=ttl_minutes)

    def get(self, cache_key, api_call_func, *args, **kwargs):
        """Get cached result or fetch fresh data."""
        if cache_key in self.cache:
            cached_data, cached_time = self.cache[cache_key]
            if datetime.now() - cached_time < self.ttl:
                print(f"✓ Cache hit for {cache_key}")
                return cached_data

        print(f"⚠️  Cache miss for {cache_key}. Fetching fresh data...")
        fresh_data = api_call_func(*args, **kwargs)
        self.cache[cache_key] = (fresh_data, datetime.now())
        return fresh_data

# Usage
cache = APICache(ttl_minutes=120)

reviews = cache.get(
    "reviews_com.example.app",
    get_reviews,
    app_id="com.example.app"
)
```

### 5. Request Only the Data You Need

Use parameters to limit the amount of data returned:

```python
# ✓ Better: Limit to 50 results
response = requests.get(
    "https://api.pullbay.com/v1/app-store/reviews/all",
    params={
        "app_id": "com.example.app",
        "limit": 50,  # Get 50 results instead of default 100
        "sort": "date_desc"  # Get newest first
    },
    headers={"Authorization": f"Bearer {API_KEY}"}
)
```

### 6. Implement Request Queuing

For applications that need to make many requests, queue requests and process them at a controlled rate:

```python
import queue
import threading
import time

class RequestQueue:
    def __init__(self, api_key, requests_per_second=5):
        self.api_key = api_key
        self.request_queue = queue.Queue()
        self.results = []
        self.delay = 1 / requests_per_second
        self.running = True

    def enqueue(self, endpoint, params):
        """Add a request to the queue."""
        self.request_queue.put((endpoint, params))

    def worker(self):
        """Process queued requests at controlled rate."""
        while self.running or not self.request_queue.empty():
            try:
                endpoint, params = self.request_queue.get(timeout=1)

                response = requests.get(
                    f"https://api.pullbay.com/v1{endpoint}",
                    params=params,
                    headers={"Authorization": f"Bearer {self.api_key}"}
                )

                self.results.append(response.json())
                self.request_queue.task_done()

                time.sleep(self.delay)  # Rate limiting
            except queue.Empty:
                continue

    def start(self, num_workers=1):
        """Start worker threads."""
        threads = []
        for _ in range(num_workers):
            t = threading.Thread(target=self.worker, daemon=True)
            t.start()
            threads.append(t)
        return threads

# Usage
queue_worker = RequestQueue(api_key=API_KEY, requests_per_second=5)
queue_worker.enqueue("/reviews/all", {"app_id": "com.example.app"})
queue_worker.enqueue("/reviews/all", {"app_id": "com.other.app"})
threads = queue_worker.start(num_workers=2)
queue_worker.running = False
```

### 7. Upgrade Your Plan When Needed

If you consistently hit rate limits:

1. **Add bandwidth:** Upgrade to a higher plan for more requests/minute
2. **Monitor trends:** Track rate limit hits in your logs
3. **Optimize queries:** Reduce API calls through caching and batching
4. **Contact sales:** For enterprise scale needs, contact Pullbay sales for custom limits

### 8. Set Up Rate Limit Alerts

Log rate limit status and alert when approaching the limit:

```python
def check_rate_limit_health(remaining, limit):
    """Check rate limit health and alert if necessary."""
    percentage_remaining = (remaining / limit) * 100

    if percentage_remaining < 10:
        print(f"🔴 CRITICAL: {percentage_remaining:.1f}% of requests remaining")
        # Send alert (email, Slack, etc.)
    elif percentage_remaining < 25:
        print(f"🟡 WARNING: {percentage_remaining:.1f}% of requests remaining")
    else:
        print(f"🟢 OK: {percentage_remaining:.1f}% of requests remaining")
```

## Production-Ready Rate Limit Tracking

Here's a complete example tracking rate limit usage over time:

```python
import csv
from datetime import datetime
import requests

class RateLimitTracker:
    def __init__(self, api_key, log_file="rate_limit_log.csv"):
        self.api_key = api_key
        self.log_file = log_file
        self._init_log_file()

    def _init_log_file(self):
        """Initialize CSV log file."""
        try:
            with open(self.log_file, 'a', newline='') as f:
                writer = csv.writer(f)
                # Check if file is empty
                f.seek(0, 2)
                if f.tell() == 0:
                    writer.writerow([
                        "timestamp", "endpoint", "status", "remaining",
                        "limit", "reset_time", "credits_used"
                    ])
        except FileNotFoundError:
            with open(self.log_file, 'w', newline='') as f:
                writer = csv.writer(f)
                writer.writerow([
                    "timestamp", "endpoint", "status", "remaining",
                    "limit", "reset_time", "credits_used"
                ])

    def make_request(self, endpoint, params):
        """Make API request and log rate limit status."""
        response = requests.get(
            f"https://api.pullbay.com/v1{endpoint}",
            params=params,
            headers={"Authorization": f"Bearer {self.api_key}"}
        )

        # Extract metadata
        remaining = response.headers.get("X-RateLimit-Remaining")
        limit = response.headers.get("X-RateLimit-Limit")
        reset = response.headers.get("X-RateLimit-Reset")

        data = response.json()
        credits_used = data.get("meta", {}).get("credits_used", 0)

        # Log to file
        with open(self.log_file, 'a', newline='') as f:
            writer = csv.writer(f)
            writer.writerow([
                datetime.now().isoformat(),
                endpoint,
                response.status_code,
                remaining,
                limit,
                reset,
                credits_used
            ])

        return data

# Usage
tracker = RateLimitTracker(api_key=API_KEY)
reviews = tracker.make_request(
    "/reviews/all",
    {"app_id": "com.example.app"}
)
```

## Frequently Asked Questions

<details>

<summary>What happens when I hit the rate limit?</summary>

You receive an HTTP `429 Too Many Requests` response with the error code `rate_limit_exceeded`. The `X-RateLimit-Reset` header tells you when the rate limit window resets (Unix timestamp). Implement backoff logic to retry after the reset time.

</details>

<details>

<summary>Does the rate limit reset every minute?</summary>

Yes. Pullbay uses a **rolling 60-second time window**. The oldest requests from the window drop off as time advances, making room for new requests. The `X-RateLimit-Reset` header shows the exact Unix timestamp when your current window resets.

</details>

<details>

<summary>Can I get a higher rate limit?</summary>

Yes:

1. **Upgrade your plan** in the Pullbay dashboard (self-service)
2. **Contact sales** for enterprise plans with custom limits beyond Scale tier

Upgrading takes effect immediately.

</details>

<details>

<summary>Are credits charged for rate-limited requests?</summary>

No. Failed requests due to rate limiting (`429` status) do not consume API credits. However, successful requests that are rate-limited do consume credits.

</details>

<details>

<summary>How do I know my current rate limit?</summary>

Check the `X-RateLimit-Limit` header from any API response, or view your plan details in the Pullbay dashboard.

</details>

<details>

<summary>Should I implement proactive throttling or reactive backoff?</summary>

* **Proactive throttling:** Best for predictable, steady load with variable traffic
* **Reactive backoff:** Simpler, works well for occasional bursts
* **Exponential backoff:** Best practice for production—handles both rate limits and transient server errors

</details>

<details>

<summary>Can I make burst requests above my rate limit?</summary>

No. Rate limits are strictly enforced per API key. Requests exceeding the limit receive `429` responses until the time window resets. Plan your request patterns to stay within your limit.

</details>

<details>

<summary>What's the relationship between rate limits and API credits?</summary>

They're separate systems:

* **Rate limits** control request frequency (requests/minute)
* **API credits** control total API usage (consumed per request, varies by endpoint)

Both constraints must be satisfied. You might hit rate limits first, or run out of credits first, depending on your usage.

</details>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pullbay.com/documentation/plans-and-usage/rate-limits.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.