Rate limits

Scan to Pay does not currently enforce per-second or per-minute request limits at the API layer. There's no 429 Too Many Requests status to plan around. That doesn't mean "send as fast as possible" — there are several behavioural limits worth knowing about, and aggressive callers will see degraded performance and may trigger fraud blocks.

This page covers what's actually rate-controlled, the cadence guidelines you should follow, and how to be a good citizen of the platform.

What's rate-controlled today

Limit	Value	Where it's enforced
`queryRef` polling cadence	No faster than once per 5 seconds per transaction	Recommended in docs, not technically enforced. Aggressive polling will degrade your experience and waste resources
Webhook acknowledgement window	45 seconds to return HTTP 200	Enforced by the reversal scheduler in Webhooks. Missing the window reverses the transaction
Transaction velocity / fraud rules	Per-card and per-MSISDN limits, configurable per merchant	Enforced by the Sky risk engine. Surfaces as `END_BLACKLISTED`, `END_FRAUD_DETECTION`, or `END_TX_RATE_BLOCKED` — see Transaction states
Daily / monthly basket limits	DAILY AIRTIME: R1000, DAILY AIRTIME_BUNDLE: R2000, MONTHLY AIRTIME: R2000, MONTHLY AIRTIME_BUNDLE: R5000 (defaults)	Enforced when `cartItems` contain `AIRTIME` or `AIRTIME_BUNDLE`. Failure surfaces as `LIMIT_FAILED`
JWT token lifetime	Set by the auth service — see the `expires` field returned with the token	Cache the token; refresh before its `expires` timestamp
Sandbox challenge / login window	600 seconds (sandbox), 60 seconds (production)	Enforced by the auth service for the PKI challenge–response flow

Cadence guidelines

These aren't technically enforced but represent the right way to use the API.

Polling

Don't poll faster than every 5 seconds for the same transaction. The state machine doesn't change faster than that in any common flow.
Stop polling once you reach a terminal state. Any END_* state is final.
Stop polling once you've received a webhook. Polling and webhooks are mutually exclusive per merchant in any case — see Webhooks.
Cap your total polling duration. Most transactions reach a terminal state within 30 seconds. After ~2 minutes with no terminal state, log and stop; ping support if it's persistent.

Bulk operations

Don't send 1,000 createCode requests in parallel. Backpressure your batch operations to a few concurrent requests at a time. The platform handles concurrency fine, but your bank-side acquiring relationships may have separate per-merchant throttles.
Spread out non-urgent batch work (reconciliation queries, bulk QR generation) into off-peak hours where possible.

Retries on failure

Use exponential backoff for 5xx errors: 1s, 2s, 4s, 8s, 16s, then give up.
Don't retry 4xx errors except 516 INTERRUPTED. Validation errors won't change on retry.
Always retry with the same merchantReference so the platform recognises the retry as idempotent. See Idempotency.

Webhook handlers

Acknowledge first, process later. Return HTTP 200 inside 45 seconds, then do your business logic on a background worker. See Webhooks for the pattern.

What happens if you're abusive

Scan to Pay doesn't return 429s today, but consistently abusive behaviour gets noticed and acted on:

Fraud rules will flag your traffic. The Sky risk engine takes patterns into account. Sustained high-volume retries against a single card or MSISDN will trigger END_FRAUD_DETECTION or END_BLACKLISTED.
Support will reach out. If your traffic pattern is causing operational problems, the on-call team will contact you to ask you to throttle.
Your merchant can be temporarily suspended in extreme cases.

If you have a legitimate need to send high-volume traffic (e.g. a bulk QR generation job, batch onboarding), email [email protected] ahead of time so it doesn't look like abuse.

If your application needs to throttle itself

You probably don't need to today, but a defensible pattern for the future:

Pre-allocate a token bucket with a generous rate (say, 50 requests/sec per merchant).
Apply backpressure at your application layer rather than relying on the platform to reject.
Surface latency metrics in your monitoring so you'd notice if the platform's response times start climbing.

When/if Scan to Pay introduces formal rate limits (returning 429 with Retry-After), the response will be standard and your handler will work without changes — the right defensive pattern is to assume the limits might exist and handle 429 gracefully even though you don't see it today.

async function callWithRetry(fn, maxAttempts = 5) {
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (err) {
      if (err.status === 429) {
        const wait = parseInt(err.headers['retry-after'] || '5', 10) * 1000;
        await sleep(wait);
        continue;
      }
      if (err.status >= 500 && err.status < 600) {
        await sleep(2 ** attempt * 1000); // exponential backoff
        continue;
      }
      throw err; // 4xx other than 429 — don't retry
    }
  }
  throw new Error('Max retries exceeded');
}

What's next

Idempotent retry pattern → Idempotency
Webhook handler timing requirements → Webhooks
Error codes for risk and velocity blocks → Transaction states, Errors
Limit configuration for AIRTIME and AIRTIME_BUNDLE baskets → Merchant onboarding