Skip to content

fix: bounded retry with exponential backoff for 403 throttle responses#1872

Open
jlima8900 wants to merge 5 commits intoKeeper-Security:masterfrom
jlima8900:fix/throttle-retry-backoff
Open

fix: bounded retry with exponential backoff for 403 throttle responses#1872
jlima8900 wants to merge 5 commits intoKeeper-Security:masterfrom
jlima8900:fix/throttle-retry-backoff

Conversation

@jlima8900
Copy link

Summary

execute_rest() retries 403 throttled responses every 10 seconds with no maximum retry count. This causes Commander to hang indefinitely when throttled, and the fixed 10-second interval prevents the server's cooldown from expiring — creating a permanent throttle lock-in.

Changes

  • Add max retry count (3 attempts) before raising KeeperApiError
  • Parse the server's "try again in X minutes/seconds" message to determine wait time
  • Use exponential backoff (30s, 60s, 120s) capped at the server's suggested cooldown
  • Log throttle attempts as WARNING instead of DEBUG for visibility
  • After max retries, raise KeeperApiError so callers can handle it programmatically

Behavior

Before After
Retries every 10s forever 3 retries with exponential backoff, then error
DEBUG log only WARNING log with attempt count
10s interval resets server's 1-min cooldown Respects server's indicated wait time
No way for callers to handle throttle Raises KeeperApiError after max retries

The --fail-on-throttle flag continues to work as before (immediate error, no retries).

Test plan

  • Verify normal (non-throttled) requests are unaffected
  • Verify throttled requests retry 3 times with increasing delay
  • Verify KeeperApiError is raised after max retries
  • Verify --fail-on-throttle still skips retries entirely
  • Verify server's "try again in X" message is parsed correctly

The execute_rest() function previously retried throttled (403) responses
every 10 seconds with no maximum retry count. This caused Commander to
hang indefinitely when throttled, and the 10-second retry interval
prevented the server's cooldown timer from expiring.

Changes:
- Add max retry count (3 attempts) before raising KeeperApiError
- Parse the server's "try again in X minutes/seconds" message
- Use exponential backoff (30s, 60s, 120s) capped at server's suggestion
- Log throttle attempts as warnings instead of debug
- After max retries, raise KeeperApiError so callers can handle it

The --fail-on-throttle flag continues to work as before (immediate error).
…ates

Validated test suite scans 445 Commander Python files across 10 categories.
Two-pass validation eliminates 96% false positives.

Confirmed findings:
- KC-001: AWS creds written without chmod 600 (awskey plugin)
- KC-002: SSH plugin logs passwd command output (5 instances)
- KC-003: AD plugin logs LDAP connection results (4 instances)
- KC-004: Record edit logs password operation details
- KC-005: Example credentials in help text

Not confirmed: SQL injection (parameterized), subprocess injection
(internal input), unbounded retry (fixed by PR Keeper-Security#1872).
min() would wait LESS than server's suggested cooldown, defeating
the purpose. max() ensures we wait at least the server's directive
or the exponential backoff, whichever is longer.

Also moves 'import re' to module level instead of inside the
throttle handler.
Without a cap, 'try again in 49 minutes' causes 49 min per retry
(2.5 hours total). Cap at 300s limits worst case to 15 minutes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant