🔥Let’s Do DevOps: Building an API Token Expired Circuit Breaker
This blog series focuses on presenting complex DevOps projects as simple and approachable via plain language and lots of pictures. You can…
This blog series focuses on presenting complex DevOps projects as simple and approachable via plain language and lots of pictures. You can do it!
Hey all!
I recently had to create 60k auto-link references in Jira (link to the story), and I immediately ran into an issue — a GitHub PAT (Personal Access Token) is given 5,000 “tokens” per hour. A “token” is a budget of the API calls that can be issued to the server which will be honored. More than that will fail.
API budgets are a concept established to help avoid DoS (Denial of Service) attacks where tens or hundreds of thousands of calls are sent to a service in order to destabilize it.
Well, sending 60k requests to GitHub means I’m spending 60k tokens, 12 times the value I get per hour. My local script runs quite a bit faster than that, uh oh. And many of the other requests I’m sending, like opening PRs and issuing comments on them, also consume tokens — so how can I tell when my API token budget is consumed? How can I tell when token budget has been refilled and I can continue?
That’s where a “circuit breaker” comes in. That concept is borrowed from electrical engineering where it means to detect when more current than is safe passes through the circuit breaker, and when that happens it immediately disconnects the circuit so nothing downstream of it can be fried. In this context, it means we’ll monitor our API token budget, and establish a wait timer until our budget is refilled.
Let’s do it!
Establish the Circuit Breaker
A circuit breaker in this context is a check that won’t continue until a condition is satisfied. We might want to call this circuit breaker lots of times, so let’s put it in a function called hold_until_rate_limit_success
.
And then we build one of the “you shouldn’t do this” loops, a while true
, which means our loop will continue forever until a command issues break
. These are generally not advised because a misconfiguration could lead to a loop that goes on forever. We’ll keep our function concise and simple in order to be as safe as possible.
# Check if hitting API rate-limiting | |
hold_until_rate_limit_success() { | |
# Loop forever | |
while true; do |
Then we need a way to check our API budget.
Checking Our API Token Budget
Our API token budget is included in the header GitHub sends back to us on almost every API request. The header name is called x-ratelimit-xxxx
. In the following picture you can see an example response — it looks like I have a limit of 5000
API tokens, which makes sense — that’s the default for a PAT, which I’m using to make this call.
Notably, GitHub is unable to increase the limit of API tokens on PATs, even for enterprise accounts.
The one we’re interested in is the next line, x-ratelimit-remaining
, which is the number of API tokens remaining. Interstingly, submitting any request consumes an API token, so simply asking how many tokens remain uses 1 token.
Because I was already working on repo autolinks, I used that as the call to find my API tokens. But again, it’s provided for most any API call. On line 5 at the end, you can see I’m putting the stderr (the curl verbose output) to stdout, and throwing away the existing stdout (the actual json response). I don’t care about the response at all, I only care about the http headers.
Then on line 6, we filter for just the header lines we care about, and on line 7, cut on the space character and grab the third field, which is just the number of tokens remaining. Then we have some syntax cleanup — on line 8 we use the xargs
function to strip any whitespace from the string, and line 9 we remove any carriage return characters using the tr
command.
API_RATE_LIMIT_UNITS_REMAINING=$(curl -sv \ | |
-H "Accept: application/vnd.github+json" \ | |
-H "Authorization: Bearer $GITHUB_TOKEN" \ | |
-H "X-GitHub-Api-Version: 2022-11-28" \ | |
https://api.github.com/repos/$GH_ORG/$GH_REPO/autolinks 2>&1 1>/dev/null \ | |
| grep -E '< x-ratelimit-remaining' \ | |
| cut -d ' ' -f 3 \ | |
| xargs \ | |
| tr -d '\r') |
Break the Circuit
Okay, we’ve checked our API token wallet, let’s see if we should keep the circuit broken (loop, keep checking token wallet), or if we should connect the circuit (break the loop, let the program continue).
To do that, we’ll use the arithmetic
context in bash, indicated by (())
brackets, instead of square brackets like this [[]]
. I decided for entirely arbitrary reasons, that if we have less than 100 API tokens remaining, we should pause and wait until our token budget is refreshed. This avoid us having an entirely empty wallet, and some other processes being unable to run. On line 3, if our token wallet has less than 100 tokens in it, we sleep for 60 seconds and the loop continues, which means we’ll check our token wallet contents and come right back here again.
The sleep is very important! If you don’t sleep, it’ll immediately
loop and check the wallet, which means it could check the wallet’s contents several times per second, which is wasteful and potentially DoS-ing.
On line 8, we have the other if
logic branch — if we have 100 or more tokens in our wallet, we can break out of our loop and continue the program. We issue a break
on line 10 to break the while true
forever loop, and we’re off to the races.
# If API rate-limiting is hit, sleep for 1 minute | |
# Rounded parenthesis are used to trigger arithmetic expansion, which compares more than the first numeric digit (bash is weird) | |
if (( "$API_RATE_LIMIT_UNITS_REMAINING" < 100 )); then | |
echo "ℹ️ We have less than 100 GitHub API rate-limit tokens left, sleeping for 1 minute" | |
sleep 60 | |
# If API rate-limiting shows remaining units, break out of loop and function | |
else | |
echo ℹ️ Rate limit checked, we have "$API_RATE_LIMIT_UNITS_REMAINING" core tokens remaining so we are continuing | |
break | |
fi |
Whole Function
And that’s kind of it! That function can be called whenever your processing is looping at — for instance, if you’re iterating over every repo in your Org, every time you loop you could check your token wallet — that’s how I’m using it. If I don’t have enough tokens to continue, we pause until the token wallet is refilled.
Here’s the whole function for easy copy and pasting:
hold_until_rate_limit_success() { | |
# Loop forever | |
while true; do | |
# Any call to GitHub returns rate limits in the response headers | |
API_RATE_LIMIT_UNITS_REMAINING=$(curl -sv \ | |
-H "Accept: application/vnd.github+json" \ | |
-H "Authorization: Bearer $GITHUB_TOKEN" \ | |
-H "X-GitHub-Api-Version: 2022-11-28" \ | |
https://api.github.com/repos/$GH_ORG/$GH_REPO/autolinks 2>&1 1>/dev/null \ | |
| grep -E '< x-ratelimit-remaining' \ | |
| cut -d ' ' -f 3 \ | |
| xargs \ | |
| tr -d '\r') | |
# If API rate-limiting is hit, sleep for 1 minute | |
# Rounded parenthesis are used to trigger arithmetic expansion, which compares more than the first numeric digit (bash is weird) | |
if (( "$API_RATE_LIMIT_UNITS_REMAINING" < 100 )); then | |
echo "ℹ️ We have less than 100 GitHub API rate-limit tokens left, sleeping for 1 minute" | |
sleep 60 | |
# If API rate-limiting shows remaining units, break out of loop and function | |
else | |
echo ℹ️ Rate limit checked, we have "$API_RATE_LIMIT_UNITS_REMAINING" core tokens remaining so we are continuing | |
break | |
fi | |
done | |
} |
Summary
In this write-up, we figured out how to check our API token wallet, and how to isolate that value in a single variable wrapping a curl
call. Then we put that call into a bash function that’ll loop forever using while true
until the token wallet has enough tokens to continue.
That’ll let us process huge workloads without hitting rate-limiting! Heck yeah.
Good luck out there!
kyler