Some developers saw their Twitter bots begin replying multiple times to every message yesterday. We apologize for that and are working to prevent it from happening again in the future.
For those interested in the technical details of the issue, here’s what happened.
Our application has an exemption from Twitter’s normal API rate limits. Where normally users are only able to make 150 requests per hour to the API, each of our servers are able to make 20,000 requests per hour, and requests made from our servers to a Twitter ID do not count toward that Twitter ID’s normal 150 per hour rate limit. This exemption allows us to respond to messages within moments of them being sent to your Twitter account as well as allowing us to manage a large number of Twitter accounts.
Because the number of requests we need to make to process your messages varies depending on how many messages are sent to your bot and how many messages you send from your bot to other people, we constantly re-evaluate the API rate limit status of each server. This ensures that we always have enough remaining connections to reply to messages. Our system checks with Twitter to see how many API calls are remaining and how long before the rate limit count is reset. We then increase or decrease the pace of our API calls to fit within the remaining limit.
Our normal workflow is to check the rate limit, adjust our API pace, then connect to every Twitter account we’re monitoring, download new messages, and post replies. Then we start the process again.
Unfortunately an issue with Twitter’s API exposed a race condition that we hadn’t anticipated. For those unfamiliar with race conditions, they occur when two processes attempt to do the same thing at the same time or when they both attempt to modify the same data. Each process is unaware that there’s another process doing the same thing, resulting in inconsistent and unpredictable results.
Yesterday, Twitter increased our rate limit unintentionally. Instead of each server making 20,000 requests per hour, we suddenly were allowed to make 20,000 requests per second. Our code that constantly adjusts the pace of our API calls performed it’s magic and sped up our system. Instead of processing your messages every few seconds, we started processing them several dozen times per second. Often, in an attempt to keep up, our system would attempt to process the same message multiple times simultaneously, causing a race condition. This resulted in your replies being posted several times. Sometimes another race condition occurred, causing us to re-process the same messages we’d already downloaded earlier in the day. These two issues combined to post your replies several times, sometimes over the course of several hours.
Some users saw this compounded by an unrelated issue. One of our servers went offline and when it came back, the Twitter process did not properly restart. This prevented us from processing messages for some Twitter accounts. When we restarted it, some of these accounts had a backlog of messages to process. The restart happened after the race condition issue appeared, resulting in this backlog of messages often getting processed multiple times. So instead of just a couple of messages getting multiple replies, many messages were replied to multiple times.
We’re taking several steps to prevent this from re-occurring. We’ve added some sanity checks rate limit logic to ensure that we don’t hit the Twitter API at an unreasonable pace, even if Twitter tells us it’s okay to do so. We’re in the process of adding safeguards that will guarantee that multiple processes don’t attempt to fetch and reply to the same account at the same time. To prevent the race condition from happening again while we work to implement these safeguards, we’ve placed an artificial cap on the volume of our API requests. This may cause your Twitter bots to respond more slowly than normal until we finish fixing the issue. We anticipate releasing the fix and lifting our self-imposed cap within a few days.