Have your http calls ever been throttled by AWS? If not, congratulations! And if you are curious about what it looks like, here you go:
Below is what AWS says about the EC2 request rate limits. The same principal applies to other AWS services, e.g Autoscaling group, in which I encountered the rate exceeded problem.
We throttle Amazon EC2 API requests for each AWS account on a per-region basis to help the performance of the service. We ensure that all calls to the Amazon EC2 API (whether they originate from an application, calls to a command line interface, or the Amazon EC2 console) don’t exceed the maximum allowed API request rate. The maximum API request rate may vary across regions. Note that API requests made by IAM users are attributed to the underlying AWS account.
The Amazon EC2 API actions are divided into the following categories:
- Describe actions, such as
DescribeVolumes. These requests simply retrieve cached data, so they have the highest request limit.
- Modify actions, such as
CreateVolumes. These requests create or modify resources, so they have a lower request limit than describe calls.
RevokeSecurityGroupIngressactions. These requests take the most time and resource to complete, so they have the lowest request limit.
If an API request exceeds the API request rate for its category, the request returns the
RequestLimitExceedederror code. To prevent this error, ensure that your application doesn’t retry API requests at a high rate. You can do this by using care when polling and by using exponential backoff retries.
I wrote a script to query the logs of each autoscaling group that uses spot instance, to find out their sir (spot instance request) ID, then check the sir ID status to ensure it has been fulfilled. It works partially, as rest of the describeScalingActivities request will be denied once the calls exceeds the rate.
To raise the rate limit, you have to give a very good reason to AWS. Otherwise, they will turn you down. In some way, it is a good thing as it forces you to think so to improve your approach. At least this is what happens to me.
I added a logic into the script to only check the logs of the autoscaling groups that if its desired targets are greater than the total instance numbers. As there is a better chance that is caused by the spot instance request can not be fulfilled. It works as a charm!