Akamai-Bot

Tags

, , ,


I received good feedbacks since shared the SSSG-Ninja in Akamai community, so I decided to share another useful tool that I ever built before.

Akamai-Bot is a Hubot based automation bot that allows users to perform some Akamai daily tasks simply by chatting.

Here are some examples. If you are interested, here is the git repo. The docker image is available too.

error.png

purge.png

 

SSSG Ninja

Tags

, , ,


SSSG Ninja is my new open source project – It is a all-in-one managemenet tool for SSSG (Site Shield Security Group), it not only makes recommendations but also can do the jobs for you. If you are interested to try, it can be found in my Github repo.

Here are current supported features:

  • Make recommendations based on health check
  • Add missed site shield cidr to security groups
  • Add new site shield cidr to security groups
  • Remove obsolete site shield cidr from security groups
  • Check the security group limits
  • Check site shield map information
  • Search cidr in security groups
  • Acknowledge site shield change
  • Debug mode logging

AWS security group limits Q&A

Tags

,


Here are a few questions that I asked AWS regarding the security group limits and their answers. Just like to share it with more people here:

Screen Shot 2016-11-24 at 8.59.19 AM.png

1) Q: By default, it is 50 limit for both inbound and outbound (giving 100 rules in total). Is it possible to set a different limit to inbound and outbound. For example, 80 limit for inbound, 20 limit for outbound (still giving a total 100 combined rules).

A: Unfortunately no, inbound and outbound traffic are processed separately, therefore the limit is set for both of them separately, the limits for inbound and outbound rules are always the same, when you update the limit, it applies to both inbound and outbound.

2) Q: Is the limit a global setting? or it can be set on a particular security group?

A: The limit is a regional setting, it will apply to all the security groups in the same region.

3) Q:  If it is global settings, will it impact the existing security groups? For example, I decrease the inbound limit to 30, but there is already a security group with 40 inbound rules. What will happen?

A: Before applying to decrease the inbound limit to 30, you need to make sure you don’t have any security groups at the moment which have more than 30 rules. If you have security groups with 40 rules, the change cannot be made, you’ll be asked to delete/modify the security groups which do not meet the requirement.

4) Q: Each NIC has maximum 250 rules (the multiple of the limit of security group per NIC and the limit of the rules per security group). Is it a global setting as well? If so, will the change impact the existing ones which violate the limits.

A: The maximum 250 rules limit is a global hard limit, exceeding the 250 rules per interface limit can have a negative impact on performance, not only for your instances, but also for any other customers’ instances running on the same underlying hardware. And similar to what has been discussed in question 3, you have to make sure all your existing security groups do not violate the limit after the change before applying for the change on limits. I hope this helps, please do feel free to come back to us at any time if you need further assistance.

 

 

Elasticache Redis Unreachable Issue

Tags

, ,


We have a Elasticache Redis replication group, it has two nodes: one primary and one replica. Last week, we noticed that the primary redis node suddenly stops working – any connections to the primary node timed out eventually.

According to the log, there was a load burst and following that the redis reboot itself.

1.png

2.png

Unfortunately, the redis node stops responding after that. The weird thing is the replication between nodes still works. So I promoted the original replica to primary, and login into it. The ‘role’ or ‘info’ commands tells me the replication is working fine, and the slave ip is 10.0.x.x. Ah, that’s interesting as my VPC network is 172.31.x.x. So it means there is something wrong with the instance’s 172.31.x.x NIC. Contacted AWS and their service team restart that NIC, then things are back to normal.

The ironical thing is that the AWS console still shows everything is green while the the 172.31.x.x NIC is not functional. Looks to me that AWS only monitor their internal network (in this case it is 10.0.x.x network). I have submitted a feature request to suggest them to improve the monitoring.