I worked with a project team to help them to improve their current VPN infrastructure on AWS. They have 3 VPN EC2 instances, let’s call them VPN01, VPN02 and VPN03. They are all OpenVPN Access Server, VPN01 and VPN02 both have 10 concurrent sessions license, and in availability a and b respectively. VPN03 only has the 2 complimentary concurrent session license, and it is availability zone c (it is mostly for emergency use, e.g both AZ-a and AZ-b go down). There is a DNS round robin setting and all three instances have the same configurations, so the end user can dial in any of them. Here are the configuration files:
They just renewed the license, so I have to stick with the current license-based AMI. Otherwise I will use the hourly-rated OpenVPN AMI with ELB and Autoscaling group. As VPN01 and VPN02 have more license, the solution need to make most users to use those two instances. And if the VPN service is not working properly on one instance, the solution needs to divert the user to the healthy instance.
With the requirements in mind, here is my design:
I guess the architecture diagram is self-explanatory. Below are some brief description of how I implemented it:
- Setup weighted DNS CNAME records for vpn.mydomain.com of vpn01.mydomain.local (weight 45), vpn02.mydomain.local (weight 45) and vpn03.mydomain.local (weight 10). So there are 45% chances the traffics go to either vpn01 or vpn02, only 10% go to vpn03.
- Setup DNS health check for each vpn(01|02|03).mydomain.local. As OpenVPNAS is SSL VPN, we only need to monitor the port 443.
- Create a new SNS topic, Let’s name it to vpn_healthcheck.
- Configure the alarm notification target to a new SNS, so a notification will be sent to SNS if the health check failed.
- Let’s work on the Lambda function. Firstly you need to setup a role for the function to perform the start or reboot operations. Here is a sample code. Secondly, set up a SNS trigger type Lambda function. I use Python, and here is the source code.
- Go back to the SNS that is created in step 3, and subscribe it with your email. And subscribe for the Lambda function as well.
- Testing time – stop the openvpnas service on one of the VPN instance. And wait for 1-2 minute, the instance will be reboot by the Lambda function.Check the Lambda function log:
Hope you find it is useful for you. All sample codes can be found in my Github repo.