With the recent and ongoing DDOS attacks against Github, many sites hosted as Github Pages have been scrambling to find alternative hosting. I began writing this tutorial long before the attacks but the configuration you find here is exactly what you need to serve static content from multiple providers like Github Pages, Divshot, Amazon S3, etc.
In my previous article, I introduced the major components of Cedexis and how they fit together to build great multi-provider solutions. If you haven’t read it, I suggest taking a quick look to get familiar with the concepts.
In this article, I’ll show you, step-by-step, how to build a robust multi-provider, multi-platform website using the following Cedexis components:
- Radar- provides real-time user measurements of provider performance and availability.
- Sonar- provides lightweight efficient service availability health-checks.
- OpenMix- DNS traffic routing based on the data from Radar and Sonar.
- Portal- UI for configuration and reporting.
I’ll also briefly cover some of the reports available in the portal.
OpenMix applications all have the same basic architecture and the same basic building blocks. Here is a diagram for reference:
For the purpose of the tutorial, I’ve built a demo site using the Amazon S3 and Divshot static web hosting services. I’ve already uploaded my content to both providers and made sure that they are responding.
Both of these services provide us a DNS hostname with which to access the sites.
Amazon S3 is part of the standard community Radar measurements, but as a recently launched service, Divshot hasn’t made the list yet. By adding test objects, we can eventually enable private Radar measurements instead.
Download the test objects from here:
I’ve uploaded them to a cedexis directory inside my web root on Divshot.
Configuring the Platforms
Platforms are, essentially, named data sets. Inside the OpenMix Application, we assign them to a service endpoint and the data they contain influences how Cedexis routes traffic. In addition, the platforms become dimensions by which we slice and dice data in the reports.
We need to define platforms for S3 and Divshot in Cedexis and connect each platform to their relevant data sources (Radar and Sonar).
Log in to the Cedexis portal here and click on the Add Platform button in the Platforms section.
We’ll find the community platform for Amazon S3 under the Cloud Storage category. It means that S3 performance will be monitored automatically by Cedexis’ Community Radar. You can leave the defaults on this screen:
After clicking next, we’ll get the opportunity to set up custom Radar and Sonar settings for this platform. We want to enable Sonar to make sure there are no problems with our specific S3 bucket which community Radar might not catch.
We’ll enable Sonar polls every 60 seconds (default) and for the test URL, I’ve put the homepage of the site.
We’ll save the platform and create another:
Divshot is somewhere in between Cloud Storage and Cloud Compute. It’s really only hosting static content so I’ve chosen the Cloud Storage category, but there is no real difference from Cedexis’ perspective. If they eventually add Divshot to their community metrics, it might end up in a different category.
Since Divshot isn’t one of the pre-configured Cloud Storage platforms, choose the platform “Other”.
The report name is what will show up in Cedexis charts when you want to analyze the data from this platform.
The OpenMix alias is how OpenMix applications will refer to this platform. Notice that I’ve called it divshot_production. That is because Divshot provides multiple environments for development, staging, and QA. In the future, we may define platforms for other environments as well.
Since there are no community Radar measurements for Divshot, we prepare private measurements of our own in the next step.
We are going to add three types of probes using the test objects which we downloaded above.
Click Add probe at the bottom left of the dialog to add the next probes.
In addition to the response time probe, we will add the Cold Start and Throughput probes to cover all our bases.
The Cold Start probe should also use the small test object.
The Throughput probe needs the large test object.
Make sure the test objects are correct in the summary page before setting up Sonar.
Configure the Sonar settings for Divshot similarly to those from S3 with the exception of using the homepage from Divshot for the test URL. Then click ‘Complete’ to save the platform and we are done setting up platforms for now.
A little bit of information about platforms. A nice thing about platforms is that they are decoupled from the OpenMix applications. That means that you can re-use a platform across multiple OpenMix applications with completely different logic. It also means you can slice and dice your data using application and platform as separate dimensions.
For example, if we had applications running in multiple datacenters, we would be interested to know about the performance and availability of each data center across all our applications. Conversely, we would also want to know if a specific application performs better in one data center than another. Cedexis hands us this data on a silver platter.
Our First OpenMix Application
Open the Application Configuration option under the OpenMix tab and click on the plus in the upper right corner to add a new application..
We’re going to select the Optimal RTT quick app for the application type. This app will send traffic to the platform with the best response time according to the information Cedexis has on the user making the request.
Define the fallback host. Note that this does not have to be one of the defined platforms. This host will be used in case the application logic fails or there is a system issue within Cedexis. In this case, I trust S3 slightly more than Divshot so I’ll configure it as the fallback host.
In the second configuration step, I’ve left the default TTL of 20 seconds. This means that users should check every 20 seconds to see if a different provider should be used to return requests. Once Cedexis detects a problem, the maximum time for users to be directed to a different provider should be approximately the same as this value.
In my experience, 20 seconds is a good value to use. It is long enough that users can browse one or two pages of a site before doing any additional DNS lookups and it is short enough to react to shorter downtimes.
Increasing this value will result in fewer requests to Cedexis. To save money, consider automatically changing TTLs via RESTful Web Services. Use lower TTLs during peaks, where performance could be more erratic, and use longer TTLs during low traffic periods to save request costs.
On the third configuration step, I’ve left all the defaults. The Optimal RTT quick app will filter out any platforms which have less than 80% availability before considering them as choices for sending traffic.
Depending on the quality of your services, you may decide to lower this number but you probably do not want it any higher. Why not eliminate any platform that isn’t reporting 100% available? The answer is that RUM measurements rely on the, sometimes poor quality, home networks of your users and, as a result, they can be extremely finicky. Expecting 100% availability from a high traffic service is unrealistic and leaving a threshold of 80% will help reduce thrashing and unwanted use of the fallback host.
Regarding eDNS, you pretty much always want this enabled since many people have switched to using public DNS resolvers like Google DNS instead of the resolvers provided by their ISPs.
Shared resolvers break assumptions made by traffic optimization solutions and the eDNS standard works around this problem, passing information about the request origin to the authoritative DNS servers. Cedexis has supported eDNS from the beginning but many services still don’t.
In the final step, we will configure the service endpoints for each of the platforms we defined.
In our case, we are just associating the hostname aliases that Amazon and Divshot gave us with the correct platform and it’s Radar/Sonar data.
In a more complicated setup, you might have a platform per region of a cloud and service endpoints with different aliases or CNAMEs across each region.
Pay attention that each platform in the application has an “Enabled” checkbox. This makes it easy to go into an application and temporarily stop sending traffic to a specific platform. It is very useful avoiding downtime in case of maintenance windows, migrations, or intermittent problems with a provider.
Choose the Add Platform button on the bottom left corner of the dialog to add the second platform, not the Complete button on the bottom right.
Define the Divshot platform like we did for S3 and click Complete.
You should get a success message with the CNAME for the Cedexis application alias. Click “Publish” to activate the OpenMix application right away.
Alternatively, clicking “Done” would leave the application configured but inactive. When editing applications, you will get a similar dialog. Leaving changes saved but unpublished can be a useful way to stage changes to be activated later with the push of a button.
Building a Custom OpenMix Application
The application we just created will work, but it doesn’t take advantage of the Sonar data that we configured. To consider the Sonar metrics, we will create a_custom_ OpenMix app and by custom, I mean copy and paste the code from Cedexis’ GitHub repository. If you’re squeamish about code, talk to your account manager and I’m sure he’ll be able to help you.
The specific code we are going to adapt can be found here (Note: I’ve fixed the link to a specific revision of the repo to make sure the instructions match, but you might choose to take the latest revision.)
We only need to modify definitions at the beginning of the script:
Let’s create a new application using the new script. Then we can switch back and forth between them if we want. We’ll start by duplicating the original app. Expand the app’s row and click Duplicate.
Leave the fallback and TTL as is on the next screen.
In the third configuration step, we’ll be asked to upload our custom application.
Choose the edited version of the script and click through to complete the process.
As before, publish the application to activate it.
Adding Radar Support to our Service
At this point, Cedexis is already gathering availability data on both our platforms via Sonar. Since we used the community platform for S3, we also have performance data for that. To finish implementing private Radar for Divshot, we must include the Radar tag in our pages so our users start reporting on their performance.
Copy the tag to your clipboard and add it in your pages on all platforms.
Before we go live, we should really test out the application with some manual DNS requests, disabling and enabling platforms, to see that the responses change, etc.
Once we’re satisfied, the last step is to change the DNS records to point at the CNAME of the OpenMix application that we want to use. I’ll set the DNS to point at our Sonar enabled application.
A useful service to check how our applications are working ishttps://www.whatsmydns.net/. This will show how our application CNAMEs resolve from location around the world. For example, if I check the CNAME resolution for the application we just created, I get the following results:
By and large, the Western Hemisphere prefers Divshot while the Eastern Hemisphere prefers Amazon S3 in Europe. This is completely understandable. Interestingly, there are exceptions in both directions. For example, in this test, OpenMix refers the TEANET resolver from Italy to Divshot while the Level 3 resolver in New York is referred to Amazon S3 in Europe. If you refresh the test every so often, you will see that the routings change.
Since this demo site isn’t getting any live traffic, I’ve generated traffic to show off the reports. First the dashboard which gives you a quick overview of your Cedexis traffic on login. Here we show that the majority of the traffic came from North America, and a fair amount came from Europe as well. We also see that, for the most part, the traffic was split evenly between the two platforms.
To examine how Cedexis is routing our traffic, we look at the OpenMix Decision Report. I’ve added a secondary dimension of ‘Platform’ to see how the decisions have been distributed. You see that sometimes Amazon S3 is preferred and other times Divshot.
To figure out why requests were routed one way or the other, we can drill down into the data using the other reports. First, let’s check the availability stats in the Radar Performance Report. For the sake of demonstration, I’ve drilled down into availability per continent. In Asia, we see shoddy availability from Divshot but Amazon S3 isn’t perfect either. Since we didn’t see much traffic from Asia, this probably didn’t affect the traffic distribution. Theoretically, a burst of Asian traffic would result in more traffic going to Amazon.
In Europe, Divshot generally showed better availability than Amazon, reporting 100% except for a brief outage.
In North America, we see a similar graph. As to be expected, the availability of Amazon S3 in Europe is lower and less stable in North America. Divshot shows 100% availability which is also expected.
It’s important to note that the statistics here are skewed because we are comparing our private platform, measured only by our users to the community data from S3. The community platform collects many more data points in comparison to our private platform and it’s almost impossible for it to show 100% availability. This is also why we chose an 80% availability threshold when we built the OpenMix Application.
Next let’s look at the performance reports for response times of each platform. With very little traffic from Asia, the private performance measurements for Divshot are pretty erratic. With more traffic, the graph should stabilize into something more meaningful.
The graph for Europe behaves as expected showing Amazon S3 outperforming Divshot consistently.
The graph for North America also behaves as expected with Divshot consistently outperforming Amazon S3.So we’ve seen some basic data on how our traffic performs. Cedexis doesn’t stop there. We can also take a look at how our traffic could perform if we add a new provider. Let’s see how we could improve performance in North America by adding other community platforms to our graph.
I’ve added Amazon S3 in US-East, which shows almost a 30ms advantage on Divshot, though our private measurement still need to be taken lightly with so little traffic behind them. Even better performance comes from Joyent US-East. Using Joyent will require us to do more server management but if we really care about performance, Cedexis shows that it will provide a major improvement.
To recap, In this tutorial, I’ve demonstrated how to set up a basic OpenMix application to balance traffic between two providers. Balancing between multiple CDNs or implementing a Hybrid Datacenter+CDN architecture is just as easy.
Cedexis is a great solution for setting up a multi-provider infrastructure. With the ever-growing Radar community generating performance metrics from around the globe, Cedexis provides both real-time situational awareness for your services operational intelligence so you can make well-informed decisions for your business.