Practical DevOps

Using AWS to cut your monthly data transfer costs in less than 10 minutes

Using AWS Cloudfront to reduce the cost of your other expensive SaaS services

Niclas Gustafsson
ITNEXT
Published in
5 min readJun 20, 2020

--

Are you managing your traffic flows efficiently? (Photo by Talen de St. Croix on Unsplash)

It can be expensive to scale out, here’s how you can save a bit of money reducing your 3rd party transfer costs

TL;DR;

Reducing SaaS transfer costs using AWS Cloudfront with a Origin Request Lambda function. With just a little code you can create a proxy that takes load of your 3rd party services. If there’s a large enough gap between the transfer pricing of the two services, this might save you some money.

Some background…

Ok, well you might not be saving thousands of dollars unless you are running a lot of traffic. We did however manage to save a couple of hundred dollars per month and we have a modest amount of traffic. We’re using a service that we really enjoy for image manipulation. This particular service has a pricing model that is based on a few things. Some of them make sense and is easy to understand. Other things, well, maybe a bit harder.

Transfer costs was one of those costs that we thought about how we could optimize since we recently had seen a surge due to increased use from our users. My colleague figured out an easy, yet brilliant solution that I implemented in just a couple of hours.

Of course service providers like the one we used need to cover their costs for growing traffic usage. But if the price markup is substantially higher than that of the below alternative, well then it might just make sense to try this out.

Disclaimer: I urge you to do your own calculation and not just assume you’ll save, there are some parameters that are unique for each scenario. And I don’t know your application so don’t send me your AWS bill in case this doesn’t go as expected. 😁

Caching

I spent some fair amount of time working in this problem domain, optimising traffic flows with different caching techniques. Most people that manage web sites that handles any kind of serious load sooner (preferrably) or later end up needing to implement different kinds of caching techniques.

ID 85153447 © Tracy Hebden | Dreamstime.com

Caching is implemented in layers, from the specific application aware innermost parts to the outermost layer that, in our case, is the CDN that caches the full response from the application stack. In this post we focus of using the AWS Cloudfront CDN to cache responses from external deliveries, from a third party SaaS service.

AWS Cloudfront has a powerful feature in defining functions that help you control the logic of the service. There are four different points in the life cycle of cloudfront where you can manipulate the request/response flow.

Source: AWS Documentation

In this post I’ll show how to make a Origin request Lambda function in node.js that makes you able to transparently proxy requests towards your external SaaS service.

Setting up AWS Cloudfront CDN

For brevity we just use an out-of-the box cloudfront CDN. For any kind of real use, you’d probably want to use your own domain, your own SSL certificate.

Spinning up a new cloudfront distribution is easy, one caveat is the class to use. The price is going to change a bit regarding if you want to have local presence in all regions or not. We’ll choose the cheapest (price class 100) since our user base is mainly in the US and Europe. That doesn’t mean that we won’t serve users from other regions, only that they will have a slightly longer round-trip and the end user will request the information from a edge server in one of the regions that are included in this specific price class

Default settings should be good for a proof-of-concept. Maybe you need to change:
Query String Forwarding and Caching: Forward all, cache based on all
If you send query parameters to your service provider.

For this test distribution we don’t use any other origin, but since cloudfront insists that you supply an default origin, I’d recommend that you point it towards an empty S3 bucket for the default content. Configuring this for production or in an existing staging / test environment, you’d probably already have an existing Cloudfront distribution that you’ll use.

Cloudfront origin request lambda function

With the cloudfront distribution set up, we head over to the AWS Lambda section.

All Cloudfront lambda functions needs to reside in the us-east-1 region so switch over to this region and create a new node.js 10 Lamba function:

Save and publish a new version. This step is important because you can only reference published versions of lambda functions to be used in a Cloudfront distribution ($LATEST does not work).

Copy the published ARN: I.e.
arn:aws:lambda:us-east-1:371199999999:function:cdn-proxy:1

Associating Lambda with Cloudfront Behaviour

🔙Back to Cloudfront settings: Now we need to create a new behaviour that will execute our Lambda function. You need to decide on a prefix that makes sense to you and your existing application if you share the distribution. For this example let’s go with /pfix/. This needs to match the logic in the Lambda function above, as we will cut it out before forwarding the request to the origin server (the SaaS provider).

Make sure that the IAM Role for your lambda function also includes the necessary rights to be able to execute. (Add the edgelamba.amazon.com trust relationship)

There you have it, now you should be able to cache the content from your provider locally in Cloudfront and only re-request if the cache needs to refresh or if it gets evicted from Cloudfront for some reason. Just replace all your links to your supplier for the link to Cloudfront and prefix the request with the prefix you chose above:

A request to <yourdistibutionid>.cloudfront.net/pfix/ResourcesFromSaaS/ will request the URL (expect the /pfix/) from your SaaS provider and then save it in Cloudfront for the next request ( which will not hit your provider) and therefor save on the billing for transfer costs.

--

--