Forecasting your AWS costs and workloads 🌦

Photo by NOAA on Unsplash

So you’ve landed that first seed round and now you need to scale. How do you forecast your costs?

AWS Activate — avoid that big bill!

If you’re eligible for Activate then you can save yourself a lot of money. If you’ve already gotten seed funding then you will be eligible for this. The credits are up to $100,000 USD. It’s often a shock for founders once the credits run out and they get a big bill from Amazon a couple of years later! See my other post on “Scaling your startup” https://link.medium.com/7rsCHkNaJkb for an approach on this transition.

For more information on AWS Activate read 👀 here:

https://aws.amazon.com/activate/

AWS Free Tier

The reason to check this out is to see at what point you start getting charged and for what. Always refer back to it when budgeting. Sometimes it’s a per hour charge and sometimes it’s per transaction or data volume.

https://aws.amazon.com/free/free/?all-free-tier.sort-by=item.additionalFields.SortRank&all-free-tier.sort-order=asc&awsf.Free%20Tier%20Types=*all&awsf.Free%20Tier%20Categories=*all&c=nhp&z=2&awswt=203b

Budgeting tool — AWS calculator

https://calculator.aws/#/

This is fairly simple to use but it assumes that you have defined what your future architecture will be, how many users, and therefore how many of what type of transaction, data volumes etc you are likely to need to support.

- Select My Estimate and — then select the products that will be in your architecture and configure the transactions and regions, etc and then click add and that should give you an overall estimate once you have added in all the products, VM’s, etc Once you have done that, you can set up budget limits — see below. All you can do with estimates is assume they are exactly that and that your bills might turn out differently, so I would add a cost tolerance I would say normally 10% but it depends on your confidence level. I would draw up the architecture, or if you’re not starting from scratch and already have workloads running use https://aws.amazon.com/solutions/implementations/aws-perspective/ which can be handy once you start running hundreds of containers or other complex architectural ecosystems.

To set up alerts and also to set budget limits you can do this within the root account.

Need to click on Budgets and then Cost Explorer — you will find this under My Billing Dashboard you can set alerts as well as cost thresholds and emailed billing, etc make sure you also have ticked any checkboxes to automated notifications via email as well.

Highly Available or Fault-Tolerant?

Well, now you need to think a bit 🤔….💭

People often confuse these two things but they are not the same… its also important to define at an early scaling stage. If you’re running financial services or critical care medical services there is NO WAY you will be anything other than fault-tolerant. The implications of anything else are serious. But if you’re just selling ad services or something is it really the end of the world if your services are interrupted? How much will that cost you? Loss of business is clearly a big deal but litigation and possible serious harm is something else of course.

It is important to do at least some rough calculations for business interruption and how much lost business you may or may not be prepared to tolerate and what your Insurer may or may not cover. People so often assume things like this are covered when they are unfortunately not.

Oh My God! it's a Disaster…😱

Effective Disaster Recovery and Business Continuity costs money regardless of your process for handling this. You will have some costs on a monthly or yearly basis.

“But it’s so unlikely something like that would happen” But unlikely things do and can happen more often than you want or think. One SaaS business I worked for used to maintain their own hardware and when one day the server boxes caught light well the staff were mostly happy for the extra day off whilst it got fixed but the angry customers were fairly unimpressed and let just say it was an expensive couple of days never to be repeated. It shouldn’t take an event like that for you to consider your DR seriously. But if you don’t have any background knowledge of IT infrastructure — many startups and certainly many developers don’t — it may not be obvious to you.

Make sure you have a plan, cost in your staff time to execute continuity plans, and make sure you practice this.

Approaches vary dependent on your requirement to operate through failure or not.

Backup and Restore

In AWS this approach start with Backup and Restore — the cheapest solution. So we are assuming here you would for example backup your snapshots to tape storage or the cloud. It seems fairly obvious that without automating that restore process there is an overhead with doing this manually. However, it is a cheap solution as you only have to consider storage costs for the backups.

What you need to consider….

Other than the business interruption cost… 🤔

What data and services do you need to back up? How often does this need to happen? Backup and Restore approach assumes you can have some lost hours whilst you perform your data and service recovery.

If you want a faster approach well you need to consider Pilot Light or Warm Standby. Or if you really need are delivering critical services or services which simply cannot be interrupted then you need to consider an Active-Active configuration. This is realistically the most expensive.

A good guide from AWS here:

https://aws.amazon.com/blogs/architecture/disaster-recovery-dr-architecture-on-aws-part-i-strategies-for-recovery-in-the-cloud/

Pilot Light and Warm Standby

Pilot Light and Warm Standby assume that you have set up a second set of infrastructure whether this is physical or in the cloud. So if you’re running a couple of application servers and a DB, you would under pilot light have backups running to another DB in the secondary site and synching ready to be used. Servers in this case would need some time to spin up. This is a good option if you want to save money but need to recover quickly.

With warm standby, you still have your architecture as you would with pilot light however you replicate your architecture however everything is minimized. You might run the backup site on the medium rather than large server instances for example. You would run everything on the minimum infrastructure so that it can be rapidly scaled. This is a lot faster than the pilot light option.

A handy detailed look from AWS + architecture diagrams.

Active-Active configuration — pricey but…

Active-Active or multisite — you simply have multiple sites with the same infrastructure, so your costs are 200% as you are running too full sets synching all the time.

Best practice with multisite is essentially an identical copy of your infrastructure — but therefore your secondary environment is ready to go at all times. Because the two environments are always in sync you can load-balance between them and optimize high availability. This is best in terms of performance.

Final Forecasts

As ever there are a lot of considerations with costs and how you estimate those costs are never entirely accurate. An estimate means exactly that. It's only experience and practice with the tools and services over time that will teach you how to accurately forecast this. Fortunately, with AWS and other Cloud providers, their products and services make this easy for you.

This was, as ever, just a starter for ten. I would be busy writing all day if I covered all the possibilities here but if you just do what I cover here you’ll be ahead of a lot of startup teams.

I hope this was useful and insightful for you.

You can trust me 😎 I’m an Architect.

👩🏻‍🔧👩‍💻👩🏽‍💻

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Gemma

Business Developer, programmer, solution architect, runner, swimmer, a culture and tech nerd. Busy building new solutions in emerging technologies.