Monitoring your cloud… the when, what, and the how?

4 min readSep 8, 2021

So you have moved to cloud, your infrastructure is all set up, you can deploy anything there within seconds, you have auto-scaling to handle your loads and you are paying less. How cool is that?

You are as happy as you can be…..

But… while humming those happy tunes, how do you make sure your cloud stays the way you want it to be? How do you make sure you know you do what is right in your cloud and not make silly mistakes or, unknown mistakes that would cost you all the benefits that you are so happy of?

This is why continuous monitoring is necessary in cloud. I say CONTINUOUS because unlike your traditional infrastructure, in cloud it would take you only minutes, if not seconds for things to change!

How you monitor depends on the operational model you have designed your cloud with. You can adopt various operational models when you set up your cloud. Some of the well-known are,

Centrally managed with a centralised dev-ops team and tenants only own the applications hosted
Decentralised, tenants own everything from cloud infrastructure and applications and everything
A Hybrid shared responsibility model where a centralised team owns some infrastructure layers, and tenant teams own the rest

Whichever operations model you choose, there is one thing that cannot be overlooked. That is monitoring. Ensuring you have the capacity to monitor and govern your cloud to abide by your organisation’s policies and standards, and then based on the operations model you have empowering the tenants, or the central cloud infrastructure team to remediate the non compliances.

When you implement this monitoring capability, it is important to understand what you should monitor and what you should not. This understanding helps you to create the least amount of noise for your infrastructure teams and make sure you handle what is important to you. From my experience this is the most important bit to figure out when you set up your monitoring capability.

What are your important assets? and where do they live?
How do you identify your important assets?
What controls do you want to have across all assets? and what controls do you want to have only on the important assets? and of-course what is the justification for this decision?
What is the risk of not monitoring certain assets? and who owns this risk?
How do you ensure you have the monitoring capacity to make sure you have sufficiently covered your cloud space and you know what is there. (how many accounts you have? how many tiers you have and which tier has what? what services are enabled across these tiers?)
How do you enable visibility for the metrics that you get from your monitoring capability? why a wholistic view may be important in contrast to a singled out view on each non-compliancy event (ticket based vs dashboard based)
How to encourage your tenants and infrastructure teams to understand the risk of non compliancy and better prioritise remediations?
How often you should monitor? yes! you can have continuous monitoring, but considering the cost of continuous monitoring, how continuous do you want to be? per second? per hour? or per day? — you could gather information continuously, but remember the effectiveness of your monitoring capability depends on how often the alerts are generated, how often they are tuned and validated.

These are good questions to ask when you design a monitoring capacity. But the most important thing is, having the monitoring capacity built up together with the infrastructure. If you design your cloud without the monitoring capability and think you can have it later, even before you know it your cloud would have grown to 100 accounts with thousands of instances and you have no capability to know what is where and how to monitor it.

Security and infrastructure in cloud are tightly coupled from day zero. Cloud comes with attractive benefits like on demand services, setting infrastructure up with just a single template and a click of a button. While these do make your life easy, it comes at a cost. A cost of security, just as you can spin up and instance with the click of a button, you could easily spin one down accidentally with the click of a button. Thus, you need clear understanding of what to enable, what not to enable and how to enable. Then what to monitor, what not to monitor and how to monitor. Because whether you gain from cloud or you lose with cloud is all dependent on how smart you make these decisions.

Because you could have all the technology you have, but not utilise them right from the get go, which would make your life in cloud a miserable one.

Monitoring your cloud… the when, what, and the how?

Written by Awanthika Senarath

No responses yet