Exploring Cloud Observability vs Cloud Monitoring
If you have anything to do with cloud computing, you have probably heard the word observability. It sounds a lot like monitoring, but in reality, they are different. One deals with capturing generated information for presentation or alerting, whereas the other focuses primarily on the content and ability to make decisions on the generated data. Let’s dive into what is different about them and what approach you should take when looking at the cloud’s observability.
Cloud Monitoring
From a cloud computing perspective, when it comes to monitoring, this is the information that we receive from our applications with the intended goal of reducing downtime and better allocating our resources. We get alerts and data about any failures before it harms our business or processes. Monitoring is mainly done with tools, and the main point is to achieve high availability by reducing the time to detect and mitigate.
Cloud Observability
Now with observability, it’s not about taking all the logs that are generated necessarily. It is the information that’s contained within the logs that defines observability. It is more about the capability to monitor and analyze the logs generated by our systems, specifically those that give us actionable insights. Observability can be enhanced in the cloud with three main data formats that are generated:
- Event Logs
- Metrics
- Tracing
It comes down to the fact that if the system can’t identify its state information, it lacks the observability to notify of issues or failures. Observability in the cloud means understanding the systems and services in operation and having the capability to ask new questions and generate novel, relevant data. Focusing on the metadata that connects the systems and services as the data flows through the platform.
Applying Observability To Our Cloud Environments
The approach to observability is focused on enabling the following:
- What actions are being taken by whom?
- What’s in transit on my network?
- What information is necessary to not only answer questions now, but also what might be asked?
- Is all of this information centralized and accessible to those who make changes in the environment?
Beginning with the basic foundation layer, our questions become, “What is happening in my cloud?”
Does the organization have the configurations and services in place to answer this question at any time? Beyond just the “What actions are being taken by whom?”
These are important questions to start with and which must be answered in order to gain observability. We need to know if we can get this information out of the systems we have in place and what is contained within those messages.
The next layer brings the following questions:
- Does the organization understand what those actions and activities represent?
- Does the security team have insight into the design process?
- Better yet, is the security team involved in the design process?
Security observability should extend into all aspects of the business and teams. Information traversing through the cloud environment must have a steward, an owner that understands the whole path. The Cloud Center of Excellence (CCE) should maintain a holistic view of the proposed solutions and implement them within the environment.
The CCE should be producing the architectures and frameworks necessary to solve business needs as driven by the requests from the business owners. These efforts should be focused on cloud-native, expandable frameworks that leverage cloud providers’ well-architected scaffolds.
We should be leveraging the tools and services already in the portfolio to the fullest extent before introducing new tools. This way, we start our observability journey right and avoid building on top of a broken architecture. When the groundwork is laid correctly, implementation of the other steps becomes much more manageable. With observability, thinking more holistically becomes key and sets us up to be more successful.
Romke de Haan
Romke de Haan has over 22 years of experience as a technical & business leader and technology strategist. Romke has worked with commercial corporations such as Microsoft, Razorfish, & Kohl’s as well as federal agencies including the General Services Administration, Environmental Protection Agency, and Transportation Security Administration.
Romke has provided technology leadership in digital transformation and innovation through the design of data driven and UI-focused systems hosted both in the cloud and on-premise. In working with federal agencies such as the TSA, Romke helped lead cloud migration initiatives by transforming organizational practices from siloed structures and waterfall methodologies to Agile delivery methods such as DevSecOps through CI/CD pipelines.
Romke’s skillset not only includes technology but also includes UI design and business strategy allowing him to better align digital transformation initiatives with the needs of the business. Romke has served in various roles including application architect, developer, mentor to startups across the US and South America, and civic initiatives such as being a founder member of Milwaukee’s Code of America chapter.