Blue Team Diaries: Becoming Data Smart
“I can’t afford to not be data-smart.”
- Doug Clendening, Principal Services Consultant at Open Raven (Previously Principal Cyber Incident Commander at Splunk)
Blue teams aren’t quite the cape-wearing heroes featured in comics, but they aren't far off when it comes to protecting organizations from cybersecurity threats and swooping in to respond to attacks. These teams operate on information. They must know the potential threats to the business, and ultimately what they need to protect. An effective blue team will understand the operational needs of the organization, and be able to design efficient workflows to ensure timely identification and remediation of issues. For a team constantly racing against the clock, time is everything.
To do this, blue teams typically pull massive quantities of activity logs from all parts of their environment into a SIEM for centralized analysis. To avoid drowning in a sea of events, teams must prioritize by level of risk. In general, they know the criticality of the asset involved, severity of the vulnerability, permissions of the user, but often lack visibility into the data involved. In order to prioritize effectively, teams require this additional context, context that is not so easily found. For example, an open S3 bucket hosting data for a public website is no cause for alarm, but the same bucket ending up with unencrypted customer financial data is a different story.
With data sprawl in the cloud being a new fact of life, such gaps in data visibility become even more of a challenge for blue teams. To evaluate prioritizing becoming data-smart, we talked to one of our recent hires, Doug Clendening. Doug comes from a background as a blue team member, most recently having helped build Splunk’s cyber incident response team. Before that, he handled large-scale incident response (IR) efforts for CrowdStrike’s services customers. In the following interview, Doug talks about his experience and what impact(s) becoming data-smart has on blue team operations.
Thank you for taking the time to talk with us. We’re happy to have you aboard. For the uninitiated, what is a blue team?
Blue Teams are responsible for the operational approach to cyber security. We prepare for threats and are responsible for detecting and responding to incidents. The core foundational disciplines are typically considered to be cyber intelligence, detection/prevention and response.
What does being ‘data smart’ mean to you and for blue teams, in general?
Although it might seem like it, there’s no magic in what we do. Being ‘data smart’ means having the right context about the sensitive data in your environment for threat planning, detection and response. The fewer steps and people involved, the ‘smarter’ you are.
Blue teams need to make business-critical decisions, quickly. Are we being attacked? By who? What are they doing? What have they compromised? What is the risk to our business? To do that effectively, we need to know as much as possible and in the shortest time possible. When there’ no time for interviews or manual investigations to gain more clarity (e.g., ransomware is extremely time-sensitive), we’re left with only assumptions and hope at a time when leaders are looking to us for solid answers and solutions.
What about the changes in data is so significant to security teams? Is it how we use it, where it ends up, how it’s structured, or a mix?
Data is being decoupled and abstracted away from the traditional assets to which they were once attached. Think standalone S3 buckets instead of disks on a file server. At the same time, the ease of setting up new data storage and the rate of development is creating data sprawl in the cloud. While this decoupling adds speed and mobility helpful in growing a business, it can also hide valuable context in protecting it.
What this means from a security perspective is that your attack surface is ever increasing. Not only that, but it’s increasing at a rate that is outpacing traditional processes and tools meant to keep teams aware of their environment to make critical decisions quickly. So I guess the short answer is that it’s all of those things and becoming data-smart is the only way to keep pace with the business you’re tasked to protect. Data is the new ‘endpoint.’
Let’s talk about some of the challenges teams face today: ransomware, supply chain attacks, vast data lakes and warehouses, structured vs unstructured data, data mobility (or sprawl), remote working, existing and new regulations, shared responsibility, shifting left, etc. Why is now the time to prioritize becoming ‘data-smart’ if teams haven’t been for so long?
Being data-smart would always have been advantageous, but the changes you mention have evolved it from a nice-to-have to a critical requirement. As they say, “Hindsight is 20-20”, but a lot changed and it snuck up on us, on the whole [security] industry. And the reminders are everywhere--everyday, there’s a story about security teams getting caught in a serious (and often preventable) incident. Many are unable to confirm what data was breached/taken quickly, if ever.
Data is more mobile than ever and that’s not going to slow down. There are vast data lakes and warehouses that house petabytes of unstructured data, some of which is sensitive and some is not. The reduction in data visibility has created a serious amount of nervous energy as we prepare for and respond to incidents--data is a primary target. Most teams get the importance of such visibility, but not everyone can build their own tools and most end up doing the best they can with what they have--usually some patchwork of tools, scripts and services. If teams can’t keep up with their data in the cloud, they’re going to struggle to protect it.
How does a blue team respond if you can’t confirm whether or not regulated data types are involved? For example, you know that EU personal data is being handled by the organization, but are not sure if such data has been breached. What are your options?
That’s a big issue and the answer is that we have to assume the worst. Again, time is urgent because we can be fined if we’re late to produce a completed breach report in cases like with EU personal data (GDPR). If we were to respond and hope that such data was not involved, it would be less work, but the organization could face major penalties and fines that would far outweigh the time saved. On the other hand, if we unnecessarily respond according to a “GDPR playbook,” it can cause issues(bad PR) on top of the additional time wasted, across multiple teams. Knowing the data involved in a breach is a key requirement for proper incident response, any “bug” in those requirements can prove costly.
I won’t ask you to name names, but what is a common or real-life incident you can think of that would’ve been or can be better handled with a data-smart platform?
Where to start? I think most blue teams, myself included, have had incidents wherein a developer accidentally leaked their AWS credentials in a public repo, allowing access to their account to anyone who grabbed those credentials. We almost always had a tough time determining what information was at risk in a timely manner. The first step is to gather context we lack so we can evaluate and prioritize next steps: Who owns the account? What is it used for? Who accessed the account? What actions did they take? What data is involved? Blue teams need these answers quickly, as the risk increases with every passing second. Obviously, back then, I had no such quick answers.
Even with AWS CloudTrail logs in our SIEM that show potential malicious activity in the account, I still didn’t have visibility into the data to understand the risk to the business. Instead, I had to interview the developer and manually poke around the AWS account, which required first gaining access. Being data-smart would’ve saved a good amount of time between receiving the alert, assessing the risk, and knowing what action to take. It just makes the whole process easier, faster and more accurate.
To close-up, if we interviewed you five years ago, do you think you would’ve been asking for an automated data classification engine to become data-smart? What is your recommendation to blue teams?
Cyber security is a practice of prioritizing action based on risk in a world where everything changes, all the time, quickly. Being data-smart is one of those things that we probably talked about and wished for, but there simply weren’t any viable options for us and the priority wasn’t high enough. With today’s advancement of data in the cloud, I can’t imagine life any other way.
When it comes down to it, knowing that there is finally a viable option out there, I can’t afford to not be data-smart.
My recommendation to blue teams is to seriously look at current workflows and consider if being data-smart will make your lives easier, or not, then come take a look for themselves.