DSPM depth or breadth? Why not both?
“Gartner wrote a research report on your space— nice. There’s quite a few companies listed…”
Last Thursday, that was how the conversation started with a partner who had read the Gartner Innovation Insight note on DSPM. While it may seem deflating or a little confusing to be in a young category with a crowd of companies, it’s actually.. completely normal. And quite good. Let me explain.
We started building a data security platform in 2019, guided by the feedback that there needed to be a “CSPM-like product for data”. In spite of the global mayhem that followed (pandemic, civil unrest, war, SVB collapse, etc.), our roadmap has stayed incredibly consistent throughout and we’ve welcomed a number of other companies to the neighborhood. Some transitioned in from privacy, some with roots in CSPM, and others that were startups solely focused on DSPM. We see it as fantastic validation and the sign of a major new category of product being born. It’s similar to what we saw happen with “next generation endpoint” around 2013 when no less than 40 companies seemed to appear overnight. And not terribly different from the early (crowded) vulnerability management market in the early 2000s. If you find yourself in a product category by yourself, you’re either a once in a generation genius or your idea isn’t very good. Smart money bets on the latter.
In spite of confusing claims that even the young companies “do all the things”, there are real differences between the DSPM offerings in the market today. Quite a bit of it comes down to who you’re building your product for and what you think they need. I’ll explain some of the things that we believe make us different while highlighting our recent progress in each area.
Depth and Breadth
We started our journey by going deep into AWS, making sure we could tackle the highest risk data at massive scale: the enormous piles of data in S3. It’s the grandfather of all public cloud services, a continual source of leaks and breaches and is so steadfast it might even be an eternal service. We then expanded out to structured data (e.g., AWS RDS, etc.) and today announced expansion into Google Cloud Platform and Snowflake. But why not “all the things?” And more specifically, why not Microsoft Azure?
If we cover a data service, our approach is to focus and cover it in depth. So each new example of “breadth” has to offer a reasonable measure of “depth” along with it. This means if we’re offering data classification, you can choose the exact sampling rate you need, from 1-100%. You can also constrain analysis by time or budget. Scans are smart enough to avoid touching any data that hasn’t changed. Similarly, if we classify unstructured data in RDS, we handle all the different types. It also means we’re going to allow you to connect at the Org level to automatically discover and map resources versus making you connect account by account. And so on.
It’s easy to explain our focus. Our customers tend to push data services to their limits and favor offerings from AWS, GCP and Snowflake. We’re rapidly expanding our coverage to other data services and clouds over the course of this Spring and throughout the year, including Azure. And when we do, you can count on us bringing not only more breadth, but similar depth that we already offer for existing services.
0 Touch, 0 Headaches, 100% Control
Today we announced a partnership with In-Q-Tel as well as an associated work program for their sponsors who are among the most privacy sensitive organizations in the world. They chose Open Raven as their partner given our platform’s unique model which centers on 3 key areas explained below.
Touchless scanning - Data discovery and classification is performed using serverless functions that belong to the customer-- not Open Raven. The only data ever seen by the Open Raven platform is metadata that the customer can dial up or down based on their needs. More on this below.
Zero headaches - Private design is often achieved by making the customer support and maintain a Kubernetes cluster for the platform. This is a little like renting a house on AirBnb only to find out they’ve left all their pets for you to take care of. Exactly no one wants to take care of your pet ferret… or your Kubernetes cluster. Open Raven is true SaaS so that there is no maintenance toil while the touchless method of analysis with only metadata being returned to the platform ensures full data privacy before and after analysis. We also offer a choice for structured data analysis: you can give us permissions to sidescan data stores without providing credentials or you can restrict permissions to our platform and give explicit permission to scan data stores.
Full control - From the CSPM core of the platform being available on Github (Project Magpie) to the open model for our rules and policies (SQL-based and also open source), we’ve designed Open Raven with the spirit of the cloud infrastructure products we admire. A clear example of the level of flexibility we provide customers is with our data classes. Yes, you can create custom data classes, but you can now also create a “super” data class with conditional logic that brings together several classifiers into a company specific definition of PII, PHI or anything else. And data classes are not just for what’s inside an object, table, etc. but can now be written for a broad variety of metadata as well, from a file’s author to the geo coordinates of an image or video.
Whether it’s the intelligence community of a sophisticated fintech company, Open Raven is designed to uniquely meet their specific needs at scale and without the painful privacy trade-offs that have become far too common. Don’t take our word for it, here’s what Henric Andersson, Chief Information Security Officer, Deserve has to say:
{{henric-quote="/drafts/style-guide"}}
Beyond DSPM
Now that a clear definition of DSPM is forming, the future of data security itself is also now more visible than ever. DSPM is just the beginning of a new genre of data security platforms that are built to defend the data economy itself, from protecting customer data inside SaaS platforms from breaches to securing the data sets that serve as fuel for generative AI.