Introduction and Initial Setup
Silas Glines: Good afternoon everybody. We are gonna go ahead and get started here in just a minute. Um, as we get started, if everyone could kind of start entering some messages in the chat, let me know. You can hear me loud and clear. Let me know if there's any audio issues. Um, general feedback. Also, feel free to throw in where you are, uh, based out of right now where you're watching this from.
I always like to kind of get a feel for where everyone is, uh, location wise, New York City, love New York City, Indianapolis, Colorado. Nice. I'm coming at you right here from Phoenix, Arizona. Normally though, I am, uh, residing in Boston. Alvin, good to see you. Hope you're doing well. Yeah, yeah, of course I remember.
Awesome. Cool. Cleveland, Detroit. Awesome. Good to see everybody.
Overview of Cyber Haven
Silas Glines: Um, what we're gonna do today is, is do kind of a, an introduction to Cyber Haven at a high level. Um, I'm gonna give you all a, a quick overview of kind of why we exist, um, what, what makes us special in the market, some of the use cases that we handle.
Um, and then what I'll be following up with is actually a, a sneak preview into our new DSPM offering that's going ea uh, next month. So we've got a lot of exciting stuff to talk about today. Um, got a few, few early access features to show you a few beta features. We've got, um, our entire production product to show, so we've got a lot to go through.
I think we've given everyone a minute or two to join, seeing a lot of people from Dallas too. Awesome to have you all, uh, all on board here. Awesome. Let me go ahead and, uh, gonna share up my screen. We'll go ahead and get started. Um, and then kick things off. Share my screen out and here we go. Awesome. As before we get started, I'm gonna go ahead and move all this over to the side for you.
Awesome. So what I'm gonna be talking about today is Cyber Havens data protection platform. Quick introduction about myself. My name is Silas Kleins. Uh, I currently lead our, our sales engineering team over here at Cyber Haven. I've been with the company now for almost four years, which I feel like in this space feels like a lifetime.
Um, prior to this though, I've worked at other legacy data protection providers. Um, I've deployed a lot of the legacy tools that are out there. So think of, you know, the Symantec DLP or you know, V two as. Was known, uh, McAfee, uh, DLP, kind of a lot of the, the products that you've seen out there, like Proofpoint, stuff like that.
Uh, digital Guardian as well. I came over to Cyber Haven four years ago because I saw something very unique in their vision, which I'm about to share with you today. And we're gonna be sharing about how it's expanded over time. Uh, you're more than welcome to throw messages in the chat here. You're gonna see me look off to the left.
That's where I have your chats pinned. So don't think I'm like looking off to the side at anything else. Uh, just wanna make sure I'm answering any questions that come in. I'm also going to keep our q and a tab open, so if you have access to the chat. And access to the q and a section. Throw your questions about the presentation in the q and a section.
Um, if for some reason that function isn't working, sometimes it's a little bit temperamental, feel free to to throw them in the chat and I'll just keep a, a light monitoring on there as well.
Challenges in Data Protection
Silas Glines: So why, why does Cyber Haven even exist? Right. I think that when we review the history of data protection and data loss prevention and insider risk management, we all are familiar with a lot of the, the known issues, right?
Anyone on this call that's tried to deploy a, a data protection, uh, product, um, has run into. Similar issues where they're usually blind out of the box. Now, thankfully in recent years, a lot of the vendors in this space have caught up and realized that that out of the box visibility is incredibly important.
But historically, these products actually were blind when you first deployed them, and you needed a policy to get visibility into activity. Second, there is a reliance on content classification to identify what data is or isn't sensitive. We'll talk a little bit more about what the limitations are with content inspection, but if you have ever looked at a false positive in your data protection platform as a result of bad matches on content inspection patterns, like regular expression out of the box dictionaries, this is the place for you.
This is absolutely the discussion for you to be in on. And then lastly, we, we load these kernel level agents on our user systems. We do process injection into every running application. Um, we, we, you know, use proxies to get our network visibility and we duplicate every website certificate header, and then we scratch our heads and we wonder why our users complain about performance.
It's really not a mystery. Right. When you use invasive Legacy methods to get your visibility, you can expect a level of performance that will reflect that approach. Um, the, the results of these are, of course, that we have. A lot of false negatives in an environment. Why? Well, a false negative is when something, there's a risk in your environment that you were not aware of, and when that activity occurred because you didn't write a policy to see it, you weren't notified that it occurred.
This happens all the time in legacy tools and even the ones that do provide you visibility, don't do a good job of surfacing the risky behavior because frankly, they don't know the difference between risky behavior and acceptable behavior. We also have a real issue with false positives. Uh, like I said earlier, when you're only looking at content inspection, this is very prone to false positives.
My favorite example here is, if I give any of you on this, this webinar, two files that both have PII, and I'll even give you a shortcut, human to human, I'll tell you, Hey, one of these files is actually sensitive, but one of these is a false positive. You as a human can only look at the content in these two files to figure out which is which.
All of you are gonna come back to me and say, Silas, you only let me look at one thing content. And in that content, both these files had PII. Therefore, they're both sensitive. And this is exactly what we are asking most of our data protection tools to do. We're giving them a small piece of information and having them make large deterministic decisions based on that one piece of information.
If I give you a little bit more context into the data, we could probably figure out really easily which one is which. If I say document A with PI, I came from your SharePoint repository where you store all of your customer data and document B with PII is actually a a a A pay stub that was downloaded from your HR platform, from your employee portal.
Now in instantly as a human, you know that document A is actually sensitive. That's customer data because it's PII. And it came from your SharePoint tenant where you store your customer data. But document B is a false positive. That's not customer data. That is someone's own pay stub, and they should be allowed to airdrop that to their iPhone or upload that to their personal Google Drive.
You don't wanna look at all of the policy violations that they're gonna do with that data because you don't care about that data despite it having PII, and this is of course a very high level example, but I think that this represents really well why we exist and, and kind of some of the frustrations that we're solving over here.
This same concept of context and lineage is honestly what I saw in Cyber Haven four years ago when I was evaluating the product to even decide if I wanted to come to work here. Right? And, and we have been building on this ever since. And then lastly, right, none of us want our users to complain about performance.
If any of you on this call have developers, you will hear about it. If their build times take a hundred milliseconds longer, you will hear about it. If they have application crashes, performance issues, stability problems, um, how many of us have PTSD from bs, sods on Windows devices? These are things that we want to avoid.
I'll talk a little bit later on about how Cyber Haven's architecture avoids a lot of these pitfalls.
So where do we sit in the, the security stack? Because I get a lot of people asking me, are you guys A-A-D-L-P product? Are you guys an insider risk product? I always say that we're a data protection platform that encompasses use cases from data loss prevention. We also have insider risk management. We're releasing DSPM, which is ea and I'll show you a preview of that today.
I actually, I believe that other than our kind of launch announcement, this is the first time that we're showing DSPM on a webinar format. I'm gonna keep it really high level, to be honest with. All of you, but this is just kind of a preview of what's to come. Now, all of that is built on this backbone of data lineage and out of the box visibility.
Data Lineage Explained
Silas Glines: So how does data lineage work and what is it? Well, data lineage works by us collecting all sorts of event details from various event sources. So what that means is that if you have a user that downloads a file from a website, uh, we don't just ignore that event because the file isn't egressing the environment.
We keep track of that. We notate that that file was downloaded from that website. We can see the size of the file, the contents in the file. We can see the name of the file, all sorts of metadata, like the hash of the file. If it's from a office environment, we'll give you the Office 365 file, ID a ton of metadata before there's even a risk that that data is sensitive or a risk that it's going to egress.
Why do we collect that? We collect that because over that file's lifecycle, it's gonna have derivatives, it's gonna have copies, it's gonna have information pasted into it from other documents, and it's gonna go places that maybe you don't want it to go. So if we wait till the very last second to start recording, you've missed so much information that could be valuable in your investigation, your determination if it's sensitive, and your ability to enforce blocks or controls on that data when it egresses accurately.
So we're always collecting all of these disparate events with a ton of metadata. We store those events in a graph, and then we start to process events to identify links between any two given events. So when I download a file and then I email that file later on, we're gonna look for an attach event that shows that that file was attached to the email that then I sent.
And we can only do that because of the tremendous amount of metadata that we're collecting on all of these events out of the box. So think of this concept of us just kind of blasting out a request for all metadata, for all events, for all files. Storing those in a graph database and then starting to make correlative links between those events to show you a timeline of that data and its lifecycle.
At its core, this is what data lineage is.
Legacy vs. Modern Data Protection
Silas Glines: So right here what we have is kind of this legacy approach where, like I said, a lot of the legacy tools can't see, um, out of the box. They miss a lot of important information. They don't really pay that much attention to ingress events or to small file level changes.
Copy and paste activity between due documents. They only show you when data goes at the egress vector. And one of the challenges here that I like to call out is just gonna be encrypted or compressed data. Quite simply, if I upload a file that's encrypted to my personal Google Drive account, well at that point in the file's lifecycle, we don't have any content to trigger off of because we can't content inspect an encrypted document.
However, if we watched me download that file from a sensitive repository, any sensitive repository, and then we watch me open that file with seven zip, we watch me encrypt that document and then we watch me upload that to Google, Google Drive. Now what we have is a much more useful picture. We have a, a, a series of breadcrumbs that we can follow to determine this encrypted archive that we can no longer scan came from this sensitive repository.
And when it was downloaded before it was encrypted, we found a ton of PII inside of this document. Encrypted or not, this file should never go to this user's personal Google Drive account. Just as one example. So we're always tracking this data out of the box as it moves between files, applications, your content clip your, your clipboard on your endpoint, all sorts of stuff.
We layer in broad controls to be able to choose exactly what users can and can't do with this data. Now, uh, don't, don't think about this as just being, you know, broad coverage and granular controls in the sense that we can just block things, right. There's a number of responses that we can do anytime that a user is interacting with this data.
We don't have to just be an enforcer to block egress. We can be a digital fence to guide users in the right direction where we actually use our prompts to educate the user on the best way to handle that data. My favorite example of this is, is chat, GPT. Maybe your organization has a copilot instance, and maybe you sent out a memo about it months back and, and users maybe saw the email, maybe they didn't.
Maybe they just clicked it and marked it as red. They never actually requested a copilot account, but two months later, you now see that they're putting data into their personal chat. GPT. It's great that you can block them with a tool to prevent them from putting that data into chat GPT, but you're actually reinforcing some negative behaviors if that's all you do.
If I just tell my user, you can't do that, you can't do that, you can't do this, what are they going to do? They're gonna get real creative and they're gonna start looking for workarounds. So with us, you can put in either passive warnings, you can do a block with an override option for the end user. You can do a hard block.
And you can put in instructions. One of my favorites that one of my customers recently implemented was just linking their acceptable use policy and their generative AI policy right within the prompt. So when the user gets blocked, they get educated on what they're supposed to do instead. I think this is really important when we talk about incentivizing good behavior in our workforce and implementing controls in a way that doesn't hold them back from doing important and critical functions.
And don't just think about us as being the solution that protects browsers, right? We also protect end-to-end encrypted applications, even local versions. We protect local protocols like air airdrop and Bluetooth transfer. We can do printers and scanners, USB devices. Uh, the MTP and PTP protocol, I'm holding up my phone.
It's what I call the USV device in everyone's pocket. Your phone, plug it into your computer. Data can now egress to that location. We need to be able to cover all these legacy destinations as well as the new age threats, like those generative AI products that are being embedded into various applications and operating systems.
Um, and so, uh, I got a great question right here in the chat. Do you have tenant control for AI tools? So allowing enterprise Chat, GPT or cloud, but not personal versions of the tools. Yes, we do. This is extremely important. We're gonna talk about how we do this in a second with our browser extensions and what, what we call cloud app account detection feature.
What this does is it doesn't just tell you the data's going to a domain like SharePoint or Chat GPT. This tells us the account that the user is logged into within the SaaS application itself. So you can actually create custom policies that say if they're using a, uh, an account with chat GT that ends [email protected], block them, block the data from being pasted, block it from being uploaded.
But if they're logged into their corporate account, allow it because we have a privacy addendum with chat GPT or Claude that prevents them from reusing our prompts from model training. And that's very important when we're talking about sensitive data that's covered by all sorts of regulatory compliance frameworks.
Great question. And then the ultimate goal here is to reduce the overhead, reduce your cost of cost, uh, of, of having to look through a bunch of false positives, not understanding the threats inside of your environment, having to create policies just to get basic visibility. All of that is out the door with this approach, especially as we start talking about our linear AI product that can be layered in to help make those decisions for you.
So in short, patterns are great. Content inspection is great, but it does give you an incomplete picture into how sensitive your data actually is. We wanna combine those content inspection pattern matches with the context of the data, and we can even ingest Microsoft labels as a third source of truth when we're determining what data is or isn't sensitive.
This approach is much more holistic results, the false positives, and increases your confidence when you do enforce a block that you're not gonna block someone's. Approved workflow. We've also added in this concept of cloud lineage to combine with our endpoint lineage. So what we have today is the ability to see files that are coming from your various repositories, making their way into various SaaS applications, using the endpoint here to show the users moving that data around.
And then we have cloud sensors to pick up the lineage where we leave off. And so when we're in your, maybe your corporate Google Drive, OneDrive or SharePoint, we can continue that lineage using our cloud lineage sensors. And then if a user accesses that data later on back on another managed device, we then again, have that endpoint lineage, uh, ability to continue that trace and monitoring throughout the files lifecycle.
Like I said earlier, we're also adding in that DSPM portion, so those data at REST scans the ability to view the permissions and access to a document and even, uh, take note of when they access that data. Or, or change it in any way on a personal or unmanaged device like someone's cell phone or their personal computer at home using their corporate accounts.
All of this is, is, is coming out with our DSPN product as well.
And this right here is just one example where we wanna be able to show you access to data changes to the data, regardless of where it's occurring. So when they're on a managed device, that's great. We'll show you all those file level, uh, access logs. We'll show you the files, lifecycle, the copies, the derivatives, if they take texts out of that file and paste it in another document, where does that file go?
All that'll occur on the managed device. When you're using an unmanaged device and accessing that data in your cloud repository, we have the ability to, to surface that activity as well. Kind of have you covered all the way through with the ability, combining that with the, the capabilities that I just spoke with in response to the last question I was asked here.
With the ability to block that data from going to someone's, say personal Google Drive account, which for those of you who have had to implement a legacy DLP tool, you understand that that can be such a challenge because you're only gonna see the same sub-domain for Google Drive, regardless of if it's personal or corporate.
So here what we've done is set up digital fences to prevent it from going somewhere that we don't have visibility, but allow it to go to the corporate destinations where we have that visibility to show you what occurs on personal devices when they're interacting with the SaaS application. This is really just an example of the extent of what our lineage covers.
I often get asked, you know, what applications do you support for your, your lineage here? Where can you detect data coming from? And the short answer I is really, it doesn't matter. We can detect it out of the box anywhere. The reason for that is because, of course, we're always monitoring every endpoint application that's generating, generating data.
We're monitoring every website that data is copied out of or downloaded from, and so that's out of the box functionality. You don't really need to even tell us to look for that. What you can then do is take a report of all that activity and start generalizing the type of data that might be in each repository.
So you can be as infinitely, you know, flexible and customizable here as you want, or be as granular as you want, getting down to the exact folder level or opportunity level in something like a Salesforce or as share.
And then lastly, the ability to actually respond in a way that's effective for your workforce.
Flexible Response Mechanisms
Silas Glines: Uh, we always want to be careful about just layering in blanket blocks for functionality, uh, that we might not necessarily be a hundred percent confident isn't approved by the workforce or by someone's manager.
So we have flexible responses here. We have five different levels of response when a user is handling sensitive data. And they send it somewhere that you don't want them to send it. So the first is gonna be just a passive log of that activity, which we do out of the box all the time. The next is going to be, uh, a silent alert to the security team.
The, the third is going to be a warning. That's completely passive. It allows them to move that data where you, where you identified they're moving it to, but it warns them on the screen with a prompt that takes over their screen to message them why what they're doing might be risky. I find this is really helpful for chats where someone might paste text into a chat, but you wanna warn them before they've had a chance to click send, for example.
The next layer is gonna be a block with an override. So this blocks the action. It prompts 'em, it asks 'em if they're sure that they want to do that, informs 'em that their activity is being logged. And then the last one is a block with no override. So no matter what the user does, they cannot bypass that.
Um, right here. Uh, Jeff, great question. How do you manage whitelisting across multiple policies? We have a list feature with lists that can be shared between any policies. We actually also have a scripting feature that allows you to grab snippets of policies and reuse those as objects in any other policy in your environment.
So let's say you have like a list of your internal domains. You might wanna add that to two policies. One that identifies your emails going to external email domains, and another that maybe identifies data going to external websites, if any of those websites don't contain your internal domains in the domain structure.
Uh, so with us, you can just create one list of those and then reference that list in both policies. Just as an example, we also have like user level exemptions, for example, where you might wanna allow a user to override a policy that can also be done with something called insider risk groups. Very easy for you to apply and unapply those to various policies to allow those exemptions to go through.
I hope I answered your question, Jeff, but if you need any clarifications, just throw it in the chat. I'll be monitoring that for you.
Technical Architecture
Silas Glines: This right here is our architecture. I'm gonna go through this pretty quickly so we can dive into the product. Um, but at our core right, we have our managed endpoint devices.
We have an endpoint sensor. This is a user space sensor for all of our baseline visibility. Um, this right here doesn't do any process injection into any running applications. Uh, we don't use any proxies, uh, to get our, our browser visibility, for example. So we're not at the whims of, you know, security certificates that are gonna break our proxy approach.
Um, just to answer one question that just came in, do we use existing RAC or conditional policy? So our entire console does have RAC built into it as well as scopes. Um, now as far as our ability to ingest those RAC roles from another solution, that gets a little bit trickier, but you can customize every single login and their role to make sure that they can only access parts of the product that you want them to.
Or use scopes to determine whose data they can access within the product. So once again, just throw any additional follow ups in the chat if you need me to follow up on that. But that's kind of how we're we're structured today. So our endpoint agents aren't doing any of those. He that heavyweight lifting.
The truth is that we can get away with this because we were developed a lot more recently than a lot of the other tools on the market, right? We weren't founded in, you know, 2002, so we didn't have to use kernel level hooks. We didn't have to use process injection. We have the benefit of using lightweight user space streams like the ETW stream on Windows or the EESF on Mac os.
We also have EBPF on Linux. So all of these allow us to do either kernel list visibility or use universal kernel calls that are compatible across every distro of Linux that's using a compatible kernel version. For those of you who deploy on Linux, you probably know how big of a deal that is, not having to constantly manage the kernel version and the OS version.
We also have browser extensions that we support and deploy. So out of the box we support Chrome Edge, Firefox Safari. We also have a universal chromium extension that you can deploy in any chromium based browser to get blanket visibility. And then lastly, we have plugins for Microsoft Office and Outlook to track things like emails that are being attached to certain accounts, differentiating between personal and corporate.
Show you where those emails are being sent on the endpoint with the ability to add in our cloud connector visibility to see emails being sent. On any device, even unmanaged devices from the corporate email domain. These connectors also include, uh, functionality for our, our upcoming DSPM product. We also have the cloud sensor lineage that we're, that, that's occurring using these same connectors.
So this entire product is using all of these sensors and sending up that massive amount of metadata to the backend instance here, hosted in GCP. So this is a GCP tenant that Cyber Haven manages and owns. It's a single tenant dedicated instance for every customer. No shared processes or databases between any two.
This is where we do our content inspection. One of the benefits here is that instead of using our endpoint, CPU dis IO memory network utilization, all of this for the content inspection, we just use some network utilization to send that, that data up to the content inspection backend, uh, where we actually have an encrypted container that all of the content is scanned inside of it's encrypted in transit.
And it's encrypted in the container while it's being scanned in memory only. So the file itself is never written to disk. It's only stored in ram to process for content matches. When the scan is done, we either send a copy of that file to the customers storage bucket for evidence capture, or if you don't have that configured, we delete the file directly from RAM and only store on our database the number of pattern matches that was found inside of that document.
For all of your deployed content patterns. Now, this may sound a little bit like over-engineering to some extent, but the benefits here are pretty, pretty, uh, broad. When we talk about, first off, having a shared content inspection engine between your DSPM product and your endpoint product means that we don't have to do a million rescans on every file.
Whereas in a Legacy DLP model, you would have a, a different content inspection engine deployed to each endpoint in your environment, and you would have to, if you had one file on a hundred systems, you would've to scan it a hundred times. Whereas with us, you only have to scan that file once, and then we do a hash check on every subsequent system that sees that file.
So we don't have to re-upload it and re-scan it, and we don't use any resources on the endpoint, which I think is really significant when we talk about end user experience and performance. Uh, and then when we have, uh, on the backend tenant, we have a bunch of different dashboards you can use. So we have like linear ai, we have policy management, your analytics and reporting, your incident management.
All of this is also recurring in that backend. Uh, is there a bring your own key option for encryption? Uh, chase, I'm inclined to say no right now, though. That has been somewhat of a discussion that we've had. Um, I'll follow up on this webinar in any way that I can with you. I believe it will have your email address that I can follow up with you on.
Um, and, and I'll, I'll let you know if that answer is different. Um, I know that it is something that we've kind of had some discussions about. Uh, you know about whether or not we wanna be able to do, bring your own key, uh, for encryption. Um, and I have another question here, uh, from Jonas here. Is there an API that can be used for triggered events or sim logs can be used to trigger other tools such as ITM or around playbooks.
It can be part of our store process.
Exploring API Functionalities
Silas Glines: There absolutely is. Uh, so we have a number of APIs here. You're gonna see this one on the top for sim. So functionality rather than have to create a custom integration with every single, you know, simmer soar product, we've opted to create a univers series of rest APIs.
We have a pull and a push, and you can export all those to your simmer source solution. We also have callback functionality where you can then update elements inside the console all via API. So you don't ever have to even log into the console to make critical changes to things like your policies, for example.
Introduction to Directory Services
Silas Glines: We also have directory services here, so you might want to link someone's cloud events with their endpoint events, with their email events and not just one risk score for that user entity. That's where we would use directory services. We support basically every directory under the sun, even custom directories that some, some of our customers have.
So it's very compatible, kind of a broad range of coverage there.
Enhancing File Coverage with Purview Labels
Silas Glines: We also have the ability to enrich your, your file coverage with purview labels. Actually ingesting your purview labels from Microsoft, and then looking for any files with those labels down the line. And then lastly, that customer storage bucket I mentioned earlier, uh, that allows you to, uh, you know, store actual files when they're a part of a policy violation.
You can also store, uh, copy and pasted text snip. We can even grab screenshots off the endpoint, uh, when, when you'd like to, to kind of show you what was going on on someone's screen at the exact time that they violated a policy. So there's a lot of functionality here that all can be either turned off or turned on depending on what your organization's needs are, what your privacy regulations are, your policies are, all of that stuff.
We wanna appease the security, you know, junkies here in the room, as well as the legal and privacy teams that might push back a little bit. Everything here is modular.
Upcoming DSPM Offering
Silas Glines: I do wanna talk a little bit about our DSPM offering that's coming up though, because this is really a, um, a, a pretty, pretty critical departure from where we've played in the past.
It's really important to note that we have so much more information because of this concept of already being on the endpoint, already having Lineage, already having AI categorization, which we released a year ago, and now we're able to bring that into the DSPM offering. So while a lot of other solutions are gonna be able to tell you, Hey, this document has a sensitive pattern match, and.
You might wanna place a label on it. In your data repository, we're gonna understand the type of, you know, categorization of that data. Is it an employee record? Is it a financial audit document? Not only does it have sensitive content who's accessing it. So this right here just highlights that we can go beyond just the categorization, uh, or just the content inspection, and also layer in things like cat categories for that type of data.
We also have the ability to understand the identity of the users that are accessing it. Um, are these external users to your organization? Are they internal? Are they accessing the data on a managed device? Are they accessing it on an unmanaged device? What are they doing with the data inside the repository?
So that data in motion as well, and then layer in all of these labels together. So what you end up having is a way more complete picture. Whereas a legacy kind of DSPM might just have the content inspection results to 200 290 4,303 is a number. And that number might mean that it's something sensitive with the content, uh, kind of AI piece there.
The categorization, it will look at the whole document and tell you actually two thou 294,303 is a financial metric within this document. And then using the categorization, we can see actually 2 290 4,303 uh, dollars is the company's unreleased Q3 revenue number from one of your, you know, documents.
Denoting your success throughout this last quarter. That right there is combining all of these together so that you really understand what's in that file, not just that there is a random pattern match that you might be concerned about.
Live Demo and Q&A
Silas Glines: We're gonna go into a live demo. Now, before we do that, I'm gonna answer some questions here.
So, uh. Dev, I'll go to Cyber Haven Auto-Create incident, uh, tickets in ServiceNow. There's two different ways that I've helped customers do this. While we don't have a direct integration with ServiceNow, it's pretty easy. Uh, we can obviously use the rest API to take the event metadata from the incidents and then just generate an incident in certain fields in ServiceNow.
This is probably the more common integration that we help our customers set up, which our professional services team is always on standby, ready to help people out with those use cases. But we also have the automated email alert system. I have a few customers that just wanted a simple ticket to be made in ServiceNow, so what they actually did was an automated email alert when the policy was violated to kind of, you know, forward that incident into ServiceNow to auto-create a ticket.
And then Chase, you asked a great question. Can purview labels be applied as an action if content triggers a policy? If so, can you override manually applied labels? So today we only have the ability to read the labels that are being placed on the documents and let you know if they match any of your purview, uh, assigned labels.
So we use those more as an ingestion method to determine whether or not a file is sensitive. When we release our DSPM offering, which is gonna be the Q1 of this year, probably Q2 as well, working more on enriching that data. It is very firm that we are going to be able to place labels on that document.
And then also our intention is to bring that to the end point and then yes, be able to override the existing labels by a user. This is critical functionality. Because of course, right? We don't want our users to just place labels on documents and then we trust that they're gonna do the right thing. Um, that sounds good.
In theory, we always assume, you know, our users are gonna know what files are aren't sensitive. Um, but that, that is not really the case. Users always want to do the most that they can do with that data, and they're always gonna place the label that allows 'em to do what they wanna get done, which is usually gonna be the lowest sensitivity label.
And then you said right here, uh, it seems you're using regex for numbers. It's applied to AI to make decisions by capturing the context as well. Yeah, so there's actually three components here, dev, and it's a, it's a good question. I'm glad, I'm glad you asked this. We support out of the box content patterns for, you know, state and government regulations across a hundred plus countries.
We also support customer regular expression. We also support custom dictionaries. We also support. Uh, uh, exact data matching and we support ocr, so we can do all that on image files as well as text files. That is just the content inspection piece where we'll just spit out raw values, denoting how many patterns we found in each piece of data.
We also have AI categorization that's not leaning on the RegX, that is separate. That's letting you know the category of document that it is. And then we also have the lineage and the provenance of the data. So denoting, is this an internal document? Did it come from our internal repository? Is this an external document?
Did someone download this from their personal Google Drive rather than our corporate Google Drive, for example? You can use all three of those to denote the sensitivity of the data, which I'll demo here in a second.
Incident Management Tool Overview
Silas Glines: So first off, uh, what I wanted to show is just kind of an example of our incident management tool.
This is gonna be primarily using the data in motion information that we have. So this right here is an incident generated by one of my amazing SCS Tvia. Shout out to Tvia if she ever listens to this. She's amazing. Sc She went ahead and, um, you know, exfiltrated some corporate health records. One thing that you're gonna notice here is that this data set denoting the sensitivity of the data.
And this policy denoting that that was a risky destination are actually both AI generated. This is an entire AI generated incident. This is not something that we even had a policy in place for. This is a part of our linear AI product. It's called Linear Analyst. What Linea is always gonna do is take all that outta the box, visibility, all that outta the box content awareness, and it's going to make decisions on how risky behavior is.
It's really important to note that Linea is not designed to find just anomalies. Now anomalies are great, but anomalies aren't risk. I talked to a lot of tax firms and every April 15th they have a policy that their, you know, uh, other CPAs have to go ahead and delete various tax documents on April 15th.
And if Linea was just looking for anomalies every April 16th, you would see all this mass deletion of data, all these anomalous events, and it would send you a bunch of alerts. This is where we really cannot rely on anomaly detection. Now the first thing the AI is looking for is anomaly, uh, anomalous activity.
But then what it's gonna do is it's gonna actually evaluate that activity. What is the data? Where did it come from? Who's accessing it? What kind of AI categorization is it? What's the content in the data and where's it going? What did the user do with it? What account are they using to upload it to X, Y, or Z SaaS application?
And it's gonna take all that information into account and then spit out a policy violation without you even having a policy in place. This is an example of that. So we have a summary here that's provided by the AI tool. It's not like your analysts are gonna need this summary, you know, they're gonna be able to look at the lineage and understand obviously what's going on.
This is provided so you can share this with other people in the organization that might not be as technical, right? So you can just kind of quickly copy and paste this and say, Hey, here's the activity for this employee. Maybe your HR team, maybe someone's manager. And then this is kind of a plain text way to communicate to them what happened.
We can see then that as we scroll down, we get a, a kind of a, a data trace for this. We can see that this file was uploaded here to Google Cloud. We can see that was uploaded to the Cyber Haven account. We can see later on it was downloaded from that same Google account. We see Then it was compressed into a zip archive.
We can then see it. That zip archive was moved and renamed as funny. Meme gif. Every single event here has a bunch of kind of metadata fields that we can use, and then we see later that after we did scanning of that file. Tavia uploaded that to drive.google.com, but this time it was her personal Gmail, not her corporate account.
This is an example of a pure AI incident where the AI detected that this was probably something that should not have been done. And even though there aren't sensitive pattern matches on this document, it knows the type of document that it is and it knows to flag this as an alert. So now not only does your analyst team have this lineage to work off of, they can toggle our legacy lineage view and see everything that Tavia did with the file, including the events that you might not have otherwise thought were very relevant.
So this right here is just an example where analysts really enjoy using the tool. I think partially because they, they no longer have to go to 12 different tools to see what the user did. They can see the entire lineage for that file and any copy or derivative of that file just by clicking through what happened, even down to them deleting the file, sending it to the recycled bin.
Now, of course, we're doing this for all users all the time, so what do we do with that?
Insider Risk Monitoring
Silas Glines: Well, we have an insider risk portion of our product that's always monitoring how much access to sensitive data users have. How sensitive is that data? What are they doing with it? Are they renaming file extensions? Are they archiving and encrypting data?
Uh, where are they sending it externally to the organization? And then it's giving every user a risk score. So you can see here we have tvia here is a critical risk. We can also see our insider risk groups that I was referring to earlier. These are groups of users that you can create for really any purpose.
They can either be manually assigned or automatically generated based on directory elements. So here I'll go into departing employees, and you can see that any employee that has a termination date in the directory is a part of the Departing employees group. And we also have a few custom additions here.
Now, one thing you'll notice is that you can add a risk multiplier to their behavior. This is really just to surface their risk quicker. Right? Say your top performer moves a hundred files to USB. That is cause for concern, but they're a top performer. Now, what happens if they move, uh, a hundred files to USB and then the next day they put in their two weeks notice?
Suddenly you care about that activity a lot differently. And so this right here is to kind of surface that, to say, Hey, let's not only increase their risk score and change their policies moving forward. Any DLP tool worth itself can do that. What's really impressive is being able to then apply those policies and that risk score multiplier to behavior that occurred in the past.
Right. Keep in mind we're storing all this lineage all the time for every file. We don't need to wait until they do an action to tell you that they violated a brand new policy. We can take that brand new policy and overlay that onto all of their past lineage and let you know if they already violated that policy.
Looking backwards, this is really valuable for departing employees because I think we all know that when a user puts in their two weeks notice. It. You know, most of your users are gonna assume that they're being monitored at a more granular level. What this does is kind of get gets ahead of that. When they put in their two weeks notice, you now know what they did the week, the two weeks, the 90 days before they ever told you they were going to leave.
And then of course you can create departments, groups for applying different policies. Maybe your IT team gets USB access. Maybe your sales department doesn't. Now what you can do is make it so that that automatically triggers and assigns those policies on deployment just because of the user directory that they're a part of.
When you click in on a user, I'll use Cole as an example. We get this daily risk score, trend chart overviewing their risk score. Over the last period, I'm using 90 days from loading a ton of data right now, literally millions of events and millions of metadata elements. And then down below I see their risky data flows here, took top secret business data, put it on a USB device.
We can see that action was blocked. Just high level metrics and I can click in and investigate any one of these flows here. So let's say, I wanna see when did he take corporate strategy data and put it into unsanctioned cloud storage? I just need to click on this and now it's actually gonna show me the lineage, the traces, and the events where he took that corporate strategy data and then put it on a personal Google Drive.
Here is exactly what that looks like. He downloaded this copy of his accounts under his Cyber Haven account in our Salesforce tenant. I can see all sorts of information. Then I can see later that he uploaded this to Google Drive and was in fact blocked. And we even have a nice little AI summary of that one event as well, including a content summary showing us from the AI what type of data this was.
So very easy to get from high level numbers to very granular events. And like I said, you can always look at the full lineage of the file to see everything else he did with it.
DSPN Product Preview
Silas Glines: Now lastly, I did want to give you guys a quick preview into our DSPN product. Um, so this right here is just an example of some of the ways that we can label data, categorize it, how we can surface where that data resides.
So I'm gonna start here under data type. This is really cool because we have a bunch of out of the box data types, so billing and invoicing for example, and we actually show you the prompts that we use to determine if a file matches this data type, where you can then even like test and tune this exact category, upload a file, see if it matches that.
The cool thing here is that you can also create a new label. So as you go ahead and add an inclusion rule, for example, you can just type in a prompt. We'll use AI to then look through all the documents that we're scanning to find a file that matches that description. So this is, this is pretty cool because now you're not only getting visibility into the out of the box prompts, which is not as common, but you're also able to make those custom prompts and find different data types that would otherwise be very difficult to define using traditional methods.
We also understand data providence, so we understand, did that data come from a, uh, internal location? One of our internal SaaS applications is this maybe a personal file that someone downloaded from their personal email account? We don't necessarily need to categorize it. Is this public information that we see people sharing all the time with public resources, websites and users, or do we just not know which, what the providence is?
Yet? We do this all using the lineage that we have from the endpoints to enhance the detections and and understanding of the data on our DSPN product. We of course have legacy data patterns. This is an example of a bunch of our out of the box content policies. So there's a ton here when we talk about various countries, various regulations.
Um, my favorite is how, if you go down to the U section, we literally have regulations for each individual state. So kind of showing you California regulations, um, you know, Texas regulations, Colorado regulations, state by state, country by country, all sorts of stuff. Then we have data sensitivity where this allows you to combine any characteristics from the top three to determine if the data is critical.
Sensitivity, high, moderate, low, or unrestricted. Lets you categorize that data to make it more actionable and more granular. So here you can consume things from the data type. It is the AI categorization, the provenance of the data, the content inspection results, and then create a sensitivity based on rules that match those three categories.
Now, of course from here we have a whole explorer where you can take a look at every repository where this data's coming from. Um, so here, for example, I might wanna look at any sales and marketing documents. And immediately I can go ahead and see every location that we've seen, sales and marketing documents on Google Drive and someone's endpoint.
I might wanna investigate why does someone have sales and marketing documents on their endpoint. When that's something that should strictly stay in the cloud, I can now dive in and view those entities and actually see those files residing on that endpoint. We also have the ability to just view the data at large and show you all of the data in your environment across all of your repositories, and then let you scroll down and actually see where that data is residing, whether it's on someone's endpoint, which we have the endpoint data at risk scanning already, or if it was found in a cloud environment that we're scanning actively.
All of this is possible here by combining all of that telemetry. And then lastly, obviously, the ability to configure your data stores, whether that's installing the sensor on an endpoint to get visibility into the data that they're interacting with, or whether that's hooking it up to your Google Drive, OneDrive, SharePoint.
We're gonna have a bunch of connectors that are available on day one for this functionality. But that's just kind of a quick preview into how we're really not. Building A-D-S-P-N product. Certainly not by, you know, uh, acquiring another product. We're building this from the ground up. This is the first truly unified DLP and DSPN platform that is built, not acquired, but then we're using this existing lineage technology that made us so well known in the endpoint space.
People love the lineage to get more accurate classifications, better and quicker investigations, better insights into where their data's moving from and flowing to. And we're using that to really enhance our vision when we talk about the data that we're discovering with the DSPN product.
Concluding Remarks and Gift Card Instructions
Silas Glines: I know we're coming up on the end here, so I'll leave that there.
I know it's a little bit of just a teaser, uh, but I'm gonna go through those questions that were asked here, and then we'll go ahead and kind of show you all how to get your, your gift card, or if you chose the donation, you don't need to go through that part of a process. Uh, so David, you asked how does linear AI evolve?
Does it learn from customer data in any way? Yes, it does. So we have an out of the box, kind of linear configuration that will deploy to your environment on day one, and then over the next 30 days, 60 days, 90 days, we'll retune that and have the AI actually learn from not just your organization, but taking into account your industry, your directory structure, taking into account the type of data users are typically moving.
And really what that's doing is getting us accurate in, in both. Directions that we need to get more accurate in. The first is gonna be the anomaly detection. Is this actually an anomaly that we need to investigate? And then the second is the risk detection, right? Not only was this an anomaly, but do we care about it because does it denote any risk?
I'll tell you all a quick story before I end this. I was not planning on telling this, but I think it's a fun story. Um, I, I actually was working with a, a prospect who was testing Linea and they actually had a, a firmware file from a printer get downloaded from one of their IT members, but it came out of one of their more sensitive repositories 'cause they just so happened to create a subfolder in this sensitive repository for, for storing their printer firmware updates.
The IT member had grabbed that firmware file correctly and then went to go and put it onto the printer, which was a removable device as detected by the agent. It was transferring it externally to the computer. By policy that was not allowed. So by policy, that should have actually triggered a policy violation 'cause they're taking something from a sensitive, you know, code repository and putting it on an external device.
What was really cool was watching the linear product go in there and it analyzed each step of the process and it went in there and it said. I know that this came from your source code repository, but this is actually a printer firmware file that I was able to find on the internet just by Googling. So this probably isn't proprietary source code.
And then it analyzed the user and it said, this would make sense because this user is a part of it, so it makes sense that they're grabbing printer firmware files. And then it analyzed the device ID and the product id, the vendor ID of the device it was going to. And said, this is clearly not a USB device.
This is a printer that they're trying to install this on. And it actually went in and, and, and actually reduced the false positive. It said, I know that this violates your policy, so I'm still gonna create the incident. But let me tell you, as someone that's already looked at this, you probably don't need to investigate this.
This is an informational severity, uh, uh, policy violation rather than the critical severity that the policy implied. So it really helped their security team lower its priority when they were going through their incidents, and then also just piggyback off of what the AI already learned and figure out that they actually agreed with the AI's assessment without a human having to look at it.
So it's just one of those cool stories that I think these are popping up more and more as Linea has been released and tested and deployed. We're only using each customer's individual data to train their own AI model, right? We're not sharing that data across the, the environments. Um, but I, I think that, that, that does a good job of speaking to some of, you know, really what we're trying to accomplish here.
Um, Linea is not provided internet connection, like closed, skipped. So there, there could be different configurations that we could do there. Dev, to be honest with you, Linea is running in the cloud environment, in the cloud tenant. Um, so I'm not fully aware if we're able to like shut off internet access for Linea, to be completely honest with you.
I believe we can. Um, but this is just kind of what that configuration for that customer was. You know, it, it kind of had learned about these applications and it was able to give that result. That's a good question. So I just wanna thank you everyone for letting me go over. First of all, I know you all are busy.
I know that you don't have time to join webinars that go over, so I really do appreciate it. Uh, this right here is the link that you will use to kind of claim your gift card. Uh, so what I'm gonna do is I'm actually gonna stop sharing my screen. I'm gonna paste this link inside the chat. And for those of you that selected a gift card for attending this webinar, you can go into that form and request it.
For those of you that requested the donation, I don't believe that this is necessary. Um, I don't believe that you have to do this in order to get your, uh. To get any, uh, you know, gift cards or anything like that, or any donations done. Uh, so lemme go ahead and grab this link. I'm gonna put it in both, uh, in the.
Messages section right here, and I don't, I'll throw it in the q and a as well in case you guys are there as well. Um, thank you very much, Deb. I really appreciate it. Uh, Jeff as well. I hope, I hope this was useful for you all. I, I hope you all learned something. If you didn't, my apologies for wasting your time.
There's a hundred dollars to Uber Eats, uh, as a thank you for, for bearing with me. Um, if any of you have any questions after today. I'm always available. Like I said, I've been here for a while. I, I have the, the the, I'm very lucky to get to, to lead an awesome sales engineering team. I'm gonna throw my email here right in the chat for all of you so you all have direct access to me.
If you have any questions about anything that I went over today, especially any follow ups that I might need to get up to you. So, uh, thank you everyone for all the kind words. I really do appreciate it. Um, have a great rest of your day. Don't forget to access that link and get your gift card. I'm gonna post it one more time 'cause I would hate for you guys to miss that.
Um, and. Hopefully we'll, we'll see you guys soon. Talk to you soon. And uh, I know there's already a few of you in the chat that I see at conferences, uh, on a regular basis. So come find me, come say hi. I'm always down to, to have a, a, a friendly conversation.





.avif)
.avif)
