Mean time to find, fix
Two very useful metrics
Do you use others?
Finding the right terms to communicate to executives and boards is more art than science. How do you measure success?
Every single tool employed by our Security Operations Center comes with a litany of metrics. Events per second (EPS); Endpoints protected, files/processes blocked/quarantined; alerts triggered; files/processes permitted; emails captured with potential malware, malicious URL, phishing, spoofing, domain spoofing; percentage of people caught by ethical phishing. All of these are great tools for analyzing the effectiveness of the tool, but they prove to be less useful when communicating to a group of people who want to know why one $200,000 investment should be given higher priority than another in the budgetary process.
As a non-revenue producing cost center, information systems departments and cybersecurity functions must find ways to communicate several concepts, stripping out industry jargon or product specific technical terms. One method of conveying this information is in terms of risk. What’s our exposure factor? What’s the likelihood this (an event that this tool could prevent) will happen in our environment? If that event happened, how much could it cost our organization? What’s the cost of implementation (including professional services for implementation is most accurate)? What’s the difference between the two? Are there ways to mitigate the problem without this solution? What do those methods cost?
Another valuable model for clarifying what a tool could accomplish or has accomplished is the event timeline. Boards and executives are increasingly aware of how long criminals were operating detection-free in another company before they were found. In my research, a common term for this is mean time to find. We use this when communicating about the budget. We come up with current examples that are determined by proofs of concept. We compare those performance numbers against past performance in the same area (presumably worse without the tool). Then we compare the two with a percentage of improvement. Leadership is also alert to how long knew about a security event before action was taken. We refer to this internally as mean time to fix. These numbers get the same analytical rigor and are presented at budget time as percentages of improvement.
Not all tools can be measured in quite the manner because the problems they solve aren’t quite so cut and dried. When that happens we start looking at how we can map those tools to either regulatory compliance or maturity model mapping. As a healthcare organization, we use HIPAA regulations. We also look to a couple of frameworks like CIS Top 20, NIST, and attestations made to health insurance companies (payers). We review the questionnaire for Most Wired Hospital. All of these various metrics are somewhat less concrete, but they help shape the picture that our team is using recognized benchmarks to self-evaluate and stay with industry leaders to identify and employ the best measures possible to secure our community within the given means.