There’s a lot of hype over machine learning and data science these days. It’s time to go beyond the buzzwords, and find out how we, as a security community, can actually reap the benefits of machine learning like fine-tuning staffing needs, making smarter budget decisions, automated and adapative security incident response, and more accurate threat reporting and assessment.
Machine learning isn’t the end all, be all of security operations and program management. We aren’t going to stop a ransomware attack just because we have more data on how the last one occurred. It doesn’t matter how much data we have if we don’t understand it and what to do with it. It won’t get us anywhere without someone on staff (or a full staff, if you’re really serious) who can understand the data, analyze it, and create a meaningful execution strategy using insights from that data.
The Joint Chiefs of Staff have defined intelligence as, “Information and knowledge about an adversary obtained through observation, investigation, analysis, or understanding.” We need to apply that principle to today’s threat landscape. Here’s how.
Plan your work, and work your plan
The idea behind data science is taking the huge amount of information available today from all of our devices and networks, and enriching it to make it actionable. Data science naturally relies on data, the more the better, but it isn’t just about quantity, it’s about quality. Data science, like all analysis methodologies and techniques, is a “garbage in, garbage out” scenario. Thus, the result of our analysis (intelligence) is only as good as the information the system or systems ingest. Sure, we’re going to enrich a baseline data set with additional sources of information to improve the quality, accuracy, and potential actions that can be taken, but if all of our data sources are garbage, our result will ultimately be garbage.
Machine learning and data science strategies need to be well thought-out and planned. What is the objective of implementing machine learning? What do we want to get out of all the data we are collecting? Define your data plan with objectives, criteria for success, and how you plan to evolve as a result. One example of a methodology that can help is the OODA Loop: Observe, Orient, Decide, Act (courtesy of Col. John Boyd).
Machine learning actually can help stop the next attack – if you do it right
Machine learning can help you move past a generic authentication and access security analysis model, toward a user and entity behavioral analytics (UEBA) model. This can help identify normal vs. abnormal behaviors for users and their devices. Does Alice normally log in at 2:47 a.m. on a Wednesday, from halfway across the world, and transfer gigabytes of data? If the answer is “no”, flag this event for immediate investigation, or perhaps block access outright and then investigate.
Machine learning can help organizations in every sector – tech, financial, healthcare, etc. – move from a reactive to a proactive security model. And it can help control costs by providing executive and product management the information they need to make informed, strategic business decisions.
Some essential questions to ask when you’re going through the machine learning-based security vendor selection process:
- Does the vendor understand the data they’re analyzing and technology they’re peddling?
- Do they actually have security experts on staff that can verify the efficacy of the protection provided by the solution?
- Can they only detect the terribly-formed, fuzzing-based attacks of tools, or can they truly detect targeted, unique anomalous behavior?
Asking these questions will help uncover the real experts who can actually help distill data to meet business goals, like reducing our attack surface.
What are the most valuable business insights CSOs can gain from data science?
- How to intelligently use your budget. Based on analysis, are we using hundreds of SaaS apps? Maybe it’s an appropriate time to research cloud access security broker (CASB) solutions.
- Fine tune staffing needs. Cross-train staff in other activities or disciplines, increasing their contributions and the diversity of their skills. Get increased ROI from existing staff.
- Accurately report on the type of attacks we’re seeing to quantitatively determine risk, and develop prioritized plans of action.
- Data is our most valuable asset. Are we protecting it properly? Who has access? How are they using the data? Can we follow the chain of custody for confidential data?
Having and understanding that data helps with board-level conversations, resource allocation, funding and staffing – all of which are on top of the CISO/CSO priority list today, and will remain top priorities for at least the next five to 10 years.