Incident response: preparing for the unknown
Nobody is immune to security breaches. Preparing for incident response is critical - let's look at the best practices.
There is no such thing as 100% security – even if a system was developed to be as free of vulnerabilities as possible, unknown vulnerabilities may still be present and exploitable. Maybe it’s not your code, but the third-party software you use in your DevOps pipelines is vulnerable. Worse yet, the system may be vulnerable due to non-technical reasons such as social engineering or employee negligence. This necessitates the practice of incident response: preparing and executing guidance for identifying, investigating, handling, and recovering from incidents. OWASP is now also including this issue in the Top Ten as part of A10 – Insufficient Logging and Monitoring.
Best practices in this area come from secure software development standards and recommendations: the Incident Response practice within OWASP SAMM, BSIMM Configuration Management & Vulnerability Management, the Incident Response Reference Guide referenced within SDL, and NIST SP 800-61 along with NIST SP 800-184 all lay out their own best practices for incident response. Each of these documents looks at the problem from a different angle – and may use slightly different terminology at times – but their conclusions are very similar.
As this is a massive topic, this article will condense the most important highlights and takeaways when it comes to incident response. More detailed practices can be found in the documents linked above.
Being proactive to react better
A major rule of thumb when it comes to incident response is a focus on preparation. Having personnel, systems, and clearly documented processes in place can make dealing with incidents significantly less painful. Another important rule is to stay calm and follow the plan; simulating an incident before it happens can help with this. Dealing with incidents can be a high-stress scenario for all involved personnel, possibly compounded by external pressure from customers or press. However, overreacting to an incident can cause significant damage, and even indirectly help the attacker!
Before anything else, audit the system and do threat modeling to understand which components are likely to be the biggest targets. Have a threat intelligence program in place to stay vigilant about new attack vectors and threats. Apply monitoring, intrusion detection, and data protection measures to the system accordingly. This step also defines what is considered an incident in the first place.
Then it is time to prepare an incident response team along with an incident response plan. This plan should be part of standard risk management processes. The incident response plan should cover how to investigate a problem, how to triage and mitigate it, how to recover from it and finally how to document it. The incident response team should have a direct channel to the engineering team (unless they are part of the engineering team to begin with) for hotfixing the vulnerability.
It’s not just about the tech
Incident response is always a team effort, and that doesn’t just mean the literal incident response team, or even the IT team. The handling of an incident can involve PR and legal concerns as well as significant operational changes – such as shutting down several components. Any kind of public response should be coordinated between these entities instead of these units dealing with it on their own. The Incident Command System approach can be used here as well.
Disclosure needs to be centralized, transparent and clear to customers. For particularly key customers, point-to-point non-public disclosure of specifics may also be necessary.
Finally, incident response is an evolving process. The lessons learned – and metrics gathered – from an incident or breach should be used to update the incident response plan accordingly. Of particular importance is root cause analysis – finding what caused the incident in the first place. If it was due to a bug in the code (which is not always the case), it is worthwhile to look for similar bugs in the codebase and fix them as well.
The do’s and don’ts of incident response
Prevention is always the best cure, but incident response can mean the difference between a one-time data loss and a business-ending catastrophe.
As a clear example of bad incident response, take the Equifax breach of May-July 2017. The tools released by Equifax to handle credit freezes had multiple vulnerabilities. Their official social media accounts inadvertently redirected victims to fake sites. Worse yet, they appeared to be making attempts to try and avoid legal consequences. This incident has also spawned lawsuits that keep going well into 2020.
Due to the nature of publicity, positive examples of incident response are harder to find. Nuance’s response to the Petya ransomware in 2017 was well-regarded in the healthcare domain. A 2019 publication analyzing data breach notifications in Maryland shows a few positive examples on how to notify users, such as explicitly confirming the incident and the data that was compromised.
It is also a good idea to create a bug bounty program and use ethical hacking to your advantage. This way you can downgrade many potential breaches and incidents to vulnerability reports which can be handled in a way that doesn’t require incident response in the first place!
Of course, nobody knows your software better than your developers – it’s a good idea to have them act as the first line of defense by teaching them the mindset and skills required to implement code that is resilient against attacks and limits the attacker’s capabilities. Check out our course catalog to find the training that best fits your product’s needs!