Friday, November 8, 2013

"Stop the Madness!!!" - Mr. Wonderful, Shark Tank

For those that attended, it was a great week in Rockville at the recent ICSJWG Fall Meeting. I was very much hoping that so many of these "alarmists" who continue to "cry wolf" of these DNP3 vulnerabilities would attend so that we could once and for all resolve some of the issues around how this is being communicated to the broader ICS security community.  Unfortunately, none seemed to show their faces, except for Adam Crain which provided me the opportunity to have a very detailed discussion around these vulnerabilities (unfortunately, the contents of these discussions will remain private).

What continues to be disappointing when reading these articles is the complete omission of the how these vulnerabilities are seen in actual deployed, commissioned systems. If you look at the DHS ICS-CERT summary advisory ICSA-13-291-01, you will note that there are in fact 8 systems identified to date (I expect many more to come), however, of those listed with details publicly available, only the Alstom e-Terracontrol and Schweitzer RTAC act as not only the DNP3 Master (responsible for primarily data communication) but also can function as a form of SCADA Master (combining not just data comms, but visualization, historization, and applications) - In more graphical text, the different between "A master" and "THE master".  This means that if you can compromise the DNP3 Master, you will most likely impact the overall SCADA Master Server as well. The other identified systems are acting more like "gateways" within the overall ICS architecture.

So ... what exactly does this mean?

To start, Security 101 teaches you that if someone gains physical access to a remote site (DNP3 outstation or substation in this case) that you have to concede that it is compromised and there is in fact very little you can do other that respond and remediate these threats via "physical" means - a baseball bat works great in these situations! This location is pwned - game over! The security objective is now to protect this initial compromise from affecting either the Central Site (such as a central control facility) or a peer site (such as another substation). In the case of the Alstom system vulnerability, we did in fact have a serious risk in that the substation could provide the means to initiate a successful remote attack against the Central Site.  

However, the misrepresentation that surfaces in so many of these documents is that in the case of many of these system identified as vulnerable, they were in fact acting as an application gateway that is taking the vulnerable nature of DNP3 and converting this to OPC-DA RPC communications (which if you don't know, already possess secure authentication mechanisms).  What this means is that if you are in fact successful in remotely launching the attack against the OPC Server which is also acting as the DNP3 Master Station, you have done nothing more than DoS/DoV/DoC the remote site which has already been compromised. In other words, the DNP3 vulnerability does not add any additional risk to the situation!

I submitted a diagram to many of those that like to argue this, and as expected, no one was willing to talk about the problem against a real architecture! In this case, your system has performed its security function and your corrective action is more in line with physical security improvement than cyber security ones.

What is probably most disturbing about all of this is that the mere deployment of the DNP-OPC application gateways is in fact a security countermeasure in itself, because it is performing its main job in protecting the SCADA Master from compromise due to physical attacks against the DNP3 Master. So, unless these guys have figured how to remotely take DNP3 commands and "pivot" through the host and spawn malicious RPC commands (against an authenticated protocol), we have effectively stopped the attack (good thing is that OPC-UA will make this even harder!).  Again, these "alarmists" will argue that the OPC Server is being used to aggregate multiple DNP3 outstations. Poor engineering design needs to be addressed through other security measures. This is no different than someone implementing non-redundant components in a high-availability architecture, and shows that the original system designer (which could in fact be some of these "alarmists") failed to perform the necessary FMEA on the design architecture before commissioning! Something those of us who design against functional safety standards like IEC-61511 are very familiar.

In closing, I have to tell a story.  I found a very significant and similar vulnerability in one of the world's leading DCS controllers recently that would allow me to completely shutdown all controllers on a given controller network provided I had PHYSICAL ACCESS to the controller network to initiate the attack. From a plant perspective, this could mean taking down a refinery's catalytic cracking unit or a power plants boilers (very common applications for this device). I didn't cry wolf and say that nn% of this country's refined products were now at risk because of this vulnerability, because (a) the mere disclosure of this number is both irresponsible and probably inaccurate, and (2) it would not reflect the reality of how these devices are actually deployed.

I am very impressed with Adam and his work, and look forward to seeing great things from him in the future. I know that he is not to blame for this misrepresentation of data. I am only hoping that for all of those that read this blog who are new to the ICS world realize the distinction between component vulnerabilities versus system vulnerabilities. Sad that my course still is the only one that really focuses on identifying and mitigating system vulnerabilities. 

Mike Ahmadi and Billy Rios blew the crowd away (myself included!) yesterday with their talk about a project that evaluated medical devices in a deployed system environment. Everyone was energized and hopeful that this approach may soon become more common in industrial manufacturing environments.  Until then, I can only continue to teach the importance of taking a given vulnerability and its Base CVSS score, and consider the Environmental and Temporal factors that reflect the net risk of a given vulnerability to your organization AS DEPLOYED.


Stay safe and secure ....

3 comments:

  1. I think we already agree on lot of the gateway vs. "big M" Master differences, but there's a subtle point here that I think you're missing.

    Authentication and software security are not the same thing.

    https://automatak.net/wordpress/?p=39

    I'm not familiar with OPC-DA but if it's authentication is high-up in the stack like the new DNP3-SA spec, it won't protect you from this class of vulnerability.

    We broke 5/6 systems we tested with secure authentication enabled, because we crashed the stack's parser before it even had a chance to authenticate.

    Once again, auth in OPC-DA may be in a better place, but auth != software security/robuestness.

    I was actually trying to make this point for months with the DNP3 standards committee. They're ICS protocol experts, and they didn't get it. Many security experts that specialize in broad system security don't get this nuance either.

    Also, bigger protocols will also tend to have MORE vulnerabilities than smaller ones:

    https://automatak.net/wordpress/?p=332

    The only way to mitigate these issues are to put the code inside trust boundaries (i.e. put your auth "low" in the stack.. i.e. wrap with TLS) or do a lot of testing.

    A good way to think about this the following:

    I don't care if you're running TLS if I can pop your TCP/IP stack in your OS with a buffer overflow.

    ReplyDelete
  2. Joel, I want to first thank you for the kind words. I know you are a major skeptic of all things security, so blowing you away is indeed a proud moment.

    Regarding the work Adam is doing, perhaps what he is exposing may not have a real "practical" application (today) in the threat actor world...but we simply do not know. His work is the first of its kind, and hats off to him for making it happen. Time will tell what this leads to.

    I think the more salient point is that there are some new players in the ICS testing space that are likely to discover some new and interesting vulnerabilities, and that is a very good thing. I believe that the lack of diversity in the fuzz testing arena has limited the ICS world until now.

    That has now changed, as you know.

    ReplyDelete