INITIAL POST BY RAPHAEL PEREIRA:
"I am writing a paper about the benefits and problems OS use of cloud technology in Control Systems. Does anyone have experience with this use? I am looking for use of virtualization technology, syncronous data tranfers, DR sites and other applications that we could use to help to be control systems more realible."
Jake Brodsky • I have several things to point out:
First: The primary purpose of virtualization in this arena is not to save on the number of servers, but to make existing systems more resilient and to reduce recovery time from software crashes.
Second, if you're using these servers to consolidate all processing to one place, you're doing it wrong. The point of this exercise is resiliency. Make sure you have at least one backup server room on the other side of the plant with backup servers capable of running the whole show from a separate power source.
Third, network design is key. Make certain that the switches (and routers, if any) are capable of handling the traffic of moving images from server to server without affecting plant operations. In other words, don't buy a cheap layer 3 switch and then get surprised when the backplane saturates.
Fourth, make sure the network infrastructure is also distributed. In other words, even if you diversify the servers, it won't do you much good if everything is focused through one great big switch of everything.
Fifth, virtualizing is great --as long as the applications support it. Make certain that the HMI vendor, driver vendors, historian vendors, and any other application vendors all support this.
Finally, in case you haven't already figured this out: Just because you've virtualized the control systems apps doesn't mean you can put office applications on the same server. Security issues aside, office application performance management is very different from control systems application performance management As with networks, while it is theoretically possible to mix them, in practice it is more trouble and more risky than any potential savings one might see.
Andrew West • I know of a utility that is using virtualization in much the manner that Jake describes: The benefit of virtualizaiton is to manage version control (it is easy to roll back to the previous configuration) and facilitiate failover to a backup system if a fault takes one of the servers down: The process image is checkpointed to a backup machine and is made active with almost no downtime. This system also keeps the disaster recovery image at a second control centre current in a similar way.Ron Southworth • I know people are using virtualised environments for testing. Some vendors are starting to certify and support virtualised environments. Two vendors in the power systes space spring to mind. I know of a few (less than the fingers on one hand) owners and operators that are perhaps bravely (hapily) using a virtualised environment. Early days for control systems lots of advantages and challanges to sort out as Jake has mentioned.
Perry Pederson • Perhaps I am going to the same place as the dinosaurs, but I would never EVER put my safety systems in the cloud. Take whatever threats and vulnerabilities there may be and multiply them by some huge unknown number and then try to sleep at night. Fugetaboutit!
Joel (the SCADAhacker) Langill • I have lots of experience with virtualization, and am completely behind this movement in order to help isolate the dependence on hardware from the functionality of the control system software. Coming from a vendor, too much time was spent on compatibility issues with hardware, where it should have been spent on validation of software functionality and its inherent security!
I have been using virtualization for several years in certain aspects of the project lifecycle and system architecture. Let's expand on this, starting with the high value, low risk areas first.
One of the best most obvious locations for virtualization is within the test and development environment of a facility. With the risk presented by installing untested or "lightly" tested patches and updates, virtualization eliminates most of the hurdles that were common in the past relating to building and maintaining a separate test bed for such a purpose. Patches need to be stressed more against the software installed, than the hardware platform upon which they are installed.
Next, virtualization is perfectly aligned with high-fidelity training simulation facilities that are becoming more common as manufacturing facilities are required to demonstrate that operating personal are regularly tested on their ability to operate and control the facility under a variety of planned and unplanned events.
We also are beginning to see more virtualization show up on the application level within the control system architecture, including historization and advanced application platforms. This are commonly applications that are not critical to maintaining production levels, and the vendors that are entering the cloud are beginning at this level.
I have also begin to work with virtualization on several of the hosts that would typically reside within 1 or more DMZs, such as web servers and jump hosts. Virtualization effectively allows us to create an architecture with multiple functional DMZs all directly connected to dedicated virtual platforms. This all but removes many of the common concerns sites had with "too many" DMZs and how this is managed with a traditional firewall. What some of the more progressive designs have also considered is the use of virtual firewall appliances as well that are used "downstream" of a traditional dedicated appliance, further allowing these sites to be built with very restricted and dedicated functional DMZs.
The one contraint that is going to continue to hinder the deployment of virtual technologies within the lower levels of the control system architecture will be the lack of flexibility in terms of peripheral support. It would be next to impossible, for example, to create a Profibus interface adapter that would be certified by both the vendor and the virtual software platform. As long as we have "non-standard" or proprietary technologies, it will be difficult to completely migrate the traditional level 1 and level 2 applications to the cloud.
Hope this provides some insight ... it certainly has given me some thoughts for a blog entry of my own!!! (blog.SCADAhacker.com).
Ron Southworth • G'day Perry & Joel.
I don't think you are going the way of the dinosaurs.
(If you are then call me a Muttaburrasaurus )
Perry you are reflecting what I would say is my present risk appetite, especially when I look at many industrial process control systems designs in the face of modern targeted malware.
Joel I am yet to see an implimented cloud that isn't more flawed than a legacy control system to be completely frank. Most of the cloud offerings I have seen are really more to do with outsourcing. I hope that folks will learn that this actually costs an enterprise up to 2.5 times the cost of operating and maintaining your own non core buisness enterprise technologies.
There is a big difference between running systtems in a virtualised environment and what all this cloud computing is all about. Please don't misunderstand me it isnt a question of if it is more a question of when I guess, and I think that folks are racing too fast towards embracing and merging technologies. Profi- Bus is about providing deteriministic operation something that is an issue not being given enough attention. We have to be careful about what we standardise on or how we impliment technologies.
Hopefully many other folks will see or have a better understanding or aprecaiation of this aversion to operational risk that I am speaking about. Many folk when they talk about mitigation techniques to permit connectivity from the board room to the plant floor I don't think really have a firm operational understanding of Protection and Safety Systems fragility.
Ralph Langner prior to all this malware hyperbole was developing a great paper on the subject - well on it's way to being a book actually and he was using fragility as a means to close and explain this gap of understanding.
I still stand by what I said at a conference a few years ago now. Sometimes the only effective mitigation we have to cyber threats with all it's limitations is physical segregation. It is by no means perfect but it removes a whole heap of "cyber" problems off the table providing you have your house in oder on your human elements.
As you say Joel there are some levels of an enterprise that might be capable of being supported in the cloud however I will need a lot of convincing as to when this should happen.
Joel (the SCADAhacker) Langill • Ron ... excellent points ... but let me be clear ... my implementations and examples provided in my comment above are not for the "enterprise", but for the control system domain. These have been implemented, and when implemented by individuals who understand how to effectively implement and secure virtual environments, they are quite reliable in practice. This means that those individuals have actual experience in implementing a true virtual environment based on a hypervisor, and not simply taking some casual backoffice experience with a product like VMware Workstation or Server and trying to move this into a production environment.
Maybe we need to talk further about what you have observed in your implementations versus mine. Since I prefer to make security a base requirement, these systems are significantly more secure than any legacy control system I have personally used. However, since my first love is control systems, and I have grown into security over the years, my design approach is very different than most!
I too have seen very poor cloud implementations, however, these should not discredit the solution, but rather discredit the individuals responsible for its implementation. For starters, most fail in their virtual implementations with the poor "built-in" and "default" configurations relating to virtual networks. This then is further exaggerated with less-than-optimal designs around virtual management.
I read the ISA article on HMI in the cloud, and personally, think this really misses the true value proposition of virtualization to both the owner-operator (end-user) and vendor. With nearly 18 years experience as a vendor, the real value lies not in the HMI nodes, but the server nodes. There may be some performance gains with the HMI nodes, and these tend to be easier due to their lack of non-standard hardware, however, the nodes that cause use the greatest headache continue to the those that are based on a server operating system (primary/backup system servers, historians, applications, etc.).
A solid virtualization platform, like vSphere from VMware, allow vendors to implement hardware redundancy on nodes that they have not been successful in the past in providing high-availability solutions. The features provided by products like vMotion offer an opportunity to proper vendors into a new domain of reliability without really investing much from a product development perspective. Take common application nodes like those used for batch management, multivariable control, optimization, and web services ... these are non-existent in a cost effective redundant configuration (barring something like a Marathon product which defeats the benefits of COTS hardware), but are fairly straightforward in a virtual world.
I only used Profibus as an example of what virtualization cannot be used everywhere, however, please focus on the main point of my comment which shows significant benefit of virtualization relating to two components of any control system architecture that impacts the overall security posture of the system: patch management; application development, testing and migration; and DMZ applications.
I paper on this topic is a great idea, and is something I will definitely add to my 2011 goals and objectives! Any suggestions would be greatly appreciated and respected.
Jake Brodsky • Following our design, we are currently testing a virtualized HMI and Historian system for a water filtration plant. As you say Joel, there are many pitfalls. Too many are selling office oriented systems, treating this application as if it were just another web server. I have no patience for such idiots.
By doing this we are treading a very find line between complexity and usability with this technology. Remember, people will have to use this system during times of stress and fatigue. It is difficult for some IT experts who live and breath this stuff to understand that on a plant, in the wee hours of the morning, with the superintendent breathing down your neck and the plant radio system squawking away, most of the HMI or historian systems dead in the water, and potential hazards ready to engulf, explode, or burn someone --that someone (a 24 hour duty engineer like me) has to remember how to bring this stuff online!
Yeah, when things are routine, when you're managing stuff on a planned schedule, it is a wonderful technology. However, Murphy's law says that things will fail in the worst possible way at the worst possible time. We're trying to find simple instructions to diagnose and repair these systems so that a tired duty engineer can talk an operator through this problem in a matter of minutes.
Sometimes, even though a solution may take longer to recover, the simplicity and predictability may make a simpler system more desirable. Ultimately, the goal is to get back up and running in minimal time. The fewer opportunities to make mistakes, the more likely it is that that recovery from an outage will happen sooner.
That said, we see a value in this technology and we are implementing it with a look toward using our experience to push solutions of this sort elsewhere. However, support from our vendors has been tepid; the tools, particularly the licenses, are confusing; and the costs, while reasonable, aren't exactly small change. There are lots hazards to navigate on this still poorly traveled road. Those who casually wave their hands about this while glossing over the details clearly haven't done this before or do not have to service this creature after it has been built.
Virtualization has future for control systems design. I think it will be a bright one. But as with many early adopters, there are still many lessons to be learned.
Joel Langill • Excellent points, Jake. Again, I think this group dialogue will result in
excellent material for a paper, and I hope no one objects to its use. (I
wonder is the original author of the post from 7 months is still
I agree completely, but am somewhat disappointed that the bad reputation of
virtualization is tied more to the quality of the implementation than the
actual technology of virtualization and what it offers.
Good luck with your project. I could see virtualizing an HMI for a SCADA
type workstation like Wonderware, but this would not be my first choice
with a system like Centum, DeltaV or Experion. I will stick to level 3
nodes and development / patch management / training systems for now. I
would be interested in feedback in the future.
Jake Brodsky • And as if on cue here is a Dilbert Cartoon to illustrate my point: