Rich Crowther, Head of the Defence Digital Service (DDS), explains why we think that - even in Defence - we can secure our OFFICIAL workloads better in the public cloud than we can on-premises.
In Defence we’re starting to make more use of the public cloud for handling our OFFICIAL information. As set out in the government classification policy, this includes routine business operations and services that are not subject to a heightened threat profile. However, some people find it counter-intuitive that cloud services shared with other organisations or individuals can be considered to be as secure as those in data centres on military bases. A few years ago I’d have argued that the cloud can be ‘just as secure’ as our on-premises data centres for hosting OFFICIAL workloads, but today I’d say that in most circumstances we can do a better job of security in the cloud than we can do on-premises.
Below I set out the three main reasons why I believe this.
#1: Security patches can be applied faster
Let's face it, most organisations, either public or private sector, struggle to keep on top of maintaining their infrastructure. Whether it's operated in-house, or outsourced to a supplier, on-premises infrastructure is rarely well maintained. If you consider the full technology stack underpinning a single website in an on-premises environment: Even in a simplified example we have the hardware, the firmware and the Basic Input/Output System (BIOS) that operates the low level hardware, the operating system, the web server software and then the web application. If all of these layers were fully patched at the start of the system’s life, will they ever be again? Do you even notice when a new version of the BIOS or firmware is released for your hardware? Perhaps you do, but in my experience, most people don’t.
And why does patching matter? Well, I don’t like talking about security in absolutes, but I’ll make an exception for this one: If you fall behind with patching, your system is not secure. All threat actors, from bored and mischievous teenagers through to nation states, have the capability to attack an unpatched system. It can take just days for a newly released security patch for a component in the operating system to be reverse engineered, the vulnerability identified, and for a reliable working exploit to be developed. Sometimes these exploits will be kept secret by a threat actor, sometimes they’ll be added to popular hacking tools for the world to use and explained in YouTube tutorials. If you’re an organisation that measures time-to-patch in a small number of days, then you’re probably going to be OK, most of the time. But if you measure it in weeks or months, then you’re probably not moving fast enough.
Case study: Speculative execution (aka Spectre)
Contrast how we measure speed of patching in on-premises environments with how rapidly the major public cloud vendors responded to the speculative execution vulnerabilities in early 2018. Spectre was remarkable in that it represented an entirely new class of vulnerability that existed down at the hardware level in processors. The vulnerability was publicly disclosed on 3rd January 2018. Some of the major cloud providers had advance notice, but possibly less than they’d expected. Here in the Ministry of Defence – like most of the world - we found out about these vulnerabilities as the news broke.
By comparison, look at the response of Amazon Web Services (AWS): They presumably had a head start on addressing the vulnerabilities, and they made an updated virtual machine image for Amazon Linux available the same day that the vulnerabilities went public. This meant that if you were a concerned customer, you could update your virtual machines (‘EC2 instances’ in AWS lingo) right away. By the following day they had acted for all customers to ensure the vulnerabilities couldn’t be exploited for all their EC2 instances globally, regardless of whether you’d applied the patch yourself. The same was true for AWS Lambda, the flagship serverless compute capability in AWS - you didn’t need to lift a finger to make sure your function ran in a secure environment. And the following day, all database instances had been patched too. That’s astonishing when you consider how much development, testing and monitoring would have been required to roll out such low-level changes with high confidence. Over the months that followed, there was lots more patching to do too, since the chip manufacturers rolled out microcode updates that needed to be applied to the lowest level of software that runs on a computer (below even the BIOS). If you were using AWS or similar services, all of this was handled for you without needing to take any action.
In my experience, few organisations in the UK are likely to have the level of engineering scale and expertise to be able to apply security patches as rapidly as a hyperscale cloud provider, and if we don’t patch as quickly as they do at all levels of the stack, our systems are easier to attack.
What can you take from this? You should use cloud services at the higher levels of abstraction - ‘functions’ or ‘containers’ rather than virtual machines - as it means your systems are patched faster and spend less time being vulnerable to publicly known exploits.
#2: It’s easier to deploy security controls at scale
The second reason I champion security of public cloud is the simplicity of rolling out security controls across a huge estate in moments. Do you need a network monitoring tap inserted into every egress point in your system right away? No problem. Need to check all of your internet-exposed servers don’t have console access open to the world? Easy. Need to ensure all of your administrators’ access is recorded in an immutable log and stored indefinitely? You got it. All of these things can be done in on-premises environments too, but some could represent hours, days or even weeks of effort, whereas they are trivial to achieve in the cloud.
Be careful though: The downside of being able to deploy security controls at scale is that you can also scale any mistakes you make very rapidly too! It’s therefore important to ensure your templates or configuration code is well reviewed. And a well-designed deployment workflow with ‘separation of duties’ baked in will make that easier, which brings me to...
#3: You can authorise everything, and implement ‘separation of duties’ more easily
The strong focus on identity and authorisation within the major cloud services will be evident to anyone who’s tried to deploy infrastructure in them. It’s possible to authorise almost any action and keep an audit trail of the authorisation decisions that were made. The decision logic for authorisation can take into account a range of parameters, not just who you are or where you’re connecting from, but whether the resource you’re trying to access has specific attributes attached to it via metadata. You get the idea; you have a lot of control over who can access what.
I take that level of authorisation control as a given, though. Where I think the cloud has helped us make more fundamental strides regarding authorisation is in some of the architectural controls it makes much more achievable. First, there are things like ‘just in time’ administration, whereby access is only granted just before it is needed, which companies like Microsoft have made easy to set up within Azure. Second, and more subtle but equally important, is that we can more easily design systems that implement controls to require multiple user accounts to be compromised to cause a catastrophic breach. These ‘separation of duties’ controls have been technically achievable in on-premises environments for years, but they’ve often been tricky to set up and expensive to operate. Now, using public cloud, we’re able to easily build systems which require multiple people to collaborate to gain privileged access or carry out risky activities - this is a big step forward and means this sort of control can be used more widely.
“But what about…?”
Some of you will probably be thinking that I’ve ignored some of the special controls that Defence and other public sector organisations can put in place that commercial entities can’t match - physical security and personnel security vetting being the obvious examples. Don’t get me wrong, these are super important controls to protect some of our systems, like the more classified systems we depend on to run military operations. But don’t underestimate the level of work that the hyperscale cloud providers have put into physical and personnel security either. For example, the separation of duties controls they are able to deploy due to their scale often mean that the staff who could access a data centre to replace a broken disk are not the same people who could identify which of the disks contain data from a specific customer. This is a powerful control that is infeasible to achieve for most OFFICIAL workloads operated on-premises.
There will, however, always be some systems where we need very strict personnel and physical controls around our more classified systems, typically those which handle SECRET and TOP SECRET information. It’s important that these systems are extremely well patched too as part of their defensive posture, and we can’t get away from having to do that patching work ourselves or with close industry partners. But by making greater use of public cloud services for our OFFICIAL workloads, and letting the cloud provider do most of the heavy lifting, means that we can focus much more of our engineering effort on increasing our speed of patching our more classified systems.
Applying this in Defence
If you work in digital and technology in the Ministry of Defence, my colleagues in Defence Digital’s MODCloud team have made some of the leading hyperscale cloud services available for all of us to easily consume. The MODCloud team provides various guard rails and templates to help ensure some consistent security controls are in place across all accounts. My team are successfully using these services to process a wide variety of datasets, and we’re using all of the techniques described above too: We use higher-level abstraction services to ensure patching is largely done for us by the cloud provider, we deploy all of our infrastructure from code, and we’re architecting the systems to achieve separation of duties as a major design objective. All of this is in close partnership with our friends in Cyber Defence & Risk and the MODCloud team - we’re all learning and continually improving together.
I hope this encourages other teams in Defence to adopt MODCloud’s public cloud offerings for their OFFICIAL workloads (including with the SENSITIVE caveat), allowing us to focus more of our collective security operations and engineering effort on improving and securing our more classified systems.
And finally, just as we were putting the finishing touches to this blog, NCSC published a detailed whitepaper on the security benefits of good cloud services, which sets out many other arguments for where cloud offers improved security, if you need more.
Comment by Thomas Smith posted on
This post is wonderful, but ignores the requirement that system administration is done by at least SC cleared individuals - something which you simply cannot get with the big public cloud providers. Better to use UK SMEs for cloud services where they can properly protect such data through the right application of MoD policies rather than pretending they don’t exist.