Sam's life in the clouds

Earlier this year Sam Jones re-joined DWP Digital as a site reliability engineer for hybrid cloud services, after a stint in the NHS.

“My role is to look after the infrastructure that underpins many of our cloud services hosted within Amazon Web Services (AWS),” says Sam. “We try to ensure high performance and uptime of all our applications through automation, such as responsive alerting, while at the same time ensuring cloud costs are kept to a minimum.”

How do hybrid clouds work?

Hybrid cloud hosting at its most basic level provides private and public computer and storage infrastructure. This enables DWP Digital to develop, build, test and operate all its business and supporting IT systems.

“It may sound a bit boring as a soundbite, but it couldn’t be further from the truth. The services we provide are integral to DWP’s day-to-day operations, and the 20 million customers we serve,” he says.

DWP Digital is investing heavily in engineering and technology to improve our services, making them more resilient and secure.

“We make the consumption of infrastructure quicker and easier for operators of our services and development teams through automation improvements,” says Sam, “and we support DWP’s wider engineering community by providing common shared services, tools, patterns, platforms and pipelines.”

DWP’s journey to cloud hosting

“We began our cloud and digital transformation journey 5 years ago when the Carer’s Allowance digital service went live.”

He adds, “From there we quickly accelerated our digital transformation for customers and colleagues, by focusing on the new capabilities and possibilities that public cloud hosting provided.”

“There’s still plenty to do though,” says Sam. In DWP Digital, our engineering capability is matched with a strong architectural vision.

“We want to run and operate our private cloud as we do our public cloud. This is the stage of the journey we’re at now,” he explains. “We’re looking at how we can align technologies, build capabilities, focus on automation, and keep our infrastructure modern.

“We’re seeing great benefits with containerisation in cloud, which we’re experimenting with in private cloud. There are some exciting times ahead.”

Head and shoulder image of Sam Jones

No day’s the same

What’s great about working in hybrid cloud engineering is that there’s no typical day.

“I’m lucky to work on many diverse projects. One day I may be trying to optimise our cloud spend and working on new approaches to save money, another I may be helping to deploy new infrastructure and applications. And another I may be all hands to the pump trying to get services back online (yes, systems will always fail occasionally!).”

DWP is a huge organisation with mammoth IT systems – old and new – whose scale does not compare to anything else in the technology sector.

“It’s amazing to see the development of technology in this area,” says Sam.

“When I first started working on DWP systems in 2011, I was working on an age-old IBM mainframe. The department has moved on massively since then, and although there’s now a heavy cloud-first focus, there’s still a place for these older systems. Imagine building a system with a lifespan of 20+ years now – you’d get laughed out of the room!”

DWP Digital has a huge and ever sprawling cloud platform consisting of hundreds of AWS accounts and subscriptions.

“We try to give our product development teams as much autonomy as possible to try and breed innovative solutions, using the most cutting edge of technologies,” he explains. “Unfortunately, one of the side effects of this autonomy and cloud sprawl is the lack of control and oversight of what is going on in each account. This can result in our cloud spend ballooning if not kept in check.”

Understanding the different ways of using AWS have already saved DWP many thousands and will increase exponentially as we continue to roll them out across our cloud estate.

“The experience and career development you get from being able to work with such diverse technology is second to none.”

Cloud computing skills

Being a site reliability engineer requires a large breadth of skills but we don’t expect people to be the master of all those skills.

“Having an interest and deep knowledge in a particular area is an advantage” says Sam. “A must-have is proven experience or certified knowledge in a cloud provider (in DWP Digital we currently work with AWS and Azure).”

He adds, “Alongside that, other core skills may include infrastructure as code (such as Terraform), configuration management tools (such as Ansible), server administration and knowledge in scripting languages (for example Bash or Python).”

We’re looking to create a diverse multi-skilled organisation, so if you have a passion and knowledge in a niche skill then you’re exactly what we’re after. We’re currently hiring for a number of engineering roles.