IT adaptation to COVID-19 accelerated demand for the elasticity that just-in-time hardware and software provide, and now for smart enterprises there’s no turning back.
A big lesson brought home early in the COVID-19 pandemic is that IT requirements can suddenly change at an explosive rate, and the only way to prepare for such events is to build as much flexibility as possible into corporate networks.
Many large enterprises had already embraced this concept, but smaller ones with fewer financial resources had not. The pandemic moved the needle for many of them from viewing flexibility as a luxury to seeing it as a core functionality they can’t afford to be without.
So what goes into achieving the level of flexibility that lets business adjust on the fly to 100% of their employees working remotely or situations where critical staff can’t come into the office?
Data-center elasticity, portability, fast spin-up
One factor is scalability, which in technical terms can have two meanings: scaling out (spreading the workload across multiple servers) and scaling up (adding resources to a single server). A better term might be elasticity, which has a connotation of being able to grow on demand with minimal effort, but also being able to return to a baseline once conditions allow.
For example, elasticity in a data center has benefits beyond simply being able to scale up to support additional load. It allows scaling back down again to reduce hardware burn and increase energy efficiency, and it affords individual workloads room to grow as their needs increase.
Another key component of flexibility is portability. Environments that require three-months to migrate an application from one server to another just don’t cut it in the modern data center. The infrastructure should have enough abstraction built in that migrating a workload from one data center to another, or even temporarily to the cloud, is a simple matter of triggering the migration with a script or an import wizard.
One obvious constraint to flexibility is having enough hardware to support workloads as they grow. Shifting or expanding these workloads to the cloud is an easy fix that can reduce the amount of underutilized hardware within on-premises data centers. For workloads that must remain on-premises, it’s critical to adopt virtualization and protocols that can abstract applications without sacrificing performance. This can make more efficient use of the hardware and make it easier to add new workloads dynamically.
A potential strategy for achieving cloud-like functionality while keeping workloads on-premises is using services that extend cloud services into enterprise data centers. Both Amazon Web Services and Microsoft Azure have offerings that extend their cloud footprint into on-prem hardware in the form of AWS Outposts and Azure Stack. As with any solution there are pros and cons; they make it possible to leverage existing tools and management strategies with minimal effort, but at the same time they run potentially sensitive business-critical workloads on platforms supported by another company.
Selecting CPUs for servers based on their support for virtualization enhancements is critical, making features like core count and modern hypervisor support a priority over clock speed in many cases. Hardware and protocol-level support for performance using network-based storage and I/O-intensive workloads such as the NVMe family of standards are also key factors when sourcing server hardware.
Software makes just-in-time infrastructure work
Hardware is the underpinning of just-in-time infrastructure, but software is where the magic happens. None is more important than the hypervisors that form the foundation of virtual-machine deployments and in some cases the underlying runtime for container-based applications.
Hypervisors like VMware’s vSphere and Microsoft’s Hyper-V are mature at this point, with support for a variety of configurations including clustering, nested virtualization, and multi-tenant deployment. Choosing one may be a matter of personal preference for its management tools or the existing enterprise environment. For example, if a business is invested heavily in Microsoft System Center, then Hyper-V might be a good choice because it will have in-house familiarity with many of the Microsoft management tools and concepts. Similarly, VMware offers many popular features that could make it more attractive to other businesses.
Another major software category for enhancing elasticity is containers such as Docker, Kubernetes, and related platforms like RedHat OpenShift. They provide the means to quickly deploy new workloads or make changes to existing applications through common development processes like Continuous Integration and Continuous Delivery (CI/CD). They can unlock the ability to scale on demand or transport workloads between a huge range of host platforms.
Automation holds just-in-time infrastructure together
With a foundation of server and storage hardware that support virtualization and an application platform built for elasticity and portability in place, it’s feasible to move from manual management toward more holistic automation through orchestration. Automation can ease the burden of repetitive manual tasks but can be hampered by limits to its scaling. When dealing with servers or applications numbering in the hundreds or thousands, handing details of the infrastructure becomes too much to manage efficiently from the command line. What’s needed then is an orchestration platform that provides a broad view of network assets and the tooling to configure overall automation plans.
Security
Massive scale introduces or exacerbates security challenges such as: consuming, analyzing, and acting on log events; ensuring applications and the platforms they reside on are hardened; maintaining patch levels without negatively impacting systems; and managing user access to resources that may now dynamically expand, contract, or even change location. It remains important to have tooling in place to enable threat detection from a variety of sensors, perform deep threat analysis, and help automate a response to those threats.
One possible option is an extended detection and response (XDR) platform that in some ways bleeds into the automation and orchestration category, but the clear focus is on securing workloads. XDR providers include Broadcom, Crowdstrike, Cybereason, Cynet, Microsoft, Palo Alto Networks, and SentinelOne.
In addition to XDR, identity management (IDM) is a key element in scaling as it enables a single gateway for authentication while also enabling modern security protocols and flexible multi-factor authentication. Funneling user and API authentication through an IDM provides an effective mechanism for auditing user activity and stopping threats before they reach applications.
IDM suites can also streamline the management of permissions and resource entitlements, automatically assigning and revoking access to applications based on group membership, and even automatically disabling user access to corporate resources when their employment status changes.
While it may have taken a pandemic to expose the benefits of just-in-time infrastructure, it’s now clear that the efficiencies and nimbleness it enables creates a strategic advantage that enterprise architects should pursue.