Why Out-of-Band Management Is Essential for Minimizing Costly Downtime

Why Out-of-Band Management Is Essential for Minimizing Costly Downtime

The high cost of downtime is well documented. According to research by the Ponemon Institute, the average cost of a data center outage is nearly $9,000 per minute. The average total cost is nearly $750,000. These figures include the cost to detect, contain and recover from the outage, along with productivity losses, lost revenue, opportunity costs, customer churn and reputational damage.

Clearly, rapid response to an incident is the key to minimizing the cost of downtime. That’s why data centers develop emergency operating procedures (EOPs) that data center personnel follow to prevent an incident from becoming a full-scale outage. In practice, however, rapid response is often hampered by an inability to access IT equipment remotely.

If an outage occurs in a distant facility or in the middle of the night, the engineer on call must get to the data center in order to troubleshoot the issue. If the organization decides to use a remote “smart hands” resource, that person may have very little to go on. The wrong equipment may be switched on or off, adding to the pain. Delays and human error result in a longer outage and higher costs.

Redundancy vs. Resilience

In order to mitigate the risk of an outage, organizations typically develop business continuity plans that focus on the physical data center infrastructure. They implement redundant power sources and uninterruptible power supplies, along with redundant systems and networking gear, as a hedge against equipment failure. They may also have the ability to failover to a redundant facility or cloud service in the event of catastrophic outage.

But redundancy isn’t enough to create resilience. What if there’s a cable cut in the “last mile” Internet connection? What if a security breach takes down one or more network segments? What if there’s a problem with a firmware update across the entire fleet of network switches? No amount of redundant equipment will protect against these kinds of problems.

Many organizations have invested in advanced infrastructure management tools to streamline their data center operations, but these tools require a network connection. With the primary network down, data center engineers have no way of troubleshooting the issue without going onsite.

Why Out-of-Band Management

One of Rahi’s European customers recently decided to make its infrastructure more resilient by implementing an out-of-band management solution. With out-of-band management data center engineers can access and control IT equipment remotely, even when the primary network is unavailable. A separate network is set up to support serial console servers that allow administrators to remotely manage network gear through the device’s serial port. Service processors provide low-level access for remote monitoring and management.

Setting up an out-of-band management network is not cheap — but neither is buying redundant equipment that sits idle unless there’s a failure. And the management network is going to be more effective in terms of accelerating problem resolution and minimizing downtime. Out-of-band management can also help streamline upgrades and facilitate the maintenance of equipment in offsite data centers, co-location facilities and remote offices.

If you’re looking to create a truly resilient infrastructure, an out-of-band management network can provide complete visibility and control without costly and time-consuming “truck rolls” to remote facilities. Rahi Systems has proven expertise in out-of-band management and can help you implement a solution that will minimize the cost and risk of a data center outage.

Pandemic-Related Telework Requires a New Approach to Cybersecurity

Pandemic-Related Telework Requires a New Approach to Cybersecurity

Cybercriminals are preying on fears of the COVID-19 coronavirus pandemic to spread malware, perpetrate scams, and compromise systems and networks. Many of these attacks are targeting employees who are working from home under “social distancing” policies. Organizations should shore up their cyber defenses to ensure that attacks on remote workers don’t result in a security breach.

The Department of Homeland Security’s Cybersecurity and Infrastructure Security Agency (CISA) issued an alert on March 13th about phishing emails attempting to steal user credentials from teleworkers. The CISA also warned that cybercriminals are targeting vulnerabilities in the virtual private networks (VPNs) remote workers use to connect to corporate IT resources.

According to the Check Point Global Threat Index, 8 percent of the more than 4,000 coronavirus-related domains are malicious or suspicious, creating a significant threat of malware infection if an employee visits one of these sites. Security experts have also noted attacks on company executives who are working outside their organization’s secure network perimeter.

That perimeter has all but disappeared in recent years due to increasing numbers of remote and mobile workers. Approximately 4.7 million Americans now work from home at least half the time, according to the U.S. Census Bureau. That’s a 159 percent increase since 2005. The rise in telework due to the COVID-19 pandemic will likely result in more employees taking advantage of this option long term.

Organizations should prioritize security policies, procedures and technologies to protect remote workers. It starts with a “zero trust” security model in which every user and device attempting to access the network is presumed to be a threat. User identities and the security posture of the devices they use must be authenticated, whether they are inside or outside the network perimeter. User behavior analytics tools can help to detect deviations from normal activity that could signal a cyberattack.

Other steps organizations can take include:

  • Limit the risk associated with stolen credentials. Implement multifactor authentication for remote access. Adopt the principle of least privilege, which limits access to only those resources employees need to do their jobs. Utilize network segmentation to prevent hackers from moving laterally through the network should they gain access.
  • Keep systems and devices up to date. Implement the latest software patches and updates in VPNs, firewalls and devices remote workers use to access the corporate network. Require that the devices employees use maintain minimum security standards.
  • Implement IT operational procedures. Ensure that IT personnel are prepared to monitor remote access, detect attacks and respond to security incidents. Prepare for mass usage of VPN connections and use rate limiting and other techniques to prioritize users requiring access.
  • Educate remote workers. Alert users about the increase in phishing attacks related to the pandemic and warn them to be suspicious of links and attachments that purport to provide information on COVID-19. Provide ongoing training so that users learn to recognize phishing and other social engineering techniques. Ensure remote workers know who to call for support or to report a security incident.

Remote workers should also take steps to prevent a cyberattack, including:

  • Ensuring their router and Wi-Fi connection are secure
  • Keeping all operating systems, security tools and antivirus software up to date
  • Regularly backing up files to protect against loss and ransomware
  • Using only company-approved software and collaboration tools

Rahi Systems can help you implement the policies and tools you need for telework security. Give us a call to schedule a confidential consultation.

Why the Next-Gen Data Center Must Be ‘Green’

Why the Next-Gen Data Center Must Be ‘Green’

The data center has undergone a number of major changes over the years, from mainframes to client/server computing to web-scale computing. This evolution occurred over decades, but in recent years the changes have been fast and furious. Key trends include the rapid rise of cloud computing,  ever-increasing demand for storage, and the growing adoption of artificial intelligence, data analytics and other resource-intensive applications.

One consequence of the accelerated evolution of the data center has been a dramatic increase in density. In order to achieve operational efficiencies and meet growing business demand, organizations are packing in more kilowatts per rack than ever before. Densities of 8kW to 12kW per rack have become common, with more than 20kW per rack for compute-intensive applications. In addition, high-density data centers generate more heat and thus require more
cooling.

That all adds up to an enormous amount of power. In 2017, global data centers consumed more than 400 terawatts of electricity — approximately 3 percent of the energy generated worldwide. In the U.S. alone, data centers consumed more than 90 billion kilowatt-hours (kWh) of electricity, an almost 50 percent increase over the 61 billion kWh consumed in 2006. Year-over-year increases in data center energy consumption have slowed significantly to about 4 percent, but the amount of power involved is still mind boggling.

Yet organizations have been slow to reduce energy consumption in their IT environments. Power usage effectiveness (PUE), calculated by dividing the total energy a data center uses by the amount consumed by computing equipment, is one metric used to determine energy efficiency. If the two numbers are equivalent — meaning that all energy goes to the IT equipment — the data center has achieved the ideal ratio of 1.0. 

According to the Uptime Institute, the world’s largest data centers have a PUE of 1.67. However, a survey by Digital Reality found that the average data center has a PUE of 2.9. Nearly three times more power goes to cooling and other overhead than is consumed by IT equipment. That’s because smaller data centers are less efficient — a Ponemon Institute study found that energy costs per kWh in the smallest data center are 180 percent higher than in the largest data centers.

Older, inefficient equipment and underutilized resources can contribute to high energy costs, but data center design is a key factor. Organizations need to implement green design practices, such as energy-efficient cooling, aisle containment, and racks and cabinets that optimize airflow. Environmental monitoring, cooling assessments and operational audits can help identify inefficiencies that increase power consumption.

Organizations should also consider the use of colocation facilities. Colocation services not only minimize the capital investments associated with data center buildouts, but can also reduce power, cooling and operational costs. Most colocation providers have adopted a green data center model and provide the economies of scale needed to maximize data center efficiency.

The multidisciplinary team at Rahi Systems can help you optimize power consumption in your IT environment. We have specialists in data center infrastructure, compute, storage and networking who can assist you with strategic upgrades. Our team can also help you utilize colocation facilities and identify workloads that can migrate to the cloud.

These initiatives can also create a more efficient data center that has the speed and agility to support today’s requirements. Let Rahi help you incorporate green data center practices to reduce costs and meet business demands.

IT Complexity Drives the Need for AI-Powered Performance Monitoring

IT Complexity Drives the Need for AI-Powered Performance Monitoring

When we think about IT complexity, we often visualize stacks of hardware appliances. But there’s growing complexity in what you don’t see — the systems, services, and tools that underlie modern applications. According to a recent study conducted by Vanson Bourne, the typical mobile or web application transaction crosses 37 different technology systems or components. This increase in complexity makes it harder for organizations to manage application performance.

The study found that IT teams now spend one-third of their time dealing with application
performance problems, costing organizations an average of $3.3 million annually. That’s an
increase of 34 percent over the $2.5 million reported in 2018. Despite these investments in time and resources, IT teams are struggling to meet business requirements. On average, organizations reported six IT outages in the preceding 12 months in which revenue, operations or the user experience were impacted.

Challenges in the Cloud
Much of this complexity is being driven by multi cloud adoption. The cloud enables greater speed and agility, but it also represents a fundamental shift in how applications are built, deployed and operated. More than three-quarters (76 percent) of CIOs say they lack complete visibility into application performance in cloud native architectures. Traditional performance monitoring solutions drown IT teams in alerts that offer more
questions than answers. On an average day, IT and cloud operations teams receive 2,973 alerts
from their monitoring and management tools, a 19 percent increase in the past 12 months. Just
26 percent of these alerts require action, but the excessive volume of alerts causes 70 percent
of IT teams to experience problems that should have been prevented. Simply put, traditional monitoring tools were not designed to handle the volume, velocity, and
variety of data generated by applications running in multi-cloud cloud environments. What’s
more, many organizations have multiple, siloed tools that lack the broader context of events
taking place across the entire technology stack. IT teams must manually integrate and correlate
alerts to filter out duplicates and false positives before manually identifying the underlying root
cause of issues.

AI to the Rescue?
Artificial intelligence (AI) will be critical to IT’s ability to master increasing complexity, according to 88 percent of CIOs. AI-powered monitoring tools don’t just gather discrete data points and compare them to preset thresholds. They detect anomalies and analyze them in the context of multiple application components. This makes it possible to pinpoint the source of the problem automatically, saving IT teams hours if not days of effort and allowing issues to be resolved more quickly. Achieving these benefits can be difficult. You have to identify the right sources of performance data and convert that data into a common format so that it can be analyzed effectively. The latest AI-powered performance monitoring tools do some of the work by automatically discovering application components and providing a graphical display of data flows. It’s highly likely, however, that customization will be required to get the analytics you need.

How Rahi Can Help
The networking team at Rahi Systems has expertise in performance monitoring tools and can
help you select the right solution for your environment. We can then help you determine which
data sources to use and prepare the data for ingestion.
Virtually every organization is becoming a software company. Applications drive interactions
with customers and revenue-generating opportunities, putting pressure on IT to ensure that
software is performing optimally. Let Rahi help you leverage AI-based performance monitoring
to improve the efficiency and effectiveness of your operations.

Why Your Organization Should Be Taking a Close Look at SD-WAN

Why Your Organization Should Be Taking a Close Look at SD-WAN

Any organization with multiple branch offices needs connectivity between those locations.
Traditionally, organizations have used MPLS for WAN connectivity due to its reliability and
performance. Those capabilities come at a steep price, however — MPLS is significantly more
expensive than broadband and fiber-optic Internet connections. The costs go up for
organizations with a global footprint, and those that need a lot of bandwidth.

MPLS is a complex deployment. It takes a minimum of two to three weeks to get connectivity
for any given location — more if you must provision links from service providers in different
regions of the world. What’s more, WANs based upon MPLS is designed for connectivity to
the enterprise data center, not directly to cloud applications and services. Individual locations
must connect to the data center to reach the cloud, which can cause unpredictable application
performance.

Software-defined WAN (SD-WAN) came into the picture to address these challenges. SD-WAN
adds a software layer that abstracts and simplifies WAN connectivity. You can have multiple
WAN connections of any type, be it MPLS or broadband or even 4G / LTE. SD-WAN creates an
overlay that aggregates these physical connections to provide every user with a high-quality
experience. Each location gets a high-performance, highly available connection even if there is
network congestion or the link speed isn’t that high.

With a traditional WAN infrastructure, there are a lot of physical devices that must be installed
at each site and managed individually. With SD-WAN, there is just one device at the edge that
can be deployed in minutes and managed remotely. And SD-WAN is a pay-as-you-grow service
— customers can readily add locations and upgrade their infrastructure according to their
needs and budget.

Any WAN connection can be subject to outages, slowdowns, packet loss and other issues that
can impact applications. This is especially problematic with latency-sensitive applications such
as voice over IP, video conferencing, streaming media and virtual desktops. SD-WAN uses traffic
shaping and other techniques to provide assured application performance. It’s also application-
aware, which means that it can prioritize latency-sensitive applications and automatically select
the best data path based upon application requirements and network conditions.
SD-WAN also improves security. It can act as a VPN gateway, making it easier to set up VPN
tunnels between the locations. Best-in-class SD-WAN solutions incorporate other security
features as well.

Which brings us back to cost. With SD-WAN, organizations can connect branch locations
directly to the Internet without worrying about reliability, performance, and security. And that direct Internet connection is going to be much more efficient when it comes to accessing cloud
applications and services.

Gartner has predicted that by 2023 almost 90 percent of WAN edge infrastructure will be on
SD-WAN. But is it right for your organization? You should consider SD-WAN if:
     – Your connectivity contracts are coming up for renewal. Many organizations find that
     connectivity cost savings more than pays for the SD-WAN implementation.
     – The edge devices in your branch locations are due for an upgrade, or your IT team is
     struggling to manage all those devices.
     – You’re moving more applications and services to the cloud.
     – You have a growing number of branch locations that require highly reliable, high-
     performance connectivity. Global organizations can especially benefit from SD-WAN.

Rahi Systems has experts in networking, security, virtualization, and software-defined data
center solutions. We have relationships with leading SD-WAN vendors and our engineers are
certified in the delivery of their products. If you’re considering SD-WAN, we can help you
determine if it’s the right choice and design a solution that will meet your requirements.

Pandemic Preparedness Should Be Part of Your Business Continuity Plan

Pandemic Preparedness Should Be Part of Your Business Continuity Plan

The Coronavirus has not officially reached global pandemic status, but organizations should nevertheless be prepared for the threat. In fact, experts say that pandemic preparedness is a good business practice that should be part of every business continuity plan.

It can be easy for organizations to disregard the virus until it hits their region. However, organizations may rely on business partners and contractors who are suddenly subject to quarantine. There doesn’t have to be a major increase in the outbreak — if just one person presents with symptoms, other workers who have been in contact with that person could also be quarantined for 14 days.

Even a regional outbreak can have a serious impact on today’s global supply chains. Businesses in Asia-Pacific have reported supply chain disruptions due to constraints on travel and shipments from China.

The Business Continuity Plan

What does pandemic preparedness have to do with business continuity planning? A business continuity plan should include all the information and procedures needed to minimize operational disruption in a disaster. It should address all aspects of the business — including human resources. If key personnel are unable to come to work or do their jobs, how will your business function?

A business impact analysis is an important part of business continuity planning. In the context of pandemic preparedness, organizations should determine the potential impact of a pandemic using multiple possible scenarios that affect different products, services and/or functions. The risk is particularly acute for IT operations. Many IT teams are already stretched thin and may not have the resources or duplicative skill sets needed to ensure business continuity in a pandemic.

Preparing for Pandemic

The Department of Health and Human Services (HHS) and the Centers for Disease Control and Prevention (CDC) have developed guidelines to assist organizations in planning for pandemic and other public health emergencies. Among their suggestions:

Establish policies. Consider flexible scheduling, including telecommuting and staggered shifts. Restrict travel to affected geographic areas. Establish policies for employee compensation and sick-leave absences unique to a pandemic. Select one or more pandemic coordinators for each site and establish an emergency communications plan.

Implement procedures. Identify essential employees with defined roles and tasks required to maintain business operations. Prepare an ancillary workforce and/or partner with a third-party outsourcing provider. Modify the frequency and type of face-to-face contact.  

Communicate and educate. Disseminate information about preparedness and planning to dispel fear, anxiety, rumors and misinformation. Identify community sources for accurate information and ensure availability of medical consultation and advice.

Coordinate with external organizations. Maintain communication with vendors and supply chain partners. Collaborate with insurers, public health agencies and healthcare facilities to understand their capabilities and plans.

How Rahi Can Help

The experts at Rahi Systems can help you develop a business continuity plan that addresses pandemic and other personnel-related risks as well as threats to the IT infrastructure. In addition, our offices worldwide are staffed with Rahi personnel who work in concert to deliver IT solutions and services globally. 

We can help ensure continuity of support through our network operations centers. We also have warehouse facilities around the world, and procurement and logistics teams who are adept at obtaining equipment and coordinating delivery anywhere in the world. These capabilities can help keep IT projects on track and ensure the availability of replacement components.

Coronavirus may never become a global pandemic, but the outbreak serves as an important reminder of the need for preparedness. Now’s the time to update your business continuity plan to include policies and procedures related to personnel disruptions.

How to Deploy Smart Data Centre Infrastructure Management

How to Deploy Smart Data Centre Infrastructure Management

Over the past decade, many organisations have deployed Data Centre Infrastructure Management (DCIM) platforms. This is software that captures the availability and performance metrics of the hybrid IT environment such as cloud infrastructure, servers, network devices, power and storage systems. 

These platforms monitor and collate the availability of IT infrastructure and provide utilisation metrics in real-time to help the IT and data centre operations teams manage the environment efficiently. DCIM can also reduce the amount of time used to manage assets and capacity, monitor performance and be proactive with detecting and managing infrastructure faults that cause downtime.

Fifteen years ago, companies selling DCIM platforms sold a great story and many organisations looked at these solutions as the silver bullet to the smart and efficient management of their data centre estates. The organisations made the investment, implemented the solutions and about six to twelve months after the implementation project was completed the reality kicked in: DCIM required a lot of resources and specialist skillsets to administer and maintain the database and keep the systems running. This was hard to manage, and the solution quickly became known as a black art. 

Many organisations went back to their vendor only to discover that more costly professional services would be needed to put things straight. It was difficult to convince the budget owner to spend more on a solution that did not deliver what was expected in the first place.

DCIM technology has taken huge steps in the past few years and many vendors are now offering operational features that can help you streamline and orchestrate your daily IT operations. These new platforms are starting a new era of monitoring platforms known as DNIO (Data Centre Network Infrastructure and Operations) DNIO is the next generation of DCIM and extends the solutions to provide a full suite of operational features. Here is a high-level overview of some of those key features. 

Integration with ITSM systems is a low hanging fruit to achieve quick wins from your technology services teams, some of the benefits are orchestration of asset discovery, database maintenance, and administration,
utilising the platform to detect failures to automate Incident ticket creation, planning adds, moves and changes, seamless updating of the CMDB (Asset Database) gives the operational teams the tools to respond quickly to business demands. Ultimately this keeps your services up and running and helps prevent that dreaded call from your customers telling you they have had an outage.

Automatic Solutions that have the capabilities to integrate with existing systems such as billing, building management, and enterprise resource management will help you maximize the payback and bring your teams together. All will be working from the same platform and have access to the same data, keeping it consistent and error-free.

Before starting an implementation, it’s best to obtain buy-in from other departmental stakeholders and collaborate to build the scope. This should start with day one requirements and developing a roadmap for future implementation phases. Often the IT and facility teams are detached, with their own priorities and completely different regimes when it comes to the management of the environment and the control measures used to minimize disruption in the event of maintenance or failure situations. The trick here is to get your teams talking and working closely together.

Integrate the solution with all aspects of the data centre IT and facility operations to give your teams a holistic view from one central place. The teams can work closely together to ensure the environment is tuned for the best performance, ensuring energy is managed efficiently by monitoring the IT loads and tuning cooling systems. IT infrastructure teams can be notified automatically when a critical facility outage is detected, triggering scripts that can orchestrate emergency operating procedures. This removes the need to rely on humans to make calculated decisions in stressful situations, which increases the risk of human error.

Infrastructure management platforms that support multi-site operations get really interesting. You can now start thinking outside of the data
centre and extend coverage to your offices and manufacturing plants, having the ability to monitor and manage all of the infrastructure from IT to the facility or physical security and IoT systems. This helps your business move into the smart buildings and achieve your green initiatives and reduce your carbon footprint by decreasing energy consumption and greenhouse gas emissions.

At Rahi Systems we have the ability to monitor and manage the full stack, including your metro and wide-area network connectivity that can all be viewed from one platform and a single pane of glass.

Finally, a key part of the success of your implementation is training the people who are going to use it. I have heard and experienced too many stories about staff not knowing how to use DCIM or even understanding why it is there. This results in a wasted solution that is forgotten and left on the shelf and replaced with another.

The biggest challenge is convincing the business to make the investment to improve your operations and mitigate risks. However, if you can gain buy-in from other departments and key stakeholders you will be able to collaborate and build a sustainable business and focus on what you do best, improving efficiency, managing costs and 
maximising the agility of the business.

The next decade of DCIM has arrived. Are you ready to take your organisation to the next level of smart efficiency? 

The Distributed Cloud Model is a Gartner Top Trend. Here’s Why

The Distributed Cloud Model is a Gartner Top Trend. Here’s Why

Forrester Research estimates that about 20 percent of enterprise workloads now run in the public cloud. A survey by Goldman Sachs aligns with that estimate, finding that 23 percent of workloads run in the cloud with 43 percent of organizations expecting to migrate additional workloads by the end of 2022.

Many cloud initiatives have focused on the so-called low-hanging fruit — email, collaboration, file-sharing and similar applications that don’t require much customization. Now organizations are starting to migrate more mission-critical workloads, such as HR systems and other back-office applications.

The fact remains, however, that organizations will need to keep some workloads on-premises. Organizations simply aren’t in a position to refactor highly customized monolithic applications for the cloud environment. Even if that were possible, many would be hesitant to put sensitive enterprise data in the cloud due to security and compliance concerns. The downside is that organizations must manage two separate infrastructures in a hybrid environment.

This hard reality is driving an emerging trend called “distributed cloud.” Gartner defines it as “the distribution of public cloud services to different physical locations, while operation, governance, updates and the evolution of those services are the responsibility of the originating public cloud provider.” In other words, the distributed cloud enables customers to put public cloud resources in the on-premises data center.

The major cloud providers are leading the way with solutions such as AWS Outposts, Google Anthos and Microsoft Azure Stack. The key feature of these services is a single pane of control that can be used to operate both public cloud and on-premises environments. IT infrastructure management is simplified while enabling the customer to retain control over certain applications and data.

The big push now is around containerization. For example, Google allows you to initiate and manage your containers from the Cloud Console but run them on validated on-premises equipment as well as within the Google cloud. You can even use Anthos to build, deploy and manage your environment across Amazon and Azure. With containers your applications and data are more portable, so you can consume cloud services securely and efficiently.  

Rahi is uniquely positioned to help customers take advantage of the distributed cloud model. We have deep roots in the design, architecture, implementation and support of enterprise-class on-premises infrastructure and extensive knowledge of public cloud platforms. We understand the reference architectures and platforms that have been certified for distributed cloud technology. 

The technology is also evolving rapidly, as the hyperscale cloud providers seek to gain traction in a wide range of enterprise data centers. Rahi is in a position to work with our hardware and cloud partners to validate the server and storage architectures our customers have on-premises. 

Gartner has identified distributed cloud as one of the Top 10 trends impacting infrastructure and operations in 2020. At the same time, the research firm advises a cautious approach when adopting this new model. Ross Winser, senior research director at Gartner said, “Enthusiasm for new services like AWS Outposts, Microsoft Azure Stack or Google Anthos must be matched early on with diligence in ensuring the delivery model for these solutions is fully understood by I&O teams who will be involved in supporting them.”

Rahi is one of a handful of solution providers with the expertise to deliver distributed cloud today. If you’re looking to explore this exciting new model, we invite you to contact us for a consultation and whiteboarding session.

How Multiple Transit Gateways Can Simplify VPC Connectivity for Large Enterprises

How Multiple Transit Gateways Can Simplify VPC Connectivity for Large Enterprises

In a previous post, we discussed the Transit Gateway managed service from Amazon Web Services (AWS). AWS Transit Gateways act as a hub for connecting multiple Virtual Private Clouds (VPCs) and virtual private network (VPN) connections within a single region. They enable a hub-and-spoke model with centralized control of traffic routed among VCPs. 

As we explained previously, Transit Gateways have some limitations — for one thing, you won’t be able to use route aggregation, so your routing table is going to get bigger and bigger. That’s why larger companies often use multiple Transit Gateways to connect VPCs.

Let’s take a software development company as an example. They have a production environment, a QA environment and a development environment. If we use just one Transit Gateway, there will be a lot of routes, which will be hard to manage. Instead, you can use one Transit Gateway for production, one for QA and one for development. By using multiple Transit Gateways, you can decrease the site of the routing table so you don’t need a big team to manage your AWS cloud. 

You can also segregate the environment by department, just as you would in the on-premises data center. With a single Transit Gateway, it would be very hard to segregate traffic between production, QA and development. Multiple Transit Gateways provide isolated sections within the VPC, with resources launched in a virtual network. 

You can set up a VPN connection to the Transit Gateway for remote access to the cloud instances, or you can use the AWS Direct Connect site-to-site VPN to connect to VPCs within the same region. The administrator gains greater visibility and complete control over the routing table and the IP range in use. In this way, multiple Transit Gateways improve security.

You can also combine multiple Transit Gateways with the concept of transit VPCs. A transit VPC is the old way of connecting multiple VPCs with remote resources. You set up a VPC with a firewall or routing instance in the center to create a global network. In our case, you can use that as a security add-on — your firewall or edge device will be connected to multiple Transit Gateways.

Finally, the use of multiple Transit Gateways allows you to aggregate bandwidth. A single Transit Gateway supports up to 50gbps. In our scenario, in which we’re using three Transit Gateways, we get up to 150gbps of bandwidth.

The use of multiple Transit Gateway is most suitable for large companies — customers who host their data center primarily in the cloud, and want to segregate their cloud by department. A large enterprise is going to have hundreds or even thousands of AWS accounts and rapid growth. 

Each VPC instance is associated with a specific account, so you have to have a way to connect them. Traditionally, you could use VPC peering, but with that you need to manage access control lists (ACLs). That’s very costly. With multiple Transit Gateways it becomes easy — you don’t have to deal with as many routing and ACL parts. You have one link instead of three. If there is a new instance or VPC, you can easily attach to it without the need to update the routing table. It is cost-effective because infrastructure management overhead is reduced.

Large enterprises with a substantial and growing number of VPC instances can save time and money with multiple Transit Gateways. Contact the Rahi Systems network engineering team for help in architecting a solution.

Why More Enterprises Are Adopting Network Functions Virtualization

Why More Enterprises Are Adopting Network Functions Virtualization

Networking has always been a hardware-centric affair. Organizations purchase, implement and maintain a wide array of routers, switches, load balancers and other gear on a device-by-device basis. The result is a highly complex network environment that is difficult to manage and lacks the agility to meet today’s changing business demands.

That’s slowly changing as more organizations adopt network functions virtualization (NFV). As the name implies, NFV virtualizes networking capabilities, implementing them in software on industry-standard servers rather than dedicated, proprietary hardware appliances. A specialized hypervisor supports the software that enables virtualized network functions (VNFs), and a management framework handles the provisioning and orchestration of VNFs and controls the server resources that support them.

Developed by a group of telecom network operators, NFV has primarily been used by service providers. NFV enables carriers to provision services faster, make changes quickly, and deliver higher service levels at lower costs. In a 2016 IHS Markit study, 100 percent of service providers said they planned to implement NFV, although adoption has been slowed somewhat by onboarding challenges.

Today, enterprises are starting to embrace NFV due to increasing dependence on the cloud and the move toward edge computing. NFV is also a component of the larger trend toward “software-defined” network services.

However, NFV is distinct from software-defined networking (SDN) and software-defined WAN (SD-WAN). SDN separates the network control plane from data plane to enable centralized management, greater agility and policy based routing changes. SD-WAN provides similar capabilities at the WAN level. NFV delivers the functionality required to enable specific network services throughout the network infrastructure.

Like server virtualization, NFV helps to conserve capital, enable faster provisioning of applications and services, and create a more efficient, flexible and responsive environment.

NFV breaks the dependence on networking hardware, simplifying management and enabling anytime, anywhere control via a cloud-based orchestrator. As such, NFV helps to reduce network operational costs across the extended enterprise.

Some of the top value propositions for NFV include:

Increased network automation. Network services can be programmatically configured based upon policy-based rules, and implemented via standards-based application programming interfaces (APIs).

A more agile, future-proof network. Organizations can add, remove and change services with a few clicks in the orchestrator. VNFs can be upgraded at any time and scaled up or down as needed.

Maximization of enterprise networking investments. Deploying network functions through software optimizes licensing and eliminates the need to overpay for equipment to obtain desired features or performance.

Greater consistency and visibility. By developing a validated NFV reference architecture across all sites, organizations can create a ubiquitous networking platform and gain better insight into network health and performance.

Improved branch network services. NFV allows organizations to collapse the “branch stack,” standardize branch networking hardware, and leverage cloud-based orchestration and management tools to reduce technician site visits.

Enhanced edge resources. By placing virtual machines and containers at the edge, NFV enables the integration of low-latency compute resources to enhance the value of the network and deliver faster services from the cloud.

The NFV architecture includes hardware, virtualization, orchestration and application layers. The hardware, typically an x86-based server, must provide five 9s availability, extremely low latency and high performance. Several of the major vendors offer VNF software; there are also specialized solutions and open-source options.

Putting this all together is not easy. Organizations looking to implement NFV can benefit from partnering with an IT solution provider with specific expertise in enterprise networking. Rahi’s networking practice can help you develop a network modernization strategy and determine specific use cases that could benefit from NFV.