Usage Models Driving Data Center Architecture Changes

The way the world is accessing data is changing. The data center is evolving to adapt.

popularity

Data center architectures are undergoing a significant change, fueled by more data and much greater usage from remote locations.

Part of this shift involves the need to move some processing closer to the various memory hierarchies, from SRAM to DRAM to storage. There is more data to process, and it takes less energy and time to process that data in place. But workloads also are being distributed over greater distances, as more data is generated and communicated. The pandemic has accelerated this trend, with mandates to work and learn at home, and more streaming video and music.

The collective impact has stressed data centers, from the edge to the cloud, and it has prompted a rethinking of what needs to be done locally versus at the edge or in the cloud. That, in turn, has had an impact on how data centers are architected based upon the movement, storage, and retrieval of data, rather than trying to architect the data around existing architectures. And it has opened up new opportunities for obtaining the necessary EDA resources, including on an as-needed basis.

“We’ve been in a mode of generating more data, as streaming video has become more popular,” said Scott Durrant, marketing manager for DesignWare IP solutions at Synopsys. “At the same time, there are increasing resolutions in streaming video and social networking, with people sharing all sorts of media, video, and pictures on social networking feeds. This has been driving up the amount of data generated, exchanged, and shared. That activity, even pre-COVID, was driving demand for higher data center capacities, higher network speeds, higher storage capacities, and performance. With COVID, that is being driven exponentially.”

There is growing consensus this is the new normal. But even if people do return to work in offices, they may split their time between home and commercial offices. To sustain — or even improve — productivity for remote workers, network resource responsiveness must be at least as good as employees have in the office.

“You’ve got to have the infrastructure to enable these remote workers to access the resources,” said Durrant. “In many instances, we are seeing, and will continue to see, an ever-increasing utilization of cloud services for that. You don’t want to have to have everybody access a local, on-site, corporate-type data center for a number of reasons. The cloud offers tremendous flexibility, so there will continue to be a greater migration to cloud-based services. Alongside this, the amount of network traffic through various infrastructure nodes is increasing, which means the network performance has to increase, while maintaining or reducing latency on the network. This is driving demand for faster network speeds.”

All of this has ramifications for data center architectures.

“The one-size-fits all approach to computing where workloads are powered by a single, legacy, general-purpose compute architecture is becoming obsolete,” observed Dhaval Parikh, director of segment marketing, Infrastructure Line of Business at Arm. “The future hyperscale data center infrastructure is becoming increasingly heterogenous, customizable and purpose-built fine-tuned to their specific wide-spectrum scale-out workloads.”

That said, the future compute architecture needs to support these emerging infrastructure requirements with flexibility of design and vendor choice at the best possible performance and power efficiency resulting in lowest TCO.

“When we think of the hyperscale data center, the first thing that comes to mind is the trusty server CPU,” observed Tom Wong, director of marketing, Design IP at Cadence. “Performance and power savings come from a very predictable x86 world scaling the architecture, as well as riding the wave of Moore’s Law. We have also witnessed compute processing power migrating to FPGAs, GPUs, and more recently, custom SoCs designed in-house by major internet giants. With every subsequent development, compute processors made improvements in a very linear and predictable manner. Other important components in a hyperscale data center are wired and wireless connectivity, networking, and storage, all of which also go through the natural progression of improvement by adopting the latest Ethernet and networking standards, as well as the latest memory and storage technologies.”

Every component is evolving, but at the same time the value assigned to different components is shifting. Performance now also is tied to how fast data can move between servers within a rack, between racks, between buildings, across campuses, and ultimately to the Internet.

Leaf-spine topology architecture in a data center. Source: Cadence

Fig. 1: Leaf-spine topology architecture in a data center. Source: Cadence

“From 2015 to 2020, network switch host speed doubled every two years, from 3.2Tb in 2015, to 12.8Tb in 2019, and to 25.6Tb in 2020/2021,” Wong said. “We’re not that far from 51.2Tb deployment, especially with advances in high-speed SerDes development resulting in single-lane 112G-LR capabilities.”

Alongside of that, module bandwidth has increased from 100G in 2015 to 200/400G in 2019, with a major 400G to 800G speed deployment poised to occur over the next two to three years. That will be coupled with improvements in optical components transitioning beyond 28Gbaud optics to 56Gbaud optics, which began in 2019. And all of this coincides with the transition from NRZ coding to higher-modulation PAM4 coding, which is far more efficient.

“A quick survey of what’s available in the commercial market shows that the majority of the 12.8Tb SoCs are manufactured in 16nm,” Wong said. “25.6Tb SoCs moved to 7nm beginning in late 2019 and have since gone into volume production in 2020. First-generation chips used 50G SerDes, as that was the best technology available at that time. More recent announcements have indicated that 100G SerDes has finally arrived, and the industry expects a transition from 50G to 100G SerDes, as well as migration from 7nm to 5nm. The benefits of this are pretty dramatic. Consider a 25.6Tbps switch. If it is dependent on 50G SerDes, you will have a device that needs 512 lanes. But if you have a 100G SerDes, then the number of lanes is reduced to 256. Just imagine the reduction in die area and the lower power consumption resulting from this reduction in lane counts. These chips consume a lot of power — about 350W. At the end of the day, any technology improvement that can maintain the same performance and simultaneously provide power savings will be very much appreciated.”

Speed and capacity both need to grow to keep up with the rising flood of data, but that’s only part of the picture. “You must be able to store that data somewhere, so there will be continued demand for increasing storage capacity,” said Synopsys’ Durrant. “But once you’ve captured the data, you’ve got to be able to process it somehow. Processing these huge data sets becomes immensely expensive if you have to look at them a small piece at a time, and swap information back and forth between storage and processing. With computational storage, if you can put more of that information into memory all at one time, and then create correlations among larger chunks of data, that can be much more efficient and provide insights that you wouldn’t otherwise be able to capture.”

Computational storage brings energy efficiency into the picture in places where previously it was an afterthought. So while memory capacity continues to increase, there is growing emphasis on moving data to and from memory more efficiently.

“We see DDR systems are increasing in capacity, and HBM, as well,” Durrant said. “But one that’s been around for awhile, in the form of Intel Optane, may be on the verge of really taking off. Broadly known as persistent memory, it allows increased capacity while retaining a fairly high-performance level. Also, the introduction of cache-coherent interface technologies that support cache coherency with these devices, like CXL, will be a boon to the expansion and broad adoption of persistent memory and compute systems in the data center.”

Tradeoffs are a delicate balance
All of this requires tradeoffs, which increasingly involve a combination of processing speed, various memory capacities, and strategies for minimizing I/O and paring down data and data movement. But when data does need to be moved, that has to happen at speeds that are in sync with other operations.

“When it comes to data center tradeoffs, it all depends on the bandwidth requirements,” Frank Ferro, senior director of product management for IP cores at Rambus explained. “Say you are designing for a network card that has no additional power input, with a very specific 75-watt limit. That is going to limit how much processing you can do on the card. The bandwidth on those cards is 300+ gigabytes per second. We’ve seen both HBM and GDDR can potentially stay in those power profile range, so it’s rather limited. It’s a balance. You’ve got to run at 300 Gig but you can’t go over your power profile. So you may end up, between the processor and the memory, using HBM or GDDR to stay in that box. But then, when you go up to the next 250-watt cards, you’ve got a lot more power to play with. Depending on what problem you’re trying to solve, you’ll use either GDDR or HBM. In the lower-power cards, you may use LP because LPDDR looks potentially like a nice solution, but you’re limited in your bandwidth. The design team wants to know how to make the tradeoffs. ‘I’ve got this bandwidth problem. I need 500 gigabytes per second,’ or, ‘I need 300 gigabytes per second. Give me the best memory solution that’s going to stay in my power profile, that can give me the performance that I want, and can keep my costs from going through the roof.’ That’s the game we play every day.”

Today this is mostly done with GPUs, Ferro said. “We see so many companies coming out with new architectures to try to chip away at that GPU market. GPU is really good at what it does. It’s a good general processor. But if I’ve got a very specific type of neural network I want to run, and somebody’s come up with an accelerator card that has a processor that’s optimized for that particular network, I’m going to use that. Users are not coalescing around one company. It really comes down to the kind of network they’re running, and what kind of memory and processor speed requirement is needed for that particular network. That’s why there are so many companies working on this. Everyone has a slightly better widget to solve that. For the short term, GPU is still going to dominate the market. But accelerator cards with custom ASICs and custom SoCs are starting to slowly make their way into different segments of the market.”

Switch fabrics are another source of tradeoffs. “First, it’s important to make sure the SerDes IP being used truly has the performance needed, backed up by the silicon data,” said Cadence’s Wong. “Do an evaluation based on actual silicon. Ask the supplier if they have taped out the device, and if they have silicon. Is there a board you can hook up to equipment so you can actually see the IP in action to validate the claims? Asking the IP supplier for validation of the silicon is probably the most critical part.”

Second, not all of these tradeoffs have precedent. “There are a lot of standards, and in the data center, sometimes the internet giant has a proprietary internal system,” Wong said. “Their server talks to another one of their servers, which then talks to their own data center. They control both ends — transmit and receive. In the insatiable desire to go faster and move data around more quickly, they may choose to adopt a newer standard that may not be ratified by the industry organization, but it gives them a lead on the competition by probably three to four years. While that might work for an internet giant, it may not interoperate with every piece of equipment on the planet. It’s almost like building a Formula 1 car versus building a car for the masses. I only have to win the Grand Prix. If the car breaks down, I’m okay because I won the race. A modern data center is almost like that. I have more speed, I move more data. I do faster computation, so I’m going to pick the latest and greatest technology, and eventually the industry is going to catch up and ratify that. Hopefully I’m large enough that they will see it my way and adopt my technology as the standard. That’s how the internet giants operate, and rightly so.”

Security concerns
Another factor creeps into this equation, as well. More data, more data movement, and more components also increase the value of a cyber attack, and they collectively widen the attack surface.

“With increasing demands on network bandwidth caused by emerging applications such as 5G and AI, data centers require faster performance, higher density, low latency, and secure memory,” said Sam Geha, CEO of Infineon Technologies LLC. “It is becoming critically important for data center architectures to increase user privacy protection, prevent component counterfeiting, and ensure secured infrastructure. Moving forward, data center architectures will evolve to address the growing demand for billions of connected devices, which will amplify the need for a secure system. The flash memory is a critical component of the system that must be protected from attack, as it provides access to the boot code, security keys, and other system-critical data.”

Others agree. “Threats of inappropriate access and misuse of data are growing as the value of data traversing and stored in the cloud increases,” Durrant said. “In particular, the increase in the number of remote workers and their operating environments has broadened the field of attack for would-be data thieves. Protection of data is critical for cloud computing. To properly protect the confidentiality, integrity, and availability of data to authorized users, standards organizations are incorporating security requirements into data interface protocols. Implementing the requisite security algorithms in these high-speed interfaces requires high-quality cryptography IP for data encryption and decryption, security protocol accelerator IP to implement high-speed secure protocols, and trusted execution environments to provide root of trust and secure key management. To avoid creating bottlenecks in their respective data paths, the IP used to implement these functions must be able to sustain line rate operation.”

Conclusion
There are multiple concerns in a data center, but in the end the overriding element is performance.

“You’ve got to have speed,” said Durrant. “Latency is important, and as we see more control systems come online, that criticality of low latency is going to increase in importance. Think high speed, low latency.”

Next in importance is energy efficiency, which is a significant and growing concern. “There’s a big drive right now toward net zero carbon footprint for data centers, which is actually a huge challenge because they are large consumers of power today,” he said. “Every element in the data center is going to come into play around that. As SoC designers are putting together products for deployment in the data centers — especially server products, because those are multiplied by tens of thousands in a typical hyperscale data center — even the switching infrastructure is growing. There is at least one switch in every rack, and a lot of silicon in that switch. This means the network infrastructure, the compute infrastructure, and storage are growing. For all of these devices, we are going to have to increase the energy efficiency of each of them. That’s one of the reasons that we’re seeing, for the second or third time at least, a drive for alternative architectures.”

Viewed individually, each of these changes is significant. Taken as a whole, they represent a fundamental shift inside of data centers, and that trend will continue as data usage and storage patterns continue to evolve and change.



Leave a Reply


(Note: This name will be displayed publicly)