We know the most common PCIe-based GPUs, but now NVIDIA has introduced a new version of the GPU: NVLink. But as a leader in the AI industry, why has NVIDIA done this? Why has NVIDIA introduced a new version of the GPU in an already existing technology? What is the difference between NVLink vs PCIe? All these questions will be answered in the article below.
Table of contents
What is PCIe
This has been the most mainstream GPU form for a long time. Because it has been trusted for many years, it has strong compatibility and has become an industry standard interface. Whether it is a home gaming PC or a regular server, PCIe can be used perfectly for data transmission. It is commonly used to connect various devices in a stable way, such as network cards and storage drives.
PCIe is essentially a general-purpose peripheral bus architecture. It forms a tree structure with the CPU or chipset as the central node. In this architecture, all communication must go through the host path. While this provides very high compatibility, it also shows limitations today as AI training demand grows. It cannot support direct GPU-to-GPU communication (or only very limited), and must rely on the slower PCIe channel.

However, in NVLink vs. PCIe, although the interconnect bandwidth of PCIe GPUs is lower than that of NVLink, the computing performance of the GPU itself does not have a clear difference. PCIe Gen5 x16 provides about 64GB/s one-way bandwidth, while PCIe Gen6 with PAM4 signaling can reach about 128GB/s one-way bandwidth. For applications that do not rely heavily on high-speed GPU interconnect, such as small to medium model training and inference deployment, GPU interconnect bandwidth does not have a large impact on overall performance.
PCIe key features:
• Hub-based communication via CPU/chipset
• Standard interface with strong compatibility
• Tree topology
• Bandwidth scaling via PCIe lanes, e.g., x16
What is NVLink
NVLink refers to a block-style structure based on the SXM architecture, which stands for Socketed Multi-Chip Module. In today’s fast-growing AI era, to further improve speed, NVIDIA introduced a new GPU form factor called SXM. SXM supports up to 8 GPUs placed flat on the motherboard. This design is created for AI model training, such as LLMs, enabling fast direct communication between GPUs without going through other devices.

NVLink vs PCIe, NVLink is a high-speed interconnect protocol designed specifically for GPU communication, and it is built for short-distance, high-quality links. It uses SerDes (NVHS) for differential signaling, which makes the signal path shorter and reduces interference. At the same time, NVLink scales bandwidth through multiple parallel links. Each GPU contains multiple NVLink links, and each link consists of several high-speed channels. For example, in H100 (NVLink 4), each GPU provides 18 NVLink links, each with about 25GB/s one-way bandwidth, reaching about 450GB/s one-way (900GB/s bidirectional). In the newer Blackwell architecture (NVLink 5), total bandwidth can reach around 1.8TB/s bidirectional.
Unlike PCIe, NVLink does not rely on complex long-distance FEC systems. Its lightweight packet + flit mechanism gives it lower latency and higher efficiency.
NVLink key features:
- Direct GPU to GPU communication
- Multiple parallel links for each GPU
- Support for GPU mesh interconnect.
Why is NVLink necessary now?
In the past, people focused more on connecting different types of devices into one system, and the demand for bandwidth was not so high. But now, with the rise of AI training, simple computation is no longer the main bottleneck. The focus has shifted to data movement, especially data transfer between GPUs.
In model parallelism and distributed training, a large amount of intermediate data needs to be exchanged frequently between GPUs. If still relying on the PCIe path, each transfer goes through a single-channel, slower communication path, which limits performance. NVLink was created to solve this problem and has gradually become a key technology supporting AI infrastructure. So it is necessary now.
NVLink vs. PCIe
NVLink vs. PCIe have many differences and similarities, including bandwidth and performance, which people care about most. Some people in the community also focus on RAS coverage, SerDes differences, and link scaling methods. The table below can help you quickly understand these differences.
| PCIe | NVLink | |
| Bandwidth | Gen5 x16: 64 GB/s→128 GB/s (bi) Gen6 x16: 128 GB/s→256 GB/s (bi) | NVLink 4: 450 GB/s→900 GB/s (bi) NVLink 5: 900 GB/s→1.8 TB/s (bi) |
| Latency | Higher, due to CPU/chipset path | Low, direct GPU-to-GPU links |
| Topology | CPU-controlled tree structure | Mesh / point-to-point GPU interconnect |
| GPU Direct | No, via CPU/chipset | Yes, direct GPU communication |
| Use Cases | General servers, gaming | AI training, model parallelism |
| Scaling | Add lanes (x16/x32) | Add links per GPU |
| Reliability (RAS) | Protocol + platform-dependent | CRC + retransmission |
| SerDes | Long-reach, needs FEC/retimers | Short-reach NVHS, lower latency |
| Expansion | Scale by lane width | Scale by parallel links |
NVLink Limitations
However, newer does not mean perfect. From discussions in the Reddit community, we can clearly see that NVLink still has some limitations and is not suitable for everyone.
First, a very common misunderstanding is that NVLink can achieve “memory pooling” or “shared VRAM.” NVLink does not make multiple GPUs act as a single GPU with larger memory. Each GPU still has its own independent memory space. NVLink only provides a faster path for data exchange. Data still needs to move between GPUs, and even with NVLink, this is much slower than internal GPU memory bandwidth (for example, A100 memory bandwidth can reach hundreds of GB/s, while NVLink is around the hundred GB/s level).
Second, the benefit of NVLink depends highly on the workload. In data parallelism scenarios, each GPU processes the full model independently, and there is almost no communication between GPUs. In this case, NVLink brings little benefit. When In model parallelism, since computation is often sequential, communication only happens a limited number of times during forward and backward passes, so the overall speed gain is also limited. In many real tests, NVLink improves performance by about 30%–40%, not by an order of magnitude. So changing the connection type won’t make a huge difference; it just makes the work more efficient. Also, NVLink is still evolving.
NVLink vs PCIe, NVLink usually depends on SXM platforms or dedicated bridge structures, rather than standard PCIe slots. This means higher hardware cost, stricter thermal design, and a more limited ecosystem, unlike PCIe, which has a complete ecosystem. Therefore, for small-scale deployment or general computing scenarios, PCIe is still a more practical and cost-effective choice.
How to choose
In the context of large-scale model training with model parallelism, NVLink vs PCIe, the best choice is NVLink. It would help make the process more efficient and speed up the communication between the GPUs. However, it should be noted that it is not a plug-and-play solution, and after choosing the NVLink systems, software optimization is required to form a complete ecosystem.
However, if you are using a home or small-scale model training environment, you can still use PCIe. In addition, the mature environment means you don’t have to worry too much about compatibility. Moreover, when the number of GPUs is small, the difference in bandwidth is not significant. In this case, PCIe is powerful enough.
Besides, the interconnect between the GPUs is only a part of the whole system. The whole AI deployment system is not only about the interconnect between GPUs (either NVLink or PCIe), but also about the system’s coordination with the network layer. In this case, the optical module can add more value for you.
-
0.5~2m Generic OSFP-800G-DAC Compatible 800G OSFP Finned Top DAC Cable
Price range: US$ 99.00 through US$ 159.00 (Excl. VAT)
FAQs
#1 Is NVLink always better?
No. NVLink is designed for GPU communication. In many real applications, its advantage cannot be fully used, and PCIe is still more suitable for most scenarios in terms of cost and general use.
#2 Will NVLink replace PCIe?
No. PCIe is a general standard interface, while NVLink is a specialized solution for GPU interconnect. They are completely different in concept and direction. They will coexist for a long time and serve different needs.
Conclusion
The enumeration and hot plug of PCIe determine the upper limit of PCIe. NVLink vs PCIe, a product of a new era, can improve the interconnectivity between GPUs and make it more efficient and direct. Both interfaces have their own merits and demerits, and NVLink also needs time to develop. What is most important is to understand the difference between them and make a suitable selection according to your case.
Read more
- NVIDIA A100 vs H100 vs L40S vs A6000: A Detailed Comparison
- Understanding Terabit Ethernet: 100G, 200G, 400G, and 800G
- The 10 Best Home Network Switch for 2025






