TL;DR
- Chip Launch: Alibaba has launched the Zhenwu M890 on May 20 as a domestic AI chip for training and inference in China.
- System Pitch: The launch pairs processor specs with cloud and cluster details, including 144 GB of memory and inter-chip bandwidth.
- Market Test: Chinese buyers want local alternatives to constrained Nvidia supply, but independent benchmark proof and production adoption still need testing.
- Roadmap: Alibaba expects V900 and J900 follow-ons over two years, making the M890 a platform test rather than a one-cycle announcement.
Alibaba launched the Zhenwu M890 AI chip on Wednesday, adding a new domestic accelerator for AI training and inference. Alongside the chip, Alibaba outlined a broader cloud platform and cluster build-out, signaling that it wants to sell a system stack, not only a chip.
Alibaba framed the M890 as “exceptionally well-suited” for both training and inference and says it delivers three times the performance of the older Zhenwu 810E. Buyers will have to test that company-supplied claim against software support, operating cost and cluster efficiency in live deployments.
144 GB of on-chip memory and 800 GB per second of inter-chip bandwidth are the clearest technical numbers behind the launch. More memory on the processor helps keep larger workloads close to compute, while faster chip-to-chip links matter once training jobs spread across multiple accelerators.
Alibaba’s New Chip Pitch
Alibaba is presenting the M890 as part of a wider cluster design. ICN Switch 1.0 extends the launch beyond one processor, and Alibaba rates that fabric at 25.6 terabits per second across clusters of 64 accelerators. For large model training jobs, that interconnect affects whether adding more chips improves throughput or only adds cost and coordination overhead.
T-Head’s earlier chip-unit plans give the launch some business history. In January, Alibaba’s chip ambitions were still being discussed through T-Head and a possible IPO path. Alibaba now pairs the new processor with more than 560,000 Zhenwu chips shipped to date and 400 external customers across 20 industries to argue that its silicon program already has commercial scale.
Alibaba lists precision formats from FP32 to FP4 for the processor, a range aimed at high-accuracy training and cheaper inference work on the same hardware family. Independent benchmark methodology is still missing, so customers will need to test software compatibility and operating cost before treating the M890 as a proven answer to delayed H200 deliveries and domestic alternatives. A strong spec sheet loses value quickly if model teams still have to spend weeks adapting frameworks or tuning inference pipelines.
Why the Launch Matters in China Now
China’s push for homegrown AI chips gives the launch a clear market backdrop. Chinese companies are shifting toward domestic AI chips, which give Alibaba a wider opening to argue that its own stack can cover both model building and model deployment.
Procurement teams are not only comparing peak speeds. They also have to weigh delivery certainty, software support and the cost of moving sensitive AI work onto hardware that can be sourced inside China on a dependable schedule.
Earlier policy moves help explain that demand. On January 28, Beijing discussed domestic-chip purchase quotas alongside foreign-chip approvals, showing that import access and domestic promotion were being handled as part of the same procurement debate. Alibaba’s launch lands after months of that pressure rather than at the start of it.
Competitors, Context, and What Comes Next
Google TPU and AWS Trainium remain part of the comparison set for buyers evaluating custom accelerators instead of general-purpose GPUs. Intel’s latest AI accelerator belongs in the same field, which means Alibaba still has to compete on software maturity, memory behavior and deployment reliability, not only on headline performance claims.
Alibaba’s Zhenwu M890 looks like a serious top-tier Chinese AI accelerator, especially for Alibaba Cloud/Qwen-style inference and agent workloads, but it is not yet publicly documented at the same raw-performance level as Nvidia H200, Blackwell, GB300, or Rubin. The biggest caveat is that Alibaba has disclosed memory capacity, supported numeric formats, relative performance versus its prior chip, and rack-level networking, but not absolute FLOPS, HBM bandwidth, process node, TDP, MLPerf results, or real training-throughput benchmarks.
Alibaba says M890 has 800 GB/s inter-chip bandwidth. Nvidia H200’s 4.8 TB/s figure is memory bandwidth, not inter-chip bandwidth. Huawei’s 950DT 4 TB/s is also memory access bandwidth, while its 2 TB/s figure refers to interconnect. Those are different measurements, so M890’s 800 GB/s should not be directly read as “one-sixth of H200”; Alibaba simply has not disclosed M890’s HBM bandwidth.
Zhenwu M890 is probably one of the most important Chinese AI chips now announced, alongside Huawei Ascend 910C/950 and the best efforts from Cambricon, Baidu, Enflame, MetaX, and Moore Threads. It appears well above older China-market chips like Nvidia H20-class alternatives and much more modern than many domestic challengers because of 144 GB memory, FP4 support, Alibaba Cloud integration, and AL128 rack networking.
Alibaba is also trying to show it can sell a larger system around the chip. Its 128-card AI supernode server links 128 AI chips with latency measured in the hundreds of nanoseconds, extending the pitch from one processor to the tightly connected training fabric customers buy in practice. Across the broader market, custom ASIC deployment faster than GPU deployment has become a more visible 2026 pattern, which helps explain why specialized AI hardware is getting more attention from cloud providers.
Alibaba’s next checkpoint comes with the Zhenwu V900 and Zhenwu J900, which the company expects to roll out over the next two years. That longer product plan points to a continuing platform, but sustained adoption will still depend on whether Alibaba can turn launch-day specifications into repeat production deployments.

