Qualcomm AI250 Stacks DRAM Over Compute But Leaves FLOPS Undisclosed

BySendTech Times Chips & Compute DeskNewsroom-edited, source-reviewed coverage|Source: The Register

Newsroom brief

Qualcomm is pitching high-bandwidth compute for AI inference, with AI250 cards claiming 768 GB of memory and 133 TB/s of effective bandwidth, but the company has not disclosed peak FLOPS or named customers.

Verified against source materialEdited by SendTech Times Chips & Compute Desk

Qualcomm AI250 Stacks DRAM Over Compute But Leaves FLOPS Undisclosed

Image source: The Register

Qualcomm Moves AI250 Memory Closer To Compute

Qualcomm is using its AI250 accelerator roadmap to push a different answer to the AI inference memory bottleneck.

The company describes high-bandwidth compute, or HBC, as a 3D-stacked design that places DRAM above logic so some work can happen closer to memory.

The AI250 is due to follow the AI200 Dragonfly rack systems and is planned to begin shipping in 2027.

Qualcomm also outlined a second-generation HBC platform, the AI300, for 2028.

Qualcomm says the AI250 card will carry 768 GB of memory and up to 133 TB/s of effective memory bandwidth.

The company ties those claims to bandwidth-bound inference work, especially decode, where model weights are streamed from memory during token generation.

Effective Bandwidth Claims Need More Detail

The company is presenting HBC as a way to reduce data movement between memory and compute.

Qualcomm says the architecture uses LPDDR memory in a purpose-built near-memory design and differs from HBM because HBC does computing in the base logic die.

The bandwidth claims still depend on Qualcomm's definition of effective bandwidth.

For the AI200 generation, Qualcomm had cited 414 TB/s of effective memory bandwidth across 56 chips.

The AI250 marketing material says HBC gives 18x the AI200's effective bandwidth, while the AI300 would reach 54x.

Qualcomm says the AI250 can operate as a standalone AI accelerator.

It also says the part can sit in disaggregated inference systems, with GPUs or other Qualcomm parts handling prompt processing and AI250 accelerators handling memory-intensive decode.

The company declined to give peak FLOPS for AI250.

It also did not give the detailed physical bandwidth calculation behind the headline effective-bandwidth figures, even as the disclosed figures indicate that ordinary LPDDR5x bandwidth would not explain the claimed totals by itself.

Modular Deal Targets The Software Gap

Qualcomm's investor-day push also included its planned acquisition of Modular, the AI software startup behind Mojo and the Max serving platform.

Mojo is positioned as a low-level programming interface that can run across different hardware, while Max targets LLM model serving.

AI accelerator buyers are comparing more than silicon specifications.

They need serving tools, developer support and deployment paths that do not lock every workload to one vendor stack.

Qualcomm is using Modular to address that software gap while Nvidia and AMD remain the main comparison points for AI infrastructure buyers.

The plan also assumes Qualcomm can make a heterogeneous inference model attractive.

The article describes a possible split where other chips handle prompt processing and AI250 systems focus on memory-intensive decode, but it does not identify production deployments using that design.

Qualcomm has not disclosed peak FLOPS for AI250, the detailed method behind its effective bandwidth calculation, named AI250 customers, production deployment dates beyond the 2027 target or whether regulators will clear the Modular acquisition this year.

#AI accelerators #near memory compute #Qualcomm #AI250

Chips & Semiconductors

Qualcomm Wins Meta CPU Agreement, But Production Waits Until 2028

Qualcomm Technologies will supply data center CPUs for Meta under a multi-generation agreement, with the first Dragonfly C1000 production scheduled for the second half of 2028 and capacity terms still undisclosed.

Chips & Semiconductors

Qualcomm Names Meta As First Dragonfly Data Center CPU Customer

Qualcomm said Meta will use its Dragonfly C1000 data center CPU when production starts in 2028, while the chipmaker raised its fiscal 2029 non-handset revenue projection to $40 billion.

Chips & Semiconductors

SK hynix Uses HPE Discover to Push AI Memory Beyond HBM

SK hynix used HPE Discover 2026 in Las Vegas to showcase HBM, CMM-DDR5, eSSD and server DRAM products for AI infrastructure buyers. The company said HPE-certified products already deployed in HPE servers include PS1010 E3.S eSSDs based on 176-layer 4D NAND and 64GB DDR5 RDIMM modules built on 1c process technology. The clearest commercial point is HPE certification and supply; the booth display does not by itself show broader customer adoption.

Chips & Semiconductors

Tencent’s Canghai V2 Chip Pushes Video Encoding Into Its Cloud Infrastructure Stack

Tencent Cloud says its self-developed Canghai V2 video encoding chip has entered mass production after leading MSU hardware encoding benchmarks. The company is positioning the chip as a way to cut bandwidth and compute costs for AI video, live streaming and cloud media workloads. The next test is whether benchmark leadership turns into wider deployment across Tencent Cloud services and external customers.