SoftBank Launches Sovereign AI GPU Cloud
SoftBank has introduced its AI Data Center GPU Cloud, a sovereign AI infrastructure service. The service aims to keep data within Japanese jurisdiction, filling a gap left by major cloud providers. A beta version is live now, with commercial availability set for October 2026.
The impact sits in capacity, compute costs and supply chains: one deployment or bottleneck can change how companies buy chips, cloud contracts and data-centre space. Readers should track whether the announcement turns into available infrastructure, not just a product claim.
Overview of AI Data Center GPU Cloud
SoftBank has announced its AI Data Center GPU Cloud, a sovereign AI infrastructure service that moves the company further into competition with global cloud giants.
This service is a key part of SoftBank's broader "Activate AI for Society" strategy.
A beta version went live immediately, although commercial availability is not scheduled until October 2026, initially limited to internal use across SoftBank group companies.
Unique Offerings and Technology
This initiative builds on a series of partnerships SoftBank has quietly formed over the past year, particularly with NVIDIA.
Instead of offering a generic GPU cloud, SoftBank integrates its telecom assets, edge network, and AI compute into a single service designed for customers prioritizing data sovereignty within Japan.
The core of the service is SoftBank’s proprietary software stack, the Infrinia AI Cloud OS, which consolidates AI computing infrastructure with the necessary software layers to manage modern AI workloads at scale.
The platform offers two main delivery modes: Kubernetes as a Service (KaaS) for multi-tenant environments and Inference as a Service (Inf-aaS), which provides access to large language model inference through APIs.
This setup is intended to support a wide range of workloads, from model training to general data processing.
Hardware and Networking
On the hardware side, SoftBank relies heavily on Nvidia technologies.
The cloud is built on Nvidia GB200 NVL72 systems utilizing the Grace Hopper architecture and is hosted in Japan-based data centers.
The Infrinia AI Cloud OS manages everything from BIOS configuration to Kubernetes management on the GPU platforms.
SoftBank employs Nvidia BlueField-3 DPUs to enhance both vRAN and generative AI workloads, with an integrated Nvidia Spectrum Ethernet switch facilitating the 5G timing protocol.
This AI Data Center GPU Cloud is part of SoftBank's "Telco AI Cloud" vision, which aims to connect large-scale GPU data centers with multi-access edge computing across its telecom network.
The edge component operates on AITRAS, SoftBank’s fully software-defined AI-RAN solution, currently deployed at Nvidia’s Santa Clara headquarters.
The goal is to achieve low-latency distributed inference processing at the network edge, while central data centers manage training and heavy computational tasks.
By sharing hardware between AI and telecom workloads, SoftBank claims to effectively provide "5G for free" from the same infrastructure.
This approach reportedly offers up to a fourfold improvement in ROI for vRAN workloads compared to dedicated 5G deployments.
With Japanese enterprises increasingly focused on data sovereignty, SoftBank's offering fills a significant gap in the market, especially as major global cloud providers have limited sovereign options in Japan.





