KAIST Simulator Tests LLM Infrastructure Before AI Server Buildouts

BySendTech Times Cloud & Infrastructure DeskNewsroom-edited, source-reviewed coverage|Source: Aitimes

Newsroom brief

KAIST researchers developed LLMServingSim 2.0 to test LLM serving infrastructure before large deployments. The simulator models GPUs, NPUs, PIM devices, memory behavior, routing, power use and serving policies. The team plans to open-source the tool and validate it with real LLM serving frameworks.

Verified against source materialEdited by SendTech Times Cloud & Infrastructure Desk

KAIST Simulator Tests LLM Infrastructure Before AI Server Buildouts

Image source: KAIST / AI Times Korea

KAIST tests LLM infrastructure before deployment

KAIST researchers have developed LLMServingSim 2.0, a simulator for testing large language model serving infrastructure before operators build expensive server clusters.

AI Times Korea reported that the work by Professor Jongse Park's computer science team won a best paper award at ISPASS 2026.

The simulator works as a virtual testbed for AI infrastructure design.

Instead of deploying physical systems to compare accelerators, memory devices and serving policies, engineers can model how an LLM service may behave under different cluster configurations.

Why it matters

Large language model services can require very large server fleets.

The KAIST team said modern LLM serving is becoming more complex as operators combine GPUs with other accelerators, memory layers and software methods such as prefill-decode separation and prefix caching.

LLMServingSim 2.0 is designed to estimate throughput, latency, memory usage and power behavior.

It supports heterogeneous environments that can include GPUs, NPUs and processing-in-memory devices, giving cloud providers and semiconductor companies a way to test future AI hardware before it is widely available.

The system accepts workload, cluster configuration and hardware profile inputs.

It then builds a serving engine with request routing and model serving groups, while modeling compute execution, memory access, communication cost, power consumption and runtime outputs.

For mixture-of-experts models, the simulator can reflect expert routing, expert placement, loading and synchronization.

It can also analyze the impact of expert parallelism and expert offloading on serving performance.

Next steps

The researchers plan to release the simulator as open source, connect it with real LLM serving frameworks and keep adding hardware profiles.

Professor Park said AI service competitiveness depends not only on the model, but also on reliable and efficient infrastructure.

For Korea's AI sector, the project highlights the growing importance of infrastructure research behind generative AI.

If broadly validated, it could help cloud operators, AI chip developers and enterprise AI teams lower the cost and risk of testing new LLM serving designs.

#KAIST #LLM infrastructure #AI data centers #NPU

Cloud & Data Centers

AI Data Centers Face Heat And Severe Weather Costs As Buildouts Move Beyond Hubs

CNBC reports that severe weather now drives a third of Zurich’s U.S. data center builders’ risk losses, while new AI data-center construction moves into markets with grid, roof and cooling exposure.

Cloud & Data Centers

AI Data Centers Force Utilities To Price Power Queues Before Land Deals

Utilities and grid-reliability specialists say hyperscale AI campuses are changing how large loads are studied, priced and connected, with FirstEnergy using a two-stage load study and bigger Ohio transmission plans.

Cloud & Data Centers

Iren Plans 800MW Australia AI Data Center Campus as Power Becomes the Capacity Gate

Iren signed a transmission connection agreement for a planned 800MW data center campus in Bundey, South Australia. The project is Iren's first Australian foray and is expected to be energized in 2028 as the company shifts more cash flow toward AI cloud infrastructure. The practical question is whether Iren can turn grid-connected power, financing and GPU capacity into energized AI cloud campuses on the announced timelines.

Cloud & Data Centers

KDDI’s Osaka AI Data Center Turns Liquid Cooling Into A Power Test

KDDI is moving liquid cooling into an Osaka AI data center after a 2023 immersion test cut server cooling energy use by 94 percent and lowered PUE to 1.05.

Cloud & Data Centers

CAS Star’s Photonics Bet Turns Into an AI Infrastructure Test

CAS Star founder Mi Lei says the AI boom has validated a decade-long investment thesis around photonics and other hard-tech fields. The firm has more than 200 photonics-related companies among roughly 600 portfolio companies, spanning sensing, communications, computing, storage and display. The next test is whether optical links, laser chips and photonic computing companies can turn AI data-centre demand into durable commercial scale.

Cloud & Data Centers

Centuria Raises AU$300 Million As ResetData Builds Australian AI Factories

Centuria Capital Group launched an AU$300 million equity raise to support ResetData AI cloud growth, including AI factories, GPU deployment plans and a 200MW+ power pathway.