SendTech Times
Analysis
CAPACITY TEST:

KAIST Simulator Tests LLM Infrastructure Before AI Server Buildouts

Article summary

KAIST researchers developed LLMServingSim 2.0 to test LLM serving infrastructure before large deployments. The simulator models GPUs, NPUs, PIM devices, memory behavior, routing, power use and serving policies. The team plans to open-source the tool and validate it with real LLM serving frameworks.

KAIST Simulator Tests LLM Infrastructure Before AI Server Buildouts
Image source: KAIST / AI Times Korea

KAIST tests LLM infrastructure before deployment

KAIST researchers have developed LLMServingSim 2.0, a simulator for testing large language model serving infrastructure before operators build expensive server clusters.

AI Times Korea reported that the work by Professor Jongse Park's computer science team won a best paper award at ISPASS 2026.

The simulator works as a virtual testbed for AI infrastructure design.

Instead of deploying physical systems to compare accelerators, memory devices and serving policies, engineers can model how an LLM service may behave under different cluster configurations.

Why it matters

Large language model services can require very large server fleets.

The KAIST team said modern LLM serving is becoming more complex as operators combine GPUs with other accelerators, memory layers and software methods such as prefill-decode separation and prefix caching.

LLMServingSim 2.0 is designed to estimate throughput, latency, memory usage and power behavior.

It supports heterogeneous environments that can include GPUs, NPUs and processing-in-memory devices, giving cloud providers and semiconductor companies a way to test future AI hardware before it is widely available.

The system accepts workload, cluster configuration and hardware profile inputs.

It then builds a serving engine with request routing and model serving groups, while modeling compute execution, memory access, communication cost, power consumption and runtime outputs.

For mixture-of-experts models, the simulator can reflect expert routing, expert placement, loading and synchronization.

It can also analyze the impact of expert parallelism and expert offloading on serving performance.

Next steps

The researchers plan to release the simulator as open source, connect it with real LLM serving frameworks and keep adding hardware profiles.

Professor Park said AI service competitiveness depends not only on the model, but also on reliable and efficient infrastructure.

For Korea's AI sector, the project highlights the growing importance of infrastructure research behind generative AI.

If broadly validated, it could help cloud operators, AI chip developers and enterprise AI teams lower the cost and risk of testing new LLM serving designs.

Share this article
inXf

Related articles

More
Meta's Ohio AI Data Center Tents Put Speed and Power at the Center of the Capacity Race
Cloud & Data Centers

Meta's Ohio AI Data Center Tents Put Speed and Power at the Center of the Capacity Race

Meta has built six rapid deployment structures outside New Albany, Ohio, as it seeks faster AI data center capacity. Local permits reviewed by Michael Thomas show five 125,000-square-foot structures started between April and June, while the site uses 200 megawatts of nearby modular gas turbines. The practical test is whether faster construction helps Meta turn heavy AI capital spending into usable developer and product capacity.

AI Data-Centre Spending Turns Energy Costs Into an Inflation Test
Cloud & Data Centers

AI Data-Centre Spending Turns Energy Costs Into an Inflation Test

AI infrastructure spending is pushing data-centre power demand, construction costs and debt issuance into the inflation debate. Pew Research Centre counted more than 3,000 operational US data centres and about 1,500 more under construction or in early development. The practical test is whether productivity gains arrive before power, construction and financing costs spread further through the economy.

Google Compute Lease Turns SpaceX Data Centers Into an AI Capacity Test
Cloud & Data Centers

Google Compute Lease Turns SpaceX Data Centers Into an AI Capacity Test

SpaceX lined up a Google compute agreement that gives Google access to about 110,000 NVIDIA GPUs and related components. The filing-based terms call for $920 million a month from October 2026 through June 2029, with delivery protections if GPU access is not ready by September 30, 2026. The next signal is whether SpaceX can turn AI data-center capacity into reliable third-party infrastructure before Google's bridge-capacity need changes.

Iren Plans 800MW Australia AI Data Center Campus as Power Becomes the Capacity Gate
Cloud & Data Centers

Iren Plans 800MW Australia AI Data Center Campus as Power Becomes the Capacity Gate

Iren signed a transmission connection agreement for a planned 800MW data center campus in Bundey, South Australia. The project is Iren's first Australian foray and is expected to be energized in 2028 as the company shifts more cash flow toward AI cloud infrastructure. The practical test is whether Iren can turn grid-connected power, financing and GPU capacity into energized AI cloud campuses on the announced timelines.

Keep Reading

More Stories

Latest
Apple AI Architecture Puts Google And Nvidia Inside Its Privacy TestAIJun 9, 2026Apple AI Architecture Puts Google And Nvidia Inside Its Privacy TestApple is using Google and Nvidia to support its most advanced cloud AI model while trying to keep Apple Intelligence centered on private orchestration, proprietary models and on-device context.Amazon-Corning Fiber Deal Puts Optics Inside The AI Data Center BottleneckCloud & Data CentersJun 9, 2026Amazon-Corning Fiber Deal Puts Optics Inside The AI Data Center BottleneckAmazon has reached a multi-year optical fiber and networking agreement with Corning, adding North Carolina manufacturing jobs and highlighting fiber capacity as a practical constraint in AI data center expansion.Check Point VPN Exploitation Puts Legacy IKEv1 Access In The Ransomware SpotlightCybersecurityJun 8, 2026Check Point VPN Exploitation Puts Legacy IKEv1 Access In The Ransomware SpotlightA critical Check Point VPN flaw, CVE-2026-50751, is being exploited against legacy IKEv1 remote-access configurations, with activity tied in one case to a Qilin ransomware affiliate and a second related VPN issue also disclosed.Silent Ransom Group Uses Fake IT Support Calls to Pressure Law FirmsCybersecurityJun 8, 2026Silent Ransom Group Uses Fake IT Support Calls to Pressure Law FirmsSilent Ransom Group is targeting U.S. law firms and professional services organizations with fake IT support calls, remote access tools and rapid data-theft extortion. Mandiant links the activity to UNC3753, Luna Moth and Chatty Spider, while the FBI has warned of related social engineering and in-person theft attempts.Alphabet’s $85 Billion AI Financing Push Tests Data Center Investor AppetiteCloud & Data CentersJun 8, 2026Alphabet’s $85 Billion AI Financing Push Tests Data Center Investor AppetiteAlphabet is seeking $85 billion in equity financing after raising its capex outlook to as high as $190 billion. The company is presenting Google Cloud growth, AI adoption and lower Gemini serving costs as evidence that its data center spending can support long-term AI demand.Apple WWDC 2026 Turns Siri Into the Test of Its AI CredibilityAIJun 8, 2026Apple WWDC 2026 Turns Siri Into the Test of Its AI CredibilityApple is expected to put Siri back at the center of WWDC 2026 after delays to its promised Apple Intelligence assistant. The event is likely to test whether Apple can turn contextual awareness, chatbot-style interaction and agentic voice tasks into reliable platform features.ChatGPT Lockdown Mode Narrows AI Data Exfiltration PathsCybersecurityJun 8, 2026ChatGPT Lockdown Mode Narrows AI Data Exfiltration PathsOpenAI is rolling out Lockdown Mode for eligible ChatGPT users to reduce data exfiltration risk from prompt injection. The optional setting limits outbound web and tool capabilities, trading some product flexibility for stronger containment around sensitive workflows.Smart TV Proxy SDKs Turn Free Apps Into a Hidden AI Scraping Supply ChainCybersecurityJun 7, 2026Smart TV Proxy SDKs Turn Free Apps Into a Hidden AI Scraping Supply ChainBright Data's SDK has been reverse-engineered in research showing how free apps can turn consumer devices, including smart TVs, into residential proxy nodes for web-scraping traffic. The issue matters because AI data harvesting is increasing demand for residential IPs, while consent screens and background network behavior may not be clear to users or IT teams.Stratos Data Center Cuts Utah Plan as Water Backlash Tests AI Infrastructure GrowthAIJun 7, 2026Stratos Data Center Cuts Utah Plan as Water Backlash Tests AI Infrastructure GrowthA Kevin O'Leary-backed Utah data center plan has been cut back after water and transparency objections, showing how local resistance can reshape AI infrastructure projects.Dubai Hotels Turn to Residents as Tourism Shock Tests Luxury DemandEconomyJun 7, 2026Dubai Hotels Turn to Residents as Tourism Shock Tests Luxury DemandDubai luxury hotels are using resident staycation discounts to offset weaker international tourism, but the source shows weekend demand cannot fully replace longer foreign stays.Ciena's $50 Billion AI Network Target Puts Optical Capacity on the Hyperscaler ClockChips & SemiconductorsJun 7, 2026Ciena's $50 Billion AI Network Target Puts Optical Capacity on the Hyperscaler ClockCiena says AI demand could roughly double its addressable market to about $50 billion by 2029 as hyperscalers and service providers invest in optical networking. It cited RLS Hyper Rail, DCOM, coherent modules and 400G/800G pluggable optics as demand areas while planning $250 million to $275 million in capex this year. The practical test is whether AI compute buildouts convert into durable network orders.liko.ai Funding Turns Edge AI Into a Smart-Home Hardware TestAIJun 7, 2026liko.ai Funding Turns Edge AI Into a Smart-Home Hardware Testliko.ai completed its first-round financing to fund edge-side vision-language models, AI-native hardware and multi-modal home terminals. The investor group includes Shangtang Guoxiang Capital, Orient Fortune Capital, iFlytek Venture Capital, Hongtai Fund, Zhengxuan Investment and Mianbi Intelligence. The practical test is whether the startup can turn camera-based edge AI into a consumer smart-home hub without relying on cloud processing.