South Korea Opens 25 High-Value Public Datasets for AI Companies

Article summary

South Korea will release 25 AI and high-value public datasets through the public data portal by December. The datasets were selected from more than 3,280 candidate projects identified through company visits and public-demand surveys. The program targets AI use cases in energy, culture, infrastructure safety, legal-risk checking and agricultural diagnosis.

South Korea Opens 25 High-Value Public Datasets for AI Companies

Image source: 인공지능신문

What happened

South Korea's Ministry of the Interior and Safety has finalized detailed plans to open 25 datasets this year from its AI and high-value public data Top 100 program.

The ministry said the datasets will be released sequentially through the public data portal by December, with the goal of supporting domestic AI companies and new industries.

The selection came from more than 3,280 candidate projects identified through visits to about 800 companies and an online public-demand survey.

External experts reviewed the candidates based on economic impact, links to national policy tasks and AI suitability.

The government plans to open about 100 high-value datasets by 2028.

It opened 10 datasets in 2025, plans 25 this year, 30 in 2027 and 35 in 2028.

Why it matters

For AI developers, access to structured, lawful and domain-specific data can be as important as model choice.

The new releases focus on four areas tied to commercial and public-service use cases: new industries, K-culture, disaster and safety, and AI training data.

Examples include renewable-energy technical potential data from the Korea Institute of Energy Research, cultural AI training data from the Korea Culture Information Service, special-bridge inspection and management data from the Korea Authority of Land and Infrastructure Safety, Fair Trade Commission decision data structured for AI learning, and crop disease and pest diagnosis data from the Rural Development Administration.

The program also shows how South Korea is trying to balance AI training demand with privacy.

Some family, youth and transport-worker qualification data will be released as synthetic data, designed to preserve useful structure and distribution without exposing personal information.

Who is affected

AI startups and service developers are the main target group.

The ministry expects the data to support business-model development in energy project analysis, cultural content, infrastructure maintenance, legal-risk checking, unfair-trade queries and agricultural diagnosis.

The opening could also affect public agencies that hold data but need to make it more AI-friendly.

The ministry said it will strengthen demand surveys for training data and shift public-data management toward an AI-friendly system.

What to watch next

The near-term test is whether the 25 datasets are released on schedule by December and whether their formats are usable for model training, retrieval systems and commercial applications.

Data quality, metadata, licensing terms and update cadence will determine how useful the releases are in practice.

Readers should also watch whether Korean AI companies can turn these public datasets into deployed services rather than experiments.

If adoption follows, the program could become part of South Korea's effort to support a stronger domestic AI ecosystem without relying only on private datasets.

#ai #south korea #public data #ai policy

ORBBEC Pushes 3D Vision Deeper Into Physical AI

ORBBEC is expanding from robot vision into physical AI, general AI vision, 3D printing and 3D data acquisition. The company reports more than 70% service robot market share in China and South Korea and has entered supply chains for AgiBot, UBTech and Unitree. Q1 2026 revenue reached RMB 203 million, while net profit after deductions rose 531.01% year on year.

Grep Adds LLM Agent To Monito As Online Proctoring Shifts Toward Context Review

Grep said its Monito online proctoring product now uses an LLM agent to analyze context around suspected cheating events. The company cited internal tests showing more than 30 percent shorter post-exam review time and nearly 20 percent fewer false alerts. The key issue is whether agent-based proctoring can improve review efficiency while preserving human final judgment and candidate fairness.

Mercari Moves AI Leadership Into HR as It Tests AI-Native Workflows

Mercari's Japan business CTO Toshiya Kimura is becoming CHRO and CAIO as the company links AI adoption with organizational redesign. The company has tested smaller AI Pods and found faster decisions, but also limits around design, compliance and cross-functional work. Mercari plans to make HR itself AI-first while using governance across legal, privacy, security, public policy and AI expertise.

Nota Runs VLA Robotics Model in Real Time on Qualcomm Edge AI Hardware

Nota demonstrated real-time operation of a vision-language-action robotics model on Qualcomm Dragonwing edge AI hardware. The company reduced the model action-head processing time from 218 milliseconds to 31 milliseconds while keeping task success nearly unchanged. The demo points to a path for physical AI systems that can run closer to robots rather than relying mainly on GPU servers or cloud infrastructure.

Keep Reading