Chips & SemiconductorsNews|June 20, 2026, 06:54 PM

CAPACITY TEST:

Intel And AMD Push ACE To Move AI Math Back Onto x86 CPUs

BySendTech Times DeskNewsroom-edited technology coverage|Source: Tom's Hardware

Newsroom brief

Intel and AMD have released the ACE specification for x86 processors, using AVX10 registers and dedicated matrix-multiplication silicon to make some AI tasks more power-efficient on CPUs rather than GPUs or NPUs.

Verified against source materialEdited by SendTech Times Desk

$Intel And AMD Push ACE To Move AI Math Back Onto x86 CPUs$

ACE Reframes The CPU Role In AI

Intel and AMD have released the full specification for ACE CPU extensions, a move aimed at making x86 processors more useful for AI tasks that do not always belong on GPUs.

The change targets smaller models, latency-sensitive single-user work and situations where no capable GPU is available.

The standard uses existing AVX10 registers while adding silicon dedicated to matrix multiplication.

That combination is meant to preserve links to current x86 designs while giving developers a more direct path for AI math.

The practical claim is not that CPUs replace accelerators, but that some AI work can run with less overhead when it stays close to the processor already handling the system.

That matters because AI infrastructure is not only a GPU story.

CPUs still manage operating systems, memory movement, storage, networking and many edge or client-side tasks.

If ACE gives x86 chips a more efficient way to handle matrix operations, Intel and AMD gain a clearer response to AI workloads that are too small, too latency-sensitive or too scattered for dedicated accelerators.

Matrix Multiplication Gets Dedicated Silicon

Matrix multiplication sits at the center of many AI workloads.

CPUs can already run those operations, but the process can be slow and power-hungry when it relies on general vector instructions.

AVX10 multiply-accumulate instructions can help, but the source material describes that path as a workaround because AVX was not designed around 2D matrix operations.

ACE changes the approach by adding hardware support for matrix multiplication while continuing to use 512-bit AVX inputs.

That design is meant to simplify integration with existing x86 processor designs because ACE does not need a separate input format.

At equal input-vector counts, the ACE design is described as capable of 16x the operations available through AVX10.

That is not the same as a promised 16x real-world speedup, because each processor implementation will determine delivered performance.

Still, packing more matrix work into each instruction can reduce instruction overhead and may improve RAM bandwidth use.

The design also keeps the CPU discussion tied to software practicality rather than benchmark claims alone.

ACE is useful only if the instruction path can be exposed in ways that compiler writers, library maintainers and framework teams can adopt without fragmenting support across every x86 implementation.

A Common Target For AI Frameworks

The developer angle may be as important as the hardware change.

ACE is intended to be implementation-agnostic, so machine-learning frameworks and libraries such as PyTorch and TensorFlow can target one code path rather than building many variations around different levels of AVX support.

The standard also supports data types used in machine-learning operations, including INT8, INT32, FP8, FP16, FP32 and BF16.

It can also use Open Compute Project MX block-scaled formats natively, which AVX10 does not provide.

That gives Intel and AMD a way to make x86 CPUs a more consistent fallback or primary target for selected inference work.

Developers could move some NPU-specific workloads back to CPUs when they need quick execution and do not want to handle different NPU designs.

The Watchpoint Is Real Implementation

The specification gives Intel and AMD a shared technical direction, but the commercial test will come from silicon and software adoption.

ACE needs processor implementations, compiler support and framework support before it changes how AI workloads are deployed.

The open question is where ACE fits against GPUs and NPUs.

GPUs will remain central for large-scale training and heavy inference.

NPUs will keep serving power-sensitive client workloads.

ACE is more likely to matter in the middle: small models, fallback execution, CPU-only environments and workflows where moving data to another accelerator adds more overhead than value.

If Intel and AMD execute well, ACE could make x86 CPUs a more credible part of the AI stack rather than just the host around accelerators.

If support arrives slowly, it may remain a useful specification without becoming a practical deployment target for mainstream AI frameworks.

#chips semiconductors #AI processors #x86 #CPU extensions

Chips & Semiconductors

NymCard Pitches a Single Stack for MENA Banks Still Stuck With Patchwork Payment Systems

Dubai-based NymCard has launched nCore FullStack, a platform that puts card issuing, lending, money movement, settlement, financial-crime controls and reconciliation behind a single integration. The company says the system can run on public cloud, hybrid, local in-country infrastructure or on-premise deployments, a key point for banks operating under strict regional data-residency rules. NymCard says it powers programmes for more than 60 banks, fintechs and enterprises across eight markets, but the launch still needs to prove that banks will replace fragmented vendor stacks rather than add another layer.

Chips & Semiconductors

Claros Turns Samsung Foundry Into a Test for Its AI Power Chip

Claros says Samsung Electronics will manufacture its integrated voltage regulator at the Austin, Texas, fab, giving the startup a U.S. production route for chips designed to reduce power loss near AI processors and support 800 VDC data-center designs.

Chips & Semiconductors

Intel 18A-P Enters Risk Production, but Foundry Proof Still Depends on Yield

Intel has started risk production of its 18A-P node, adding performance and power claims to its foundry pitch, while outside-customer commitments, Arm manufacturing proof and packaging capacity remain the next tests.

Chips & Semiconductors

GoPro’s Slide Shows How AI Hardware Is Tilting Toward Chinese Brands

GoPro’s market-share collapse and iRobot’s loss of Roomba control show how Chinese hardware makers are challenging Western consumer-tech pioneers as AI features move into cameras and home robots.