Analysis
AI SHIFT:

SEMQ Tests Aim To Cut AI Memory Overhead Without Named Customers

Newsroom brief

SEMQ Group is pitching symbolic embedding multi-quantization as a way to preserve retrieval and classification behavior while lowering AI semantic-state overhead, but its public evidence remains benchmark-led and customer names are undisclosed.

Verified against source materialEdited by SendTech Times AI & Enterprise Desk
SEMQ Tests Aim To Cut AI Memory Overhead Without Named Customers
Image source: The Register image

SEMQ Targets Embedding Memory Rather Than Model Weights

SEMQ Group is proposing a different way to reduce the storage and memory burden around AI systems.

The company’s approach, called Symbolic Embedding Multi-Quantization, separates the meaning captured in embeddings from the numerical representation normally used to store those embeddings.

Andrés Mac Allister, SEMQ Group’s CEO and founder, said the method focuses on the structural relationship among embedding components rather than preserving every floating-point magnitude.

The claim is vendor-led, but it is tied to a specific technical target: semantic state used in retrieval, memory and classification workflows.

Mac Allister described conventional embedding systems as sequences of high-precision numerical coordinates.

SEMQ files are described as preserving relative similarity ordering, neighborhood structure and other relational properties while separating the representation from metrics, indexing and execution semantics.

Mac Allister Cites Banking77 Benchmark Results

Mac Allister’s technical explanation gives the storage baseline for conventional quantization.

Mac Allister said FP32 requires 4 bytes per parameter, so a 7B parameter model would need about 28 GB of disk space and memory.

He said FP16 or BF16 requires 2 bytes per parameter and would put the same model near 14 GB.

The same comparison lists smaller options including FP8, INT8, Q8, Q6, Q5, Q4, Q3 and Q2.

Those formats reduce the storage and memory footprint by lowering precision, while SEMQ is pitched as a way to preserve more of the relational structure in embedding state.

Mac Allister pointed to a company validation test built around the Banking77 dataset from MTEB and the all-MiniLM-L6-v2 embedding model.

He said the FP32 baseline achieved 92.26 percent accuracy, SEMQ reached 92.27 percent, and 4-bit quantization produced 56.05 percent accuracy.

Those figures come from the company’s cited validation work, not from named customer production systems.

Customer Evidence Remains Behind NDAs

Mac Allister said SEMQ can be applied when data is ingested or at query time.

In his description, teams could use an SDK on vectors generated by their existing embedding model and could run SEMQ beside an existing LLM, embedding model, vector database or agent framework before using it in selected retrieval or memory workloads.

He also said .semq files have been used in research to snapshot and restore transformer KV-cache state across process boundaries.

He did not present that as a pre-training workflow, but as a runtime-state workflow for pausing, transferring and resuming an active model session.

The early business claim is still limited by disclosure.

Mac Allister said the company signed NDAs with organizations in a Founding Design Partnership Program, including some AI infrastructure hyperscalers and companies at the AI application layer.

SEMQ Group has not named customers, and the public record does not disclose deployment sizes, infrastructure savings or third-party benchmark validation.

Share this article
inXf

Related articles

More
SK hynix Ships HBM4E Samples. AI Memory Buyers Still Need Volume Proof.
Chips & Semiconductors

SK hynix Ships HBM4E Samples. AI Memory Buyers Still Need Volume Proof.

SK hynix has sent 12-layer HBM4E samples to major customers, citing 16Gbps per pin speed and a 48GB stack. The announcement shifts the AI memory race from specification claims toward customer qualification and production timing.

Linux Foundation Executives Put MCP Between AI Models And Enterprise Tools
AI

Linux Foundation Executives Put MCP Between AI Models And Enterprise Tools

Linux Foundation executives described MCP as a coordination layer that connects AI models to tools, memory and private data, while leaving approved registry lists and production outcomes outside the public record.

AMD EPYC 8005 Raises SP6 Core Counts Without Customer Rollout Data
Chips & Semiconductors

AMD EPYC 8005 Raises SP6 Core Counts Without Customer Rollout Data

ServeTheHome reported that AMD EPYC 8005 “Sorano” keeps the SP6 server socket while reaching 84 cores, DDR5-6400 memory and CXL 2.0. The sponsored test material disclosed AMD-supplied CPUs, leaving customer deployments and order evidence outside the public record.

MRAgent Cuts Long-Memory Agent Queries To 118k Tokens In Benchmark Tests
AI

MRAgent Cuts Long-Memory Agent Queries To 118k Tokens In Benchmark Tests

National University of Singapore researchers built MRAgent to reconstruct memory through a Cue-Tag-Content graph, with VentureBeat citing LongMemEval prompt use of 118k tokens per sample versus 632k for A-Mem and 3.26 million for LangMem.

Keep Reading

More Stories

Latest
ADCCI Says Abu Dhabi Construction Shift Includes AI Controls And Data CentresEconomyJul 1, 2026ADCCI Says Abu Dhabi Construction Shift Includes AI Controls And Data CentresADCCI said Abu Dhabi construction is shifting toward integrated building systems, modular methods, AI-enabled project controls and specialized projects including data centres. Its report cited a 66 percent rise in new construction business registrations in 2025 and a 24.8 percent increase in active construction members, but did not name data-centre projects, customers or capacity.Oxmiq Raises $35 Million For GPU IP And AI Factory DesignChips & SemiconductorsJul 1, 2026Oxmiq Raises $35 Million For GPU IP And AI Factory DesignEE Times reported that Oxmiq Labs raised $35 million in Series A funding, taking total funding to $60 million, and is expanding from GPU IP toward data-centre-scale hardware, orchestration software and AI factory design. Raja Koduri described customer-funded custom silicon and a 2-GW AM Intelligence Labs deployment, but Oxmiq did not name the other large customers or production contracts.Google Says AI Demand Is Outrunning Grid DecarbonisationCloud & Data CentersJul 1, 2026Google Says AI Demand Is Outrunning Grid DecarbonisationGoogle said its electricity demand climbed 37% in 2025 and has grown more than 250% since 2019 as AI and cloud infrastructure expanded. The company also cited 1 GW of demand-response capacity and more than 12 GW of new clean generation deals, but did not disclose what share of its total computing load can shift during grid stress.Unit 42 Finds 13,229 Malicious URLs In AI Phantom-Domain StudyCybersecurityJul 1, 2026Unit 42 Finds 13,229 Malicious URLs In AI Phantom-Domain StudyUnit 42 said its phantom-squatting research generated 685,339 prompts across 913 brands and produced 2.1 million unique URLs, including 13,229 malicious URLs and about 250,000 unique phantom domains. The vendor research did not disclose the brand list, affected customer names or named domains tied to data loss.Heat Failures Put Data Centres, Telecoms Cabinets And Power Networks Under StrainCloud & Data CentersJul 1, 2026Heat Failures Put Data Centres, Telecoms Cabinets And Power Networks Under StrainExtreme heat is testing the physical systems behind digital services, from power transformers and telecoms cabinets to hospital data centres. BBC evidence from France and the UK shows outages at 40C, data-centre temperatures of 50.3C and rail cabinets that can exceed 70C, while no single national heat-proofing standard is named.Morgan Stanley Digital Asset Trust Wins OCC Approval With $50 Million Capital FloorFintech & Digital PaymentsJul 1, 2026Morgan Stanley Digital Asset Trust Wins OCC Approval With $50 Million Capital FloorThe OCC conditionally approved a national trust bank charter for Morgan Stanley Digital Trust, but the digital-asset subsidiary must meet capital, liquidity and nonobjection conditions before operating freely.MGX Raises $49bn Fund As Abu Dhabi AI Capital Targets Compute AssetsAIJul 1, 2026MGX Raises $49bn Fund As Abu Dhabi AI Capital Targets Compute AssetsMGX closed its first AI-focused fund at $49 billion, above its $45 billion target, as the Abu Dhabi firm ties sovereign capital to semiconductors, data centres and AI platforms. The company has not disclosed Fund I's investor list, stake sizes or customer capacity commitments.Taiwan Crypto Law Sets FSC Licences And 100% Stablecoin ReservesFintech & Digital PaymentsJul 1, 2026Taiwan Crypto Law Sets FSC Licences And 100% Stablecoin ReservesTaiwan approved the Virtual Asset Service Act, requiring crypto platforms to obtain FSC licenses while stablecoin operators face central-bank approval and 100% reserves. The law sets penalties, but the start date and first approvals remain unnamed.Qatar Funds Turksat Satellite As 50Gbps Capacity Stays At Turkish Orbital SlotTelco & ConnectivityJul 1, 2026Qatar Funds Turksat Satellite As 50Gbps Capacity Stays At Turkish Orbital SlotTurksat and Qatar's Es'hailSat signed a satellite partnership funded by Qatar, with Turkey keeping the 50 degrees east orbital and frequency rights while the project leaves cost, launch timing and customers undisclosed.US Lifts Anthropic Model Export Controls After Safeguards DealCapital & PolicyJul 1, 2026US Lifts Anthropic Model Export Controls After Safeguards DealThe Commerce Department is removing licence requirements for Anthropic’s Mythos and Fable models after a safeguards agreement, reopening foreign access while leaving jailbreak controls as the unresolved policy test.Rocket Lab Sets $8bn Iridium Deal As Satellite Network Test Awaits RegulatorsCapital & PolicyJul 1, 2026Rocket Lab Sets $8bn Iridium Deal As Satellite Network Test Awaits RegulatorsRocket Lab has agreed to buy Iridium Communications for about $8bn, pairing launch and spacecraft manufacturing with a satellite communications network that serves more than 2.55 million active subscribers.Qualcomm AI250 Stacks DRAM Over Compute But Leaves FLOPS UndisclosedChips & SemiconductorsJul 1, 2026Qualcomm AI250 Stacks DRAM Over Compute But Leaves FLOPS UndisclosedQualcomm is pitching high-bandwidth compute for AI inference, with AI250 cards claiming 768 GB of memory and 133 TB/s of effective bandwidth, but the company has not disclosed peak FLOPS or named customers.