May 2025 - by-EAjks.Com | Technology Conferences 100% Summarized using AI

Thales Bets on Open Source Silicon for Sovereignty and Safety-Critical Systems

A video published by RISC-V International on their YouTube channel, and summarized for you by a combination of popular AI chatbots (Gemini, Qwen, ChatGPT, Claude). Watch it on YouTube for full credit to the authors.

Executive Summary

Bernhard Quendt, CTO of Thales Group, delivered a compelling presentation at RISC-V Summit Europe 2025 on May 28th, 2025 on the strategic adoption of open-source hardware (OSH), particularly RISC-V and the CVA6 core, to build sovereign and reliable supply chains in safety- and mission-critical domains. The talk emphasized how tightening geopolitical controls—export restrictions from both U.S.-aligned and China-aligned blocs—are accelerating the need to decouple from proprietary IP.

Quendt highlighted three technical thrusts in this initiative: an open-source-based spaceborne computing platform based on CVA6, a compact industrial-grade CVA6-based microcontroller (CVA62) for embedded systems, and a forthcoming CVI64 core with MMU support for secure general-purpose OSes.

Thales is not adopting OSH merely as a cost-cutting measure. Rather, it views open hardware as foundational—alongside AI acceleration, quantum computing, and secure communications—for enabling digital sovereignty, reducing integration costs, and maintaining complete control over high-assurance system architectures.

Three Critical Takeaways

1. CVA6-Based Spaceborne Computing Platform

Technical Overview

Thales Alenia Space has developed a modular onboard computer based on the CVA6 64-bit RISC-V core. This system incorporates secure open-source root-of-trust blocks and vector accelerators. The platform supports mixed-criticality software and is tailored for the unique reliability and certification needs of space environments.

The modularity of the platform allows faster design iteration and decoupling of hardware/software verification cycles—critical benefits in aerospace development.

Assessment

The strategy is forward-leaning but not without risk. Toolchains and verification flows for open-source processors remain less mature than those in the Arm or PowerPC ecosystem. Furthermore, CVA6 is not yet hardened against radiation effects (e.g., single event upsets or total ionizing dose), which poses challenges for LEO and deep-space applications.

Thales likely mitigates this through board-level fault tolerance and selective redundancy, though such architectural decisions were not disclosed.

Market Context

This approach diverges from legacy reliance on processors like LEON3 (SPARCv8) or PowerPC e500/e6500, which are radiation-tolerant and supported by ESA/NASA toolchains. The open RISC-V path offers increased configurability and transparency at the expense of hardened IP availability and TRL maturity.

Quantitative Support

While specific metrics were not shared, RISC-V-based radiation-tolerant designs typically aim for performance in the 100–500 DMIPS range. Proprietary IP licenses for space-qualified cores can exceed $1–2 million per program, underscoring the potential cost advantage of open-source silicon.

2. CVA62: Low-Area, Safety-Ready Microcontroller

Technical Overview

Thales introduced CVA62, a 32-bit microcontroller derivative of CVA6, targeting embedded systems and industrial IoT. CVA62 is designed on TSMC 5nm and adheres to ISO 26262 safety principles, aiming for ASIL-B/D applicability. Its RTL is formally verified and publicly auditable.

It supports the RV32IMAC instruction set, features a configurable pipeline depth, and prioritizes area and power efficiency. Its release aligns with growing demand for safety-certifiable open cores.

Assessment

A formally verified open-source MCU with ISO 26262 alignment is a strong differentiator—especially for defense, automotive, and infrastructure markets. However, achieving full ASIL-D certification also depends on qualified toolchains, documented failure modes, and compliance artifacts. The current RISC-V ecosystem has yet to meet these rigorously.

Still, the availability of a verified baseline—combined with collaboration-friendly licensing—could enable safety qualification through industry-specific efforts.

Competitive Context

CVA62 competes with Cortex-M7 and SiFive E31/E51 in the deterministic MCU space. While Arm cores offer rich toolchains and pre-certified software stacks, CVA62 provides transparency and configurability, with the tradeoff of less polished ecosystem support.

Feature	CVA62	Cortex-M7
ISA	RISC-V (RV32IMAC)	Armv7E-M
Pipeline	Configurable	Fixed 6-stage
MMU Support	No	No
Open Source	Yes	No
ISO 26262 Alignment	Planned	Available (via toolchain vendors)
Target Process	TSMC 5nm	40nm–65nm typical

Quantitative Support

Public benchmarks for RV32-class cores show CVA62 class devices achieving 1.5–2.0 CoreMark/MHz depending on configuration. Power efficiency data is pending silicon tape-out but is expected to improve over larger legacy MCUs due to 5nm geometry.

3. CVI64: MMU-Enabled RISC-V Application Core

Technical Overview

Thales is collaborating on CVI64, a 64-bit RISC-V core with memory management unit (MMU) support and a clean-slate deterministic design philosophy. The first silicon is targeted for Technology Readiness Level 5 (component validation in relevant environment) by Q3 2025.

CVI64 is intended to support real-time Linux and deterministic hypervisors, with applications in avionics, defense systems, and certified industrial platforms.

Assessment

Adding MMU support unlocks Linux-class workloads—but increases architectural complexity. Issues like page table walk determinism, cache coherence, and privilege transitions must be tightly constrained in safety contexts. Out-of-order execution, if implemented, would further complicate timing analysis.

Early ecosystem maturity will likely lag that of SiFive U-series or Arm Cortex-A cores, but CVI64 may find niche adoption where auditability and customization trump software availability.

Competitive Context

CVI64 enters a field occupied by SiFive S7/S9, Andes AX45, and Arm Cortex-A53/A55. Unlike these, CVI64 will be fully open and verifiable. This suits users requiring full-stack trust anchors—from silicon up to operating system.

Feature	CVI64	SiFive S7	Cortex-A53
ISA	RV64GC	RV64GC	Armv8-A
MMU	Yes	Yes	Yes
Execution Model	In-order (planned)	In-order	Out-of-order
Target Frequency	TBD (~1 GHz class)	1.5–2.0 GHz	1.2–1.5 GHz
Open Source	Yes (100%)	Partial	No

Quantitative Support

SiFive U84-based SoCs have reached 1.5 GHz on 7nm. CVI64 will likely debut at lower performance (~800–1000 MHz) due to early-phase optimizations and tighter deterministic design goals.

Final Thoughts

Thales’s adoption of open-source silicon reflects a strategic shift across defense and aerospace sectors. OSH enables sovereignty, customization, and long-term maintenance independence—critical in an era of increasingly politicized semiconductors.

Yet major challenges persist: toolchain immaturity, limited availability of safety-certifiable flows, and uncertain community governance. Organizations pursuing this path should adopt a phased integration model—deploying OSH first in non-critical components while building verification and integration expertise in parallel.

Significant investment will be required in:

Formal verification frameworks (e.g., SymbiYosys, Boolector, Tortuga Agilis)
Mixed-language simulation environments (e.g., Verilator, Cocotb)
Cross-industry ecosystem building and long-term funding models

Thales is making a long-term bet on auditability and openness in silicon. If the RISC-V ecosystem can deliver the tooling and robustness demanded by regulated industries, it could catalyze a new wave of mission-grade open architectures. The opportunity is real—but so is the engineering burden.

AMD at COMPUTEX 2025: Pushing the Boundaries of Compute

A video published by AMD on their YouTube channel, and summarized for you by a combination of popular AI chatbots (Gemini, Qwen, ChatGPT, Claude). Watch it on YouTube for full credit to the authors.

At COMPUTEX 2025 on May 21st, 2025, AMD’s Jack Huynh—Senior VP and GM of the Computing and Graphics Group—unveiled a product vision anchored in one central idea: small is powerful. This year’s keynote revolved around the shift from centralized computing to decentralized intelligence—AI PCs, edge inference, and workstations that rival cloud performance.

AMD’s announcements spanned three domains:

Gaming: FSR Redstone and Radeon RX 9060 XT bring path-traced visuals and AI rendering to the mid-range.
AI PCs: Ryzen AI 300 Series delivers up to 34 TOPS of local inferencing power.
Workstations: Threadripper PRO 9000 and Radeon AI PRO R9700 target professional AI developers and compute-intensive industries.

Let’s unpack the technical and strategic highlights.

1. FSR Redstone: Machine Learning Meets Real-Time Path Tracing

The Technology

FSR Redstone is AMD’s most ambitious attempt yet to democratize path-traced rendering. It combines:

Neural Radiance Caching (NRC) for learned lighting estimations.
Ray Regeneration for efficient reuse of ray samples.
Machine Learning Super Resolution (MLSR) for intelligent upscaling.
Frame Generation to increase output FPS via temporal inference.

This hybrid ML pipeline enables real-time lighting effects—like dynamic GI, soft shadows, and volumetric fog—on GPUs without dedicated RT cores.

Why It Matters

By applying learned priors to ray-based reconstruction, Redstone achieves the appearance of path-traced realism while maintaining playable frame rates. This lowers the barrier for mid-range GPUs to deliver high-fidelity visuals.

Caveats

The ML approach, while efficient, is heavily scene-dependent. Generalization to procedurally generated content remains an open question. Visual artifacts can emerge in dynamic geometry, and upscaling introduces trade-offs in motion stability.

Competitive Lens

Feature	FSR Redstone	DLSS 3.5	XeSS
Neural Rendering	✅	✅	✅
Ray Regeneration	✅	❌	⚠️ Partial
Open Source Availability	✅ (via ROCm)	❌	⚠️ Partial
Specialized Hardware Req.	❌	✅ (Tensor Cores)	❌

In essence: Redstone is AMD’s answer to DLSS—built on open standards, deployable without AI-specific silicon.

2. Ryzen AI 300 Series: On-Device Intelligence for the AI PC Era

The Technology

The new Ryzen AI 300 APUs feature a dedicated XDNA 2-based NPU delivering up to 34 TOPS (INT8). This enables local execution of:

Quantized LLMs (e.g., Llama 3 8B)
Real-time transcription and translation
Code assist and image editing
Visual search and contextual agents

The architecture distributes inference across CPU, GPU, and NPU with intelligent workload balancing.

Why It Matters

Local inferencing improves latency, preserves privacy, and reduces cloud dependencies. In regulated industries and latency-critical workflows, this is a step-function improvement.

Ecosystem Challenges

Quantized model availability is still thin.
ROCm integration into PyTorch/ONNX toolchains is ongoing.
AMD’s tooling for model optimization lacks the maturity of NVIDIA’s TensorRT or Apple’s CoreML.

Competitive Positioning

Platform	NPU TOPS (INT8)	Architecture	Ecosystem Openness	Primary OS
Ryzen AI 300	34	x86 + XDNA 2	High (ROCm, ONNX)	Windows, Linux
Apple M4	~38	ARM + CoreML NPU	Low (CoreML only)	macOS, iOS
Snapdragon X	~4.3	ARM + Hexagon DSP	Medium	Windows, Android

Ryzen AI PCs position AMD as the open x86 alternative to Apple’s silicon dominance in local AI workflows.

3. Threadripper PRO 9000 & Radeon AI PRO R9700: Workstation-Class AI Development

The Technology

Threadripper PRO 9000 (“Shimada Peak”):

96 Zen 5 cores / 192 threads
8-channel DDR5 ECC memory, up to 4TB
128 PCIe 5.0 lanes
AMD PRO Security (SEV-SNP, memory encryption)

Radeon AI PRO R9700:

1,500+ TOPS (INT4)
32GB GDDR6
ROCm-native backend for ONNX and PyTorch

This pairing provides a serious platform for AI fine-tuning, quantization, and even training of small LLMs.

Why It Matters

This workstation tier offers an escape hatch from expensive cloud runtimes. For developers, AI researchers, and enterprise teams, it enables:

Local, iterative model tuning
Predictable hardware costs
Privacy-first workflows (especially in defense, healthcare, and legal)

Trade-offs

ROCm continues to trail CUDA in terms of ecosystem depth and performance tuning. While AMD offers competitive raw throughput, software maturity—especially for frameworks like JAX or Triton—is still catching up.

Competitive Analysis

Metric	TR PRO 9000 + R9700	NVIDIA RTX 6000 Ada
CPU Cores	96 (Zen 5)	N/A
GPU AI Perf (INT4)	~1,500 TOPS	~1,700 TOPS
VRAM	32GB GDDR6	48GB GDDR6 ECC
Ecosystem Support	ROCm (moderate)	CUDA (mature)
Distributed Training	❌ (limited)	✅ (via NVLink)
Local LLM Inference	✅ (8B–13B)	✅

AMD’s strength lies in performance-per-dollar and data locality. For small-to-mid-sized models, it offers near-cloud throughput on your desktop.

Final Thoughts: Decentralized Intelligence is the New Normal

COMPUTEX 2025 made one thing clear: the future of compute is not just faster—it’s closer. AMD’s platform strategy shifts the emphasis from scale to locality:

From cloud inferencing to on-device AI
From GPU farms to quantized workstations
From centralized render clusters to ML-accelerated game engines

With open software stacks, power-efficient inference, and maturing hardware, AMD positions itself as a viable counterweight to NVIDIA and Apple in the edge-AI era.

For engineering leaders and CTOs, this represents an inflection point. The question is no longer “When will AI arrive on the edge?” It’s already here. The next question is: What will you build with it?

Arm at COMPUTEX 2025: A Strategic Inflection Point for AI Everywhere

A video published by Arm on their YouTube channel, and summarized for you by a combination of popular AI chatbots (Gemini, Qwen, ChatGPT, Claude). Watch it on YouTube for full credit to the authors.

Executive Summary

Chris Bergey, Senior Vice President and General Manager of the Client Line of Business at ARM, delivered a keynote at COMPUTEX 2025 on May 20th, 2025 that framed the current era as a historic inflection point in computing—one where AI is no longer an idea but a force, reshaping everything from cloud infrastructure to edge devices. The presentation outlined ARM’s strategic positioning in this new landscape, emphasizing three core pillars: ubiquitous platform reach, world-leading performance-per-watt, and a powerful developer ecosystem.

Bergey argued that the exponential growth in AI workloads—both in scale and diversity—demands a fundamental rethinking of compute architecture. He positioned ARM not just as a CPU IP provider but as a full-stack platform company delivering optimized, scalable solutions from data centers to wearables. Key themes included the shift from training to inference, the rise of on-device AI, and the growing importance of power efficiency across all form factors.

The talk also featured panel discussions with Kevin Dearling (NVIDIA) and Adam King (MediaTek), offering perspectives on technical constraints, innovation vectors, and the role of partnerships in accelerating AI adoption.

Three Critical Takeaways

1. AI Inference Is Now the Economic Engine—Not Training

Technical Explanation

Bergey distinguished between the computational cost of model training vs. inference, highlighting that while training requires enormous flops (~10^25–10^26), inference—though less intensive (~10^14–10^15 per query)—scales with usage volume. For example, if each web search used a large language model, ten days’ worth of inference compute could equal one day of training compute.

This implies a shift in focus: monetization stems not from model creation, but from scalable deployment of efficient inference engines across mobile, wearable, and embedded platforms.

Critical Assessment

This framing aligns with current trends. While companies like NVIDIA continue optimizing training clusters, the greater opportunity lies in edge inference, where latency, power, and throughput are paramount. However, the keynote underplays the complexity of model compression, quantization, and hardware/software co-design, which are critical for deployment at scale.

ARM’s V9 architecture and Scalable Matrix Extensions (SME) are promising for accelerating AI workloads in the CPU pipeline, potentially reducing reliance on NPUs or GPUs—a differentiator in cost- and thermally-constrained environments.

Competitive/Strategic Context

x86 Alternatives: Intel and AMD dominate traditional markets but lag ARM in performance-per-watt. Apple’s M-series SoCs, based on ARM, demonstrate clear efficiency gains.
Custom Silicon: Hyperscalers like AWS (Graviton), Google (Axion), and Microsoft (Cobalt) increasingly favor ARM-based silicon, citing up to 40% efficiency improvements.
Edge NPU Trade-offs: Competitors like RISC-V and Qualcomm Hexagon push AI logic off-core, whereas ARM integrates it into the CPU, improving software portability but trading off peak throughput.

Quantitative Support

Over 50% of new AWS CPU capacity since 2023 is ARM-based (Graviton).
ARM-based platforms account for over 40% of 2025 PC/tablet shipments.
SME and NEON extensions yield up to 4x ML kernel acceleration without dedicated accelerators.

2. On-Device AI Is Now Table Stakes

Technical Explanation

Bergey emphasized that on-device AI is becoming the norm, driven by privacy, latency, and offline capability needs. Use cases include coding assistants, chatbots, and real-time inference in industrial systems.

ARM showcased its client roadmap, including:

Travis CPU: Next-gen core with IPC improvements and enhanced SME.
Draga GPU: Advanced ray tracing and sustained mobile graphics.
ARM Accuracy Super Resolution (AASR): AI upscaling previously limited to consoles, now on mobile.

Critical Assessment

On-device AI is architecturally sound for privacy-sensitive or latency-critical apps. Yet, memory and thermal constraints remain obstacles for large model execution on mobile SoCs. ARM’s strategy of enhancing general-purpose cores aids flexibility, though specialized NPUs still offer superior throughput for vision or speech applications.

While ARM’s developer base (22 million) is substantial, toolchain fragmentation and driver inconsistencies complicate cross-platform integration.

Competitive/Strategic Context

Apple ANE: Proprietary and tightly integrated but closed.
Qualcomm Hexagon: Strong in multimedia pipelines but hampered by software issues.
Google Edge TPU: Power-efficient but limited in scope.

ARM’s open licensing and platform breadth support broad AI enablement, from Chromebooks to premium devices.

Quantitative Support

MediaTek’s Companio Ultra delivers 50 TOPS AI performance on ARM V9.
Travis + Draga enables 1080p upscaling from 540p, achieving console-level mobile graphics.

3. Taiwan as the Nexus of AI Hardware Innovation

Technical Explanation

Bergey emphasized Taiwan’s pivotal role in AI hardware: board design, SoC packaging, and advanced fab technologies. ARM collaborates with MediaTek, ASUS, and TSMC—all crucial for AI scalability.

He highlighted the DGX Spark platform, combining 20 ARM V9 CPUs and an NVIDIA GB10 GPU, delivering petaflop-class AI compute to compact systems.

Critical Assessment

Taiwan excels in advanced packaging (e.g., CoWoS) and silicon scaling. But geopolitical risks could impact production continuity. ARM’s integration with Taiwanese partners is a strategic strength, yet resilience planning remains essential.

DGX Spark is a compelling proof-of-concept, though mainstream adoption may be constrained by power and cost considerations, especially outside research or high-end enterprise.

Competitive/Strategic Context

U.S. Foundries: Lag in packaging tech; TSMC leads sub-5nm.
China: Investing heavily but remains tool-dependent.
Europe: Focused on sustainable compute but lacks vertical integration.

ARM’s neutral IP model facilitates global partnerships despite geopolitical tensions.

Quantitative Support

Taiwan expects 8x data center power growth, from megawatts to gigawatts.
DGX Spark packs 1 petaflop compute into a desktop form factor.

Conclusion

ARM’s COMPUTEX 2025 keynote presented a strategic vision for a future where AI is ubiquitous and ARM is foundational. From hyperscale to wearable, ARM aims to lead through performance-per-watt, platform coverage, and ecosystem scale.

Challenges persist: model optimization, power efficiency, and political risk. Still, ARM’s trajectory suggests it could define the next computing era—not just through CPUs, but as a full-stack enabler of AI.

For CTOs and architects planning future compute stacks, ARM’s approach offers compelling value, especially where scalability, energy efficiency, and developer reach take precedence over peak raw performance.

Microsoft Build 2025: A Platform Shift for the Agentic Web

A video published by Microsoft on their YouTube channel, and summarized for you by a combination of popular AI chatbots (Gemini, Qwen, ChatGPT, Claude). Watch it on YouTube for full credit to the authors.

Executive Summary

Satya Nadella’s opening keynote at Microsoft Build 2025, on May 20th, 2025, painted a comprehensive vision of the evolving developer landscape, centered around what Microsoft calls the agentic web—a system architecture where autonomous AI agents interact with digital interfaces and other agents using standardized protocols. This shift treats AI agents as first-class citizens in software development and business processes.

This is not just an incremental evolution of existing tools but a transformation that spans infrastructure, tooling, platforms, and applications. While Microsoft presents this as a full-stack transformation, practical maturity across the stack remains uneven—particularly in orchestration and security.

The central thesis was clear: Microsoft is positioning itself as the enabler of this agentic future, offering developers a unified ecosystem from edge to cloud, with open standards like MCP (Model Context Protocol) at its core.

This blog post distills three critical takeaways that represent the most impactful innovations and strategic moves presented at the event.

Critical Takeaway 1: GitHub Copilot Evolves into a Full-Stack Coding Agent

Technical Explanation

GitHub Copilot has evolved beyond code completion and chat-based assistance into a full-fledged coding agent capable of autonomous task execution. Developers can now assign issues directly to Copilot, which will generate pull requests, triage bugs, refactor code, and even modernize legacy applications (e.g., Java 8 → Java 21). These features are currently in preview.

It integrates with GitHub Actions and supports isolated branches for secure operations. While there is discussion of MCP server configurations in future integrations, public documentation remains limited.

Microsoft has also open-sourced the integration scaffolding of Copilot within VS Code, enabling community-driven extensions, though the underlying model remains proprietary.

Critical Assessment

This represents a major leap forward in developer productivity. By treating AI not as a passive assistant but as a peer programmer, Microsoft is redefining how developers interact with IDEs. However, the effectiveness of such agents depends heavily on the quality of training data, token handling capacity, and context-awareness.

Potential limitations include:

Context fidelity: Can the agent maintain state and intent across large codebases given current token limits?
Security and auditability: Transparency around sandboxing and trace logs is essential.
Developer trust: Adoption hinges on explainability and safe fallback mechanisms.

Competitive/Strategic Context

Competitors like Amazon CodeWhisperer and Tabnine offer similar capabilities but lack GitHub’s deep DevOps integration. Tabnine emphasizes client-side privacy, while CodeWhisperer leverages AWS IAM roles but offers limited CI/CD interaction.

Feature	GitHub Copilot Agent	Amazon CodeWhisperer	Tabnine
Autonomous PR generation	✅	❌	❌
Integration with CI/CD	✅	Limited	❌
Open-sourced in editor	Partial	❌	✅ (partial)
Multi-agent orchestration	Planned	❌	❌

Quantitative Support

GitHub Copilot has over 15 million users.
Over 1 million agents have been built using Microsoft 365 Copilot and Teams.
Autonomous SRE agents reportedly reduce incident resolution time by up to 40%.

Critical Takeaway 2: Azure AI Foundry as the App Server for the Agentic Era

Technical Explanation

Azure AI Foundry is positioned as the app server for the next generation of AI applications—analogous to how Java EE or .NET once abstracted deployment and lifecycle management of distributed applications.

Key features:

Multi-model support: 1,900+ models including GPT-4o, Mistral, Grok, and open-source variants.
Agent orchestration: Enables deterministic workflows with reasoning agents.
Observability: Built-in monitoring, evals, tracing, and cost tracking.
Hybrid deployment: Supports cloud-to-edge and sovereign deployments.

Foundry includes a model router that automatically selects models based on latency, performance, and cost, reducing operational overhead.

Critical Assessment

Foundry addresses the lack of a standardized app server for stateful, multi-agent systems. Its enterprise-grade reliability is particularly appealing to organizations already invested in Azure.

Still, complexity remains. Building distributed intelligent agents demands robust coordination logic, long-term memory handling, and fault-tolerant execution—all areas that require ongoing refinement.

Competitive/Strategic Context

AWS Bedrock and Google Vertex AI offer model hosting and inference APIs, but Azure Foundry differentiates through full lifecycle support and tighter integration with agentic paradigms. Support for open protocols like MCP also enhances portability and neutrality.

Capability	Azure AI Foundry	AWS Bedrock	Google Vertex AI
Multi-agent orchestration	✅	❌	Limited
Model routing	✅	❌	❌
Memory & RAG integration	✅	Limited	✅
MCP support	✅	❌	❌

Quantitative Support

Over 70,000 organizations use Foundry.
In Q1 2025, Foundry processed more than 100 trillion tokens (5x YoY growth).
Stanford Medicine reduced tumor board prep time by 60% using Foundry-based agents.

Critical Takeaway 3: The Rise of the Agentic Web with MCP and NLWeb

Technical Explanation

Microsoft is building an open agentic web anchored by:

MCP (Model Context Protocol): A lightweight, HTTP-style protocol for secure, interoperable agent-to-service communication. A native MCP registry is being integrated into Windows to allow secure exposure of system functionality to agents. Public availability is currently limited to early preview.
NLWeb: A framework that enables websites and APIs to expose structured knowledge and actions to agents, functioning like OpenAPI or HTML for agentic interaction. Implementation requires explicit markup and wrappers.

Together, these technologies support a decentralized, interoperable agent ecosystem.

Critical Assessment

MCP solves the critical problem of safe, permissioned access to tools by agents. NLWeb democratizes agentic capabilities for web developers without deep ML expertise.

Challenges include:

Standardization: Broad adoption of MCP beyond Microsoft is still nascent.
Security: Risk of misuse via overly permissive interfaces.
Performance: Real-time agentic calls could introduce latency bottlenecks.

Competitive/Strategic Context

LangChain and MetaGPT offer agent orchestration but lack the web-scale interoperability MCP/NLWeb target. Microsoft’s emphasis on open composition is reminiscent of the REST API revolution.

Feature	MCP + NLWeb	LangChain Tooling	MetaGPT
Web composability	✅	❌	❌
Interoperability	✅	Limited	Proprietary
Open source	✅	✅	✅
Security model	OS-integrated	Manual	Manual

Quantitative Support

Windows MCP registry enables discovery of system-level agents (files, settings, etc.).
Partners like TripAdvisor and O’Reilly are early adopters of NLWeb.
NLWeb supports embeddings, RAG, and Azure Cognitive Search integration.

Conclusion

Microsoft Build 2025 marked a definitive pivot toward the agentic web, where AI agents are not just tools but collaborators in software, science, and operations. Microsoft is betting heavily on open standards like MCP and NLWeb while reinforcing its dominance in developer tooling with GitHub Copilot and Azure AI Foundry.

For CTOs and architects, the message is clear: the future of software is agentic, and Microsoft aims to be the platform of choice. The success of this vision depends on Microsoft’s ability to balance openness with control and to build trust across the developer ecosystem.

The tools are now in place—and the race is on.

Jensen Huang’s COMPUTEX 2025 Keynote: A Technical Deep Dive into the Future of AI Infrastructure

A video published by NVIDIA on their YouTube channel, and summarized for you by a combination of popular AI chatbots (Gemini, Qwen, ChatGPT, Claude). Watch it on YouTube for full credit to the authors.

Executive Summary

In his keynote at COMPUTEX 2025 on May 19th, 2025, NVIDIA CEO Jensen Huang outlined a detailed roadmap for the next phase of computing, positioning artificial intelligence as a new foundational infrastructure layer—on par with electricity and the internet. Rather than focusing on individual product SKUs, Huang presented NVIDIA as the platform provider for enterprises, industries, and nations building sovereign, scalable AI systems.

Central to this vision is the replacement of traditional data centers with “AI factories”—integrated computational systems designed to generate intelligence in the form of tokens. Huang introduced key architectural advancements including the Grace Blackwell GB300 NVL72 system, next-generation NVLink and NVSwitch fabrics, and the strategic open-sourcing of Isaac GR00T, a foundational robotics agent model.

This post dissects the three most technically significant announcements from the keynote, with a focus on implications for system architects, CTOs, and principal engineers shaping next-generation AI infrastructure.

1. The GB300 NVL72 System: Scaling AI Factories with Rack-Scale Integration

Technical Overview

The Grace Blackwell GB300 NVL72 system represents a fundamental rethinking of rack-scale AI infrastructure. Each rack contains 72 B300 GPUs and 36 Grace CPUs in a liquid-cooled configuration, delivering up to 1.4 exaflops (FP4) of AI performance. Notable improvements over the H100/H200 era include:

~4× increase in LLM training throughput
Up to 30× boost in real-time inference throughput
192 GB of HBM3e per GPU (2.4× increase over H100)
5th-generation NVLink with 1.8 TB/s per GPU of bidirectional bandwidth

A 4th-generation NVSwitch fabric provides 130 TB/s of all-to-all, non-blocking bandwidth across the 72 GPUs, enabling a unified memory space at rack scale. The system operates within a 120 kW power envelope, necessitating liquid cooling and modernized power distribution infrastructure.

Architectural Implications

The GB300 NVL72 exemplifies scale-up design: high-bandwidth, tightly coupled components acting as a single compute unit. This architecture excels at training and inference tasks requiring massive memory coherence and fast interconnects.

However, scale-out—distributing computation across multiple racks—remains bottlenecked by inter-rack latency and synchronization challenges. NVIDIA appears to be standardizing the NVL72 as a modular “AI factory block,” favoring depth of integration over breadth of distribution.

The thermal and electrical demands are also transformative. 120 kW per rack mandates direct-to-chip liquid cooling, challenging legacy data center design norms.

Strategic and Competitive Context

Feature / Vendor	NVIDIA GB300 NVL72	AMD MI300X Platform	Google TPU v5p	Intel Gaudi 3
Primary Interconnect	5th-Gen NVLink + NVSwitch (1.8 TB/s/GPU)	Infinity Fabric + PCIe 5.0	ICI + Optical Circuit Switch	24× 200 GbE RoCE per accelerator
Scale-Up Architecture	Unified 72-GPU coherent fabric	8-GPU coherent node	4096-chip homogeneous pods	Ethernet-based scale-out
Programming Ecosystem	CUDA+, cuDNN, TensorRT	ROCm, HIP	JAX, XLA, PyTorch	SynapseAI, PyTorch, TensorFlow
Key Differentiator	Best-in-class scale-up performance	Open standards, cost-effective	Extreme scale-out efficiency	Ethernet-native, open integration

Quantitative Highlights

Performance Density: A single 120 kW GB300 NVL72 rack (1.4 EFLOPS FP4) approaches the compute capability of the 21 MW Frontier supercomputer (1.1 EFLOPS FP64), yielding over 150× higher performance-per-watt, though with different numerical precision.
Fabric Bandwidth: At 130 TB/s, NVSwitch bandwidth within a rack exceeds peak estimated global internet backbone traffic.
Power Efficiency: Estimated at 25–30 GFLOPS/Watt (FP8), reflecting architectural and process node advances.

2. NVLink-C2C: Opening the Fabric to a Semi-Custom Ecosystem

Technical Overview

NVIDIA announced NVLink-C2C (Chip-to-Chip), a new initiative to allow third-party silicon to participate natively in the NVLink fabric. Three key integration paths are available:

Licensed IP Blocks: Partners embed NVLink IP in their own SoCs, ASICs, or FPGAs.
Bridge Chiplets: Chiplet-based bridges allow legacy designs to connect without redesigning core logic.
Unified Memory Semantics: Ensures full coherence between NVIDIA GPUs and partner accelerators or I/O devices.

This enables hybrid system architectures where NVIDIA GPUs operate alongside custom silicon—such as domain-specific accelerators, DPUs, or real-time signal processors—in a shared memory space.

Strategic Assessment

NVLink-C2C is a strategic counter to open standards like CXL and UCIe. By enabling heterogeneity within its own high-performance ecosystem, NVIDIA retains control while expanding use cases.

Success depends on:

Partner ROI: Justifying the cost and engineering complexity of proprietary IP over CXL’s openness.
Tooling & Validation: Supporting cross-vendor debug, trace, and profiling tools.
Performance Guarantees: Ensuring third-party devices do not introduce latency or stall high-bandwidth links.

This move also repositions NVIDIA’s interconnect fabric as the system backplane, shifting the focus from CPUs and PCIe roots to GPUs and NVLink hubs.

Ecosystem Comparison

Interconnect Standard	NVLink-C2C	CXL	UCIe
Use Case	GPU-accelerated chiplet/silicon cohesion	CPU-to-device memory expansion	Die-to-die physical interface for chiplets
Coherence Model	Full hardware coherence	CXL.cache and CXL.mem	Protocol-agnostic
Governance	Proprietary (NVIDIA)	Open consortium	Open consortium
Strategic Goal	GPU-centric heterogeneous integration	Broad heterogeneity and ecosystem access	Chiplet disaggregation across vendors

Confirmed partners: MediaTek, Broadcom, Cadence, Synopsys.

3. Isaac GR00T and the Rise of Physical AI

Technical Overview

Huang identified a strategic shift toward embodied AI—autonomous agents that operate in the physical world. NVIDIA’s stack includes:

Isaac GR00T (Generalist Robot 00 Technology): A robotics foundation model trained on multimodal demonstrations—text, video, and simulation. Designed to be robot-agnostic.
Isaac Lab & Omniverse Sim: A highly parallelized simulation environment for training and validating policies via reinforcement learning and sim-to-real pipelines.
Generative Simulation: AI-generated synthetic data and environments, reducing dependence on real-world data collection.

Together, these components define a full-stack, simulation-first approach to training robotics agents.

Challenges and Opportunities

While simulation fidelity continues to improve, the sim-to-real gap remains the key barrier. Discrepancies in dynamics, perception noise, and actuator behavior can derail even well-trained policies.

Other critical considerations:

Safety and Alignment: Embodied AI introduces physical risk; rigorous validation and fail-safe mechanisms are mandatory.
Fleet Orchestration: Deploying, updating, and monitoring robots in real-world environments requires industrial-grade orchestration platforms.
Edge Compute Requirements: Real-time control necessitates high-performance, low-latency hardware—hence NVIDIA’s positioning of Jetson Thor as the robotics edge brain.

Competitive Landscape

Company / Platform	NVIDIA Isaac	Boston Dynamics	Tesla Optimus	Open Source (ROS/ROS 2)
AI Approach	Foundation model + sim-to-real	Classical control + RL	End-to-end neural (vision-to-actuation)	Modular, limited AI integration
Simulation	Omniverse + Isaac Lab	Proprietary	Proprietary	Gazebo, Webots
Business Model	Horizontal platform + silicon	Vertically integrated hardware	In-house for vehicle automation	Community-led, vendor-neutral

Strategic Implications for Technology Leaders

1. Re-Architect the Data Center for AI Factory Workloads

Plan for 120 kW/rack deployments, with liquid cooling and revamped power infrastructure.
Network performance is system performance: fabrics like NVSwitch must be part of core architecture.
Talent pipeline must now blend HPC, MLOps, thermal, and hardware engineering.

2. Engage in Heterogeneous Compute—But Know the Tradeoffs

NVLink-C2C offers deep integration but comes at the cost of proprietary lock-in.
CXL and UCIe remain credible alternatives—balance performance against openness and cost.

3. Prepare for Digital-Physical AI Convergence

Orchestration frameworks must span cloud, edge, and robotic endpoints.
Edge inferencing and data pipelines need tight integration with simulation and training platforms.
Robotics will demand security, safety, and compliance architectures akin to automotive-grade systems.

Conclusion

Jensen Huang’s COMPUTEX 2025 keynote declared the end of general-purpose computing as the default paradigm. In its place: AI-specific infrastructure spanning silicon, system fabrics, and simulation environments. NVIDIA is building a full-stack platform to dominate this new era—from rack-scale AI factories to embodied agents operating in the physical world.

But this vision hinges on a proprietary ecosystem. The counterweights—open standards, cost-conscious buyers, and potential regulatory scrutiny—will define whether NVIDIA’s walled garden becomes the new industry blueprint, or a high-performance outlier amid a more modular and open computing future.

For CTOs, architects, and engineering leaders: the choice is not just technical—it is strategic. Infrastructure decisions made today will determine whether you’re building on granite or sand in the coming decade of generative and physical AI.