Lexar Aims to Shift Local AI Models to SSDs in Response to the RAM Shortage

Lexar’s Innovative Approach to AI Storage: Redefining Local AI Performance

As artificial intelligence becomes increasingly integrated into personal computing, hardware requirements are evolving rapidly. Lexar, a leader in storage technology, is pioneering new solutions to address the growing demand for efficient, high-performance local AI processing. By leveraging advanced NAND Flash storage, Lexar aims to reduce reliance on expensive DRAM, making powerful AI capabilities more accessible to everyday users.

Reducing DRAM Dependency with AI-Optimized SSDs

Traditionally, running large language models (LLMs) locally has required significant DRAM resources, often making such setups cost-prohibitive for most consumers. According to Lexar’s Chief Technical Officer, Daniel Guo, DRAM is approximately six times more expensive to manufacture than NAND Flash. This cost disparity has driven Lexar to develop storage solutions that offload some of the memory demand from DRAM to more affordable NAND Flash, without sacrificing performance.

The Lexar AI Storage Core SSD is at the heart of this innovation. By enabling AI models to utilize SSD storage for memory-intensive operations, Lexar’s technology allows larger and more sophisticated LLMs to run on standard PC builds. Internal testing shows that this approach can reduce the memory footprint by at least 40%, making advanced AI workloads feasible on consumer hardware.

Performance Benchmarks: Running Qwen 3.5 122B Locally

Lexar’s internal benchmarks highlight the potential of this technology. Running the Qwen 3.5 122B AI model on a local PC has traditionally required a system with a high-end CPU and 128 GB of DRAM, costing around $4,500. With Lexar’s AI suite and the AI Storage Core SSD, the DRAM requirement drops to just 32 GB. In this configuration, the system can process 35 billion parameters at 15.6 tokens per second—nearly triple the speed of traditional frameworks, which achieve only 5.2 tokens per second.

When attempting to load the 122B model on 32 GB of DRAM using conventional methods, the process fails due to insufficient memory. In contrast, Lexar’s SSD offloading enables the model to run at 4.4 tokens per second. With a more robust 64 GB DRAM setup, the system can handle larger context windows, but only with SSD offloading. For example, with a 256K token context, the Lexar AI suite achieves approximately 19.3 tokens per second—something not possible with traditional configurations.

It’s important to note that while SSD offloading enables larger models, it does introduce increased system latency. The time to first token (TTFM) is about two seconds with a 2K context window, rising to 6–8 seconds at 4K. Offloading even larger models, such as those with 400 billion parameters, is technically possible but results in slower performance and higher latency. For some users, investing in additional DRAM may still be preferable, but Lexar’s approach offers a compelling alternative for many scenarios.

Hardware Innovation: Hot-Swappable AI SSDs for Mini-PCs and Desktops

At Computex 2026, Lexar showcased a forward-thinking concept for Mini-PCs and desktops. The design features a front-panel M.2 slot engineered for frequent insertions, allowing users to hot-swap SSDs encased in a metal jacket. These SSDs connect directly to the processor or chipset via the M.2 interface, minimizing overhead and maximizing data throughput.

This hot-swappable SSD solution is available in both PCIe Gen 4 and Gen 5 versions, with the latter offering increased bandwidth for demanding AI workloads. At the core of the device is Lexar’s custom Storage Processing Unit (SPU) DRAM-less controller, which provides precise control over data movement and further reduces dependency on DRAM.

Enabling the Next Generation of Local AI Computing

Lexar’s advancements in AI-optimized storage represent a significant step forward for local AI computing. By offloading memory-intensive tasks to high-speed SSDs, users can run larger and more complex AI models on standard hardware, reducing costs and expanding access to cutting-edge AI capabilities. While there are still trade-offs in terms of latency and model compatibility, Lexar’s innovative approach is shaping the future of AI-ready PCs and Mini-PCs.

Maya Stein Maya is a tech journalist who specializes in PC hardware and semiconductor industry trends. When she’s not writing, she’s building custom PCs and testing the latest peripherals.