I think there may be a misunderstanding about the reported memory bandwidth on the Beelink GTR9 Pro (Ryzen AI Max+ 395 / Strix Halo) with 128GB LPDDR5-8000.
Based on the screenshots and benchmarks mentioned, the system actually appears to be operating normally rather than being stuck in a “1:4 divider” mode.
- CPU-Z DRAM Frequency (998 MHz)
CPU-Z reporting around 997–1000 MHz is expected for LPDDR5-8000.
LPDDR5 uses an internal prefetch architecture, so the DRAM frequency displayed in CPU-Z is roughly ⅛ of the effective data rate.
Example calculation:
DRAM Frequency ≈ 998 MHz
Effective data rate ≈ 998 × 8 ≈ 8000 MT/s
So the memory appears to be running exactly at the rated LPDDR5-8000 speed.
- Theoretical vs Measured Bandwidth
Strix Halo has a 256-bit LPDDR5 memory interface, which gives a theoretical maximum bandwidth of:
8000 MT/s × 256-bit ÷ 8 = 256 GB/s
However, this figure represents the total SoC memory bandwidth available to the GPU, NPU, and CPU combined.
CPU-based benchmarks such as:
WinSAT
AIDA64 CPU memory tests
CPU-Z
typically measure CPU core → memory bandwidth only, which is limited by the Infinity Fabric and CPU memory ports.
On modern LPDDR systems, measured CPU bandwidth is usually much lower than the theoretical peak.
Typical real-world examples:
Meteor Lake (LPDDR5X-7500)
Theoretical: 120 GB/s
CPU benchmark: 60–70 GB/s
Phoenix (LPDDR5-6400)
Theoretical: 102 GB/s
CPU benchmark: 55–65 GB/s
Given this, a WinSAT result of 72 GB/s on LPDDR5-8000 is actually within a reasonable range for CPU-side memory bandwidth.
- “UCLK 1:4 divider” assumption
The assumption that the system is stuck in a UCLK 1:4 divider mode based on CPU-Z’s “uncore frequency” is likely incorrect.
LPDDR systems do not behave the same way as desktop DDR5 platforms. Much of the memory PHY and clocking logic is integrated inside the SoC, and tools like CPU-Z do not always report those clocks accurately.
Therefore, interpreting the 998 MHz value as a divider state is probably misleading.
- LLM inference performance
For reference, running Qwen 27B Q8 on CPU only typically produces something like:
5–10 tokens/s on high-end CPUs
So the reported inference speed is not unusual if the model is running primarily on CPU.
To fully utilize the available memory bandwidth on Strix Halo, inference should ideally use GPU offload (RDNA iGPU) rather than CPU-only execution.
Conclusion
Based on the data shown:
LPDDR5-8000 appears to be running at the correct speed
The 72 GB/s bandwidth result is plausible for CPU-side benchmarks
The “1:4 divider” interpretation is likely a misunderstanding of how LPDDR clocking is reported
It would still be useful to run additional benchmarks (e.g., AIDA64 memory test or y-cruncher) to confirm the full memory performance of the system.