HP has built what it calls its most powerful Windows AI PC ever — the ZGX Fury GB300. It is a deskside workstation aimed at AI engineers and enterprise teams who need data-center-level compute without renting a cloud server. On paper, the specs are extraordinary. In practice, it is a very specific machine for a very specific type of user.
What It Is
The ZGX Fury GB300 runs on NVIDIA’s GB300 Grace Blackwell Ultra Desktop Superchip. That is not a graphics card dropped into a standard tower. It is a tightly integrated CPU-GPU combo — one 72-core ARM Neoverse V2 CPU paired with a single Blackwell Ultra (B300) GPU — all sharing 748GB of coherent unified memory. The CPU side holds 496GB of LPDDR5X; the GPU side holds 252GB of HBM3e. Both pools communicate over NVLink-C2C at 900GB/s bidirectional bandwidth.
To put that number in context: most desktop GPUs top out at 24–80GB of VRAM. This machine has roughly 10 times that, accessible as a single flat memory space. No model sharding, no memory-offloading tricks.
Most Important Capability: Trillion-Parameter Inference
The headline feature is local trillion-parameter model inference at FP16 or BF16 precision. The sweet spot HP and NVIDIA target is 400B+ parameter models — the class of AI just below the absolute frontier. For most teams, running a 70B model locally is already fast. Running a 405B model on a single workstation, without cloud latency, is genuinely new territory for a box that sits on your desk.
AI compute peaks at 20 PFLOPS in NVFP4/FP4 precision, which covers quantized inference workloads. For training and fine-tuning at higher precision, throughput will be lower, but the 8 TB/s of HBM3e GPU memory bandwidth means data movement is rarely the bottleneck.
Networking Is Not an Afterthought
Built into the system is a dual-port NVIDIA ConnectX-8 SuperNIC running at 400Gb/s per port via QSFP connectors. This is not a consumer ethernet upgrade. It is designed for teams that chain multiple AI systems together or need low-latency access to shared storage and data pipelines. If you are running standalone inference on one machine, you will not use most of this bandwidth. If you are part of a larger infrastructure, it means the ZGX Fury GB300 can keep up.
Power and Form Factor
The system draws power from a single 1600W 80 PLUS Gold ATX supply, though the GPU can pull up to 1800W through 12V-2×6 connectors. That is a meaningful electrical load — expect to check your rack or room power setup before ordering. The chassis follows NVIDIA’s DGX Station architecture and is designed for deskside placement, not rack mounting. It runs Windows, which is either practical (familiar software stack, enterprise IT compatibility) or limiting, depending on your workflow.
Who Should Actually Buy This
The target is clear: AI engineers prototyping, fine-tuning, or running inference on large models who want to do it locally rather than on cloud infrastructure. The reasons to go local vary — latency, data privacy, recurring cloud cost, or simply faster iteration cycles. For a team spending heavily on GPU cloud instances for LLM work, a single ZGX Fury GB300 could pay itself off over time.
If your models are under 70B parameters, this is overkill. A high-end workstation GPU handles that range fine. The GB300’s memory advantage only becomes decisive when you need to hold hundreds of billions of parameters in active memory at once without compression.
What We Do Not Know Yet
HP has not published a retail price. Given the hardware — this is DGX Station-class compute on a Windows platform — expect the cost to land well above standard workstation pricing. Enterprise procurement channels are the likely path to purchase.
Quick Verdict: The HP ZGX Fury GB300 is purpose-built for one job — running massive AI models locally — and it does that job better than anything else currently available in a deskside form factor. If your workload genuinely demands 400B+ parameter models, this machine removes the main obstacle. For everyone else, it is impressive but not practical.