← 回總覽

Tether AI 借助 QVAC Fabric,在智能手机上实现十亿参数模型运行

📅 2026-03-17 21:59 0xSammy 人工智能 3 分鐘 2784 字 評分: 94
Tether AI QVAC Fabric BitNet LoRA 端侧 AI
📌 一句话摘要 Tether AI 新推出的 QVAC Fabric 框架,利用 BitNet LoRA 技术,实现了在消费级 GPU 和智能手机上直接训练和推理十亿参数 AI 模型,为完全私有的离线个人 AI 铺平了道路。 📝 详细摘要 这条推文强调了 Tether AI 发布的 QVAC Fabric 所取得的重大突破,该框架集成了全球首个跨平台 BitNet LoRA 框架。这项创新使得十亿参数的 AI 模型能够在消费级 GPU 和智能手机(包括 iPhone 16 等旗舰设备)上高效训练和运行。Tether 首席技术官引用的推文解释了 BitNet 的 1 比特架构和 LoRA 参

Title: Tether AI Enables Billion-Parameter Models on Smartphones...

URL Source: https://www.bestblogs.dev/status/2033906190451687723

Published Time: 2026-03-17 13:59:58

Markdown Content: Tether AI breakthrough Tether AI team just released new version of QVAC Fabric to include the World’s First Cross-Platform BitNet LoRA Framework to Enable Billion-Parameter AI Training and Inference on Consumer GPUs and Smartphones.

Background

Microsoft's BitNet uses one bit architecture to dramatically compress models.

Traditional LLMs operate on full-precision computation, where weights are stored as complex, high-resolution numbers. The innovation of BitNet is that it shrinks these weights into a tiny ternary range of only -1, 0, and 1. significantly reducing memory usage and computation.

LoRA, is a parameter-efficient fine-tuning technique that reduces the number of trainable parameters by up to ninety-nine percent.

Together they slash memory and compute requirements. Yet BitNet has mostly been limited to CPU or CUDA NVIDIA backends, and lacked the support of LoRA fine-tuning.

Enters QVAC Fabric: the unlock

Today, with QVAC Fabric LLM, is the first time BitNet LoRA fine-tuning and inference work cross-platform across GPU vendors and operating systems using Vulkan and Metal backends.

That means support for AMD, Intel, Apple Metal and also Mobile GPUs.

And for the first time ever, BitNet inference runs efficiently on smartphones using mobile GPUs.

On flagship devices, GPU inference is 2 to 11 times faster than CPU while using up to 90% less memory than the full precision models.

The biggest unlock: QVAC Fabric LLM support for BitNet LoRA fine-tuning on heterogeneous GPUs. Our team was able to demonstrate this by fine tuning models up to 3.8 billion parameters on all flagships phones such as Pixel 9, S25 and iPhone 16 and up to 13 billion parameter models on the iPhone 16.

Github repositories: github.com/tetherto/qvac-… : general QVAC Fabric codebase github.com/tetherto/qvac-… : specific QVAC Fabric's BitNet knowledge base, architecture docs and pre-built binaries

What does it mean?

What used to require dedicated GPUs now runs on consumer hardware.

This breakthrough is the first real-world signal of a local private AI that can truly serve the people.

And this is just the beginning.

In the next months and years Tether will relentlessly continue to invest significant amounts of resources and capital to continue to research and develop open-source intelligence that can scale and evolve on local devices, providing maximum utility and privacy to its users.

The era of Stable Intelligence has just begun.

Free as in freedom.

查看原文 → 發佈: 2026-03-17 21:59:58 收錄: 2026-03-18 00:00:42

🤖 問 AI

針對這篇文章提問,AI 會根據文章內容回答。按 Ctrl+Enter 送出。