Apple’s new Foundation Models explained: on-device AI, cloud AI, and everything in between - 9to5Mac - News Bunkers

During the WWDC26 keynote, Apple announced its third generation of Apple Foundation Models (AFM), comprising five models, some of which are local, some of which are cloud-based, and one of which lives in Google’s servers running on Nvidia chips. Here’s a breakdown of how that will work.
When Apple first announced its foundation models in 2024, the lineup included an on-device language model with roughly 3 billion parameters, and “a larger server-based language model available with Private Cloud Compute and running on Apple silicon servers,” as the company put it at the time.
Private Cloud Compute was an ambitious undertaking, as it aimed to deliver cloud-based AI capabilities while preserving the same privacy guarantees users expect from on-device processing.
For this reason, keeping everything in-house was essential. Private Cloud Compute ran in Apple data centers, on servers powered by Apple silicon. Even so, its privacy guarantees could be independently verified by third-party security researchers.
However, as Apple struggled to get its AI aspirations off the ground, the company partnered with Google to use Gemini as the backbone of its new AI efforts, the results of which it announced earlier this week during the WWDC26 keynote.
The third generation of AFMs includes five models: AFM 3 Core and AFM 3 Code Advanced, which are on-device models, and AFM Cloud, ADM 3 Cloud (Image), and AFM 3 Cloud Pro, which are server-based. The D in ADM 3 Cloud (Image) stands for diffusion, a technology we’ve covered in the past here.
Except for AFM 3 Cloud Pro, all other models were built to run on Apple silicon devices. AFM 3 Cloud Pro, meanwhile, runs on NVIDIA GPUs hosted in Google Cloud.
This was made possible afer Apple extended its Private Cloud Compute architecture to third-party infrastructure for the first time, “while maintaining Apple’s powerful security and privacy protections,” according to the company.
As for the models themselves, here’s a breakdown of each one, as explained by Apple:
The highlights here are AFM 3 Core Advanced and AFM 3 Cloud Pro.
Beginning with AFM 3 Core Advanced, it packs 20 billion parameters into an on-device model, which is no small feat. Most on-device models aimed at the general public tend to stay in the low-single-digit billions of parameters.
To make AFM 3 Core Advanced run well, Apple used a sparse architecture that activates up to 4 billion parameters at a time, depending on the prompt, rather than a dense architecture that would need to keep all 20 billion parameters active for every request.
Although conceptually similar to the Mixture of Experts approach, this selective activation relies on a technique Apple invented and detailed in the interesting study Instruction-Following Pruning for Large Language Models released a year ago.
As for AFM 3 Cloud Pro, this is the one that runs on an external infrastructure. You can read some of the technical details of this expansion in this article published on Apple’s Security blog earlier this week, but here’s the most important part:
On this foundation, Apple and Google collaborated to build capabilities that go far beyond a traditional confidential computing deployment:
In its Machine Learning Research blog, Apple says that all five models “shared a common initial foundation before specializing for their respective architectures and use cases, adding multimodal capabilities like audio, image understanding, long-context reasoning, and high-quality visual generation.”
The company adds that, to train these models, it used “a mixture of data that includes publicly available information, data licensed or purchased from third parties, open-sourced data, data obtained through dedicated studies, and synthetic data.” Apple also stresses that the training process did not include user data or interactions and that web publishers can opt out of foundation model training.
Apple says it conducted extensive human evaluations of its third-generation foundation models, with in-house reviewers grading responses across categories such as instruction following, truthfulness, presentation, and image understanding.
Models were evaluated against their predecessors (when applicable), and you can see some of the results below:
Fraction of preferred responses in side-by-side human evaluations of general text capabilities, comparing AFM 3 Core and AFM 3 Cloud against our previous generation of models. Results are presented across four distinct locale groups to demonstrate consistent performance across international variants. “English” represents our global English evaluation set, while “PFIGSCJK”, “DNNSTV” and “AFIHHMPRTU” represent our remaining supported global locales.
Fraction of preferred responses in side-by-side human evaluations of image understanding capabilities in English. The results compare AFM 3 Core and AFM 3 Cloud against their 2025 predecessors.
Fraction of preferred responses in side-by-side human evaluations for dictation tasks. The results compare AFM 3 Core Advanced against Apple’s existing production dictation system across seven quality dimensions. AFM 3 Core Advanced demonstrates a positive win rate in overall quality, with preference extending consistently across all individual formatting and comprehension dimensions.
For an even deeper dive into the third-gen Apple Foundation Models, follow this link.
FTC: We use income earning auto affiliate links. More.
Check out 9to5Mac on YouTube for more Apple news:
Marcus Mendes is a Brazilian tech podcaster and journalist who has been closely following Apple since the mid-2000s.
He began covering Apple news in Brazilian media in 2012 and later broadened his focus to the wider tech industry, hosting a daily podcast for seven years.

source

Apple’s new Foundation Models explained: on-device AI, cloud AI, and everything in between – 9to5Mac

Leave a Reply Cancel Reply