In theory, the Rabbit R1 sounds like the next generation of computing. In his presentation at CES in January, Lyu posited a future in which AI uses apps so you don’t have to. He demonstrated what Rabbit calls the “Large Action Model,” or LAM, which learns how to navigate apps similarly to how large language models (LLMs) like ChatGPT learn English. He demoed how users would be able to “teach” the model to navigate interfaces, enabling it to turn your words into actions. For instance, he showed off how the LAM could use the Midjourney AI image creator, which would otherwise require opening Discord, navigating to the Midjourney server, and typing a prompt.
Of course, the R1 was also promised to be multimodal, meaning that it could process not only speech and text but also photos, videos, and other mediums. Lyu demoed how pointing the R1 at a computer screen would allow the device to give you contextual information or how you could use it to catalog items in a refrigerator. These are capabilities already available in other AIs, but combined with the LAM, Rabbit seemed as though it had innovated something truly new.
All of that was great in theory, but until people could actually get their hands on a Rabbit R1, it was all vaporware. How the public would react when Rabbit’s paws hit the ground was another story, one that began with a somewhat strange launch event in New York City.