Meta unveils a new large language model that can run on a single GPU

A dramatic, colorful illustration.

Enlarge (credit: Benj Edwards / Ars Technica)

On Friday, Meta announced a new AI-powered large language model (LLM) called LLaMA-13B that it claims can outperform OpenAI’s GPT-3 model despite being “10x smaller.” Smaller-sized AI models could lead to running ChatGPT-style language assistants locally on devices such as PCs and smartphones. It’s part of a new family of language models called “Large Language Model Meta AI,” or LLAMA for short.

The LLaMA collection of language models range from 7 billion to 65 billion parameters in size. By comparison, OpenAI’s GPT-3 model—the foundational model behind ChatGPT—has 175 billion parameters.

Meta trained its LLaMA models using publicly available datasets, such as Common Crawl, Wikipedia, and C4, which means the firm can potentially release the model and the weights open source. That’s a dramatic new development in an industry where, up until now, the Big Tech players in the AI race have kept their most powerful AI technology to themselves.

Read 6 remaining paragraphs | Comments

Source

Leave a Reply

Your email address will not be published. Required fields are marked *