DeepMind says its new language model can beat others 25 times its size

Called RETRO (for “Retrieval-Enhanced Transformer”), the AI matches the performance of neural networks 25 times its size, cutting the time and cost needed to train very large models. The researchers also claim that the database makes it easier to analyze what the AI has learned, which could help with filtering out bias and toxic language.

“Being able to look things up on the fly instead of having to memorize everything can often be useful, as it is for humans,” says Jack Rae at DeepMind, who leads the firm’s language research.

Language models generate text by predicting what words come next in a sentence or conversation. The larger a model, the more information about the world it can learn during training, which makes its predictions better. GPT-3 has 175 billion parameters—the values in a neural network that store data and get adjusted as the model learns. Microsoft’s Megatron-Turing language model has 530 billion parameters. But large models also take vast amounts of computing power to train, putting them out of reach of all but the richest organizations.

With RETRO, DeepMind has tried to cut the costs of training without cutting how much the AI learns. The researchers trained the model on a vast data set of news articles, Wikipedia pages, books, and text from GitHub, an online code repository. The data set contains text in 10 languages, including English, Spanish, German, French, Russian, Chinese, Swahili, and Urdu.

Source

DeepMind says its new language model can beat others 25 times its size

Leave a Reply Cancel reply

The Download: introducing the Build issue

Cities Skylines II DLC now wrecks the main game, fix will take weeks

Windows vulnerability reported by the NSA exploited to install Russian malware

This solar giant is moving manufacturing back to the US

Next ID@Xbox Showcase will air on April 29, showing new indie games

Cities Skylines II DLC now wrecks the main game, fix will take weeks

Windows vulnerability reported by the NSA exploited to install Russian malware

Next ID@Xbox Showcase will air on April 29, showing new indie games

Kremlin-backed actors spread disinformation ahead of US elections

December 2021
M	T	W	T	F	S	S
« Nov				Jan »
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31