Rephrasing the Net: A Recipe for Compute and Knowledge-Environment friendly Language Modeling

This paper has been accepted on the Knowledge Issues for Basis Models workshop at ICLR 2024.

Massive language fashions are skilled on large scrapes of the online, which are sometimes unstructured, noisy, and poorly phrased. Present scaling legal guidelines present that studying from such knowledge requires an abundance of each compute and knowledge, which grows with the scale of the mannequin being skilled. That is infeasible each due to the big compute prices and period related to pre-training, and the upcoming shortage of high-quality knowledge on the net. On this work, we proposeWebRephrase Augmented Pre-training (WRAP) that makes use of an off-the-shelf instruction-tuned mannequin prompted to paraphrase paperwork on the net in particular types equivalent to “like Wikipedia” or in “question-answer format” to collectively pre-train LLMs on actual and artificial rephrases. First, we present that utilizing WRAP on the C4 dataset, which is of course noisy, accelerates pre-training by ~3 occasions. On the similar pre-training compute price range, it improves perplexity by greater than 10% on common throughout totally different subsets of the Pile, and improves zero-shot query reply accuracy throughout 13 duties by greater than 2%. Second, we examine the influence of the re-phrasing model on the efficiency of the mannequin, providing insights into how the composition of the coaching knowledge can influence the efficiency of LLMs in OOD settings. Our features are attributed to the truth that re-phrased artificial knowledge (i) incorporates model variety that carefully displays downstream analysis model, and (ii) has larger “high quality” than web-scraped knowledge.

These clear earbuds by Nothing made my AirPods look and sound boring

Easy methods to Create Leo the Lion Paintings in Photoshop

CDT Releases Report on Lowering Incapacity Bias » CCC Weblog

Most cancers Drug Exhibits Promise for Autism Cognitive Operate

Empowering Change: SI3’s “Granting Entry” Occasion Boosts Variety In Web3

The faucet-estry of threats concentrating on Hamster Kombat gamers

Rephrasing the Net: A Recipe for Compute and Knowledge-Environment friendly Language Modeling

Leave a Reply Cancel reply

Leave a Reply Cancel reply

Related News