Metaâ€™s Challenge to OpenAIâ€”Give Away a Massive Language Model

Meta is giving away some of the family jewels: Thatâ€™s the gist of an announcement from the company formerly known as Facebook this week. In a blog post on the Meta AI site, the companyâ€™s researchers announced that theyâ€™ve created a massive and powerful language AI system and are making it available free to all researchers in the artificial-intelligence community. Meta describes the move as an effort to democratize access to a powerful kind of AIâ€”but some argue that not very many researchers will actually benefit from this largesse. And even as these models become more accessible to researchers, many questions remain about the path to commercial use.

Large language models are one of the hottest things in AI right now. Models like OpenAIâ€™s GPT-3 can generate remarkably fluid and coherent text in just about any format or style: They can write convincing news articles, legal summaries, poems, and advertising copy, or hold up their end of conversation as customer-service chatbots or video-game characters. GPT-3, which broke the mold with its 175 billion parameters, is available to academic and commercial entities only via OpenAIâ€™s application and vetting process.

Metaâ€™s Open Pretrained Transformer (known as OPT-175B) matches GPT-3 with 175 billion parameters of its own. Meta is offering the research community not only the model itself, but also its codebase and extensive notes and logbooks about the training process. The model was trained on 800 gigabytes of data from five publicly available data sets, which are described in the â€œdata cardâ€ that accompanies a technical paper posted by the Meta researchers to the ArXiv online preprint server.

Joelle Pineau, director of Meta AI Research Labs, tells IEEE Spectrum that she expects researchers to make use of this treasure trove in several ways. â€œThe first thing I expect [researchers] to do is to use it to build other types of language-based systems, whether itâ€™s machine translation, a chatbot, something that completes textâ€”all of these require this kind of state-of-the-art language model,â€ she says. Rather than training their own language models from scratch, Pineau says, they can build applications and run them â€œon a relatively modest compute budget.â€

A smiling woman with short dark hair Joelle PineauMeta

The second thing she expects researchers to do, Pineau says, is â€œpull it apartâ€ to examine its flaws and limitations. Large language models like GPT-3 are famously capable of generating toxic language full of stereotypes and harmful bias; that troubling tendency is a result of training data that includes hateful language found in Reddit forums and the like. In their technical paper, Metaâ€™s researchers describe how they evaluated the model on benchmarks related to hate speech, stereotypes, and toxic-content generation, but Pineau says â€œthereâ€™s so much more to be done.â€ She adds that the scrutiny should be done â€œby community researchers, not inside closed research labs.â€

The paper states that â€œwe still believe this technology is premature for commercial deployment,â€ and says that by releasing the model with a noncommercial license, Meta hopes to facilitate the development of guidelines for responsible use of large language models â€œbefore broader commercial deployment occurs.â€

Within Meta, Pineau acknowledges that thereâ€™s a lot of interest in using OPT-175B commercially. â€œWe have a lot of groups that deal with text,â€ she notes, that might want to build a specialized application on top of the language model. Itâ€™s easy to imagine product teams salivating over the technology: It could power content-moderation tools or text translation, could help suggest relevant content, or could generate text for the creatures of the metaverse, should it truly come to pass.

There have been other efforts to make an open-source language model, most notably from EleutherAI, an association that has released a 20-billion-parameter model in February. Connor Leahy, one of the founders of EleutherAI and founder of an AI startup called Conjecture, calls Metaâ€™s move a good step for open science. â€œEspecially the release of their logbook is unprecedented (to my knowledge) and very welcome,â€ he tells Spectrum in an email. But he notes that Metaâ€™s conditional release, making the model available only on request and with a noncommercial license, â€œfalls short of truly open.â€ EleutherAI doesnâ€™t comment on its plans, but Leahy says the group will continue working on its own language AI, and adds that OPT-175B will be helpful for some of its research. â€œOpen research is synergistic in that way,â€ he says.

â€œSecurity through obscurity is not security, as the saying in the computer-security world goes. And studying these models and finding ways to integrate their existence into our world is the only feasible path forward.â€
â€”Connor Leahy, EleutherAI

EleutherAI is a something of an outlier in AI research in that itâ€™s a self-organizing group of volunteers. Much of todayâ€™s cutting-edge AI work is done within the R&D departments of big players like Meta, Google, OpenAI, Microsoft, Nvidia, and other deep-pocketed companies. Thatâ€™s because it takes enormous amount of energy and compute infrastructure to train big AI systems.

Meta claims that its training of OPT-175 required 1/7th the carbon footprint of that required for training GPT-3, yet as Metaâ€™s paper notes, thatâ€™s still a significant energy expenditure. The paper says that OPT-175B was trained on 992 80-gigabyte A100 GPUs from Nvidia, with a carbon-emissions footprint of 75 tons, as compared to an estimated carbon budget of 500 tons for GPT-3 (that figure has not been confirmed by OpenAI).

Metaâ€™s hope is that by offering up this â€œfoundation modelâ€ for other entities to build on top of, it will at least reduce the need to build huge models from scratch. Deploying the model, Meta says in its blog post, requires only 16 Nvidia 32GB V100 GPUs. The company is also releasing smaller scale versions of OPT-175B that can be used by researchers who donâ€™t need the full-scale model or by those who are investigating the behavior of language models at different scales.

Maarten Sap, a researcher at the Allen Institute for Artificial Intelligence (AI2) and in incoming assistant professor at Carnegie Mellon Universityâ€™s Language Technologies Institute, studies large language models and has worked on methods to detoxify them. In other words, heâ€™s exactly the kind of researcher that Meta is hoping to attract. Sap says that heâ€™d â€œlove to use OPT-175B,â€ but â€œthe biggest issue is that few research labs actually have the infrastructure to run this model.â€ If it were easier to run, he says, heâ€™d use it to study toxic language risks and social intelligence within language models.

While Sap applauds Meta for opening up the model to the community, he thinks it could go a step further. â€œIdeally, having a demo of the system and an API with much more control/access than [OpenAIâ€™s API for GPT-3] would be great for actual accessibility,â€ he says. However, he notes that Metaâ€™s release of smaller versions is a good â€œsecond-best option.â€

Whether models like OPT-175B will ever become as safe and accessible as other kinds of enterprise software is still an open question, and there are different ideas about the path forward. EleutherAIâ€™s Leahy says that preventing broad commercial use of these models wonâ€™t solve the problems with them. â€œSecurity through obscurity is not security, as the saying in the computer-security world goes,â€ says Leahy, â€œand studying these models and finding ways to integrate their existence into our world is the only feasible path forward.â€

Meanwhile, Sap argues that AI regulation is needed toâ€œprevent researchers, people, or companies from using AI to impersonate people, generate propaganda or fake news, or other harms.â€ But he notes that â€œitâ€™s pretty clear that Meta is against regulation in many ways.â€

Sameer Singh, an associate professor at University of California, Irvine, and a research fellow at AI2 who works on language models, praises Meta for releasing the training notes and logbooks, saying that process information may end up being more useful to researchers than the model itself. Singh says he hopes that such openness will become the norm. He also says he supports providing commercial access to at least smaller models, since such access can be useful for understanding modelsâ€™ practical limitations.

â€œDisallowing commercial access completely or putting it behind a paywall may be the only way to justify, from a business perspective, why these companies should build and release LLMs in the first place,â€ Singh says. â€œI suspect these restrictions have less to do with potential damage than claimed.â€

From Your Site Articles

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Metaâ€™s Challenge to OpenAIâ€”Give Away a Massive Language Model

Startups Brighten Phone Camera Tech

Video Friday: Meet Spidar, the Flying Bot!

DIY Cybersickness Remedies

Related Stories

Meta Opens Its AI Models for the (U.S.) Military

Metaâ€™s AI Agents Learn to Move by Copying Toddlers

New AI Model Advances the â€œKissing Problemâ€ and More

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Enjoy more free content and benefits by creating an account

Saving articles to read later requires an IEEE Spectrum account

The Institute content is only available for members

Downloading full PDF issues is exclusive for IEEE Members

Downloading this e-book is exclusive for IEEE Members

Access to Spectrum 's Digital Edition is exclusive for IEEE Members

Following topics is a feature exclusive for IEEE Members

Adding your response to an article requires an IEEE Spectrum account

Create an account to access more content and features on IEEE Spectrum , including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE .

Join the worldâ€™s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrumâ€™s articles, archives, PDF downloads, and other benefits. Learn more about IEEE â†’

Join the worldâ€™s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of IEEE Spectrumâ€™s articles, archives, PDF downloads, and other benefits. Learn more about IEEE â†’

Access Thousands of Articles â€” Completely Free

Create an account and get exclusive content and features: Save articles, download collections, and talk to tech insiders â€” all free! For full access and benefits, join IEEE as a paying member.

Metaâ€™s Challenge to OpenAIâ€”Give Away a Massive Language Model

Startups Brighten Phone Camera Tech

Video Friday: Meet Spidar, the Flying Bot!

DIY Cybersickness Remedies

Related Stories

Meta Opens Its AI Models for the (U.S.) Military

Metaâ€™s AI Agents Learn to Move by Copying Toddlers

New AI Model Advances the â€œKissing Problemâ€ and More

New AI Model Advances the â€œKissing Problemâ€ and More