Is GPT-3 Overhyped?

Naveen Joshi 16/07/2021

OpenAI’s GPT-3 could be the ultimate AI model in the future.

However, it has a few limitations that must be overcome by continued research and development. The limitations of GPT-3 need to be acknowledged and analyzed to know the areas where it can improve.

The Generative Pre-trained Transformer 3 (GPT-3) is a formidable creation and one that promises to take the field of artificial intelligence forward by leaps. The powerful text generator provides a glimpse of what natural language processing models and neural networks of the future can do. GPT-3 is the most advanced and powerful language model ever created. It possesses an astounding 175 billion parameters to regulate its language processing operations. Other statistics and specifications associated with GPT-3 highlight its amazing powers and capabilities. While GPT-3 will continue to rightfully receive praise for what it can achieve, it is easy to get carried away and, gradually, the appreciation segues into hyperbole. The deep learning-trained GPT-3, created by OpenAI, has some limitations that need to be overcome for it to become even better. Here are some of the identified limitations of GPT-3 models.

GPT-3 Cannot ‘Understand’ Semantics

On a basic level, GPT-3 can be classified as an advanced predictor and generator of text. To demonstrate GPT-3's core capability, all a data scientist needs to do is supply the model with a few lines of text to work on. The model predicts what text will follow this chunk of data by eliminating all the incorrect alternatives. After creating a larger body of text, the model continues to generate more data by using the initial input and first output together as its second chunk of input text. This process of text generation carries on until a large article or essay is created. The length of the text and basic edits need to be made by data scientists and linguistic experts along the way. In 2020, an article published on The Guardian website was written by GPT-3 and edited by the media company's editors later, thereby following the above process.

The basic working of GPT-3 depends on its ability to scan the entire web within nanoseconds for references and patterns related to a given topic (and other topics related to it). After going through billions of web pages containing previously published text on the internet, the model generates words that are mathematically and logically plausible responses to the input. Now, while having all that information onboard almost certainly guarantees that GPT-3 detects incredibly deep and nuanced inferences from the massive input datasets, there are comprehension issues that may creep into the output text. GPT-3 does not actually ‘know’ or ‘understand’ what certain words mean in specific situations. The model does not possess an internal component that helps with understanding the semantic realities of the world. As a result, while a chunk of output text generated by GPT-3 may make sense from a mathematical point of view, it may be incorrect, racist, sexist or biased as there are no filters in the text generating model.

In other words, GPT-3 lacks the ability to reason beyond what is statistically sound text. Additionally, if the model comes across content that may not be adequately represented on the internet, it cannot generate any meaningful and coherent output text. There are several examples of limitations of GPT-3 across the internet. GPT-3’s semantical limitations make it slightly unreliable for the compilation and generation of rare and complicated text topics.

GPT-3 Cannot Create Coherent Long Articles

Tellingly, The Guardian published a follow-up article to the first one. The second article contained a step-by-step guide for creating coherent prose based on GPT-3’s output chunks. The first step states that organizations or individuals using GPT-3 to create articles must initially seek the services of a computer scientist to operate the text-generating model in the first place. This means that organizations installing GPT-3 for their daily work will have to shell out more cash to correctly put it to use. More importantly, though, several steps need to be taken by organizations to get any useful output from their GPT-3-enabled systems, including “clearing the robotic stream of consciousness that appeared in the worst GPT-3 text outputs.”

While you can see that there are several related costs to optimize GPT-3’s outputs, the bigger problem is that the powerful text generator cannot be used to compose large articles or blogs. Basically, GPT-3 generates its output in a word-by-word format. Its output depends on several parameters, including the text immediately surrounding the primary input data. As a result, long articles do not contain a coherent narrative after more than a few paragraphs. We humans have natural cognitive and comprehension abilities that enable us to maintain a point of view in long pieces of prose, even over thousands of words. GPT-3, in contrast, will generate an article that goes completely off-topic or contradicts itself by the second or third paragraph. As a result, GPT-3’s outputs lack a sense of purpose and that limits their usability in content creation.

More worryingly, creating any kind of article with GPT-3’s involvement is a cumbersome and impractical exercise for organizations. In this study, data experts enlisted the difficulties in the process of text generation using GPT-3’s output. To create an article, editors and AI experts have to work together to arduously compile chunks of 500-word essays from the massive stream of disjointed and often meaningless output text provided by a GPT-3-powered NLP system. As a result, to write an article of, say, 5000 words, the experts would spend many, many hours to make 10 (!) carefully curated 500-word chunks before digitally stitching them together to make a somewhat sensible article. It is found that only about 12% of the content generated by GPT-3 can be used in articles. So, the other 88% content is contributed by human beings. As a result, using GPT-3 for content creation is expensive, difficult and ultimately inefficient as a large percentage of its output is not fit to be used.

GPT-3’s Output Content can be Objectionable

While the creation of incoherent text is wasteful for organizations, it is better than the generation of content that is categorically offensive and, hence, downright unacceptable for publishing. Unfortunately, GPT-3 can create that kind of content too. Some studies have noted GPT-3’s penchant for generating toxic language that may demonstrate biases based on race, religion, gender, ethnicity, amongst other aspects. While this limitation of GPT-3 primarily stems from the narrowness of datasets used to train the neural networks, it makes the model unusable for sensitive operations. As we know, the internet is full of toxic and obnoxious web pages that contain inflammatory articles and speeches. If GPT-3 is allowed to digest such content off the internet, then its output will simply reflect the negativity. A GPT-3-powered system’s output can be racist.

Racial biases are fairly common in GPT-3-generated text too. OpenAI themselves admit that their API models demonstrate discriminatory tendencies that will be exhibited in GPT-3’s generated text. This particular limitation of GPT-3 can be corrected by creating clear guidelines and rules as to which web pages the model is allowed to use for text generation referencing and which areas of the internet are forbidden for the same.

GPT-3 May Not Even Be Classifiable As AGI

GPT-3’s lack of semantic know-how, limited causal reasoning, and generalization of text have led many to believe that it is nothing more than a glorified transformer. GPT-3’s text generating abilities are a result of the data-retrieval resources incorporated within it and its extensive pre-training procedures. The text generated by a GPT-3-powered system almost always needs to go through several editing and supervision modules before it is even remotely presentable to be published. As we have seen, humans need to be heavily involved to oversee the screening processes. Therefore, these limitations of GPT-3 make AI experts believe that the text-generating models cannot be grouped alongside other Artificial General Intelligence (AGI) models.

-It is hard to know whether GPT-3 is overhyped or not. The technology does have its weaknesses and needs to overcome several limitations to be classified as a perfect NLP model and a worthy proponent of AI. There are several articles on the internet that simply jump the gun and state that GPT-3 can be used in fields as diverse as healthcare. While arguments can be made for that, the fact is that, at least for now, GPT-3’s usage must be concentrated on specific fields and tasks that fully use its powerful text-prediction and generation capabilities. Additionally, the models require further fine-tuning and several rounds of development before they can be used for bigger purposes. Perhaps one should imagine GPT-3 to be a really powerful fighter-jet plane. Despite its amazing capabilities, such a machine cannot be used for space exploration missions due to its obvious limitations in that zone. Similarly, we must keep GPT-3’s limitations in mind before setting it up for failure with unrealistic expectations.

Share this article

Naveen Joshi

Tech Expert

Naveen is the Founder and CEO of Allerin, a software solutions provider that delivers innovative and agile solutions that enable to automate, inspire and impress. He is a seasoned professional with more than 20 years of experience, with extensive experience in customizing open source products for cost optimizations of large scale IT deployment. He is currently working on Internet of Things solutions with Big Data Analytics. Naveen completed his programming qualifications in various Indian institutes.

Is GPT-3 Overhyped?