GPT is short for Generative Pre-training Transformer  (GPT), a language model written by Alec Radford and published in 2018 by OpenAI, Elon Musks’s artificial intelligence research laboratory.  It uses a generative model of language (where two neural networks perfect each other by competition) and is able to acquire knowledge of the world and process long-range dependencies by pre-training on diverse sets of written material with long stretches of contiguous text.

GPT-2 (Generative Pretrained Transformer 2) was announced in February 2019 and is an unsupervised transformer language model trained on 8 million documents for a total of 40 GB of text from articles shared via Reddit submissions. Elon Musk was famously reluctant to release it as he was concerned it could be used to spam social networks with fake news.

In May 2020 OpenAI announced GPT-3 (Generative Pretrained Transformer 3), a model which contains two orders of magnitude more parameters than GPT-2 (175 billion vs 1.5 billion parameters) and which offers a dramatic improvement over GPT-2.

Given any text prompt, the GPT-3 will return a text completion, attempting to match the pattern you gave it. You can “program” it by showing it just a few examples of what you’d like it to do, and it will deliver a complete article or story, such as the text below, written entirely by GPT-3.


GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans.

The last application has always worried OpenAI. GPT-3 is currently available as an open beta, with a paid private beta expected to be available eventually. OpenAI said they will terminate API access for obviously harmful use-cases, such as harassment, spam, radicalization, or astroturfing.

While the most obviously threatened population are those who produce written work, such as scriptwriters, AI developers have already found surprising applications, such as using GPT-3 to write code.

Sharif Shameem, for example, wrote a layout generator where you describe in plain text what you want, and the model generates the appropriate code.

Jordan Singer similarly created a Figma plugin which allows one to create apps using plain text descriptions.

It can even be used to diagnose asthma and prescribe medication.

Other applications is as a search engine or oracle of sorts, and can even be used to explain and expand on difficult concepts.

While it seems this approach may lead directly to a general AI that can understand, reason and converse like a human, OpenAI warns that they may have run into fundamental scaling up problems, with GPT-3 requiring several thousand  petaflop/s-days of compute, compared to tens of petaflop/s-days for the full GPT-2. It seems while we are closer, the breakthrough that will make all our jobs obsolete is still some distance away.

Read more about GPT-3 at GitHub here.