ChatGPT: The future of natural language processing

By Jongyeop Jeong, 12th Grade

On November 30, 2022, OpenAI, the developer company of the AI art generator Dall-E, released a new AI chatbot, baptized as Generative Pre-trained Transformer 3 or ChatGPT. This chatbot can generate human-like syntax with high accuracy and fluency across various topics. ChatGPT is currently the most advanced language model available, with 175 billion parameters and the ability to process billions of words per second. The bot is designed to have a dialogue format, allowing it to respond to follow-up questions, admit mistakes, challenge incorrect assumptions, and reject inappropriate requests. According to the company, ChatGPT has been trained using Reinforcement Learning from Human Feedback, where human AI trainers rate the chatbot’s responses and teach it to give accurate answers.

Unlike machine learning (ML) –a type of AI that learns through experience– ChatGPT works by pre-training a deep neural network on an immense dataset of text to then fine-tune it to perform specific tasks, such as generating text or answering questions. 

The language tool can process the input text and create a prediction for the output. The two main features that make it “revolutionary” are its ability to understand the context of a conversation and generate coherent responses, even with limited input. The chatbot’s self-attention mechanism allows it to evaluate the importance of different words and phrases concerning the task at hand for understanding the context, while the use of transformers enables it to consider long-term dependencies in the input text and produce coherent sequences of words for generating coherent responses.

A growing number of companies are exploring the potential of integrating the ChatGPT chatbot into their existing technologies and software, aiming to enhance their capabilities and improve performance. One notable example is Microsoft’s plans to incorporate ChatGPT into its Bing search engine to answer search queries with complete sentences rather than providing a list of links, which is expected to give Microsoft a competitive edge over Google. This might open a new era of search engines, where users can get summary sentences for search results, gaining an overview of the information they are searching for without having to click through multiple pages and a more conversational and interactive search experience.

ChatGPT has crossed one million users within a week of its launch, and its popularity is still palpable. With its easy accessibility, requiring only an email and phone number, it has gained widespread use for various tasks. These include code generation, math problem solving, explaining complex concepts (such as quantum physics) in a simplified manner, writing academic papers and poems, and even generating songs inspired by existing artists. Its ability to provide solutions, in a natural-sounding language, in a matter of seconds has contributed to its popularity, despite its occasional generation of plausible-sounding but nonsensical answers. 

However, many consider this new technology a threat to higher education. These are some quotes holding an unfavorable view toward ChatGPT in education: “… students will be able to use this technology undetected to complete assignments,” “… will be difficult for teachers to be able to tell the difference,” “Education may never be the same,” “… danger in education, and it’s not good for kids,” “AI has basically ruined homework.” In short, they claim that students can exploit this chatbot to complete their assignments, leading to a crisis in learning and disrupting the academic curriculum. This concern is particularly acute for younger students, who may not have the sophistication to understand the content they are generating with AI. For example, if younger students were to use AI to write a history paper, they’d cheated on a writing exercise and in learning history itself.

On the other hand, it’s worth mentioning that services such as writing research papers and solving math problems have already existed and are nowhere recent. An example is the Photomath app, which has been around since 2014 and allows students to scan a math problem and get the answers. In the field of humanities, it has been a common practice for a long time to sell academic papers. Educators and admission officers have been cognizant of the existence of these services and have developed methods to identify and take action against such services.

If properly used, these AI systems have the potential to enhance higher education. They indeed write well-structured essays but don’t, and are not anticipated to, entirely replace academic-professional papers. This is because they lack the ability to truly comprehend the meaning behind the words they generate, resulting in shallow and lacking in-depth and insightful responses. They are expected, however, to serve as a new instrument to play with a large amount of textual material. As highlighted by Sam Altman, co-founder of OpenAI, “It’s a mistake to be relying on it for anything important right now. It’s a preview of progress; we have lots of work to do on robustness and truthfulness.”

The change is now unavoidable, similar to how early humans transitioned from hunter-gatherers to farmers. Rather than acting the victim in front of this rising technology, I believe that the education system should adapt its curriculums to fit the new system. If ChatGPT or any other technologies aid students in plagiarism or other forms of cheating, then teachers should search for ways to break the orthodox homework assignments to reduce the effectiveness of those technologies. Meanwhile, anti-cheating software developers are already working on solutions, and even OpenAI, the company that developed ChatGPT, is researching ways to embed watermarks within the output of ChatGPT to make it easier for anti-plagiarism software to detect potential instances of cheating.        

Edward Tian, a senior computer science student at Princeton University, has made remarkable progress in anti-plagiarism targeting ChatGPT by developing an application called GPTZero. This application, released on January 2, can identify if ChatGPT wrote a text. The motivation behind this development was to fight the increase in AI plagiarism that he saw, as many students were using AI-written assignments as their own. GPTZero uses “perplexity” and “burstiness” as indicators to detect if a bot writes the excerpt. The former measures the complexity of the text, with a higher complexity being more likely to be human-written because it means that the AI is less familiar with it, while the latter measures the length of the sentences, as humans tend to write longer sentences in contrast to the uniform style of AIs. The app isn’t foolproof, but Tian is in the process of improving the model’s accuracy. The significance of GPTZero isn’t whether it can detect a bot; it’s the fact that it demonstrated the factors that separate humans from AI, bringing transparency to AI.


