Did you hear about the Google employee who got fired for claiming one of their AI models had become sentient?
In a statement, Google described the claim as “wholly unfounded” and emphasized their commitment to “responsible innovation.” If only this unfortunate former employee had known: sensationalism and hype is almost never the correct response to new technology.
But the thing is, large language models like OpenAI’s ChatGPT and Google’s LaMDA do kind of seem sentient. If you don’t understand how these technologies work, they’re bound to seem like magic. And if even a Google engineer can be duped by an AI large language model, the rest of us definitely need to be on our toes.
So let’s talk about it. What are large language models? How do they work? And how can we use them to our advantage? (Not the other way around.)
A large language model (LLM) is an artificial intelligence that is designed to understand and generate text-based language. These models are trained on vast amounts of data to understand natural language prompts and generate text-based outputs. They’re a type of generative AI, which refers to artificial intelligence systems and models that can generate new content, such as text, images, or even music, based on patterns and data they have learned from. LLMs like OpenAI’s ChatGPT are trained on vast amounts of text-based data from the internet. They can understand human language prompts and generate responses that appear almost indistinguishable from human-written text.
LLMs’ ability to generate human-like text has made them valuable tools to support everything from content creation to chatbots and customer support systems. Some people also use them as a research tool to get concise, synthesized information faster than they could by doing a Google search. But don’t rely too heavily on AI for research—they’ve been known to “hallucinate,” or make up false information. They’re a good place to start, but make sure a human fact checker keeps their hands on the wheel.
[ Learn more about the differences of large language models vs. generative AI. ]
Have you ever heard that quote from science fiction writer Arthur C. Clarke that says, “Any sufficiently advanced technology is indistinguishable from magic”? Personally, I don’t love that take. In a world where so much is unknowable, technology is actually something we can understand, at least at a basic level. Large language models are no exception.
Some of the simplest language models were developed in the 1960s and used a rules-based system to mimic human conversation, while modern-day large language models are more complex. Here’s a simplified overview of how LLMs do what they do:
Pre-training: Before it can be used, an LLM is trained on a dataset of text, usually from the internet. The largest models are trained on datasets with billions of parameters, learning grammar, context, and facts. They are given incomplete sentences and asked to practice predicting the missing words.
Tokenization: Once an LLM has been trained, it is ready to be used. To use an LLM, you start by giving it a prompt in the form of a phrase or sentence. The LLM divides the text from your prompt into smaller units called tokens, which it converts into numerical representations it can understand mathematically.
Transformer architecture: In 2017, LLMs began using a neural network (which is a computational model that mimics the structure and function of the human brain) called a transformer architecture to process tokens, understand context, and generate output text. This transformer architecture is the hallmark of the modern LLM. It's called a "transformer" because it transforms input data into output data. Transformers use something called a self-attention mechanism that allows them to weigh the importance of different parts of the original input data when making predictions. Transformer models are also capable of parallel processing, meaning they can process all parts of input data at once, which makes them much faster than older models that processed data one piece at a time.
Inference: When given a prompt, the pre-trained model uses its understanding of language and context to predict and generate the most likely next tokens, producing coherent, contextually relevant text as output—but notice I didn’t say correct (more on that later). Although LLMs give a convincing performance, these outputs aren’t actually original, creative thoughts. They are mathematically derived inferences or predictions of what text is most likely to follow your prompt, based on their training.
Large language models are being used everywhere. LLM-powered chatbots and virtual assistants provide natural language interactions for customer support and answering queries. LLMs are also invaluable for language translation, summarizing lengthy documents, and generating concise answers in question-answering systems. They also play a crucial role in sentiment analysis, helping businesses gauge public opinion and customer feedback.
In the healthcare sector, LLMs assist in medical research by analyzing vast volumes of medical literature and aiding in diagnostic processes. In the legal field, they streamline document analysis, contract review, and legal research. LLMs are also increasingly involved in code generation, supporting software developers with code snippets and debugging assistance. In education, they serve as powerful tools for tutoring, language learning, and creating educational content. And that’s just the tip of the iceberg of what this technology can do.
These versatile models, with their profound natural language processing and text generation capabilities, are revolutionizing industries and domains by automating tasks, improving processes, and enhancing user experiences.
Large language models and generative AI more broadly have opened the door to new possibilities in the realm of artificial intelligence. Some might even say they’ve blown the door off its hinges. So you’d be right to be excited about generative AI and LLMs. But we need to stay grounded in acknowledging their limitations, too.
Here’s what to be careful of when it comes to using LLM-generated content:
Now that we understand the basics of LLM technology, how do we move forward? From their pre-training on vast datasets to their utilization of transformer architectures, LLMs have become invaluable tools across industries, enhancing customer support, aiding medical research, revolutionizing legal processes, and so much more. As we embrace the potential of LLMs, it's crucial to remain grounded in awareness of their limitations. AI hallucinations, biases inherited from training data, and data privacy concerns are all aspects that demand our vigilance.
To truly move forward with AI without the risks, we must strike a balance between harnessing its capabilities and actively addressing these challenges. A platform can help you do just that. Appian’s AI process platform comes with security certifications and private AI options that let you wade into the waters of artificial intelligence while avoiding risking things like data privacy.
[ Learn how to get started with AI without the risk in our eBook, Implementing Private AI: A Practical Guide. ]