Large Language Model (LLM)
A neural network trained on massive text datasets that can understand and generate human language.
Large language models (LLMs) are deep learning systems trained on billions of words of text data. They learn statistical patterns in language and can generate coherent text, answer questions, write code, translate languages, and perform a wide range of language tasks.
Models like GPT-4, Claude, LLaMA, and Gemini represent the current state of the art. They use the transformer architecture and are trained using self-supervised learning on internet-scale data, then refined through techniques like reinforcement learning from human feedback (RLHF).
LLMs have created entirely new job categories and transformed existing ones. Engineers who can build applications on top of LLMs, fine-tune them for specific domains, or evaluate their outputs are in high demand across the AI industry.
Related Terms
Transformer
The neural network architecture behind modern LLMs, using self-attention mechanisms to process sequences in parallel.
Fine-Tuning
The process of further training a pre-trained AI model on domain-specific data to improve its performance on particular tasks.
Tokenization
The process of breaking text into smaller units (tokens) that language models can process.
Inference
The process of running a trained AI model to generate predictions or outputs from new input data.
Foundation Model
A large-scale AI model trained on broad data that can be adapted to a wide range of downstream tasks.