No single person “invented AI”-it emerged from decades of work by many scientists, with John McCarthy often called the “father of AI” for coining the term and organizing the 1956 Dartmouth Conference.
Key figures include Alan Turing (1950 Turing test), Warren McCulloch & Walter Pitts (1943 neural nets), McCarthy, Marvin Minsky, Allen Newell, Herbert Simon, Arthur Samuel, Frank Rosenblatt, and the “godfathers of deep learning”: Geoffrey Hinton, Yann LeCun, and Yoshua Bengio.
The field was formally born at the 1956 Dartmouth Summer Research Project, where “artificial intelligence” was coined and researchers gathered to pursue the goal of creating thinking machines.
AI went through boom-bust cycles-optimistic 1960s, first AI winter (1970s), expert systems boom (1980s), second AI winter (late 1980s)-before today’s deep learning and large language models like GPT, Claude, and Gemini.
For staying sane amid AI hype cycles and weekly breakthroughs, professionals rely on curated, no-filler briefings like KeepSanity AI instead of drowning in daily newsletters.
Ask “who invented AI” and you’ll get a dozen different answers depending on who you ask and what they mean by “AI.” That’s because artificial intelligence isn’t a single gadget someone built in a garage-it’s a sprawling field that emerged from mathematics, philosophy, engineering, and computer science over more than a century.
John McCarthy is widely known as the “father of AI” because he coined the term “artificial intelligence” in 1955 and organized the famous Dartmouth workshop where the field got its name. But McCarthy himself would be the first to tell you he didn’t invent AI alone.
Here’s why the question is tricky:
“AI” can mean different things: the ancient idea of artificial humans and thinking machines, the academic field launched in the 1950s, or today’s machine learning and deep learning systems.
Modern AI is a cumulative invention built on Alan Turing’s theory of computation, McCulloch and Pitts’ 1943 artificial neural networks, Turing’s 1950 paper proposing the imitation game, the 1955 Dartmouth proposal, and decades of subsequent breakthroughs.
No single paper, conference, or inventor created AI-it crystallized through overlapping contributions from mathematicians, psychologists, engineers, and computer scientists across multiple countries.
The rest of this article moves chronologically: roots before 1950, birth of AI at Dartmouth, key “inventors” in each era, and how we arrived at GPT-style systems transforming industries today.
The question of who invented AI starts centuries before digital computers existed. Humans have imagined artificial beings and intelligent machines since ancient times, laying cultural and mechanical groundwork for what would eventually become AI research.
Here are the concrete milestones that planted the seeds:
Ancient myths of artificial beings: Greek mythology gave us Talos, a bronze automaton that guarded Crete. Jewish folklore described the Golem, a clay figure animated by rabbinical incantations. These weren’t real AI, but they showed humanity’s fascination with creating intelligent systems outside the human brain.
18th–19th century automata and calculating machines: Pierre Jaquet-Droz’s 1770s creations-like “The Writer,” which could compose custom sentences using over 2,000 precisely crafted parts-demonstrated programmable mechanical “intelligence.” Charles Babbage’s 1830s Analytical Engine designs envisioned a steam-powered general-purpose computer with punched cards, conditional branching, and what we’d now call an arithmetic logic unit. Ada Lovelace’s 1843 notes recognized this engine could compose music and handle non-numerical tasks, earning her the title of first computer program author.
1921: The word “robot” enters the vocabulary: Karel Čapek’s science fiction play “Rossum’s Universal Robots” (R.U.R.) introduced the word “robot” from the Czech “robota,” meaning forced labor. The play depicted artificial humans rebelling against exploitation-embedding ethical dilemmas about machine intelligence that echo in today’s AI debates.
1939–1942: Electronic computers make programmable intelligence thinkable: John Atanasoff and Clifford Berry’s ABC (Atanasoff-Berry Computer) solved linear equations using 300 vacuum tubes. Later, ENIAC (completed 1946) scaled to 18,000 tubes and 30 tons, performing 5,000 additions per second. These machines proved that computing machinery could execute complex tasks previously requiring human intervention.
These developments established the core assumption behind AI: that human thought might be mechanized and simulated in hardware and software. The stage was set for the formal logic and computing power that would turn speculation into science.

The 1940s and early 1950s created the mathematical and computational toolkit that modern AI still relies on today. This period saw the birth of neural network theory, the formalization of machine intelligence, and the earliest experiments in teaching machines to reason.
Warren McCulloch and Walter Pitts published “A Logical Calculus of the Ideas Immanent in Nervous Activity,” modeling biological neurons as simple binary threshold logic units. Their insight was profound: networks of these units could compute any logical function, including all Boolean operations.
This was the first artificial neural networks model capable of universal computation in principle-though limited to feedforward structures without learning. It connected the human brain’s architecture to formal logic, suggesting intelligent behavior could emerge from simple components.
Alan Turing provided the theoretical foundation for everything that followed:
Year | Contribution | Impact |
|---|---|---|
1936 | “On Computable Numbers” paper introducing the Turing machine | Proved the halting problem undecidable and established universal computation |
1950 | “Computing Machinery and Intelligence” | Proposed the Turing test (originally called the imitation game) as an operational definition of machine intelligence |
Turing’s 1950 paper asked the question that launched a field: “Can machines think?” His answer was to sidestep philosophical debates and propose a practical test-if a human judge couldn’t distinguish a computer program from a human being in text-based conversation, the machine could be said to exhibit intelligent behavior.
Norbert Wiener’s 1948 work on cybernetics synthesized feedback control across machines and organisms. His book sold over 50,000 copies by 1950, introducing concepts like feedback loops and information entropy that became precursors to AI control systems and later reinforcement learning.
Before the field had a name, researchers were already building:
1951: SNARC (Stochastic Neural Analog Reinforcement Computer) – Marvin Minsky and Dean Edmonds built the first neural network machine using 3,000 vacuum tubes to simulate 40 neurons. It played a mouse-in-maze game by associating directions with rewards-demonstrating rudimentary machine learning despite analog limitations.
Early game-playing programs – Claude Shannon’s 1950 chess program evaluated positions with minimax search. By 1955, Allen Newell and Herbert Simon’s Logic Theorist automated 38 of 52 theorems from Whitehead and Russell’s Principia Mathematica, proving that computer systems could engage in logical reasoning.
This period accumulated the tools-logic networks, computability theory, feedback, and proto-programs-that the Dartmouth conference would unify into a named discipline.
If you must pick one event where AI was “invented” as a field, it is the 1956 Dartmouth Summer Research Project on Artificial Intelligence. This workshop didn’t solve AI-but it created AI as a funded, named research discipline that persists to this day.
John McCarthy, then a 29-year-old mathematics professor at Dartmouth, drove the vision. In 1955, he authored the two-page typed proposal that coined the term “artificial intelligence”-choosing it for neutrality, sidestepping “cybernetics” (associated with Norbert Wiener’s analog focus) and “automata” (too narrow).
McCarthy secured $13,500 from the Rockefeller Foundation for the workshop. He later called his proposal a “flag to the mast” at the AI@50 conference in 2006, marking the ambition that unified disparate ideas into a coherent field.
Participant | Affiliation | Key Contribution |
|---|---|---|
John McCarthy | Dartmouth | Coined “AI,” later created Lisp, pioneered time-sharing |
Marvin Minsky | Harvard | Cognitive modeling, later co-founded MIT AI Lab |
Claude Shannon | Bell Labs | Information theory, entropy, cryptography |
Nathaniel Rochester | IBM | IBM 704 designer, early pattern recognition |
Allen Newell | RAND/Carnegie | Logic Theorist co-creator, problem solving research |
Herbert Simon | Carnegie | Logic Theorist co-creator, cognitive psychology pioneer |
The workshop ran from June 18 to August 1956 in Hanover, New Hampshire, with about 11 core attendees and additional visitors like Ray Solomonoff (induction work) and Oliver Selfridge.
The proposal’s manifesto-like language set the field’s agenda:
“Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”
Specific targets included natural language processing, early neural networks, abstraction, problem solving, and self-improvement-ideas that remain central to AI today.
1955–56: Logic Theorist – Newell, Shaw, and Simon created one of the first AI programs, proving mathematical theorems using heuristic tree search and the Information Processing Language (IPL).
1956 onwards: AI labs formed at MIT (1959), Carnegie Mellon, and Stanford (1963), establishing ai research as an academic discipline.
McCarthy later admitted at the 2006 reunion that collaboration at Dartmouth was imperfect-attendees arrived at different times and pursued individual agendas. But the workshop formalized AI as fundable science, sparking decades of progress through booms and winters.

Calling John McCarthy the “father of AI” is common, but the field has multiple “parents” with distinct contributions. Think of AI’s invention as a relay race rather than a solo sprint.
McCarthy earns the title for institutionalizing the field:
Coined “artificial intelligence” in the 1955 Dartmouth proposal
Created Lisp in 1958, the first functional programming language with dynamic typing and garbage collection, enabling 90% of early AI systems
Pioneered time-sharing (1961), allowing interactive computing versus batch processing
Developed concepts like the “Advice Taker” (1959, forward-chaining inference engine) and circumscription (1980, non-monotonic logic for commonsense reasoning)
Minsky co-founded the MIT AI Lab with McCarthy in 1959 and advanced:
Frames (1974): Structured knowledge representation with slots and defaults
Perceptrons critique (1969): Book with Seymour Papert exposing linear separability limits of early neural networks, influencing the field for decades
Work on robotics, including contributions to Shakey, the first mobile robot integrating computer vision, planning, and natural language
This duo demonstrated that digital computers could engage in symbolic reasoning:
Logic Theorist (1955–56): First AI program to prove theorems, proving 38 of 52 theorems from Principia Mathematica
General Problem Solver (late 1950s): Means-ends heuristic search that influenced cognitive psychology
Physical Symbol System Hypothesis (1980): The claim that intelligence equals symbol manipulation
They shared the 1975 Turing Award for their foundational contributions to artificial intelligence ai and cognitive science.
Samuel coined the term “machine learning” in 1959 for his IBM 704 checkers program (1952–1959). Using minimax alpha-beta pruning and temporal difference learning, the program reached expert play through self-play-over 200,000 games-and defeated Samuel himself by 1962.
Rosenblatt’s 1957 Perceptron was the first trainable neural network for pattern recognition. The Mark I hardware used 400 motors and 5,500 photocells, learning binary classifications via the delta rule (precursor to backpropagation). Despite limitations exposed by the Minsky-Papert critique, the Perceptron pioneered the approach that would dominate modern AI.
Some authors call Alan Turing the “father of theoretical AI” for formalizing computation and the intelligence test. But historically, McCarthy gets the title for founding AI as an institution-naming it, convening its first workshop, and building its core programming language.
AI was a collaborative invention spanning theory, algorithms, languages, and hardware. No single computer scientist created it alone.
AI’s first decades followed a pattern that would repeat: rapid optimism, overpromising, disillusionment, funding cuts, and eventual recovery. Understanding this history of ai explains why the field seems to be “reinvented” every decade.
Early AI research focused on symbolic reasoning-manipulating symbols and rules to mimic human intelligence:
Lisp (1958): McCarthy’s programming language standardized recursion and garbage collection, becoming the lingua franca of AI
SHRDLU (1968–70): Terry Winograd’s system parsed English commands in a blocks world (“Pick up the red block”), executing over 1,000 commands with 95% accuracy using procedural attachment
DENDRAL (mid-1960s): Lederberg and Feigenbaum created the first expert systems, inferring molecular structures from mass spectra using generate-test heuristics
Researchers predicted that machines would match human intelligence within a generation. Herbert Simon famously predicted in 1965 that “machines will be capable, within twenty years, of doing any work a man can do.”
Reality hit hard:
Event | Year | Impact |
|---|---|---|
ALPAC Report (US) | 1966 | Concluded machine translation was uneconomic after $20M investment |
Lighthill Report (UK) | 1973 | Criticized AI for “toy problems” that didn’t scale; triggered £1M funding cuts |
Early systems worked only on constrained toy problems. Combinatorial explosion-chess has a branching factor of 35 per move-made scaling impossible with available computing power. Government funding dried up, and AI research contracted.
Expert systems revived commercial interest:
R1/XCON at DEC (1980): Configured VAX computers using 10,000 rules, saving $40 million per year by 1986
MYCIN (1972–76, Stanford): Diagnosed bacterial infections with 69% accuracy (versus 65% for clinicians) using 450+ rules and backward-chaining certainty factors
Japan’s Fifth Generation Computer Systems (1982–92): ¥50 billion investment in Prolog-based parallel inference, advancing hardware despite commercial failure
Corporations invested heavily, believing intelligent systems would revolutionize business.
Meanwhile, alternatives to symbolic AI emerged:
1982: John Hopfield’s neural networks demonstrated content-addressable memory with energy-based networks storing 0.15N patterns via Hebbian learning
1986: Backpropagation – Rumelhart, Hinton, and Williams published the gradient descent algorithm for training deep neural networks, reducing error ~10x on benchmarks
1988: Judea Pearl’s Bayesian networks brought probabilistic reasoning to AI, enabling systems like CAR-START to diagnose engine faults at 80% accuracy
Expectations again outpaced delivery. Lisp machine companies like Symbolics collapsed (losing $100 million), and expert systems proved brittle outside narrow domains. The ai winter returned as corporate and government funding retreated.
But the tools developed during both winters-neural networks, probabilistic reasoning, faster hardware-would fuel the next wave.
“Who invented AI” changes meaning in this era. The focus shifts from symbolic programs to data-driven learning and deep neural networks-requiring new heroes and new infrastructure.
AI research regrouped around statistical methods:
Support Vector Machines (Vapnik, 1995): Kernel tricks enabled non-linear separation, topping ImageNet benchmarks by 2009
LSTM Networks (Hochreiter & Schmidhuber, 1997): Gated recurrent units handled long dependencies in sequences, halving error rates in speech recognition
Big data became crucial. The ai community realized that algorithms were often less important than having massive labeled datasets.
Year | Development | Significance |
|---|---|---|
2004–2006 | Face Recognition Grand Challenge | Showed large-scale benchmark-driven progress |
2006 | Hinton’s deep belief networks | Revived deep learning via unsupervised pretraining |
2007 | Fei-Fei Li’s ImageNet launch | 14 million labeled images across 1,000 classes |
2009 | GPU-accelerated training (Raina, Ng) | 60x speedup versus CPUs |
ImageNet became the benchmark that would define progress in image recognition and computer vision for a decade.
The ai boom of the 2010s traces directly to one paper:
AlexNet (Krizhevsky, Sutskever, Hinton) won the 2012 ImageNet competition by cutting top-5 error from 26% to 15%. The architecture used:
5 convolutional layers with ReLU activations
60 million parameters
Dropout for regularization
Training on 2 NVIDIA GPUs for 5 days
By 2017, ImageNet error dropped below 5%-better than average human performance. Deep learning techniques had arrived.
2011: IBM Watson wins Jeopardy! – 100 servers, 15TB of data, confidence-weighted question answering defeated human champions
2011: Apple’s Siri launches – Mainstream virtual assistants brought speech recognition and natural language to billions of iPhones
2016: DeepMind’s AlphaGo defeats Lee Sedol 4:1 – Monte Carlo tree search combined with deep learning value/policy networks showed AI could play chess-and far beyond
Geoffrey Hinton, Yann LeCun, and Yoshua Bengio share the 2018 Turing Award for their foundational contributions:
Hinton: Deep belief networks, backpropagation popularization
LeCun: Convolutional neural networks (1989), enabling 99% MNIST accuracy
Bengio: Word embeddings (2003), sequence modeling
Just as McCarthy was the father of symbolic AI, these three are widely seen as the central “inventors” of modern deep learning.

The most visible “AI” to the public today-ChatGPT-style tools, image generators, virtual assistants that actually work-rests on the transformer architecture and massive-scale training that would have been unthinkable at Dartmouth.
Google researchers published “Attention Is All You Need” in 2017, introducing self-attention mechanisms that could:
Process sequences in parallel (O(1) versus RNNs’ O(n²) for long-range dependencies)
Scale to hundreds of billions of parameters
Handle human language with unprecedented fluency
This single architectural innovation enabled the leap from BERT (2018, 340M parameters) to GPT-3 (2020, 175B parameters) and beyond.
Year | System | Significance |
|---|---|---|
2014 | GANs (Goodfellow) | Min-max adversarial training enabled photorealistic synthetic images |
2020 | GPT-3 | 175B parameters, few-shot learning across 45/93 SuperGLUE tasks |
2020 | Diffusion models (Ho et al.) | Iterative denoising for high-quality image generation |
2022 | ChatGPT launch | 100 million users in 2 months, conversational AI goes mainstream |
2023 | DALL·E 3, Midjourney V5 | Text-to-image reaches near-photorealistic quality |
Generative ai transformed from research curiosity to consumer product in under three years.
The “inventors” of today’s AI are increasingly large research teams with billion-dollar budgets:
OpenAI: GPT series, DALL·E, Sora (video generation), ChatGPT
Google DeepMind: AlphaGo, AlphaFold (protein structure prediction, 2024 expansion to RNA), Gemini models
Anthropic: Claude series (200k context window, constitutional AI training)
Meta: LLaMA open-weight models, PyTorch framework
xAI, Mistral, Cohere: Emerging challengers in large language models
Modern AI development involves:
Research teams of hundreds of ai researchers and engineers
Training runs costing $100+ million (estimated for GPT-4)
Hardware vendors like NVIDIA (H100 chips delivering 4 PFLOPS)
Open-source communities contributing datasets, frameworks, and fine-tuning
Regulators developing governance (Biden Executive Order 2023, EU AI Act 2024)
The question “who invented AI” now has an answer more like “who invented the internet”-it’s an ecosystem, not a single genius.
Because breakthroughs now land weekly, professionals rely on curated weekly briefings instead of trying to track every minor model release. KeepSanity AI filters the signal from the noise-one email per week with only the major developments that actually matter.
AGI-systems matching or exceeding human-level, broad intelligence across diverse tasks-has not been invented yet, despite what marketing materials might suggest. Current systems, however impressive, are narrow AI optimized for specific complex tasks.
AGI is distinct from:
Narrow AI: Systems that excel at single tasks (AlphaGo plays Go, GPT writes text)
Current LLMs: Powerful pattern recognizers lacking genuine causal understanding or robust planning
Early AGI thinking includes:
I.J. Good (1965): Described an “ultraintelligent machine” capable of recursive self-improvement
Vernor Vinge (1993): Coined “technological singularity” for the point where AI-driven change becomes unpredictable
Labs like OpenAI and DeepMind explicitly target more general systems:
OpenAI o1 (2024): Reasoning chains attempting multi-step problem solving
DeepMind scaling research: Exploring whether scale alone leads to AGI
But significant gaps remain. LLMs show “emergence” (in-context learning appearing at 10B+ parameters) but struggle with:
Abstract reasoning benchmarks like ARC (30-40% accuracy vs. human 85%)
Genuine causal understanding vs. pattern correlation
Robust planning in novel environments
The potential for AGI raises alignment challenges:
Alignment problem: Ensuring advanced AI systems pursue human-compatible goals without human emotions to guide them
Mesa-optimization risks: Inner misaligned goals emerging during training
Research organizations: CAIS, Anthropic’s Constitutional AI, OpenAI’s alignment team
Emerging regulation includes:
Biden Executive Order (2023): Red-team testing requirements for models above 10^26 FLOPs
EU AI Act (2024): High-risk system restrictions
China’s content-labeling rules for AI-generated media
Whoever “invents” AGI in the future will be standing on a century of prior AI inventions-from Turing’s formalism to McCulloch-Pitts neurons to transformers-not starting from zero.
These FAQs cover related questions not fully addressed above. For quick reference, here are the answers to what readers commonly ask about AI’s origins.
John McCarthy did not single-handedly create all AI techniques, but he uniquely positioned himself as the field’s founder through three contributions:
Coined “artificial intelligence” in the 1955 Dartmouth proposal, giving the field its name
Organized the 1956 Dartmouth Summer Research Project, bringing together the earliest examples of AI researchers and formalizing ai began as an academic discipline
Developed Lisp (1958) and foundational concepts in logical AI that powered decades of research
This combination of naming, convening, and technical innovation is why the AAAI and most historians call him the “father of AI.” But McCarthy himself acknowledged AI was a collaborative effort-he organized the wedding, but many people built the marriage.
Turing provided the theoretical bedrock rather than the institutional founding. His contributions include:
1936: Formalizing computation with the universal Turing machine concept, proving what is and isn’t computable
1950: Publishing “Computing Machinery and Intelligence,” which proposed the Turing test as a practical way to ask “Can machines think?”
Some historians call Turing the “father of theoretical computer science and AI.” His 1950 paper on computing machinery and intelligence remains foundational. But McCarthy is more specifically tied to AI as a named field because he created the term and organized its founding workshop.
AI’s path from lab to consumer followed a gradual timeline:
1970s–80s: Expert systems like DEC’s R1 deployed in corporations, but invisible to consumers
1990s–2000s: Search engines, recommendation systems, and spam filters quietly used AI techniques without marketing it as “AI”
2011 onward: Visible consumer AI emerged-Siri, Google Assistant, Alexa, and autonomous systems like Roomba brought AI to mainstream awareness
2022: ChatGPT made conversational generative AI a household concept
AI often “disappears” into everyday products once it works reliably. The translate languages feature in your browser, the speech recognition in your phone, the medical diagnosis support in hospitals-all use AI, but we stop calling them “AI” once they’re normal.
Generative AI is a branch of AI with multiple milestones rather than a single inventor:
Era | Development | Inventor(s) |
|---|---|---|
1960s | ELIZA chatbot | Joseph Weizenbaum |
2014 | GANs enabling realistic images | Ian Goodfellow |
2017 | Transformers enabling fluent text | Vaswani et al. (Google) |
2020–22 | GPT-3, DALL·E, ChatGPT | OpenAI research teams |
Generative AI results from decades of progress in neural networks, optimization algorithms, massive data collection (big data), and exponentially growing computing power. It’s a symphony, not a solo.
AI progresses in waves, each bringing new excitement and eventual consolidation:
Symbolic reasoning (1950s–70s): Logic, theorem proving, expert rules
Expert systems (1980s): Knowledge-encoded business applications
Statistical ML (1990s): SVMs, probabilistic models, data-driven approaches
Deep learning (2010s): Neural networks at scale, image recognition breakthroughs
Transformers and agents (late 2010s–2020s): Large language models, autonomous systems
Each wave gets hyped as revolutionary, faces partial disillusionment, then consolidates into real products. The pattern repeats because each “invention” builds on previous foundations while introducing genuinely new capabilities.
Staying informed without burning out means tracking only the truly major shifts. KeepSanity AI delivers exactly this-one weekly email with the signal, zero daily filler to waste your time.
AI wasn’t invented by a single genius-it’s a century-long relay race of ideas passed from Turing to McCarthy to Minsky to Hinton and beyond. Every breakthrough stands on foundations laid by previous generations of researchers, engineers, and dreamers who imagined creating thinking machines.
The next major AI development could come from an established lab, an open-source community, or an unexpected direction entirely. That’s what makes tracking the field both exciting and exhausting.
If you need to stay informed but refuse to let newsletters steal your sanity, lower your shoulders. The noise is gone. Here is your signal.