AI and Computer Architecture: 200 Years Together

AI and CompArch
200 Years Together

Fall 2025 2/26
The Birth of AI Field
●
2022: The launch of ChatGPT

Fall 2025 3/26
●
●
2017: The introduction of the Transformer model and Tensor Cores

Fall 2025 4/26
●
●
●
2014: Generative Adversarial Networks (GANs) invented

Fall 2025 5/26
●
●
●
●
1966: ELIZA by Joseph Weizenbaum, the 1st chatbot

Fall 2025 6/26
●
●
●
●
●
1958: Lisp by John McCarthy, the AI language

Fall 2025 7/26
●
●
●
●
●
●
1956: AI officially established as a field of study

Fall 2025 8/26
●
●
●
●
●
●
●
1950: The Turing Test by Alan T. A test for intelligence

Fall 2025 9/26
●
●
●
●
●
●
●
●
1834: Babbage’s Analytical Engine

Fall 2025 10/26
●
●
2017: The introduction of the Transformer model
●
2014: Generative Adversarial networks (GANs) invented
●
●
1958: LISP by John McCarthy, the AI language
●
●
Charles Babbage:
designed and built
Analytical Engine (1834)
Annabel Byron: called a
“Thinking Machine”
Ada Lovelace: “programmed”
mom

Fall 2025 11/26
mom
●
●
2017: The introduction of the Transformer model
●
2014: Generative Adversarial networks (GANs) invented
●
●
1958: LISP by John McCarthy, the AI language
●
●
Charles Babbage:
designed and built
Analytical Engine (1834)
Annabel Byron: called a
“Thinking Machine”
Ada Lovelace: “programmed”
?

Fall 2025 12/26
●
●
●
●
●
●
●
●
1834: Babbage’s Analytical Engine

Fall 2025 13/26
Lots of Irritating Silly Parentheses
●
Lisp: LISt Processing language
●
Designed by John McCarthy (MIT) in 1958 for AI research
●
The second programming language (after Fortran)
●
First implemented in IBM 704

Fall 2025 14/26
Lists In Lisp
●
A cons[tructed] pair: (Expr1 . Expr2)
●
A list is a cons where Expr1 is a value and Expr2 is a list or a NIL
●
(1 . (2 . (“hello” . (-1 . (‘world . NIL)))))
●
(1 2 “hello” -1 ‘world)
●
(SETQ my_list ‘(1 2 “hello” -1 ‘world)) ; assignment
●
Quote ‘ prevents the argument from being interpreted as a LISP
command
●
Commands are lists, too!

Fall 2025 15/26
IBM 704
●
1954, Backus and Amdahl
●
Registers:
●
Accumulator
●
Multiplier/Quotient
●
Sense Indicator
●
index registers (XR1, XR2, XR3)
●
PC
●
Instruction formats: Type A and Type B
●
Mean time to failure: 8 hours

Fall 2025 16/26
“Type A” Instruction Format
●
Four fields:
●
3-bit prefix (opcode)
●
15-bit decrement (for immediate operands or subtractive indexing)
●
3-bit tag for index register selection (one bit per XR)
●
15-bit address (or immediate)
●
Assembly macros for accessing the parts:
●
CAR (contents of the address part of the register)
●
CDR (contents of the decrement part of the register)

Fall 2025 17/26
Lisp + 704
●
Store cons cells as “Type A” instruction words:
●
Head pointer in the CAR
●
Tail (“rest of the list”) pointer in the CDR
●
Lisp functions CAR and CDR
●
(CAR my_list) ; 1
●
(CDR my_list) ; (2 “hello” -1 ‘world)

Fall 2025 18/26
Lisp Machines
●
General-purpose computers designed to efficiently run Lisp
●
High-level language computer architecture (also for ALGOL 60,
BASIC, Pascal, Ada, Occam, Java, etc.)
●
Built in 1980s
●
Manufacturers: Symbolics, Lisp Machines Incorporated, Texas
Instruments, Xerox
●
Operating systems written in Lisp Machine Lisp, Interlisp,
Common Lisp
●
Hardware:
●
Stack machine optimized for Lisp assembler

Fall 2025 19/26
●
●
“Attention is All You Need” by Vaswami et al.

Fall 2025 20/26
Transformers
●
A transformer is a neural network built of two repeated building blocks:
●
Self-attention: weigh the importance of tokens in an input sequence to
better understand the relations between them
●
MLP (multi-layer perceptron): feedforward neural network consisting of
fully connected neurons with a nonlinear activation function
●
Used for NLP (Natural Language Processing: chatbots, text generation,
summarization)
●
BERT (Bidirectional Encoder Representations from Transformers) from
Google (embeddings, Google search)
●
Generative AI

Fall 2025 21/26
From Transformers
to Linear Algebra
●
Matrix-multiply (matmul) heavy
components:
●
Embeddings/projections
●
(Masked) Multi-headed self-attention
●
Feed-forward network
●
Linear output
●
Matmul complexity:
●
Practically, O(N3
)
●
Theoretically, O(N2.371339
)

Fall 2025 22/26
Graphics Processing Units (GPUs)
●
Designed for computer games (real-time 3D graphics), used for AI
●
NVidia GPU:
●
Connects to a CPU (“the host”) using PCIe or NVLink bus (interconnects
using NVLink buses)
●
Includes 80–132 streaming multiprocessors (SMs)
●
Includes 10–100 MB L2 cache shared among the SMs
●
Includes 40–100 GB of shared memory

Fall 2025 23/26
Streaming Multiprocessor
●
Provides 64–128 FP32 cores, 32–64 FP64 cores, 4–8 tensor cores
●
Provides 65,536 32-bit registers
●
Includes 128–256 KB L1 cache
●
Executes SIMD (“single instruction multiple data”) instructions on vector
registers simultaneously

Fall 2025 24/26
CUDA Cores
●
Compute Unified Device Architecture,
NVIDIA’s parallel computing platform
●
A CUDA core within an SM serves as
the functional unit of parallel
processing
●
CUDA software enables use of GPUs
for general-purpose processing
●
Written in C

Fall 2025 25/26
Tensor Cores
●
Specialized units that perform matrix
multiplications much faster than general-
purpose CUDA cores
●
Performs matrix-multiply-and-add (D = A × B + C)
●
Deliver much higher throughput
●
Work well with lower-precision data (FP16 /
INT8 / INT4)
●
Reduce memory bandwidth
●
Increase effective cache capacity
●
Cut energy consumption

Fall 2025 26/26
What’s Next?
Operating Systems

AI and Computer Architecture: 200 Years Together

More Related Content

Similar to AI and Computer Architecture: 200 Years Together

More from Dmitry Zinoviev

Recently uploaded

AI and Computer Architecture: 200 Years Together