What's an Algorithm?

Definition - Algorithm: a set of instructions, or rules that, starting with initial data, solves a class of problems using precise operations, executed mechanically, without creative human intervention.

Cooking recipes could be thought of as algorithms. They enumerate a step-by-step procedure which achieves a task (the final dish) starting with initial data (ingredients) and asking you to execute relatively straight forward instructions. Yet, any of us who tried to use a cookbook know that 'straight forward' is such a relative term. Recipes tend to be ambiguous and could potentially be a nightmare for the beginner chef. So, then, what should we ask of a good recipe?

Algorithms' properties

To qualify as an algorithm, the set of instructions should have some properties:

1. Preciseness - Or non-ambiguity. Each step must be non-ambiguous and executable without creative intervention.
2. Generality - It should solve a class of problems, not just one particular problem.
3. Finiteness - The algorithm must finish in a finite amount of time. For all practical purposes it should finish in reasonable time (e.g. before we die).
4. Correctness - The end result should be correct for all inputs.

Do recipes have the above properties? Supposedly they are correct, though some of your dinner guests might disagree (if they feed it to your dog while nobody's watching that might be a sign). They are also finite. What they lack is generality, since they cook one dish using fixed input. And what most of them lack is preciseness, which, on one hand makes them bad examples of algorithms, but on the other hand leaves way for creativity and uniqueness.

Word's origins

The word algorithm has interesting roots. Some say it comes from al-Khwārizmī, a Persian mathematician who lived at around 800 and wrote books on linear and quadratic equations. Other people say it is an anagram of logarithm, a mathematical operation which is an inverse of the exponential. In the old days there seemed to be little distinction between algorithms and 'math recipes'. For a long time the word was linked with Euclid's algorithm for computing the greatest common divisor. See The timeline of algorithms for more details. Also, for a more detailed introduction to the concept visit Algorithm. We'll move on to see how we can create and describe algorithms.

Flow Charts

Flow charts are one way to describe algorithms (another way is pseudocode, but we won't be dealing with that one here). They consist of the following symbols:

• • • • Computation box

As the name suggests, these boxes are used to compute things. They usually comprise mathematical computations. Examples:

• computes the expression and stores the result (in this case ) in .
• takes the old value of , adds to it and stores the result in variable , overwriting the old value. If was initially , the final value will be , after executing this block.
• negates the value stored in and assigns it to . In our case would be .

These boxes deal with the flow of data in and out of our algorithm. We can read (or input) initial values, as well as write (or output) the results. Examples:

• asks for a value to be entered and stores that value in . The 'R' is for 'read'.
• asks for two values, one to be stored in and the other in .
• writes (or outputs, displays) the value currently stored in . The 'W' is for 'write'.

Decision box

A decision symbol is used to create a 'fork on the road'. It branches the execution depending on the result of the condition inside the box. If the expression is true, the execution of the algorithm continues on the right branch. If the expression is false the execution continues on the left branch. Examples:

• tests if is less than zero, when reaching this box in the execution of the algorithm. If is less than zero, we move on to the boxes on the right branch. If is greater than zero, or equal to zero, we continue on the left branch.
• tests if 's value is at least 's value plus . If so, we continue on the right branch. If not, we continue on the left branch.

We'll get a better idea about decision boxes when we design our first flow chart.

Terminator box

I wouldn't even call this a proper symbol, we could probably do without it just fine. Its only role is to set where the algorithm starts and where it ends. Here are, for now, the two terminators we will use:

• is where the flow chart starts.
• is where the flow chart ends.

Example one: Absolute value

We will now plunge into the hot waters of flow charts by designing our first one. And no, it's not the 'hello world' example, it's boring and so classical. Let us try to compute the following math function: What function is this? Indeed, it's the absolute value, or modulus function. Let's build a flow chart that computes it:

To execute a flow chart, we start at the START symbol and follow the arrows. First thing we read , the number for which we want to compute the absolute value. Then we test if it's less than zero. If so, we follow the right branch, the branch, where we compute , the absolute value, as negative . If is greater or equal to zero than we follow the branch, making the value for . Variable holds the value of the function . In the end we display and we stop at the STOP symbol.

The flow chart implements the definition function exactly. One alternative is to display the result on each branch, without storing it in (which means we don't really need , we used it for extra clarity).

Example two: Sum of the first n integers

Having successfully survived the encounter with our first algorithm, let's go on with a second one, shall we? Let's compute the following summation: In human language that is the sum of the first integers: Number is the input of our algorithm, while number is the output.

One way to do this is to have a counter, a variable that counts from to ; let's name it . More precisely, will hold all values from to , one after another. For that, we create a loop in which is increased by on each iteration:

Remember, is interpreted as taking the old value of , then adding to it, and then storing it as the new value of . Therefore, this flow chart section will add one, to , in a loop. However, it will never end (and we know that is not quite an algorithm). To fix it, we have to add a condition to exit the loop when exceeds :

There is one more thing we need to do: our counter has to start somewhere. Let's initialize it with , the first number we need to add:

Now, that we have a counter which cycles through the numbers we need to add, all that's left to do is add it to our sum . The sum needs to be initialized, as well. What should its initial value be? Math teaches us that zero is probably a good candidate, since it is the neutral value for addition.

The counter has dual purpose, in our algorithm. First, it makes sure that the loop is executed times. Second, it is used in computing the sum. In general, counters control the number of times the loop gets executed, but they don't always take part in the computation, as in our case.

The summation variable, is an accumulator. It accumulates the result incrementally.

By this time I'd expect math geeks to jump up and down crying there's a simpler and more efficient way to do this. Indeed, there is a formula that computes the sum of the first n integers (derived from the general case of arithmetic progressions): The flow chart for computing the sum becomes much simpler:

This is clearly a simpler and faster solution. There are two morals to this. First, we started with a more complicated solution to get our feet wet with a more complex example. Second, math is a powerful tool when building algorithms. Don't ignore it. Gauss certainly didn't. In elementary school his teacher, J.G. Büttner tried to occupy pupils by making them add up the integers from 1 to 100. The young Gauss produced the correct answer within seconds by a flash of mathematical insight, to the astonishment of all. Gauss had realized that pairwise addition of terms from opposite ends of the list yielded identical intermediate sums: 1 + 100 = 101, 2 + 99 = 101, 3 + 98 = 101, and so on, for a total sum of 50 × 101 = 5050. For more information, see Gauss' Biography and Gauss's Day of Reckoning.

To make everyone happy, including folks who might think "if he wanted a more complex example, why didn't he pick one that didn't have a trivial solution", imagine the problem of computing the product of the first integers, which mathematicians call the factorial:

Definition - Factorial: Not surprisingly, the algorithm looks the same as our first solution, except the sign in becomes a ( is the symbol for multiplication) and has to be initialized to . There is no simple formula for computing .

This example introduces two common tools in algorithms: counters and accumulators. Let's try and define them.

Definition - Counter: a variable that keeps track of the number of times an event occurs. It is usually incremented by one at a time, and it is used frequently to count (and decide) the number of times a loop is executed.

Definition - Accumulator: a variable that holds a partial result and that aggregates values obtained incrementally along the computation process. The values are usually computed in a loop and they are usually added to the accumulator (though they could contribute in a more complex way). At the end of the process the accumulator contains the final result.

In other words, a counter counts, and an accumulator accumulates. What a surprise!

Homework

Solve the following problems: First Degree Equation, Second Degree Equation. Absolute value of a number Absolute value of a number - version 2 Simplified counter Finite counter Finite counter counting from 1 to n Sum of integers from 1 to n Sum of integers from 1 to n

IFoundations

Introduction3

1The Role of Algorithms in Computing5

1.1Algorithms5

1.2Algorithms as a technology11

2Getting Started16

2.1Insertion sort16

2.2Analyzing algorithms23

2.3Designing algorithms29

3Growth of Functions43

3.1Asymptotic notation43

3.2Standard notations and common functions53

4Divide-and-Conquer65

4.1The maximum-subarray problem68

4.2Strassen's algorithm for matrix multiplication75

4.3The substitution method for solving recurrences83

4.4The recursion-tree method for solving recurrences88

4.5The master method for solving recurrences93

4.6Proof of the master theorem97

5Probabilistic Analysis and Randomized Algorithms114

5.1The hiring problem114

5.2Indicator random variables118

5.3Randomized algorithms122

5.4Probabilistic analysis and further uses of indicator random variables130

IISorting and Order Statistics

Introduction147

6Heapsort151

6.1Heaps151

6.2Maintaining the heap property154

6.3Building a heap156

6.4The heapsort algorithm159

6.5Priority queues162

7Quicksort170

7.1Description of quicksort170

7.2Performance of quicksort174

7.3A randomized version of quicksort179

7.4Analysis of quicksort180

8Sorting in Linear Time191

8.1Lower bounds for sorting191

8.2Counting sort194

8.4Bucket sort200

9Medians and Order Statistics213

9.1Minimum and maximum214

9.2Selection in expected linear time215

9.3Selection in worst-case linear time220

IIIData Structures

Introduction229

10Elementary Data Structures232

10.1Stacks and queues232

10.3Implementing pointers and objects241

10.4Representing rooted trees246

11Hash Tables253

11.2Hash tables256

11.3Hash functions262

11.5Perfect hashing277

12Binary Search Trees286

12.1What is a binary search tree?286

12.2Querying a binary search tree289

12.3Insertion and deletion294

12.4Randomly built binary search trees299

13Red-Black Trees308

13.1Properties of red-black trees308

13.2Rotations312

13.3Insertion315

13.4Deletion323

14Augmenting Data Structures339

14.1Dynamic order statistics339

14.2How to augment a data structure345

14.3Interval trees348

Introduction357

15Dynamic Programming359

15.1Rod cutting360

15.2Matrix-chain multiplication370

15.3Elements of dynamic programming378

15.4Longest common subsequence390

15.5Optimal binary search trees397

16Greedy Algorithms414

16.1An activity-selection problem415

16.2Elements of the greedy strategy423

16.3Huffman codes428

16.4Matroids and greedy methods437

16.5A task-scheduling problem as a matroid443

17Amortized Analysis451

17.1Aggregate analysis452

17.2The accounting method456

17.3The potential method459

17.4Dynamic tables463

Introduction481

18B-Trees484

18.1Definition of B-trees488

18.2Basic operations on B-trees491

18.3Deleting a key from a B-tree499

19Fibonacci Heaps505

19.1Structure of Fibonacci heaps507

19.2Mergeable-heap operations510

19.3Decreasing a key and deleting a node518

19.4Bounding the maximum degree523

20van Emde Boas Trees531

20.1Preliminary approaches532

20.2A recursive structure536

20.3The van Emde Boas tree545

21Data Structures for Disjoint Sets561

21.1Disjoint-set operations561

21.3Disjoint-set forests568

21.4Analysis of union by rank with path compression573

VIGraph Algorithms

Introduction587

22Elementary Graph Algorithms589

22.1Representations of graphs589

22.3Depth-first search603

22.4Topological sort612

22.5Strongly connected components615

23Minimum Spanning Trees624

23.1Growing a minimum spanning tree625

23.2The algorithms of Kruskal and Prim631

24Single-Source Shortest Paths643

24.1The Bellman-Ford algorithm651

24.2Single-source shortest paths in directed acyclic graphs655

24.3Dijkstra's algorithm658

24.4Difference constraints and shortest paths664

24.5Proofs of shortest-paths properties671

25All-Pairs Shortest Paths684

25.1Shortest paths and matrix multiplication686

25.2The Floyd-Warshall algorithm693

25.3Johnson's algorithm for sparse graphs700

26Maximum Flow708

26.1Flow networks709

26.2The Ford-Fulkerson method714

26.3Maximum bipartite matching732

26.4Push-relabel algorithms736

26.5The relabel-to-front algorithm748

VIISelected Topics

Introduction769

28Matrix Operations813

28.1Solving systems of linear equations813

28.2Inverting matrices827

28.3Symmetric positive-definite matrices and least-squares approximation832

29Linear Programming843

29.1Standard and slack forms850

29.2Formulating problems as linear programs859

29.3The simplex algorithm864

29.4Duality879

29.5The initial basic feasible solution886

30Polynomials and the FFT898

30.1Representing polynomials900

30.2The DFT and FFT906

30.3Efficient FFT implementations915

31Number-Theoretic Algorithms926

31.1Elementary number-theoretic notions927

31.2Greatest common divisor933

31.3Modular arithmetic939

31.4Solving modular linear equations946

31.5The Chinese remainder theorem950

31.6Powers of an element954

31.7The RSA public-key cryptosystem958

31.8Primality testing965

31.9Integer factorization975

32String Matching985

32.1The naive string-matching algorithm988

32.2The Rabin-Karp algorithm990

32.3String matching with finite automata995

32.4The Knuth-Morris-Pratt algorithm1002

33Computational Geometry1014

33.1Line-segment properties1015

33.2Determining whether any pair of segments intersects1021

33.3Finding the convex hull1029

33.4Finding the closest pair of points1039

34NP-Completeness1048

34.1Polynomial time1053

34.2Polynomial-time verification1061

34.3NP-completeness and reducibility1067

34.4NP-completeness proofs1078

34.5NP-complete problems1086

35Approximation Algorithms1106

35.1The vertex-cover problem1108

35.2The traveling-salesman problem1111

35.3The set-covering problem1117

35.4Randomization and linear programming1123

35.5The subset-sum problem1128

VIIIAppendix: Mathematical Background

Introduction1143

ASummations1145

A.1Summation formulas and properties1145

A.2Bounding summations1149

BSets, Etc.1158

B.1Sets1158

B.2Relations1163

B.3Functions1166

B.4Graphs1168

B.5Trees1173

CCounting and Probability1183

C.1Counting1183

C.2Probability1189

C.3Discrete random variables1196

C.4The geometric and binomial distributions1201

C.5The tails of the binomial distribution1208

DMatrices1217

D.1Matrices and matrix operations1217

D.2Basic matrix properties1222

Bibliography1231

Index

Supplemental Content

Bug reports

A list of known bugs and errata may be found here. Please review this list before sending a new bug report. Reports of previously-unreported bugs, misprints, and other errata may be sent to clrs-bugs@mit.edu

Solutions PDF

As of the third edition, solutions for a select set of exercises and problems are available here in PDF format: Intro to Algorithms Solutions PDF.

This file contains the follow content:

Chapter 2: Getting Started

Chapter 3: Growth of Functions

Chapter 4: Divide-and-Conquer

Chapter 5: Probabilistic Analysis and Randomized Algorithms

Chapter 6: Heapsort

Chapter 7: Quicksort

Chapter 8: Sorting in Linear Time

Chapter 9: Medians and Order Statistics

Chapter 11: Hash Tables

Chapter 12: Binary Search Trees

Chapter 13: Red-Black Trees

Chapter 14: Augmenting Data Structures

Chapter 15: Dynamic Programming

Chapter 16: Greedy Algorithms

Chapter 17: Amortized Analysis

Chapter 21: Data Structures for Disjoint Sets

Chapter 22: Elementary Graph Algorithms

Chapter 23: Minimum Spanning Trees

Chapter 24: Single-Source Shortest Paths

Chapter 25: All-Pairs Shortest Paths

Chapter 26: Maximum Flow

Instructor's Manual and Figure Files

An instructor's manual is available to instructors who have adopted the book for course use. The manual is not available to students, or to those studying the book on their own. Instructors with access to the manual must agree not to share the manual's password, to make any of the solutions publicly accessible, on the Web or otherwise, or to make any of their own solutions publicly available.

Instructors using the MIT Press English language edition of the book may request access to the online instructor’s manual and figure file via the Instructor Resources link listed to the left under the Instructor Resources heading.

Instructors using the book in another language should contact the publisher licensing the book in that language.

The downloadable instructor's manual is updated periodically; it was last updated February 24, 2014.

Downloaded with the instructor's manual is a file of the figures/illustrations in the textbook and PDFs of pseudocode in the instructor's manual notes (not the solutions).

clrscode3e

The clrscode3e package gives you pseudocode in the style of the third edition. You can download the package and its documentation here: clrscode3e package

Web Material

Chapters 19 and 27 were removed from the second edition, and chapters 20, 26, and 28 were substantially revised from the second to the third edition. Those second edition chapters are available here.

Chapter 19 Bionomial Heaps
Chapter 20 Fibonacci Heaps
Chapter 26 Maximum Flow
Chapter 27 Sorting Networks
Chapter 28 Matrix Operations

Professor jokes

Wondering about the professor names in the text? The jokes are explained here.

Request an Exam or Desk Copy

Use the Request Exam/Desk Copy link listed to the left under Instructor Resources

As of the third edition, this textbook is published exclusively by the MIT Press.

Please refer any inquiries to the appropriate representative of the MIT Press:

IN THE UNITED STATES AND CANADA, contact:

Michelle Pullano, Textbook Manager
mpullano@mit.edu
617-253-3620
The MIT Press
One Rogers Street
Cambridge, MA 02142

IN THE UK AND CONTINENTAL EUROPE, MIDDLE EAST, INDIA, ISRAEL, AND AFRICA, contact:

Judith Bullent
jbullent@mitpress.org.uk
+44 (0) 20 7306 0603
The MIT Press
Suite 2
1 Duchess Street
London W1W 6AN
UK

IN SOUTH AND CENTRAL AMERICA, AUSTRALIA, NEW ZEALAND, AND ASIA (EXCLUDING INDIA AND PAKISTAN), contact:

Jessica Lawrence-Hurt
jclh@mit.edu
617-258-0582
The MIT Press
One Rogers Street
Cambridge, MA 02142

IN INDIA AND PAKISTAN, contact:

PHI Learning Limited, Delhi, India
complimentary@phindia.com