Advanced Data Structures using ‘C’ Unit 1
Unit 1 Advanced Topics in Trees and
Their Applications
Structure
1.1 Introduction
Objective
Self Assessment Questions
1.2 Analyzing the BST Search Algorithm
Self Assessment Questions
1.3 Inserting Nodes into a BST
1.4 The Order of Insertion Determines the BST's Topology
1.5 Deleting Nodes from a BST
Self Assessment Questions
1.6 Traversing the Nodes of a BST
1.6.1 Preorder Traversal
1.6.2 Inorder Traversal
1.6.3 Postorder Traversal
Self Assessment Question
1.7 The Cost of Recursion
1.8 Binary Search Trees in the Real-World
1.9 Big-O Notation
1.9.1 Application of Big-O notation to algorithm analysis
1.10 Big- and Big-
1.10.1 Properties of the sets O(f(n)) and (f(n))
Self Assessment Questions
1.10.2 Other useful mathematical formulae
Self Assessment Questions
1.11 Height Balanced Trees
1.11.1 AVL Trees
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 2
1.11.2 Red-Black Trees
1.11.3 Lemma
1.11.4 Insertion into a red-black tree
1.11.5 Deletions from a Red-Black tree
1.12 The A-A tree
1.13 AVL Trees
1.13.1 Definition
1.13.2 Worst case height of an AVL tree with n Nodes
1.14 The Binary Heap
1.14.1 Definition
1.14.2 Insertions into a Binary Heap
1.14.3 Deletions from a Binary Heap
1.14.4 Analysis of Insertion Algorithm
1.14.5 Build Heap -- building a tree from a collection of forests
1.15 Summary
1.16 Terminal Questions
1.1 Introduction
A binary search tree is a special kind of binary tree designed to improve the efficiency of searching through the contents of a binary tree. Binary search trees exhibit the following property: for any node n, every descendant node's value in the left subtree of n is less than the value of n, and every descendant node's value in the right subtree is greater than the value of n. A subtree rooted at node n is the tree formed by imaging node n was a root. That is, the subtree's nodes are the descendants of n and the subtree's root is n itself. Figure 1.1 illustrates the concept of subtrees and the binary search tree property.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 3
Figure 1.1: Subtrees and the binary search tree property
Figure 1.2 shows two examples of binary trees. The one on the right, binary
tree (b), is a BST because it exhibits the binary search tree property. Binary
tree (a), however, is not a BST because not all nodes of the tree exhibit the
binary search tree property. Namely, node 10's right child, 8, is less than 10
but it appears in node 10's right subtree. Similarly, node 8's right child, node
4, is less than 8 but appears on node 8's right subtree. This property is
violated in other locations, too. For example, node 9's right subtree contains
values less than 9, namely 8 and 4.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 4
Figure 1.2: Comparison of non-BST binary tree (a) and a BST binary tree (b)
Note that for the binary search tree property to be upheld, the data stored in
the nodes of a BST must be able to be compared to one another.
Specifically, given two nodes, a BST must be able to determine if one is less
than, greater than, or equal to the other.
Now, imagine that you want to search a BST for a particular node. For
example, for the BST in Figure 1.2 (the binary tree (b)), imagine that we
wanted to search for the node 10. A BST, like a regular binary tree, only has
a direct reference to one node, its root. Can you think of an optimal way to
search the tree to see if node 10 exists? There's a better way than
searching through each node in the tree.
To see if 10 exists in the tree we can start with the root. We see that the
root's value (7) is less than the value of the node we are looking for.
Therefore, if 10 does exist in the BST, it must be in the root's right subtree.
Therefore, we continue our search at node 11. Here we notice that 10 is
less than 11, so if 10 exists in the BST it must exist in the left subtree of 11.
Moving onto the left child of 11, we find node 10, and have located the node
we are looking for.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 5
What happens if we search for a node that does not exist in the tree? Imagine that we wanted to find node 9. We'd start by repeating the same steps above. Upon reaching node 10, we'd see that node 10 was greater than 9, so 9, if it exists, must be in 10's left subtree. However, we'd notice that 10 has no left child, therefore 9 must not exist in the tree. More formally, our searching algorithm goes as follows. We have a node n we wish to find (or determine if it exists), and we have a reference to the BST's root. This algorithm performs a number of comparisons until a null reference is hit or until the node we are searching for is found. At each step we are dealing with two nodes: a node in the tree, call it c, that we are currently comparing with n, the node we are looking for. Initially, c is the root of the BST. We apply the following steps:
1. If c is a null reference, then exit the algorithm. n is not in the BST.
2. Compare c's value and n's value.
3. If the values are equal, then we found n.
4. If n's value is less than c's then n, if it exists, must be in the c's left subtree. Therefore, return to step 1, letting c be c's left child.
5. If n's value is greater than c's then n, if it exists, must be in the c's right subtree. Therefore, return to step 1, letting c be c's right child.
We applied these steps earlier when searching for node 10. We started with the root node and noted that 10 was greater than 7, so we repeated our comparison with the root's right child, 11. Here, we noted 10 was less than 11, so we repeated our comparison with 11's left child. At this point we had found node 10. When searching for node 9, which did not exist, we wound up performing our comparison with 10's left child, which was a null reference. Hence we deduced that 9 did not exist in the BST.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 6
Objective:
At the end of this unit the learner is expected to have
The complete understanding of the advanced topics in Trees and their applications.
Concept to analyze the Binary Search tree, insertion and deletion of nodes from BST
Knowledge of Node traversing and cost recursion algorithms
Complete understanding of Big-O notation and its usages
The Concept of Height Balanced Trees, A-A trees AVL trees
Understanding of different Binary Heap algorithms
Self Assessment Questions
1. Explain the concept of Binary Search Tree. How it differs from a Binary Tree? 2. What happens if we search for a node that does not exist in the tree?
1.2 Analyzing the BST Search Algorithm
For finding a node in a BST, at each stage we ideally reduce the number of nodes we have to check by half. For example, consider the BST in Figure 1.3, which contains 15 nodes. When starting our search algorithm at the root, our first comparison will take us to either the root's left or right child. In either case, once this step is made the number of nodes that need to be considered has just halved, from 15 down to 7. Similarly, at the next step the number is halved again, from 7 down to 3, and so on.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 7
Figure 1.3: BST with 15 nodes
The important concept to understand here is that ideally at each step in the
algorithm the number of nodes that have to be considered has been cut in
half. Compare this to searching an array. When searching an array we have
to search all elements, one element at a time. That is, when searching an
array with n elements, after we check the first element, we still have n – 1
elements to check. With a BST of n nodes, however, after checking the root
we have whittled the problem down to searching a BST with n/2 nodes.
Searching a binary tree is similar in analysis to searching a sorted array. For
example, imagine you wanted to find if there is a John King in the
phonebook. You could start by flipping to the middle of the phone book.
Here, you'd likely find people with last names starting with the letter M.
Because K comes before M alphabetically, you would then flip halfway
between the start of the phonebook and the page you had reached in the
Ms. Here, you might end up in the Hs. Since K comes after H, you'd flip half
way between the Hs and the Ms. This time you might hit the Ks, where you
could quickly see if James King was listed.
This is similar to searching a BST. With an ideally arranged BST the
midpoint is the root. We then traverse down the tree, navigating to the left
and right children as needed. These approaches cut the search space in
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 8
half at each step. Such algorithms that exhibit this property have an
asymptotic running time of log2 n, commonly abbreviated lg n. The growth
rate of log2 n compared to linear grown is shown in the graph in Figure 1.4.
Due to log2 n's slower growth than linear time, algorithms that run in
asymptotic time log2 n are said to be sub linear.
Figure 1.4: Comparison of linear growth rate vs. log2 n
While it may appear that the logarithmic curve is flat, it is increasing, albeit
rather slowly. To appreciate the difference between linear and sublinear
growth, consider searching an array with 1,000 elements versus searching a
BST with 1,000 elements. For the array, we'll have to search up to 1,000
elements. For the BST, we'd ideally have to search no more than ten nodes!
(Note that log10 1024 equals 10.)
Throughout our analysis of the BST search algorithm, I've repeatedly used
the word "ideally." This is because the search time for a BST depends upon
its topology, or how the nodes are laid out with respect to one another. For a
binary tree like the one in Figure 1.3, each comparison stage in the search
algorithm trims the search space in half. However, consider the BST shown
in Figure 1.5, whose topology is synonymous to how an array is arranged.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 9
Figure 1.5: Example of BST that will be searched in linear time
Searching the BST in Figure 1.8 will take linear time because after each
comparison the problem space is only reduced by 1 node, not by half of the
current nodes, as with the BST in Figure 1.2.
Therefore, the time it takes to search a BST is dependent upon its topology.
In the best case, the time is on the order of log2 n, but in the worst case it
requires linear time. As we'll see in the next section, the topology of a BST
is dependent upon the order with which the nodes are inserted. Therefore,
the order with which the nodes are inserted affects the running time of the
BST search algorithm.
Self Assessment Questions
1. Write the algorithm for analysis of a Binary Search Tree Algorithm.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 10
1.3 Inserting Nodes into a BST
We've seen how to search a BST to determine if a particular node exists, but we've yet to look at how to add a new node. When adding a new node we can't arbitrarily add the new node; rather, we have to add the new node such that the binary search tree property is maintained. When inserting a new node we will always insert the new node as a leaf node. The only challenge, then, is finding the node in the BST which will become this new node's parent. Like with the searching algorithm, we'll be making comparisons between a node c and the node to be inserted, n. We'll also need to keep track of c's parent node. Initially, c is the BST root and parent is a null reference. Locating the new parent node is accomplished by using the following algorithm: Step 1: If c is a null reference, then parent will be the parent of n. If n's value is less than parent's value, then n will be parent's new left child; otherwise n will be parent's new right child. Step 2: Compare c and n's values. Step 3: If c's value equals n's value, then the user is attempting to insert a duplicate node. Either simply discard the new node, or raise an exception. (Note that the nodes' values in a BST must be unique.) Step 4: If n's value is less than c's value, then n must end up in c's left subtree. Let parent equal c and c equal c's left child, and return to step 1. Step 5: If n's value is greater than c's value, then n must end up in c's right subtree. Let parent equal c and c equal c's right child, and return to step 1.
This algorithm terminates when the appropriate leaf is found, which attaches the new node to the BST by making the new node an appropriate child of parent. There's one special case you have to worry about with the insert
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 11
algorithm: if the BST does not contain a root, then there parent will be null, so the step of adding the new node as a child of parent is bypassed; furthermore, in this case the BST's root must be assigned to the new node. Figure 1.6 depicts the BST insert graphically.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 12
Figure 1.6: Illustration of Insert into a BST Both the BST search and insert algorithms share the same running time –log2 n in the best case, and linear in the worst case. The insert algorithm's running time mimics the search's because it, essentially, uses the same tactics used by the search algorithm to find the location for the newly inserted node.
1.4 The Order of Insertion Determines the BST's Topology
Since newly inserted nodes into a BST are inserted as leaves, the order of insertion directly affects the topology of the BST itself. For example, imagine we insert the following nodes into a BST: 1, 2, 3, 4, 5, and 6. When 1 is inserted, it is insert as the root. Next, 2 is inserted as 1's right child. 3 is inserted as 2's right child, 4 as 3's right child, and so on. The resulting BST is one whose structure is precisely that of the BST in Figure 1.4.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 13
If the values 1, 2, 3, 4, 5, and 6 had been inserted in a more intelligent manner, the BST would have had more breadth, and would have looked more like the tree in Figure 1.2. The ideal insertion order would be: 4, 2, 5, 1, 3, 6. This would put 4 at the root, 2 as 4's left child, 5 as 4's right child, 1 and 3 as 2's left and right children, and 6 as 5's right child. Because the topology of a BST can greatly affect the running time of search, insert, and (as we will see in the next section) delete, inserting data in ascending or descending order (or in near order) can have devastating results on the efficiency of the BST.
1.5 Deleting Nodes from a BST
Deleting nodes from a BST is slightly more difficult than inserting a node because deleting a node that has children requires that some other node be chosen to replace the hole created by the deleted node. If the node to replace this hole is not chosen with care, the binary search tree property may be violated. For example, consider the BST in Figure 1.6. If the node 150 is deleted, some node must be moved to the hole created by node 150's deletion. If we arbitrarily choose to move, say node 92 there, the BST property is deleted since 92's new left subtree will have nodes 95 and 111, both of which are greater than 92 and thereby violating the binary search tree property. The first step in the algorithm to delete a node is to first locate the node to delete. This can be done using the searching algorithm discussed earlier, and therefore has a log2 n running time. Next, a node from the BST must be selected to take the deleted node's position. There are three cases to consider when choosing the replacement node, all of which are illustrated in Figure 1.7.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 14
Case 1: If the node being deleted has no right child, then the node's left child can be used as the replacement. The binary search tree property is maintained because we know that the deleted node's left subtree itself maintains the binary search tree property, and that the values in the left subtree are all less than or all greater than the deleted node's parent, depending on whether the deleted node is a left or right child. Therefore, replacing the deleted node with its left subtree will maintain the binary search tree property. Case 2: If the deleted node's right child has no left child, then the deleted node's right child can replace the deleted node. The binary search tree property is maintained because the deleted node's right child is greater than all nodes in the deleted node's left subtree and is either greater than or less than the deleted node's parent, depending on whether the deleted node was a right or left child. Therefore, replacing the deleted node with its right child will maintain the binary search tree property. Case 3: Finally, if the deleted node's right child does have a left child, then the deleted node needs to be replaced by the deleted node's right child's left-most descendant. That is, we replace the deleted node with the deleted node's right subtree's smallest value. Note: Realize that for any BST, the smallest value in the BST is the left-most node, while the largest value is the right-most node. This replacement choice maintains the binary search tree property because it chooses the smallest node from the deleted node's right subtree, which is guaranteed to be larger than all node's in the deleted node's left subtree. Too, since it's the smallest node from the deleted node's right subtree, by placing it at the deleted node's position, all of the nodes in its right subtree will be greater. Figure 1.7: illustrates the node to choose for replacement for each of the three cases
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 15
Figure 1.7: Cases to consider when deleting a node
As with the insert and searching algorithms, the asymptotic running time of
delete is dependent upon the BST's topology. Ideally, the running time is on
the order of log2 n. However, in the worst case, it takes linear time.
Self Assessment Questions
1. Explain the process of Insertion and deletion of an element in an Binary
Search Tree with appropriate example.
2. What are the cases to be considered while choosing the replacement
node in a Binary Search Tree for various operations?
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 16
1.6 Traversing the Nodes of a BST
With the linear, contiguous ordering of an array's elements, iterating through
an array is a straightforward manner: start at the first array element, and
step through each element sequentially. For BSTs there are three different
kinds of traversals that are commonly used:
Preorder traversal
Inorder traversal
Postorder traversal
Essentially, all three traversals work in roughly the same manner. They start
at the root and visit that node and its children. The difference among these
three traversal methods is the order with which they visit the node itself
versus visiting its children. To help clarify this, consider the BST in Figure
1.8. (Note that the BST in Figure 1.4 is the same as the BST in Figure 1.6
and is repeated here for convenience.)
Figure 1. 8: A sample Binary Search Tree
1.6.1 Preorder Traversal
Preorder traversal starts by visiting the current node - call it c, then its left
child, and then its right child. Starting with the BST's root as c, this algorithm
can be written out as:
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 17
Step 1: Visit c. This might mean printing out the value of the node, adding the node to a List, or something else. It depends on what you want to accomplish by traversing the BST. Step 2: Repeat step 1 using c's left child. Step 3: Repeat step 1 using c's right child. Imagine that in step 1 of the algorithm we were printing out the value of c. In this case, what would the output be for a preorder traversal of the BST in Figure 1.8? Well, starting with step 1 we would print the root's value. Step 2 would have us repeat step 1 with the root's left child, so we'd print 50. Step 2 would have us repeat step 1 with the root's left child's left child, so we'd print 20. This would repeat all the way down the left side of the tree. When 5 were reached, we'd first print out its value (step 1). Since there are no left or right children of 5, we'd return to node 20, and perform its step 3, which is repeating step 1 with 20's right child, or 25. Since 25 has no children, we'd return to 20, but we've done all three steps for 20, so we'd return to 50, and then take on step 3 for node 50, which is repeating step 1 for node 50's right child. This process would continue on until each node in the BST had been visited. The output for a preorder traversal of the BST as shown in Figure 1.8 would be: 90, 50, 20, 5, 25, 75, 66, 80, 150, 95, 92, 111, 175, 166, 200. Understandably, this may be a bit confusing. Perhaps looking at some code will help clarify things. The following code shows a method to iterate through the items of BST in a preorder traversal. Note that this method takes in a Binary Tree Node class instance as one of its input parameters. This input node is the node c from the list of the algorithm's steps. Realize that a preorder traversal of the BST would begin by calling this method, passing in the BST's root.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 18
1.6.2 Inorder Traversal
Inorder traversal starts by visiting the current node's left child, then the current node, and then its right child. Starting with the BST's root as c, this algorithm can be written out as: Step 1: Repeat step 1 using c's left child. Step 2: Visit c. This might mean printing out the value of the node, adding the node to an ArrayList, or something else. It depends on what you want to accomplish by traversing the BST. Step 3: Repeat step 1 using c's right child. Applying an inorder traversal to the BST in Figure 1.1, the output would be: 5, 20, 25, 50, 66, 75, 80, 90, 92, 95, 111, 150, 166, 175, 200. Note that the results returned are in ascending order.
1.6.3 Postorder Traversal
Finally, postorder traversal starts by visiting the current node's left child, then its right child, and finally the current node itself. Starting with the BST's root as c, this algorithm can be written out as: Step 1: Repeat step 1 using c's left child. Step 2: Repeat step 1 using c's right child. Step 3: Visit c. This might mean printing out the value of the node, adding the node to an ArrayList, or something else. It depends on what you want to accomplish by traversing the BST. The output of a postorder traversal for the BST in Figure 1.8 would be: 5, 25, 20, 66, 80, 75, 50, 92, 111, 95, 166, 200, 175, 150, 90.
Realize that all three traversal times exhibit linear asymptotic running time. This is because each traversal option visits each and every node in the BST precisely once. So, if the numbers of nodes in the BST are doubled, the amount of work for a traversal doubles as well.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 19
Self Assessment Question
1. Draw all possible Binary Search Trees for the data elements 5, 9, 12.
2. Create a Binary Search Tree using the following data entered as a sequence set: 7, 10, 14, 23, 33, 56, 66, 70, 80
3. Write the procedure of inserting an element 77 in the Tree shown in Figure 1.8.
1.7 The Cost of Recursion
Recursive functions are often ideal for visualizing an algorithm, as they can often eloquently describe an algorithm in a few short lines of code. However, when iterating through a data structure's elements in practice, recursive functions are usually sub-optimal. Iterating through a data structure's elements involves stepping through the items and returning them to the developer utilizing the data structure, one at a time. Recursion is not suited to stopping abruptly at each step of the process.
1.8 Binary Search Trees in the Real-World
While binary search trees ideally exhibit sublinear running times for insertion, searches, and deletes, the running time is dependent upon the BST's topology. The topology is dependent upon the order with which the data is added to the BST. Data being entered that is ordered or near-ordered will cause the BST's topology to resemble a long, skinny tree, rather than a short and fat one. In many real-world scenarios, data is naturally in an ordered or near-ordered state. The problem with BSTs is that they can become easily unbalanced.
A balanced binary tree is one that exhibits a good ratio of breadth to depth. As we will discuss in the following sections, there are a special class of BSTs that are self-balancing. That is, as new nodes are added or existing nodes are deleted, these BSTs automatically adjust their topology to
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 20
maintain an optimal balance. With an ideal balance, the running time for inserts, searches, and deletes, even in the worst case, is log2 n. The following table is an alphabetized list of definitions for terms used throughout this Unit. Binary tree A tree where each node has at most two children. Binary search tree A binary tree that exhibits the following property: for any node n, every descendant node's value in the left subtree of n is less than the value of n, and every descendant node's value in the right subtree is greater than the value of n. Cycle A cycle exists if starting from some node n there exists a path that returns to n. Internal node A node that has one or more children. Leaf node A node that has no children. Node Nodes are the building blocks of trees. A node contains some piece of data, a parent, and a set of children. (Note: the root node does not contain a parent; leaf nodes do not contain children.) Root The node that has no parent. All trees contain precisely one root. Subtree A subtree rooted at node n is the tree formed by imaging node n was a root. That is, the subtree's nodes are the descendants of n and the subtree's root is n itself.
Running Time Summaries: The following table lists the common operations that are performed on a BST and their associated running times.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 21
Operation
Best Case
Running Time
Worst Case
Running Time
Search Log2 n n
Insert Log2 n n
Delete Log2 n n
Preorder Traversal n
Inorder Traversal n
Postorder Traversal n
1.9 Big-O Notation
Definition: Let f(n) and g(n) be any two functions defined on the set of
natural numbers.
We say that f(n) = O(g(n))
if there exist constants c and N0, where c is a real number greater than
0 and N0 is an element of the set of natural numbers, such that for all n N0
f(n) < cg(n).
Illustration:
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 22
The above graph illustrates the meaning of f(n) = O(g(n)). The function g(n)
grows asymptotically as fast or faster than f(n). For all n > N0 the value of
cg(n) > f(n). As shown in the illustration, it is not necessarily true that cg(n)
is greater than f(n) for all values for which these functions are defined, but
for all values beyond an initial N0. It is also not necessary for N0 to be the
first value for which the inequality is true.
1.9.1 Application of Big-O notation to algorithm analysis
Consider selection sort.
void SelectionSort(int A[ ], int n) {
int i = 0;
while (i < n - 1) {
int j = i + 1;
while (j < n) {
if (A[j] < A[i])
swap(A[j], A[i])
j++;
}
i++;
}
}
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 23
How many times is the comparison if (A[i] < A[j]) executed? In the first iteration of the outer loop, j takes on the values from 1 to n-1 -- there are n-1 comparisons performed. In each subsequent iteration, i is incremented by 1 and there is one fewer comparison in the inner loop. The total number of comparisons is therefore the sum: n-1 f(n) = (n-1) + (n-2) + (n-3) + ..+ 1 = k k=1 where f(n) is the number of compares as a function of the size of the list n. We claim that f(n) = O(n2). Proof: If f(n) = O(n2), then there exist constants c, N0 such that f(n)<cn2 for all nN0. Choose c = 2, N0 = 4. This choice is not unique, but we just have to find any pair of c, N such that the definition is satisfied. Let's try this pair! n-1 k < 2n2 for all n 4. k=1 Prove by induction: Base case: n = 4 3 k = 1 + 2 + 3 = 6 < 2n2 = 2*42 = 32 True k=1 Inductive hypothesis: Assume P(n) is true --the statement for size n is true-- Show that then P(n+1) is true. n-1 P(n): k < 2n2 for all n 4. assumed true k=1 n P(n+1): k < 2(n+1)2 for all n 4. k=1
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 24
n-1 = k + n < 2n2 + n < 2n2 + 4n + 2 = 2(n+1)2 . k=1 Since the comparison is the statement in the algorithm that is executed most frequently, and the number of times it is called as a function of n is shown to be O(n2), we have shown that selection sort is O(n2). Example 2: Show that Towers of Hanoi is O(2n) Let T(n) = the number of disk moves to move a tower of size n. To move a tower of size n from peg 1 to peg 3 you first move a tower of size n-1 from peg 1 to peg 2, then move a disk from peg 1 to peg 3, and finally move the tower of size n-1 from peg 2 to peg 3. The number of disk moves (T(n)) is the number of moves to move the tower of size n-1 (T(n-1)) twice plus the single disk move. T(n) = 2 T(n-1) + 1 where T(1) = 1 To prove that T(n) is O(2n), we must show that for a particular choice of c and N0, T(n) < c(2n-2) for all n N0 {since any function 2n - k is also O(2n) --We need to strengthen the claim in order to prove the result -- Choose c= 3 and N0 = 2. Base case: n = 2 T(2) = 2T(1) + 1 = 3 < c(2n-2) = 3 (22-2) = 6 True. Inductive hypothesis: Assume P(n) is true --the statement for size n true-- Show that then P(n+1) is true. P(n): T(n) < 3(2n-2) P(n+1):T(n+1) < 3(2n+1-2) T(n+1) = 2T(n) + 1 < 2* 3(2n-2) + 1 = 3(2n+1-4) + 1 = 3(2n+1-2) -6 + 1 < 3(2n+1-2)
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 25
Example 3: Show that the Fibonacci sequence f(n) = f(n-1) + f(n-2) where f(0) = 0 and f(1) = 1 is big-O of 2n. Find a pair of constants c and N0 such that f(n) < c2n for all n N0. Choose c = 1 and N0 = 2 and prove by induction. Base case: f(2) = f(1) + f(0) = 1 + 0 = 1 < 22 True Inductive hypothesis: Assume P(n) is true --the statement for size n is true. Show that then P(n+1) is true. P(n): f(n) < 2n for all n 2 assumed true P(n+1): f(n+1) < 2n+1 for all n 2 f(n+1) = f(n) + f(n-1) < 2f(n) < 2(2n) = 2n+1
Self Assessment Questions
1. Show that Towers of Hanoi is O(2n). 2. What do you mean by a Big O Notation? What is the application of Big O Notation in Algorithmic Analysis? 3. Show that the Fibonacci sequence f(n) = f(n-1) + f(n-2) where f(0) = 0 and f(1) = 1 is big-O of 2n. Find a pair of constants c and N0 such that f(n) < c2n for all n N0. 1.10 Big- and Big- Just as Big-O defines an asymptotic ceiling on the growth of a function, Big- defines and asymptotic floor. Definition: Let f(n) and g(n) be any two functions defined on the set of natural numbers. We say that f(n) = (g(n)) if there exist constants c and N0,
where c is a real number greater than 0 and N0 is an element of the set of natural numbers, such that for all n N0 f(n) > cg(n).
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 26
If there exist a pair of positive constants c1 and c2 such that for some N0 f(n)
< c1 g(n) and f(n) > c2 g(n) for all n N0, then f(n) = (g(n)) and g(n)
provides a tight asymptotic bound for the function
Illustration:
1.10.1 Properties of the sets O(f(n)) and (f(n))
Show that the set of functions (f(n)) is an equivalence class.
Let g(n) and h(n) be any two functions contained in (f(n)). Use the
definition of big- to demonstrate that the elements of this set are
reflexive, transitive and symmetric under the relation big-.
Does big-O also form an equivalence class?
Show that the functions f(n) and T(n) that respectively count the number
of compares in selection sort and the number of moves in Towers of
Hanoi are (n2) and (2n) respectively.
Demonstrate that the Fibonacci sequence f(n) = f(n-1) + f(n-2) is not
(2n).
We have already shown that f(n) = O(2n). We must now find constant c1
and N0' such that
f(n) > c12n for all n N0' then
f(n) = f(n-1) + f(n-2) > c12n-1 + c12n-2 c12n
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 27
but c1 factors out of this expression, therefore any constant that satisfies the base case will do equally well (poorly). For the inductive hypothesis to be satisfied, we must have 2n-1 + 2n-2 2n 2n-2(2 + 1) = 3* 2n-2 < 2n contradiction! no pair of c1, N0 exist Self Assessment Questions 1. Define Big- and Big-. What is their significance? 2. What are the properties of the sets O(f(n)) and (f(n))? 3. Does big-O also form an equivalence class? Explain. 1.10.2 Other useful mathematical formulae
Logarithms
y = lg(n) <=> n = 2y log of a product: lg(ab) = lg(a) + lg(b) log of a quotient: lg(a/b) = lg(a) - lg(b) log of unity: logb(1) = 0 in any base b n lg(n!) = lg(n) + lg(n-1) +..+ lg(1) = lg(i) i=1 n lg(i) = (nlg(n)) i=1 Proof: n lg(i) = O(nlg(n)) i=1 The largest term in this sum is lg(n), and there are n terms in the sum. Therefore n lg(i) < nlg(n) for all n 2 i=1
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 28
n Now to show that lg(i) = (nlg(n)) i=1 n (n/2) n (n/2) lg(i) = lg(i) + lg(i) lg(i) + (n/2) lg(n/2) (n/2) lg(n/2) (A) i=1 i=1 i=(n/2)+1 i=1 Since every term in the second sum (from I = (n/2)+1 to n) is greater than the largest term in the first sum and the first sum contains all positive terms. We can show that (n/2) > n/4 and therefore (n/2) lg(n/2) > n/4 lg (n/4) This can be written as n/4 lg(n) - n/4 lg(4) = n/4 lg(n) - n/2 Next we show that there exists a c and N0 such that n/4 lg(n) - n/2 > c nlg(n) for all n N0 Chose c = 1/8, then we find if n/4 lg(n) - n/2 > n/8 lg(n) for some range of n? On Solving, we get n/8 lg(n) > n/2 when lg(n) > 4 or n > 24 = 16. Choose N0 = 16. and since all of the inequalities in equations (A) above are in the direction , we have shown that n lg(i) = (nlg(n)) i=1 n Since lg(i) = (nlg(n)) i=1 n and lg(i) = O(nlg(n)) i=1 we have by definition that it is (n lg(n)).
Application -- Show that all sorting algorithms are (nlg(n))
Given a list of n items, there are n! possible arrangements of these items.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 29
Let each one of these arrangements be a leaf in a binary a binary tree with
the original arrangement as the root. Then the height of this (balanced)
binary tree is lg(n!) and we have just seen that this function is (nlg(n)).
The length of the path from the root to any one of the leaves can be no
shorter than a constant multiple of nlg(n), and no comparison sorting
algorithm can do better than this number of compares.
Note we use the weaker claim that lg(n!) = (nlg(n)) rather than the stronger
statement that it is (nlg(n)) because no comparison sorting algorithm can
do better than the minimum number of compares to go from the root to the
appropriate leaf of the sorting tree, but not all sorting algorithms will be this
efficient.
Self Assessment Questions
1. Show that all sorting algorithms are (nlg(n))
2. Explain Height Balanced Tree with an appropriate example.
3. How a Red Black Tree differs from a Height Balance Tree? Discuss.
4. Prove that a red-black tree with n internal nodes has height at most
2lg(n+1).
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 30
5. Explain the process of insertion of an element in a Red Black Tree. 6. What is an A-A Tree? Explain. 1.11 Height Balanced Trees 1.11.1 AVL Trees Definition: An AVL tree is a binary search tree whose left subtree and right subtree differ in height by at most 1 unit, and whose left and right subtrees are themselves AVL trees. 1.11.2 Red-Black Trees Definition: A red-black tree is a binary search tree whose leaves are external nodes. A red-black tree must satisfy the following properties:
1) Every node is either red or black
2) The root is black
3) Every leaf (NIL) is black
4) If a node is red, then both its children are black
5) For each node, all paths from that node to a descendant leaf contain the same number of black nodes
1.11.3 Lemma: A red-black tree with n internal nodes has height at most 2lg(n+1)
Proof: The tree rooted at any node x contains at least 2bh(x) – 1 internal nodes.
Base case: If the height of x is zero, then x must be a leaf and the tree rooted at x contains at least 20 – 1 = 0 internal nodes.
Inductive hypothesis: Consider a node x that has a positive height and is an internal node with two children. Each child has a black-height of either bh(x) or bh(x) – 1 depending upon whether the child is red or black, respectively. Since the height of a tree rooted at a child of x is less than the height of the tree rooted at x itself, we can apply the inductive hypothesis to
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 31
conclude that each subtree rooted at a child has at least 2bh(x)-1 –1 internal nodes. Thus the tree rooted at x has at least (2bh(x)-1 –1) + (2bh(x)-1 –1) + 1 = 2bh(x) – 1 internal nodes. Let h be the height of the tree. According to properties 4 and 5 at least half the node on any path from the root to a leaf, not including the root, must be black. Consequently the black height of the root must be at least h/2. Thus n 2h/2 – 1 and (n+1) 2h/2 Take the lg of both sides of this inequality and find that lg(n+1) h/2 or h 2lg(n+1) 1.11.4 Insertion into a red-black tree Insertions are done at a leaf and will replace an external node with an internal node with two external children.
The newly inserted node is always red. If its parent is black, no additional readjustment needs to be done to maintain properties 4 and 5 above. If the parent is red there are three cases to consider:
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 32
Case1: parent(y) and uncle (w) are both red. Make both nodes black and
their parent red. Let node x now denote the grandparent of the original x
(the node that was changed to red), and continue applying this algorithm. If
new x is the root, make it black and you are finished.
Case 2: parent (y) is red, uncle (w) is black, and x is a right child of a parent
that is A left child (or symmetrically equivalent case of x is a left child of a
parent that is a right child)
Rotate left (or symmetrical case – right) about the parent and denote the
rotated parent node as x – a left child of a left child (or symmetrically
equivalent). This now becomes Case 3.
Case 3: parent (y) is red, uncle (w) is black, and x is a left (right) child of a
left (right) child parent.
Rotate right (left) about the grandparent and make the rotated Grandparent
red and the rotated parent black. Done.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 33
1.11.5 Deletions from a Red-Black tree
To remove a node from a red black tree, proceed as you would in a binary search tree. If you replace the deleted node with the leftmost descendant of its right child, make the color of the replacement node the same as the color of the node being removed, and remove the leftmost descendant of the right child from its original position in the tree. Otherwise splice out the node to be removed in the same manner as described for the binary search tree. If the node that is (actually) removed is black, add an additional unit of blackness to the child that replaces the parent in the position it formerly occupied, and restore the black height property along the path where it was decreased. Let x denote the node carrying the additional unit of blackness, and consider the following four cases:
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 34
Case 1: The sibling of node x is red
A left rotation about the parent produces a tree with equivalent black heights
where the parent of x is now red and its sibling is black. This now becomes
one of the other three cases.
Case 2: The sibling and both of its children are black
Change the color of the sibling to red and promote the extra unit of
blackness to the parent.
Case 3: The sibling of node x is black, and the sibling's left (right) child is
red.
Perform a right (left) rotation about the sibling to convert to Case 4. Change
the colors of the rotated nodes as indicated below.
Case 4: The sibling of node x is black and, and the sibling's right (left) child
is red. Rotate left (right) about the sibling and change colors as indicated
below.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 35
1.12 The A-A tree
Maintaining the invariants in a red-black tree can be simplified by requiring
that a left child of any node cannot be red. In the diagram below, red nodes
are represented with a horizontal link. A red node, therefore is said to be on
the same level as its parent, and the level of a node is used instead of color
as the node's balance information. (the nullNode sentinel is at level 0)
The level of a node is:
Level 1 if the node is a leaf. The same level as its parent if the node is
red (therefore only a right child). One less than the level of its parent if
the node is black
The tree is rebalanced after an insertion or deletion with series of skew and
split operations. When a red child is also a left child (as will be the case
when 2 is inserted into the above tree) a skew operation is performed.
As is the case when 2 is inserted into the tree above, the skew operation
may produce a node with two right children at the same level (two adjacent
red nodes). This is resolved with a split operation in which a rotation around
the first right child is performed.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 36
As an example, look at the insertion of 2 into the left subtree of the AA-tree
depicted above. A newly inserted node is always red and inserted at level 1.
The skew operation produces a rotation in which 2 becomes a left child of
5's parent (and is therefore black) and 5 becomes a red child of 2. This
operation produces 2 consecutive right horizontal links (2 red nodes in a
row). The skew operation is followed by a split operation (which cures the
immediate problem of two consecutive horizontal links) but promotes the left
horizontal link problem up the tree.
Add, skew, split is recursively performed on an insertion and the tree is
rebalanced.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 37
1.13 AVL Trees 1.13.1 Definition An AVL tree is a binary search tree whose left subtree and right subtree differ in height by no more than 1, and whose left and right subtrees are they AVL trees. To maintain balance in a height balanced binary tree, each node will have to keep an additional piece of information that is needed to efficiently maintain balance in the tree after every insert and delete operation has been performed. For an AVL tree, this additional piece of information is called the balance factor, and it indicates if the difference in height between the left and right subtrees is the same or, if not, which of the two subtrees has height one unit larger. If a node has a balance factor rh (right high) it indicates that the height of the right subtree is 1 greater than the height of the left subtree. Similarly the balance factor for a node could be lh (left high) or eh (equal height). Example: Consider the AVL tree depicted below. The right subtree has height 1 and the left subtree, height 0. The balance factor of the root is tilted toward the right (right high – rh) since the right subtree has height one larger than the left subtree. Inserting the new node 21 into the tree will cause the right subtree to have height 2 and cause a violation of the definition of an AVL tree. This violation of the AVL property is indicated at the root by showing that the balance factor is now doubly unbalanced to the right. The other balance factors along the path of insertion will also be changed as indicated. The node holding 12 is also doubly unbalanced to the right.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 38
The AVL property of this tree is restored through a succession of rotations.
The root of this tree is doubly unbalanced to the right. The child of this
unbalanced node (node 12) is also doubly unbalanced rh, but its child (node
24) is lh. Before a rotation around the root of the right subtree can be
performed, a rotation around node 24 is required so that the balance factors
of both the child and grandchild of the unbalanced node – the subtree root
(node 12) – agree in direction (both rh in this case).
21
21
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 39
Now both nodes 12 and 21 have a balance factor of rh, and a left rotation
can be performed about the node 12. This rotation reduces the height of the
right subtree by 1 and will restore AVL balance to the tree.
Next we add a new key 6 to this tree. Addition of a new node holding this
key causes the root to become doubly unbalanced to the right.
Since the root is doubly unbalanced right and its right child is left high, we
must first perform a right rotation around node 21. Now a rotation around
the root readjusts the balance of the tree.
6
6
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 40
The left rotation about the root promotes the right child of the original root
(node 12), and makes the old root (node 4) the left child of the new root –
replacing the left subtree originally attached to node 12. The former left
subtree of node 12 is now the right subtree of node 4. In a left rotation, all of
the keys in the left subtree of the right child must be greater than the key of
the root and less than the key of the parent. When the root becomes the left
child of the parent, the keys in this subtree remain in the left subtree of the
new root and in the right subtree of the new left child.
1.13.2 Worst case height of an AVL tree with n Nodes
We will examine this question by first considering the minimum number of
nodes in an AVL tree of height h. Consider an AVL tree of height h whose
left subtree is of height one less than the height of its right subtree, and
assume that this condition is recursively true also for all subtrees
6
6
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 41
Such an arrangement produces an AVL tree of height h with the minimum
number of nodes. Let F(h) = the number of nodes in a (min-node) AVL tree
of height h.
Then F(h) = F(h-1) + F(h-2) + 1 //the number of nodes in the two subtrees
plus the root
Where F(1) = 2 and F(2) = 4
Add 1 to both sides of this equation.
F(h) + 1 = F(h-1) +1 + F(h-2) + 1
Let f(h) = F(h) + 1
Then
f(h) = f(h-1) + f(h-2)
Where
f(1) = 3, f(2) = 5
Except for the base case, this is the Fibonacci recurrence relation.
Try the solution f(h) = ah. Substitute this value into the recurrence relation
and solve for a.
ah – ah-1 – ah-2 = 0
ah-2( a2 – a – 1) = 0
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 42
From the quadratic formula we obtain a = (1 + √5)/2 and a = (1 - √5)/2 and f(h) = c1((1 + √5)/2)h + c2((1 - √5)/2)h where (1 - √5)/2 -0.618034 and will always contribute a term of absolute value less than 1 to f(h) for any h. and using the initial conditions to evaluate c1 and c2, we obtain f(h) ((5 + 2)/5)(( (1 + √5)/2)n now, taking the log of both sides of this (approximate) equality, and dropping all but the largest terms, we obtain: h 1.44 lg n where n = (F(h), the number of nodes in the sparsest AVL tree with n nodes. Note how this result compares with other similar search length results for other binary trees: Worst case path length for an AVL tree with n nodes -- 1.44 lg n Worst case path length for a red-black tree with n nodes -- 2 lg n Average case path length for a binary search tree with n nodes -- 1.39 lg n In general the maximum path length from the root to any leaf for an AVL tree will be much closer to lg n (the best case result) for a randomly generated AVL tree. 1.14 The Binary Heap 1.14.1 Definition A binary heap is a complete binary tree -- a tree that is completely filled except possibly at the bottom level, which is filled from left to right with no missing nodes. A (min) heap has the following heap order property -- In a binary heap for every node X with a parent P, the key in P is less than or equal to the key in X. An example of a binary heap is shown below:
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 43
The binary heap is implemented as a binary tree that "lives in an array". The complete binary tree is implemented as an array with the root located at index 1 of the array, and the children of the node at index i located at indices 2i and 2i+1 (if either or both children are present). The attribute next points to the next location in the tree, where an insertion can occur. 1.14.2 Insertions into a Binary Heap An insertion into a binary heap is performed by inserting the new node in the location referenced by next in the array, and then "sifting it up" by comparing the key of the newly inserted node with the key of the parent. If the new key is less than the key of the parent, the two items are swapped and the "sift up" procedure continues until either the parent's key is less than or equal to the key of the child, or the new key reaches the root of the tree. The sift_up algorithm is given below. The parameter i is the index of the key to be sifted up and the array A is the array "holding the heap".
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 44
template <class Object> void Heap<Object>::siftUp(int i) { if (i == 1) return; j = i/2; if (A[i].key < A[j].key) { swap(A[i], A[j]) siftUp(j); } } The following sequence of diagrams illustrate the sift_up algorithm.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 45
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 46
1.14.3 Deletions from a Binary Heap
Only the root node may be removed from the heap. It is replaced by the last leaf in the tree (the element just before the index last). This element is then "sifted down" by a succession of pairs of comparisons and swaps until the heap property is restored. In the example above, the key three is removed by a deletemin operation and replaced by the last leaf (with key 41). The two children (5 and 10) are compared, and the "winner" of this test (key 5) is compared with the parent and swapped. After the first siftDown step, key 5 is the new root, and key 41 sits at index 2. The process is repeated as illustrated below, and the heap order is restored.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 47
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 48
The deletemin function is written as: template <class Object> Object Heap<Object>::deletemin( ) { assert (next > 1); Object temp = A[1]; next--; if (next > 2) { A[1] = A[next]; siftDown(1); } return temp; } The siftDown function appears as: template <class Object> void Heap<Object>::siftDown(int i) { if (((2*i+1) < next) && (A[2*i+1].key < A[2*i].key)) if (A[2*i+1].key < a[i].key) { swap (A[i], A[2*i+1]); siftDown(2*i+1); } else if ((2*i < next) && (A[2*i].key < A[i].key)) { swap (A[i], A[2*i]); siftDown(2*i); } }
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 49
1.14.4 Analysis of Insertion Algorithm
Consider first the case of building a heap by a process of n consecutive insertions. The cost of the ith insertion will be lg (i) and the cost of constructing a heap of size n will be n T(n) = (lg(i)) i=1 To show that T(n) is (n lg(n)) we must first show that it is O(n lg(n)) -- that there exist constants c1 and n0 such that T(n) < c1 n lg(n) for all n n0. n n T(n) = (lg(i)) < lg(i) = lg(1) + lg(2) + .. + lg(n) < n lg(n) i=1 i=1 for all n 2 (c1=1) Next, we must show that T(n) = (n lg(n)). We must find constants c2, n0 such that T(n) > c2 n lg(n) for all n n0 . n T(n) = (lg(i)) = (lg(1)) + (lg(2)) + … +(lg(n/2)) + (lg(n/2 +1)) + .+ (lg(n)) i=1 and since the first n/2 terms are greater than or equal to zero, T(n) > (lg((n/2) +1)) +.. .+ (lg(n)) Every one of the remaining n/2 terms is greater than or equal to (lg(n/2)) T(n) > (n/2)(lg(n/2)-1) = (n/2)(lg(n) - n But we can show that if we choose c2 = 1/4 and n0 = 17, then (n/2)(lg(n) - n > 1/4 n lg(n) for all n 17 1/4 n lg(n) > n and lg(n) > 4 for all n 17 Thus T(n) is both O(n lg(n)) and (n lg(n)). Therefore T(n) = (n lg(n)).
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 50
1.14.5 Build Heap -- building a tree from a collection of forests
Instead of constructing a heap of size n nodes by inserting one key at a time into a heap, one can initialize a heap by passing an array of n objects during construction, then sifting down each key beginning at index n/2 and ending at index 1 to heap order the objects in the array. template <class Object> Heap<Object>::Heap(Object X[ ], int n) { //copy the array X into the Heap attribute A for (int i = 1; i<= n; i++) A[i] = X[i-1]; //index 0 is not used by the heap next = n+1; for (i = n/2; i >= 1; i--) siftDown(i); } An analysis of this function immediately yields the fact that the first for loop has a cost of n copy operations. We next count the (worst case) number of (pairs of) compares needed to heap order the array. The diagram below will help explain the cost of building a heap from a forest of trees.
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 51
The total cost of the n/2 siftDown steps is given by: lg(n) T(n) = n/2i+1 i < (n/2i) i = n (i/2i) < C n i=0 i=1 i=1 where C is a constant since the series (i/2i) converges (by ratio test) Therefore we have shown that T(n) = O(n). 1.15 Summary Tree structures support various basic dynamic set operations including Search, Predecessor, Successor, Minimum, Maximum, Insert, and Delete in time proportional to the height of the tree. Ideally, a tree will be balanced and the height will be log n where n is the number of nodes in the tree. To ensure that the height of the tree is as small as possible and therefore provide the best running time, a balanced tree structure like a red-black tree, AVL tree, or b-tree must be used. When working with large sets of data, it is often not possible or desirable to maintain the entire structure in primary storage (RAM). Instead, a relatively small portion of the data structure is maintained in primary storage, and additional data is read from secondary storage as needed. Unfortunately, a magnetic disk, the most common form of secondary storage, is significantly slower than random access memory (RAM). In fact, the system often spends more time retrieving data than actually processing data. 1.16 Terminal Questions
1. Assume an = O(bn), can we say that bn = O(an), where a b, is also true? Justify your answer.
2. List the following functions in the order of their asymptotic behavior (slowest increase first).
a) n! b) 2n c) n5 d) n lg n e) n + lg f) n g) lg( lg n) h) en
Advanced Data Structures using ‘C’ Unit 1
Sikkim Manipal University Page No.: 52
3. Show each red-black tree that results after successively inserting the keys 41, 38, 31, 12, 19, 8 into an initially empty red-black tree.
4. Show that the longest simple path from a node x to a descendant leaf in a red-black tree has length at most twice that of the shortest simple path from node x to a descendant leaf.
5. Describe a red-black tree on n keys that realizes the largest possible ratio of red internal nodes to black internal nodes. What is this ratio? What tree has the smallest possible ratio, and what is the ratio?
6. Insert the keys in the order shown into an initially empty AVL tree. Show the sequence of trees produced by these insertions.
A, Z, B, Y, T, M, E, W, D, G, X, F, P, O, C, H, Q, S
7. Delete keys F, O, X in order from the AVL tree generated above. Show the sequence of trees produced by these insertions.
8. Suppose that a node x is inserted into a red-black tree and immediately deleted. Is the resulting red-black tree always the same as the initial red-black tree? Justify your answer.
No comments:
Post a Comment