Friday, 23 November 2012

Data Structures using „C‟ - Sorting Methods

Data Structures using „C‟ Unit 9

Unit 9 Sorting Methods
Structure:
9.1 Introduction
9.2 Overview of Sorting Methods
Self Assessment Questions
9.3 How do you sort?
Self Assessment Questions
9.4 Evaluating a Sorting Algorithms
9.4.1 Stability on Sorting algorithm
Self Assessment Questions
9.5 Internal Sorting
9.5.1 Insertion Sort
9.5.2 Bubble Sort
9.5.3 Selection Sort
9.5.4 Shell Sort
9.5.5 Quick Sort
9.5.6 Tree Sort
Self Assessment Questions
9.6 External Sorts
9.6.1 Merge Sort
9.6.2 2-Way Merge Sort
9.7 Summary
9.8 Terminal Questions
9.1 Introduction
Retrieval of information is made easier when it is stored in some predefined order. Sorting is, therefore, a very important computer application activity. Many sorting algorithms are available. Differing environments require
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 239
differing sorting methods. Sorting algorithms can be characterized in the following two ways:
1. Simple algorithms which require the order of n2 (written as O (n2) comparisons to sort n items.
2. Sophisticated algorithms that require the O(nlog2n) comparisons to sort items.
The difference lies in the fact that the first method move data only over small distances in the process of sorting, whereas the second method method large distances, so that items settle into the proper order sooner, thus resulting in fewer comparisons. Performance of a sorting algorithm can also depend on the degree of order a heady present in the data.
There are two basic categories of sorting methods: „Internal Sorting‟ and „External Sorting‟. Internal sorting are applied when the entire collection of data to be sorted is small enough that the sorting can take place within main memory. The time required to read or write is not considered to be significant in evaluating the performance of internal sorting methods. External sorting methods are applied to larger collection of data which reside on secondary devices read and write access time are major concern in determing sort performances.
In this unit we will study some methods of internal sorting and External sorting.
Objectives
At the end of this unit, you will be able to understand the:
 Overview of Sorting Methods
 Various sorting methods and its complexity
 Several performance criteria to be used in evaluating a sorting algorithm
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 240
 Internal and external Sorting method with various types of algorithms and their implementation in C.
9.2 Overview of Sorting Methods
In the last units array, the efficient routine for finding data in an array was based on the premise that the data being searched was already sorted. Indeed, computers are used so extensively to process data collections that in many installations, a great deal of their time is spent maintaining that data in sorted order in the first place. It turns out that methods of searching a sorted list have much in common with methods of achieving that sorted condition.
In order to concentrate on sorting abstractions without having to be concerned with the type of data being sorted, most of the code presented in the rest of this unit will be designed to sort only a single kind of data, namely arrays of cardinals. Only minor modifications in a few places are necessary to use the code presented in the next few sections for other kinds of data. There is a surprisingly diverse collection of algorithms that have been developed to solve the apparently simple problem of "Sorting".
The general sorting problem is simple enough to describe: Given an initially unordered array of N records, with one field distinguished as the key, rearrange the records so they are sorted into increasing (or decreasing) order according to each record's key.
Sorting is the problem of taking an arbitrary permutation of n items and rearranging them into the total order.
X1  Xj I  j
 Increasing or Decreasing Order? The same algorithm can be used by both all we need do is change  to  in the comparison function as we desire.
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 241
 What about equal keys? Does the order matter or not? May be we need to sort on secondary keys or leave in the same order as the original permutations.
 What about non-numerical data? Alphabetizing is sorting text strings and libraries have very complicated rules concerning punctuation etc. Is Brown-Williams before or after Brown America before or after Brown John?
Sorting algorithms are used in all kinds of applications and are necessary for instance, if we plan to use efficient searching algorithms like Binary Search or Interpolation Search since these require their data to be sorted.
There are dozens of algorithms, the choice of which depends on factors such as the number of items relative to working memory, knowledge of the orderliness of the items or the range of the keys, the cost of comparing keys vs. the cost of moving items, etc.
To choose an algorithm we attempt to characterize the performance of the algorithm with respect to an array of size N. We then determine which operations are critical for each type of problem. For example, in sorting we can characterize the performance of a sorting algorithm by.
1. the number of times it compares an element in the array to another value (comparisons) or
2. the number of times it moves an element from or to a position in the array (swaps).
Sorting arrays lets us pay particular attention to the mechanics of the sort algorithm. Arrays, though simple, have the benefits of being internal (memory resident), randomly accessible (using indices), and capable of allowing us to accomplish the sort 'in place'. Of course there are sorts for trees, as well as special sorts for external data structures. But these often observe the sort itself with lots of housekeeping overhead.
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 242
The amount of extra memory used by a sort is important. Extra memory is used by sort in the following ways:
 sort in place using no extra memory except for a small stack or table
 use a linked-list and each element requires a link pointer
 need enough extra memory to store a copy of the array to be sorted
Normally, when considering a sorting problem, we will assume that the number of records to be sorted Is small enough that we can fit the entire data set in the computer's memory (RAM) all at once. When this Is true, we can make use of an internal sorting algorithm, which assumes that any key or record can be accessed or moved at any time. That is, we have "random access" to the data.
Sometimes, when sorting an extremely large data set such as Census Data, there are simply , too many records for them to all fit in memory at once. In this case, we have to resort to external sorting algorithms that don't assume we have random access to the data. Instead, these algorithms assume the data is stored on magnetic tapes or disks and only portions of the data will fit in memory. These algorithms use "sequential access" to the data and proceed by reading in, processing, and writing out blocks of records at a time. These partially sorted blocks need to be combined or merged in some manner to eventually sort the entire list.
One final Issue to keep in mind when Implementing a sorting algorithm is the size of the records themselves. Many sorting algorithms move and interchange records in memory several times before they are moved into their final sorted position, For large records, this can add up to lots of execution time spent simply copying data. A popular solution to this problem is called "indirect sorting". The Idea is to sort the indices of the records, rather than the records themselves.
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 243
Self Assessment Questions
1. What do you mean by Sorting method discuss in brief.
2. Explain the characteristics of the performance of a sorting algorithm
9.3 How do you sort?
There are several different ideas which lead to sorting algorithms:
1. Insertion - putting an element in the appropriate place in a sorted list yields a larger sorted list.
2. Exchange - rearrange pairs of elements which are out of order, until no such pairs remain.
3. Selection - extract the largest element form the list, remove it, and repeat.
4. Distribution - separate into piles based on the first letter, then sort each pile.
5. Merging -Two sorted lists can be easily combined to form a sorted list.
There are many different methods used for sorting. Quite frequently, a combination of these methods are used to perform a sort. We will cover four common sorting methods, which we termed selection, insertion, comparison, and divide and conquer. These common sorting methods can be represented as algorithmic functions, which are step by step, problem solving procedures that have a finite number of steps.
Sorting methods can be grouped in to various subgroups that share common themes.
1) Priority queue sorting methods:
Example: Selection Sort and Heap Sort
2) Divided-and-conquer method:
Example: MergeSort and Quicksort
3) Insertion based sort:
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 244
Example: InsertionSort
4) Other methods:
Example: BubbleSorl and ShellSort
Self Assessment Questions
1. List the various methods involves in sorting algorithms.
9.4 Evaluating a Sorting Algorithms
There are several performance criteria to be used in evaluating a sorting algorithm:
1. Running time: Typically, an elementary sorting algorithm requires O(N2) steps to sort N randomly arranged Items. More sophisticated sorting algorithms require O(N log N) steps on average. Algorithms differ in the constant that appears in front of the N2 or N log N. Furthermore, some sorting algorithms are more sensitive to the nature of the input than others. Quicksort, for example, requires O(N log N) time in the average case, but requires O(N2) time in the worst case.
2. Memory requirements: The amount of extra memory required by a sorting algorithm is also an important consideration. In place sorting algorithms are the most memory efficient since they require practically no additional memory. Linked list representations require an additional N words of memory for a list of pointers. Still other algorithms require sufficient memory for another copy of the input array. These are the most inefficient in terms of memory usage.
3. Stability: This is the ability of a sorting algorithm to preserve the relative order of equal keys in a file.
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 245
9.4.1 Stability on Sorting algorithm
When a sorting algorithm is applied to a set of records, some of which share the same key, there are several different orderings that are all correctly sorted. If the ordering of records with identical keys is always the same as in the original input, then we say that the sorting algorithm used is "stable". This property can be useful. For instance, consider sorting a list of student records alphabetically by name, and then sorting the list again, but this time by letter grade in a 1 particular course. If the sorting algorithm is stable, then all the students who got "A" will be listed alphabetically. Stability is a difficult property to achieve if we also want our sorting algorithm to be efficient Sorting algorithms are often subdivided into "elementary" algorithms that are simple to implement compared to more complex algorithms that, while more efficient, are also more, difficult to understand, implement, and debug.
It is not always true that the more complex algorithms are the preferred ones, Elementary algorithms are generally more appropriate in the following situations:
1) less than 100 values to be sorted
2) the values will be sorted just once
3) special cases such as:
a) the input data are "almost sorted"
b) there are many equal keys .
In general, elementary sorting methods require O(N2) steps for N random key values. The more complex methods can often sort the same data in just O(N log N) steps. Although it is rather difficult to prove, it can be shown that roughly N log N comparisons are required, in the general case.
Examples of elementary sorting algorithms are: selection sort, insertion sort, shell sort and bubble sort. Examples of sophisticated sorting algorithms are quicksort, radix sort, heapsort and mergesort.
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 246
Self Assessment Questions
1. Discuss the various types of performance criteria to be used in evaluating a sorting algorithm:
2. Explain where elementary algorithms are more appropriate.
9.5 Internal Sorting
In internal sorting, all the data to be sorted is available in the high speed main memory of the computer. We will study the following methods of internal sorting:
1. Insertion sort
2. Bubble sort
3. Quick sort
4. 2-Way Merge sort
5. Heap sort
9.5.1 Insertion Sort
An insertion sort has the advantage that it is simple to understand and simple to implement. Unfortunately, it is rather slow. Given an unsorted array of integer values, an insertion sort visits each element of the array, in turn. As it visits a particular element, it scans the array from the beginning up to the determines where in that segment of the array the current value belongs. It then inserts the current value in that location and shifts every element of the array to the right, up to the present location. It then goes on to the next location (value) in the array. Notice that one index is going from 0 to n and for each such value and another index is scanning the array from 0 to the value of the first index. The result of this is - that this type of sort is O(n2).
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 247
 Insertion Sort
This is a naturally occurring sorting method exemplified by a card player
arranging the cards dealt to him. He picks up the cards as they are dealt
and inserts them into the required position. Thus at every step, we insert an
item into its proper place in an already ordered list.
We will illustrate insertion sort with an example below given and presenting
the formal algorithm.
Example 1: Sort the following list using the insertion sort method:
Step 1 1 < 4, Therefore insert before 4
Step 2 3 > 1, 3 Insert between 1 & 4
Step 3 2 > I, 2, Insert between I & 3
Step 4 5 > I, 2,3,4, Insert after 4 (5)
Insertion sort
Thus to find the correct position search the list till an item just greater than
the target is found. Shift all the items from this point one, down the list.
Insert the target in the vacated slot.
We now present the algorithm for insertion sort.
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 248
ALGORITHM: INSERT SORT
INPUT: LIST[ ] of N items in random order.
OUTPUT: LIST[ ] of N items in sorted order.
1. BEGIN
2. FORI = 2 TO N DO
3. BEGIN
4. IF LIST[I] LIST[I-1]
5. THEN BEGIN
6. J = I.
7. T = LIST [I] /*STORE LIST [I] */
8. REPEAT /* MOVE OTHER ITEMS DOWN THE LIST.
9. J = J-1
10. LIST [J + 1] = LIST [J];
11. IF J = 1 THEN
12. FOUND = TRUE
13. UNTIL (FOUND = TRUE)
14. LIST [I] = T
15. END
16. END
17. END
Example : Program to sort n numbers using insertion sort.
#include<stdio.h>
voide insertion_sort(int a[],int n)
{
int i,j,item;
for(i=0;i<n;i++)
{
/* item to be inserted */
item = a[i];
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 249
/* Try from (i-1)th position */
j=i-1;
while(j>=0 && item<a[j])
{
A[j+1] = a[j] /* Move the item to the next position */
j--; /* and update the position */
}
A[j+1]=item; /* appropriate position found and so insert item */
}
}
void main()
{
int i, n,a[20];
printf(“Enter the no. of elements to sort \n”);
scanf(“%d”, &n);
printf(“Enter n elements \n”);
for(i=0;i<n;i++)
scanf(“%d”,&a[i]);
insertion_sort(a,n);
printf(“The sorted array is \n”);
for(i=0;i<n;i++)
printf(“%d\n”,a[i]);
}
9.5.2 Bubble Sort
In a bubble sort we try to improve on the performance of the two previous sorts by making more exchanges in each pass. Again we will make several passes over the array (n -1 to be exact). For the first pass, we begin at element O and proceed forward to element n- 1. Along the way we compare each element to the element that follows it. If the current element is greater
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 250
than that in the next location, then they are in the wrong order, and we swap them. In this way, an element early in the array that has a very large value is able to percolate (bubble) upward. In the next pass we do exactly the same thing. But now the last element in the array is guaranteed to be where it belongs. so our pass need only proceed as far as element n -2 of the array. This will move the second largest value in the array into the next to last position. This proceeds with each pass encompassing one less element of the array. The result, aside from a sorted array, is that we make n passes over n elements. This sort method, too, is O(n2).
The bubble sort can be made to work in the opposite direction. moving the least value to the beginning of the array on each pass. This is sometimes referred to as a stone sort. Though the names tend to get confused, In addition there are some pretty obvious improvements that can made rather readily to the bubble sort. For example, at a certain point in our multiple passes over the array it may be the case that the rest of the array is already sorted. As soon as we make a pass in which no exchanges are made this must be the case. So we can keep a counter of exchanges that take place in each pass and quit immediately if the counter is 0 at the end of a pass. Another variation that is often seen is called a shaker sort. In this you simple do one pass of a bubble sort upward, followed by a downward pass of a stone sort. This doesn't gain you much in efficiency, but it is cute.
In this sorting algorithm, multiple Swapping take place in one pass. Smaller elements move or 'bubble' up to the top of the list, hence the name given to the algorithm.
In this method adjacent members of the list to be sorted are compared. If the item on top is greater than the item immediately below it, they are swapped. This process is carried on till the list is sorted.
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 251
The detailed algorithm follows:
Algorithm for Bubble Sort
INPUT: LIST [] of N items in random order
O UTPUT: LIST [] of N items sorted in ascending order.
1. SWAP = TRUE
PASS = 0/
2. WHILE SWAP = TRUE DO
BEGIN.
2.1 FOR I = 0 TO (N-PASS) DO
BEGIN
2.1.1 IFA[I] >A [I+1]
BEGIN
TMP = A[I]
A[I] = A[I + 1]
A[I + 1] = TMP
SWAP = TRUE
END
ELSE
SWAP = FALSE
2.1.2 PASS = PASS + 1
END
END
Total number of comparisons in Bubble sort are
= (N -1) + (N -2) ...+ 2 + 1
=(N-1)* N =O(N2)
2
This inefficiency is due to the fact that an item moves only to the next position in each pass.
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 252
Example : C program, Function to arrange numbers in ascending order using bubble sort technique.
#include<stdio.h>
void bubble_sort(int a[], int n)
{
int i; /* To access subsequent item while comparing*/
int j; /* Keep track of the passes */
int temp; /* Used to exchange the item */
int sum; /* Holds the total number of exchanges */
int pass; /*Holds the number of passes required */
int exchag; /* Holds the number of exchanges in each pass */
int flag; /* Indicate any exchange has been done or not */
sum = 0;
pass = 0;
for(j=1;j<n;j++)
{
exchg = 0; /* number of exchanges just before the pass */
flage = 0; /* No exchange been done */
for(i=0;i<n-j;i++)
{
if(a[i]>=a[i+1])
{
/* Exchange and update the number of exchange in the current pass*/
temp=a[i];
a[i]=a[i+1];
a[i+1=temp;
exchg++;
sum++ /* Update the total number of exchanges */
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 253
flag=1; /* Exchange has been done */
}
}
pass++; /* update the number of passes */
printf(“Number of exchanges in pass : %d=%d\n”,j,exchg);
print(“Total number of exchanges = %d\n”,sum);
}
void main()
{
int i,n,a[20];
printf(“Enter the number of items to sort\n);
scanf(%d,&n);
print(“Enter the items to sort\n);
for(i=0;i<n;i++)
scanf(“%d”,&a[i]);
bubble_sort(a,n);
printf(“The sorted items are \n”);
for(i=0;i<n;i++)
{
Printf(“%d\n”,a[i]);
}
}
Note: At least one pass is required to check whether the items are sorted. So, the best case time complexity is O(1).
9.5.3 Selection Sort
A selection sort is slightly more complicated. but relatively easy to understand and to code. It is one of the slow sorting technique. Given an unsorted array of integer values, a selection sort visits each element of the array, in turn. As it visits an element, it employs a second index to scan the
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 254
array from the present location to the end while it identifies the least (smallest) value in that segment of the array. It then swaps that least value with the current value. Because one index scans the array once while another index scans part of the array each time, this sort algorithm is also O(n2).
Simple steps for selection sort. [Assume we are sorted the n items in array A]
for i = 1 to n do
for j = j+1 to n do
if A[i] > A[j] then swap(A[i],A[j])
Example :
Program to illustrate the Selection sort, assume we have an array „A‟ with „n‟ elements.
Void selectionsort(int A[], int n)
{
Int minindex,j,p,tmp;
for(p=0;p< n-1;p++)
{
minindex = p;
for (j= p+1; j<n; j++)
if(A[j]<A[minindex]
minindex = j;
}
tmp = A[p];
A[p]=A[minindex];
A[minindex]=tmp;
}
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 255
Example : C program, Function to arrange number in ascending order using Selection sort technique.
#include<stdio.h>
voide selection_sort(int a[],int n);
{
int i,j,pos,small,temp;
for(i=0;i<n-1;i++)
{
small=a[i]; /* Initial small number in ith pass */
pos=i; /* Position of smaller number */
/* Find the minimum of remaining elements along with the position */
for(j=i+1;j<n;j++)
{
if(a[i]<small)
{
small=a[j];
pos=j;
}
}
/* Exchange ith item with least item */
temp=a[pos];
a[pos] = a[i];
a[i]=temp;
}
}
void main()
{
int i,n,a[20];
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 256
printf(“Enter the number of elements to sort\n”);
scanf(%d”,&n);
printf(“Enter %d elements to sort \n”,n);
for(i=0;i<n;i++)
scanf(“%d”,&a[i]);
selection_sort(a,n)
printf(“The sort elements are \n”);
for(i=0;i<n;i++)
printf(“%d”,&a[i]);
}
9.5.4 Shell Sort
The shell sort is an improved version of the selection sort. It attempts to effect this improvement by finding and making exchanges that have a big impact on the eventual sorted order. That is, we want a value to make one long jump to near its eventual location, rather than lots of little exchanges that move it little closer each time. To do this we select a gap, usually something slightly less than half the size of the array. We then make a pass along the array comparing elements that are a gap distance apart. If they need to be swapped. then we do so. On the next pass and subsequent passes we decrease the size of the gap by half. Thus on our last pass the gap is just one. But by then all values are quite near their final location. This sort can be improved on slightly using the kind of modifications that we have seen in earlier sorts. The best optimized version of shell sort is theoretically O(n1.2).
Simple steps for Shell sort : [Assume we have an array „A‟ with „n‟ elements]
for(gap=n/2; gap>0; gap/=2)
{
for(i=gap; i<n; i++)
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 257
for(j=i-gap; j>=0; j -= gap)
{
if(!COMPARE(a, j, ,j+gap))
break;
SWAP(a, j, j+gap);
}
}
Example :
Program to illustrate the Shell sort, assume we have an array „A‟ with „n‟ elements.
void shellsort(int A[], int n)
{
int p, j, increment, tmp;
for( increment = 1;increment<=n; increment*=2);
for( increment = (increment/2)-1; increment >0; increment=( increment -1)/2);
{
for(p= increment; p<n; p++)
{
tmp=A[p];
for(j=p; j>= increment && A[j- increment]>tmp; j-= increment)
A[j]=A[j- increment]
A[j]=tmp;
}
}
}
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 258
9.5.5 Quick Sort
Quick sort doesn't look at all like any of the sorts we have examined up until now. It is a partition sort. It is one of the speediest of these 'in place' array sorts. It is also one of the most complex to code. We say that it is faster than its siblings, but in fact, it is also an O(n2) algorithm. It gets its reputation for speed from the fact that it is O(n2). only in the worst case, which almost never occurs. In practice quick sort is an O(n Log n) algorithm.
Quick sort begins by picking an element to be the pivot value. There is much debate, about how to best pick this initial element. In terms of understanding how quick sort works, it really doesn't matter. They can just pick the first element in the array. With the pivot to work with, the array is divided into three parts: all values less than the pivot, the pivot, and an values greater than the pivot. When this is finished, the pivot value is in its proper position in the sorted array. Quick sort, then, recursively applies this same process to the first partition, containing low values, and to the second position, which contains high values. Each time at least one element (the pivot) finds its final resting place and the partitions get smaller. Eventually the partitions are of size one and the recursion ends.
Quick sort is fast. It is also difficult to code correctly the first time. Most designers seem to believe that implementing quick sort for data set sizes less than 300 to 500 is wasted effort.
This is the most widely used internal sorting algorithm. In its basic form, it was invented by C.A.R. Hoare in 1960. Its popularity lies in the ease of implementation, moderate use of resources and acceptable behaviour for a variety of sorting cases.
The basis of quick sort is the 'divide' and conquer' strategy i.e. Divide the problem [list to be sorted] into sub-problems [sub-Iists], until solved sub problems [sorted of. sub-lists] are found. This is implemented as
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 259
1. Choose one item A [I] from the list A[ ].
2. Rearrange the list so that this item is in the proper position i.e. all preceding items have a lesser value and all succeeding items have a greater value than this item.
1. A [0], A[1]…...A[I-1] in sub list 1
2. A [I]
3. A [I + 1], A[I + 2]……...A[N] in sublist 2
3. Repeat steps 1 & 2 for sublist 1 & sublist 2 till A[ ] is a sorted list.
As can be seen, this algorithm has a recursive structure.
Step 2 or the 'divide' procedure is of at most importance in this algorithm. This is usually implemented as follows:
1. Choose A[I] the dividing element.
2. From the left end of the list (A[0] onwards) scan till an item A[R] is found whose value is greater than A[l]
3. From the right end of list [A[N] backwards] scan till an item A[L] is found whose value is less than A[I].
4. Swap A[R] & A[L].
5. Continue steps 2, 3 & 4 till the scan pointers cross. Stop at this stage.
6. At this point sublist1 & sublist2 are ready.
7. Now do the same for each of sublist1 & sublist2.
We will now give the implementation of Quicksort and illustrate it by an example.
Quicksort (int A[], int X, int I)
{
int L, R, V
1. If (IX)
{
2. V= A[I], L = X-I, R = I;
3. For (;;)
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 260
{
4. While (A[ + + L] V);
5. While (A[--R] V);
6. If (L = R) /* left & right ptrs. have crossed */
7. break;
8. Swap (A, L, R) /* Swap A[L] & A[R] */
}
9. Swap (A, L, I)
10. Quicksort (A, X, L-1)
11. Quicksort (A, L + 1, I)
}
}
Quicksort is called with A,I, N to sort the whole file.
Example:
Consider the following list to be sorted in ascending order. 'ADD YOUR
MAN'.
(Ignore blanks)
N = 10
A[ ]=
Quicksort (A, 1, 10)
1. l0 >1
2. V= A[10] = 'N'
L =I-1= 0
R = I= 10
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 261
4. A [4]=‟Y‟>V; therefore, L=4
5. A [9]=‟A‟ <V; therefore, R=9
6. L < R
8. SWAP (A, 4,9) to get
4. A[5] =‟O‟ > V; Therefore L =5
5. A[8] -'M' < V; Therefore R =8
6. L<R
8. SWAP (A, 5,8) to get]
4. A[6]=‟U‟ >V;..L=6
5. A[5]=‟M‟<V;R=5
6. L<R,..break.
9. SWAP (A,6,10) to get
At this point „N‟ is in its correct place.
A[6], A[1] to A[5] constitutes sublist1.
A[7] to A[10] constitutes sublist2. Now
10. Quicksort (A,1,5)
11. Quicksort (A,6,10)
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 262
The Quicksort algorithm uses the O(N Log2 N).comparisons on average.
The performance can be improved by keeping in mind the following points.
1. Switch to a faster sorting scheme like insertion sort when the sublist size becomes comparatively small.
2. Use a better dividing element I in the implementations; We have always used A[N] as the dividing element. A useful method for the selection of a dividing element is the Median-of three method.
Select any 3 elements from the list. Use the median of these as the dividing element.
Example : C program to sort the numbers in ascending order using quick sort.
#include<stdio.h>
/* Function to partition the array for quick sort*/
int partition(int a[],int low, int high)
{
int i,j,temp,key;
key=a[low];
i=low+1;
j=high;
while(1)
{
while(i<high&&key>=a[i])
i++;
while(key<a[j])
j--;
if(i<j)
{
temp=a[i];
a[i]=a[j];
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 263
a[j]=temp;
}
else
{
temp=a[low];
a[low]=a[j];
a[j]=temp;
}
}
}
/* function to sort the numbers in ascending order using quick sort */
void qucksort(int a[], int low, int high)
{
int j;
if(low<high)
{
j=partition(a,low,high); /* partion the array into 2 subtables */
quicksort(a,low,j-1) /* Sort the left part of the array */
quicksort(a,j+1,high); /* Sort the right part of the array */
}
}
void main()
{
int I,n,a[20];
printf(“Enter the value for n \n”);
scanf(%d”,&n);
printf(“Enter the number to be sorted \n);
for(i=0;i<n;i++)
scanf(%d”, &a[i]);
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 264
quicksort(a,0,n-1);
printf(“The sorted array is \n”);
for(i=0;i<n;i++)
printf(“%d\n”,a[i]);
}
9.5.6 Tree Sort
The use of a tree to sort an array is also a departure from the sorts that we have looked at so far. While all of those earlier sorts did their work 'in place', a tree sort requires additional space for a tree to be constructed. Our goal, in every case, has been to end up with the original array in sorted order. To do that with a tree sort we must copy the data from the tree back into the array by doing an in-order traversal of the completed tree.
A tree sort proceeds by visiting each element of the array and adding it to an ordered binary tree. The obvious disadvantage to this is the additional space required to hold the tree. When all of the elements of the array have been added to the tree, we walk the tree and repopulate the array in sorted order.
In the worst case, the original array is already in sorted order. If this happens, then for each element of the array we will end up adding that element as a leaf on a tree that is really a linked list. In this worst case a tree sort is O(n2). In practice this seldom happens, so tree sort is presumed to be O(n Log n). This is a good example of a typical space/time trade-off.
Heap Sort
A Heap sort is an efficient sort. It Is also, perhaps, the most difficult to understand. A heap sort can be done 'in place'. That is, it can be done in an array without creating any additional data structure. Recall that a tree sort has to build a tree as it proceeds. A heap sort does not have to create an actual heap (tree) representation. It can simply rearrange the array into a heap representation.
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 265
The representation of a heap using an array depends on the fact that a tree representing a heap is a complete tree. There are no gaps between the leaves on the bottom level. The first element of the array is the root of the heap. The second and third elements of the array are level 1 of the heap. The third, fourth, fifth, and sixth elements are level 2 of the heap, and so on. If the heap is not a full tree, there will be some elements missing from the part of the array that represents the final level of the heap; but these will be on the end of the array. There will be no gaps.
A heap sort can be accomplished by heapifying the original, unsorted array. Then select the root of the heap, which must be the largest element. Swap this value with the last element in the heap and reheapify .Notice that the heap portion of the array shrinks as you do this. The array is partitioned into the sorted portion, at the end, and the heap portion, at the beginning. The sorted portion grows and the heap portion shrinks until only the entire array is sorted.
Heap sort is an O(n log n) sort. It is especially useful for very large data sets of specialized data.
Self Assessment Questions
1. Discuss the Insertion sort with suitable example.
2. Write a C program to arrange the numbers in ascending order using bubble sort technique.
3. Write algorithm for Selection sort.
4. Write the advantages of Quick Sort
9.6 External Sorts
9.6.1 Merge Sort
Like the other sorts we have seen, we will look at merge sort (and later at radix sort) In terms of a simple Internal array. These sorts, though, are really
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 266
only useful when they are applied to large external data structures. Merge sort is a very old sort. It was used extensively when the primary external data store was magnetic tapes. It actually takes its name from the action of merging 2 or more sorted tapes to make a single sorted tape. In fact, there are a variety of sorts known a merge sorts. They have names such as balanced merge, natural merge, and polyphase merge; and are all variations on the same theme.
A true merge sort is a two part process. The first phase is called the distribution phase and the second is called the merge phase. These two phases may be alternated several times before the sort is complete. In the course of the distribution phase, data elements must be written to a new array (or a new file). As a consequence, merge sorts (all of them) require at least 2n space.
In a distribution phase elements from the original (or current) array are written into a new array (or arrays) such that the new array(s) are individually sorted, This can be done in several ways. One way is to select items from the original array and place them into bucket arrays. There might be a bucket to hold values 0-10, another for values 11-20, and so on. The buckets can be sorted independently, then passed to a merge phase to recombine them into a final sorted file.
In another approach to the distribution phase, values are taken from the original array until there is a drop down (i.e. an out of order value is encountered). The new partial
An array is then merged with what remains of the original array. In this fashion, each distribution/merge phase guarantees that one element is sorted. This approach, however, may require up to n distribution/merge phases. Nonetheless, merge sorts are O(n log n), with the drawback that they need extra space to operate.
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 267
9.6.2 2-Way Merge Sort
Merge sort is also one of the 'divide and conquer' class of algorithms. The basic idea into this is to divide the list into a number of sublists, sort each of these sublists and merge them to get a single sorted list. The recursive implementation of 2- way merge sort divides the list into 2 sorts the sublists and then merges them to get the sorted list. The illustrative implementation of 2 way merge sort sees the input initially as n lists of size 1. These are merged to get 0/2 lists of size 2. These n/2 lists are merged pair wise and so on till a single list is obtained. This can be better understood by the following example. This is also called CONCATENATE SORT.
We give here the recursive implementation of 2 Way Merge Sort
Mergesort (int List[ ], int, low, int high)
{
int mid;
1. Mid = (low + high)/2;
2. Mergesort (LIST, low, mid);
3. Mergesort (LIST, mid + 1, high);
4. Merge (low, mid, high, List, FINAL)
}
Merge (int low, int mid, int high, int LIST [ ], int FINAL)
{
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 268
int a, b, c, d;
a = low b = low c = mid + 1
While (a < = mid and c < = high) do
{
If LIST [ a] < = LIST [c] then
{
FINAL [b] = LIST [a]
a=a+1
}
else
{
FINAL [b] = LIST [c]
c=c+1
}
b=b+1
}
If (a > mid) then
for d = c to high do
{
B [bl = LIST [d]
b = b+1
}
Else
for d = a to mid do
{
B[b] = A[d]
b = b+l.
}
}
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 269
To sort the entire list, Mergesort should be called wit h LIST, 1, N.
Mergesort is the best method for sorting linked lists in random order. The total computing time is of the 0(0 log2 n).
The disadvantage of using mergesort is that it requires two arrays of the same size and type for the merge phase. That is, to sort and list of size n, it needs space for 2n elements.
9.7 Summary
Sorting is a process of arranging a series in either ascending or descending order of their values/keys. Sorting a set of data assumes, first of all, that each element in the set has a value that we can use as the sort key, the field on which we will sort. In context to this, we discussed the various sorting techniques covers the internal and external sorting techniques with the their evaluating a Sorting Algorithms such as Insertion Sort, Bubble Sort, Selection Sort, Shell Sort, Quick Sort, Tree Sort, External Sorts, Merge Sort and 2-Way Merge Sort.
9.8 Terminal Questions
1. Discuss the different ideas which lead to Sorting Algorithms.
2. Explain what are the criteria to be used in evaluating a Sorting Algorithm?
3. Explain any two Internal Sorting with algorithm.
4. Write a „C‟ program to sort „N” numbers using insertion sort.
5. Write the algorithm and „C‟ program for sorting the numbers in ascending order using quick sort technique.
6. Write note on:
a) Heap sort
b) Merge sort
Data Structures using „C‟ Unit 9
Sikkim Manipal University Page No.: 270
References :
1. Y. Langsam, M. J. Augenstein and A. M. Tannenbaum, “Data Structures Using C and C++", Prentice-Hall India, Second Edition.
2. R. Kruse, "Data Structures and Program Designs", Prentice-Hall India, Third Edition.
3. R. F. Gilberg, “Data Structures: A Pseudo code Approach with C", Thomson Learning.
4. Tremble and Sorenson,” Data Structures and Algorithms", Tata McGraw-Hill.
5. M. A. Weisis, "Data Structures and Algorithm Analysis in C++", Addison Wesley, Low Price Edition.

No comments:

Post a Comment