KBSE30

Merge Sort

Analysis of Algorithms
Merge sorting is an effective sorting algorithm based on merge operations. This algorithm is a very typical application using the divide and conquer method (Divide and Conquer). Combine existing ordered subsequences to obtain a completely ordered sequence; that is, first make each subsequence in order, and then make the subsequences in order. If two ordered lists are merged into one ordered list, it is called two-way merge.
The basic idea:
First divide the array into two sub-arrays recursively, until there is only one element in the array, and then call the function to sort the two sub-arrays, because this function will be pushed onto the stack when recursively dividing the array, so this function really The function is to sort two ordered sub-arrays;
The basic steps:
1. Judging the validity of the parameters, that is, the recursive exit;
2. No matter what, the array is divided into two sub-arrays directly;
3. The function of dividing the array is called recursively, and finally divided into the array with only one element, which also means that the array is ordered;
4. Then call the sort function to merge the two ordered arrays into one ordered array;
5. The steps of the sorting function, compare the elements of the two arrays, and store the large/small elements in a temporary array. If the elements of one array are taken out, then directly put the elements of the other array To the temporary array, and then copy the elements in the temporary array to the actual array;

Implementation code

#include<stdio.h>
 
 #define LEN 12 // Macro defines the size of the array
 static int tmp[LEN] = {0};// Set temporary array
 
// print array
 void print_array(int *array)
 {
     int index = 0;
     printf("\narray:\n");
     for (; index <LEN; index++){
         printf(" %d, ", *(array + index));
     }
     printf("\n");
 }
 
// Sort two ordered arrays into one array
 void _mergeSort(int *array, int start, int middle, int end)
 {
    int first = start;
    int second = middle + 1;
    int index = start;
    while ((first <= middle) && (second <= end)){
        if (array[first] >= array[second])
            tmp[index++] = array[second++];
        else
            tmp[index++] = array[first++];
    }
    while(first <= middle) tmp[index++] = array[first++];
    while(second <= end) tmp[index++] = array[second++];
 
    for (first = start; first <= end; first++)
        array[first] = tmp[first];
 }
 
// Recursively divide the array
 void mergeSort(int *array, int start, int end)
 {
     if (start >= end)
         return;
     int middle = ((end + start) >> 1);
     mergeSort(array, start, middle);// Recursively divide the left array
     mergeSort(array, middle+1, end);// Recursively divide the array on the right
     _mergeSort(array, start, middle, end);// Combine two ordered arrays into an ordered array
 }
 
 int main(void)
 {
     int array[LEN] = {2, 1, 4, 0, 12, 520, 2, 9, 5, 3, 13, 14};
     print_array(array);
     mergeSort(array, 0, LEN-1);
     print_array(array);
     return 0;
 }



Analyze the above code: In fact, the above code mainly consists of two functions, the first is the function of dividing the array, and the second is the merging function of merging two ordered arrays; here we need to use a temporary array, some people are in Apply for a dynamic array in the main function, and then let all recursive calls use the array; some people apply for a temporary array in the merge function; and my method is to define a global temporary array; in fact, I feel that these methods are all The same is the same, because whether it is a dynamic array or a global static array, a copy of data will be saved when the sort function is called by recursive release; if a temporary array is defined in the sort function, it should be the same as the previous method, because it is a local temporary array , Stored in the stack space, when the function is called, it will be released immediately. So I personally feel that these three methods are similar (if it is a temporary array defined in the merge function, all need to be pushed onto the stack; while the others only need to be pushed into the space occupied by useful data)

Time Complexity

Time complexity analysis of merging: Mainly consider the time cost of two functions. First, the array division function mergeSort(); Second, the ordered array merge function _ mergeSort().
The time complexity of the _mergeSort() function is O(n), because the code has 2 loops of length n (non-nested), so the time complexity is O(n);
A simple analysis of the time T[n] consumed by the merge sort with element length n: call the mergeSort() function to divide the two parts, then the time spent on sorting each small part is T[n/2], and finally The time taken by the _mergeSort() function to merge the two ordered arrays into one ordered array is O(n);

    Formula: T[n] = 2T[n/2] + O(n);

    The formula is not deduced carefully, you can refer to the following: Quick sort of sorting algorithm and the derivation of time complexity in its time complexity and space complexity;

    So the result is: T[n] = O( nlogn)

    Because these steps are required no matter what the element is in, the time spent is constant, so the optimal time complexity, worst time complexity and average time complexity of the algorithm are the same as: O (nlogn ); Someone seems to say that the worst time complexity is not O(nlogn). 

Space complexity
The space complexity of the merge is the space occupied by the temporary array and the data pushed onto the stack during recursion: n + logn; so the space complexity is: O(n)

Trade time for space
I saw that many blogs on the Internet share the merge sort method whose space complexity is only O(1); because the space consumed by the traditional merge sort is mainly in the merge function (combining two ordered functions into one ordered function) , So if you want to make the time complexity O(1), then you can only make a fuss about the merge function. The code is not listed. The main idea is to use quick sort (in fact, it is equivalent to the merge function is replaced by the quick sort function); although this method can reduce memory consumption, it will bring time loss. Because this time complexity has become O(n^2); so this method is not an idea that has the best of both worlds;

Conclusion
Although merge sort is relatively stable, it is also very effective in time (the worst time complexity and optimal time complexity are both O(nlogn)), but this kind of algorithm consumes space, and generally speaking, internal sorting will not be used This method, but use quick sort; external sort will only consider the use of this method.

Leave a Reply

Your email address will not be published. Required fields are marked *