INTRODUCTION |
Computers are far more than pure calculating machines for simply crunching numbers. Rather, it is known that a considerable portion of the processing time of all computers active worldwide is used for sorting and searching data. It is therefore important that a program not only provides the correct data, but is also optimized. This applies to:
Basically, one should already think of optimizing in the beginning of the development process, since it is usually difficult to optimize a sloppily written program afterwards. In this chapter you will examine the optimization of the execution time when sorting data. You will also get to know the limits of computer science and computer use because a problem, for which there is probably an algorithmic solution process but even the fastest computer would need hundreds of years to solve, is considered to be an unsolvable problem.
|
SORTING LIKE CHILDREN: CHILDREN SORT |
The sorting and ordering of a set of objects with the comparative operations larger, smaller and equal [more... In mathematics this is called an order relation] is and remains a standard task in computer science. Although you will find library routines in all common high-level programming languages that you can sort with, you should include the concepts of sorting to your standard knowledge because there are always situations where you need to implement or optimize the sorting yourself. A collection of unsorted objects is referred to as a set. The objects are saved in a one-dimensional data structure in the computer where a list is especially well suited [more... Earlier arrays were often used, but have the disadvantage of a fixed length]. In the program, consider the dwarfs to be actors of the game library JGameGrid. You can easily display their sprite images in a grid [more... With today's programming languages it is possible without significant additional effort to sort interesting objects as numbers]. The height of the sprite image (in pixels) also serves as a measure of the body size. Algorithms are often borrowed directly from processes that we also apply in your everyday life. If you ask children how they organize a set of objects by size, they often describe the process as follows: "You take the smallest (or biggest) object and set it down in order". This solution process seems very plausible, but there is a problem for the computer because it can not determine the smallest or biggest object like humans can with just a simple glance. It must first look for the smallest object in the unordered list by running through all the objects in order and comparing them. To implement the sorting process, in this case called children sort, you need a function getSmallest(row) which returns the smallest dwarf from the reviewed list. You can start as follows: Save the first list element in the variable smallest and run through all subsequent elements in a for-loop. If the element you are looking at is smaller than smallest, replace smallest with it. You use two lists for children sort, one list startList with the given objects and another list targetList which is initially empty. You search for the smallest element in startList, remove it from there, and then add it to the back of targetList until startList is empty. from gamegrid import * from random import shuffle def bodyHeight(dwarf): return dwarf.getImage().getHeight() def updateGrid(): removeAllActors() for i in range(len(startList)): addActor(startList[i], Location(i, 0)) for i in range(len(targetList)): addActor(targetList[i], Location(i, 1)) def getSmallest(li): global count smallest = li[0] for dwarf in li: count += 1 if bodyHeight(dwarf) < bodyHeight(smallest): smallest = dwarf return smallest n = 7 makeGameGrid(n, 2, 170, Color.red, False) setBgColor(Color.white) show() startList = [] targetList = [] for i in range(0 , n): dwarf = Actor("sprites/dwarf" + str(i) + ".png") startList.append(dwarf) shuffle(startList) updateGrid() setTitle("Children Sort. Press <SPACE> to sort...") count = 0 while not isDisposed() and len(startList) > 0: c = getKeyCodeWait() if c == 32: smallest = getSmallest(startList) targetList.append(smallest) startList.remove(smallest) count += 1 setTitle("Count: " + str(count) + " <SPACE> for next step...") updateGrid() setTitle("Count: " + str(count) + " All done")
|
MEMO |
With children sort, besides the given unordered list of length n, you need a second list that eventually also has the length n. If n is very large this could lead to a memory problem. [more...
Sorting algorithms, which require a second data structure of the same length, are called You can easily figure out how many elementary steps are required to solve the problem: Regardless of how the objects are arranged in the given list, you first need to run through all n elements of the list, then through n-1, etc. In addition to this, you have to perform the operation of moving the found element from the start list to the target list each time. The number of operations c is therefore the sum of all natural numbers from 2 to n + 1 as you can see with your variable count. Using the formula for sums of natural numbers we get:
For example, when n = 1000 we already get
steps! As you can see, the quadratic term prevails for large n values and this is why we say
|
SORTING THE CARD GAME: INSERTION SORT |
When you hold the cards in a fan shape while playing cards, you often intuitively use another sorting method: You add each newly obtained card to your 'fan' in a particular place where it fits best according to its value. In your program that inserts the disordered cards from the start list (the deck) into the target list (your hand), you proceed exactly in this way: You take card by card from left to right from the start list and run though the already ordered target list from the left to right as well. As soon as the card you picked up is considered to be higher in value than the last one considered in your hand, you add it to the target list right there. from gamegrid import * from random import shuffle def cardValue(card): return card.getImage().getHeight() def updateGrid(): removeAllActors() for i in range(len(startList)): addActor(startList[i], Location(i, 0)) for i in range(len(targetList)): addActor(targetList[i], Location(i, 1)) n = 9 makeGameGrid(n, 2, 130, Color.blue, False) setBgColor(Color.white) show() startList = [] targetList = [] for i in range(0 , 9): card = Actor("sprites/" + "hearts" + str(i) + ".png") startList.append(card) shuffle(startList) updateGrid() setTitle("Insertion Sort. Press <SPACE> to sort...") count = 0 while not isDisposed() and len(startList) > 0: getBg().clear() c = getKeyCodeWait() if c == 32: pick = startList[0] # take first startList.remove(pick) i = 0 while i < len(targetList) and cardValue(pick) > cardValue(targetList[i]): i += 1 count += 1 targetList.insert(i, pick) count += 1 setTitle("Count: " + str(count) + " <SPACE> for next step...") updateGrid() setTitle("Count: " + str(count) + " All done")
|
MEMO |
This sorting method is called insertion sort. The number of steps necessary depends on the order of the cards in the deck. The most steps are needed in a situation where the deck of cards is coincidentally ordered in reverse. One can reflect about it, or find out with a computer simulation, that the number of steps on average (for large n) increase with n2 / 4 , and thus the average complexity is also O(n2), as with children sort. |
SORTING WITH BUBBLES: BUBBLE SORT |
A known way to sort objects in a list is to repeatedly run through the list from left to right and to always swap two adjacent elements if they are in the wrong order. With this method, first the largest element moves successively from left to right until it has arrived in the final cell. In the next pass you start again on the left, but only go up to the second to last element since the largest is already in place. No second list is necessary in this process [more... We call such a sorting process in-place or in-situ].from gamegrid import * from random import shuffle def bubbleSize(bubble): return bubble.getImage().getHeight() def updateGrid(): removeAllActors() for i in range(len(li)): addActor(li[i], Location(i, 0)) def exchange(i, j): temp = li[i] li[i] = li[j] li[j] = temp n = 7 li = [] makeGameGrid(n, 1, 150, Color.red, False) setBgColor(Color.white) show() for i in range(0 , n): bubble = Actor("sprites/bubble" + str(i) + ".png") li.append(bubble) shuffle(li) updateGrid() setTitle("Bubble Sort. Press <SPACE> for next step...") k = n - 1 i = 0 count = 0 while not isDisposed() and k > 0: getBg().fillCell(Location(i, 0), makeColor("beige")) getBg().fillCell(Location(i + 1, 0), makeColor("beige")) refresh() c = getKeyCodeWait() if c == 32: count += 1 bubble1 = li[i] bubble2 = li[i + 1] refresh() if bubbleSize(bubble1) > bubbleSize(bubble2): exchange(i, i + 1) setTitle("Last Action: Exchange. Count: " + str(count)) else: setTitle("Last Action: No Exchange. Count: " + str(count)) getBg().clear() updateGrid() if i == k - 1: k = k - 1 i = 0 else: i += 1 getBg().clear() refresh() setTitle("All done. Count: " + str(count))
|
MEMO |
The largest elements move over to the right, just like bubbles move up in water. Because of this, the name of this sorting algorithm is bubble sort. You can think about it, or check the built-in step counter, its complexity is independent of the arrangement of the elements in the specified list, but again of the order O(n2). To make the demonstration a bit more exciting, both cells whose bubbles were compared last are colored using the background method fillCell(). The background color can be cleared with getBg().clear(). You have to call refresh() so that the image is re-rendered on the screen. |
SORTING WITH LIBRARY ROUTINES: TIMSORT |
Since sorting is one of the most important tasks, all high-level programming languages provide built-in library functions for sorting. In Python, the function sorted(list, cmp) even belongs to the standard built-in functions, which means that it can be used without having to import anything. You can thus save yourself from having to write a sorting algorithm, but in order to do that you have to learn how the library function is used. Clearly it need the list to be sorted as a parameter. With a second parameter, you have to tell it according to which classification criterion it should sort the objects. You define the sorting criterion in a function which here you call compare(). This function has to obtain two objects as parameters and return one of the three values 1, 0, and -1, depending on whether the first object is greater, equal to, or less than the second object. You pass the library function sorted() your freely chosen function name as a second parameter or using the named parameter cmp. from gamegrid import * from random import shuffle def bodyHeight(dwarf): return dwarf.getImage().getHeight() def compare(dwarf1, dwarf2): if bodyHeight(dwarf1) < bodyHeight(dwarf2): return -1 elif bodyHeight(dwarf1) > bodyHeight(dwarf2): return 1 else: return 0 def updateGrid(): removeAllActors() for i in range(len(li)): addActor(li[i], Location(i, 0)) n = 7 li = [] makeGameGrid(n, 1, 170, Color.red, False) setBgColor(Color.white) show() for i in range(0 , n): dwarf = Actor("sprites/dwarf" + str(i) + ".png") li.append(dwarf) shuffle(li) updateGrid() setTitle("Timsort. Press any key to get result...") getKeyCodeWait() li = sorted(li, cmp = compare) updateGrid() setTitle("All done.")
|
MEMO |
If you want to use library functions to sort, you have to specify with a comparison function how two elements are compared to find out whether they are greater, equal or lesser [more...
For a list of numbers that you'd sort ascending, you need no comparison indicate, the usual The algorithm used in Python was invented by Tim Peters in 2002 and is thus called Timsort. It has (on average) the order O(nlog(n)). So, for example, when n = 106 only about 107 operations are necessary instead of about 1012 as it would be with a sorting algorithm with the order O(n2). |
EXERCISES |
|
ADDITIONAL MATERIAL |
OVERLOADING THE COMPARISON OPERATIONS |
Comparing two objects is an important operation. You can use the 5 comparison operations <, <=, ==, >, = > for numbers. In Python it is possible to apply these operators to some other data types, for example for dwarfs. Through this, your code gains elegance and clarity. You can do the following: In the class definition of your data type, define the methods _lt_(), __le__(), __eq__(), __ge__(), __gt__() that return the Boolean values of the comparison operations less, less-and-equal, equal, greater-and-equal, greater. In addition, you can also define the method __str()__, which is used when the function str() is called. However this has nothing to do with sorting. In the class Dwarf (derived from Actor), you also save the name of the dwarf as an instance variable that you can write out as a TextActor upon updateGrid(). from gamegrid import * from random import shuffle class Dwarf(Actor): def __init__(self, name, size): Actor.__init__(self, "sprites/dwarf" + str(size) + ".png") self.name = name self.size = size def __eq__(self, a): # == return self.size == a.size def __ne__(self, a): # != return self.size != a.size def __gt__(self, a): # > return self.size > a.size def __lt__(self, a): # < return self.size < a.size def __ge__(self, a): # >= return self.size >= a.size def __le__(self, a): # <= return self.size <= a.size def __str__(self): # str() function return self.name def compare(dwarf1, dwarf2): if dwarf1 < dwarf2: return -1 elif dwarf1 > dwarf2: return 1 else: return 0 def updateGrid(): removeAllActors() for i in range(len(row)): addActor(row[i], Location(i, 0)) addActor(TextActor(str(row[i])), Location(i, 0)) n = 7 row = [] names = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"] makeGameGrid(n, 1, 170, Color.red, False) setBgColor(Color.white) show() for i in range(0 , n): dwarf = Dwarf(names[i], i) row.append(dwarf) shuffle(row) updateGrid() setTitle("Press any key to get result...") getKeyCodeWait() row = sorted(row, cmp = compare) updateGrid() setTitle("All done.")
|
MEMO |
The use of comparison operators for arbitrary data types is not mandatory, but elegant. One says that the operators are overloaded. |