Talk:Tree sort
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||||||||||||||||
|
Merge and Redirect to Binary Search Tree
editI would suggest merging and redirecting Tree sort to binary search tree. The information on Tree sort is already covered in binary search tree under the Operations section. It seems unnecessary to have a separate entry for this topic. 300user (talk) 04:57, 13 February 2008 (UTC)
Big O Notation
editAfter looking up this page, I noticed the article does not discuss speed, so I asked my professor. He told me it was O(n log n) so I put that in the article. After playing with the code myself, it seems to me to be more like O(n) - I'm going to defer to him and leave the main page as is, but if someone with more knowlege of algorithm speed knows exactly how fast tree sort runs would like to double check me and my professor, I'd appreciate it.
- Binary_search_tree has Θ(n^2) for worst case. Maybe your professor was referring to the average performance and not the worst-case performance. Rōnin (talk) 20:10, 7 April 2009 (UTC)
- It would seem from the article that O(n^2) is the worst case for Binary Search Trees and O(n log n) for balanced trees.Rōnin (talk) 20:32, 7 April 2009 (UTC)
- I agree, and will update the page accordingly. Performance depends on whether an unbalanced or a balanced (binary) tree is used, balanced performing better in O-terminology. Rijk J.C. van Haaften (talk) 09:58, 15 April 2009 (UTC)
- It would seem from the article that O(n^2) is the worst case for Binary Search Trees and O(n log n) for balanced trees.Rōnin (talk) 20:32, 7 April 2009 (UTC)
Are there any references suggesting that this should be O(n log n) average, rather than just best case? The average case would be interesting. --Thomasda (talk) 14:13, 12 November 2010 (UTC)
I'm curious how the best case O(n) runtime was arrived at. As the article mentions, constructing a tree requires inserting n elements, with each insertion taking log(n) time. I don't think you can construct a BST in less than O(n log n), which should place a bound on the search algorithm speed. Igmcdowell (talk) 18:58, 9 June 2011 (UTC)
- Not all insertions necessarily have to take Θ(log(n)) time. Suppose for example that the input is already sorted. Then you'll be inserting the elements into a tree in a sorted order. If you use an unbalanced tree structure (or a splay tree, if you want to get O(n log n) time in the worst case), then each insertion takes O(1) time - you create a new node and attach previous tree to it as a subtree. -- X7q (talk) 19:51, 9 June 2011 (UTC)
- Okay, suppose the input is already sorted and the resulting Binary Search Tree looks like a linked list (i.e. a degenerate case). How long does it take to build that linked list, using the normal Binary Search Tree insertion algorithm? Answer: O(n^2) time, not O(n). You can't just make up a new BST insertion algorithm... or can you? Suppose you check the input first, to see if it's already sorted (and therefore we can switch to a different BST insertion algorithm). Then we don't need to worry about building a BST; the job-to-be-done, i.e. sorting, does not need to be done: the input is already sorted. In fact, we could modify all sorting algorithms to first check if the input is sorted, resulting in an O(n) best-case time complexity for every sort algorithm (modified in this way)! But that's not what's done. In fact, one normally uses tree sort to put incoming data, from a stream, into a BST: one doesn't yet have all the input data. In summary, I think the best case time complexity for tree sort is O(n log n), which is the time complexity to build a balanced BST (using the normal BST insertion algorithm). I would appreciate it if someone checks this and repairs the page accordingly. Drttm (talk) 21:42, 7 August 2013 (UTC)
- I totally agree with Drttm argument here; it does not seem possible to build a degenerated tree in linear time, simply because you always start from the root (so you end up spending 1 step, 2 steps, 3 steps, etc ... ending up with the sum of the first n numbers, which is n*(n+1)/2 ~ O(n^2)). I am using CLRS book for a course now, and on the chapter 12 about Binary Search Trees it states that all basic operations, including insertion, run in O(h); where "h" is the height of the tree ( ~ lg(n) ). So claiming that an insertion can take O(1), seems to contract such result ... and I would not dare to contradict Cormen et al, would you? ;-|
- Furthermore, putting aside the trivial case of adding an already-sorted-check in the algorithm, if we claim that a comparison-based algorithm like TreeSort can do O(n), we may sound counter-intuitive for someone studying the cited book (CLRS). This is because, prior introducing non-comparison algorithms that can actually achieve linear time, authors present Theorem 8.1; the theorem states that any comparison sort algorithm belongs to Omega(n * lg(n)). Though the result is presented for the worse case scenarios, the bottom line sounds like:
- a) You can not sort in linear time using comparisons-based algorithms, unless you add the check to see if data is already sorted (but that check does not really count as sorting, I think). Some algorithms may not require the check, because they have it already built in, like InsertionSort.
- b) If you want to achieve linear time, you need to use other types of algorithms which are not based on comparisons (like CountSort or RadixSort).
- Stating the opposite to the points above, will most likely confuse students from CS which are checking CLRS or similar literature.
- Thus, I think we must reconsider in putting away that claim (that there are cases where TreeSort can do O(n)).
- Thanks. — Preceding unsigned comment added by Dario.bahena (talk • contribs) 03:23, 29 September 2014 (UTC)
- I slapped on some {{citation needed}} templates for the best case performance. It doesn't seem right to me that the balanced case should be more expensive: the Day–Stout–Warren algorithm can balance a tree in linear time, after which you can use, e.g., AVL tree operations to keep it balanced. In fact the C++ standard mandates that
std::set
construction takes linear time for sorted inputs, and this is commonly implemented using red-black trees. QVVERTYVS (hm?) 09:49, 29 September 2014 (UTC)
- I slapped on some {{citation needed}} templates for the best case performance. It doesn't seem right to me that the balanced case should be more expensive: the Day–Stout–Warren algorithm can balance a tree in linear time, after which you can use, e.g., AVL tree operations to keep it balanced. In fact the C++ standard mandates that
C++ not illuminating
editThe C++ code example is not illuminating in the slightest. Most of the sort pages show implementations of the *idea*, while this C++ code is just... "call the library function." Terrible. —Preceding unsigned comment added by 209.6.209.152 (talk) 19:59, 23 January 2010 (UTC)
I removed it. -- C. A. Russell (talk) 10:43, 19 August 2010 (UTC)
- I disagree. The C++ version that was on there was the only version on the page that performed the efficient O(n log n) sort. The very description of this sorting algorithm calls for the use of a particular data structure, in this case a self-balancing binary search tree. So the example does just that -- use the data structure. How such a data structure might be implemented internally is not relevant, and is described by other articles. This is called abstraction -- we treat it as a black box, knowing what this data structure guarantees us. The brevity of the code is just a reflection of simplicity of the description of this sorting method. --71.141.136.107 (talk) 08:17, 20 August 2010 (UTC)
I agree that the C++ snippet doesn't add anything to the article. Showing that it can be done in C++ using STL doesn't provide any useful demonstration of the algorithm. 173.188.180.118 (talk) 03:52, 19 November 2010 (UTC)
I agree with the people who don't want it there. It doesn't actually show how the algorithm works, and thus doesn't really belong on the page. Rōnin (talk) 02:37, 20 November 2010 (UTC)
- I agree that the C++ code shown is not useful here. The purpose of code samples should be to show how the algorithm could be implemented, not how to use a library. --Joshua Issac (talk) 02:10, 11 March 2011 (UTC)
Best case of O(n) time, O(1) space
editIt just struck me that the best case time and space complexity are actually O(1), if a tree is used that counts duplicates (i.e. using a key-value BST instead of a dynamic set one, and using the value as a counter). As this is OR, I haven't added it to the article. QVVERTYVS (hm?) 14:20, 9 January 2014 (UTC)
- If we assume that each insert and each retrieving operation takes a constant amount of time, c, then calling n inserts and n retrieve operations on n elements will take (n+n)*c. Thus, even in that ideal case, tree sort is O(n) in the best case.
- If what you're suggesting involves placing the elements in some special data structure before sorting them, then that operation, too, is at least O(n), since you're processing each one in some way or another. Rōnin (talk) 18:10, 9 January 2014 (UTC)
- Facepalm O(1) time per insertion, of course, so O(n) total. Sorry. QVVERTYVS (hm?) 22:04, 9 January 2014 (UTC)
Why does this binary tree sort not work?
editI have been struggling with this program for too long. Please look and tell me what I am doing wrong:
// BT.CPP /** Traverse a created binary tree and present in sorted order. */ #include <stdio.h> struct node { int key_value; node *left; node *right; }; void printInorder(struct node *leaf); class btree { public: btree(); ~btree(); void insert(int key); node *search(int key); void destroy_tree(); private: void destroy_tree(node *leaf); void insert(int key, node *leaf); node *search(int key, node *leaf); void printInorder(); void printInorder(int key, node *leaf); node *root; }; btree::btree() { root = NULL; } btree::~btree() { destroy_tree(); } void btree::destroy_tree(node *leaf) { if (leaf != NULL) { destroy_tree(leaf->left); destroy_tree(leaf->right); delete leaf; } } void btree::insert(int key,node *leaf) { if (key < leaf->key_value) { if (leaf->left != NULL) { insert(key,leaf->left); } else { leaf->left = new node; leaf->left->key_value = key; leaf->left->left = NULL; leaf->left->right = NULL; } } else { if (leaf->right != NULL) { insert(key,leaf->right); } else { leaf->right= new node; leaf->right->key_value = key; leaf->right->left = NULL; leaf->right->right = NULL; } } } node *btree::search(int key, node *leaf) { if (leaf != NULL) { if (key == leaf->key_value) { return leaf; } if (key < leaf->key_value) { return search(key, leaf->left); } else { return search(key, leaf->right); } } return leaf; } void btree::insert(int key) { if (root != NULL) { insert(key, root); } else { root = new node; root->key_value = key; root->left=NULL; root->right=NULL; } } node *btree::search(int key) { return search(key, root); } void btree::destroy_tree() { destroy_tree(root); } void btree::printInorder() { printfInorder(root); } void btree::printInorder(int key, node *leaf) { if (leaf != NULL) { search(key,leaf->left); printf("%d, ",leaf->key_value); search(key,leaf->right); } return leaf; } /** using data from page 45 of book */ int MyData[12] = {128,76,106,402,100,46,354,1018,112,28,396,35}; int main() { int i; btree MyBtree; printf("The original order:\n\n"); for (i=0;i<12;i++) { printf("%d, ",MyData[i]); MyBtree.insert(MyData[i]); } /** So now, how do I get it to print in sorted order? */ printf("The sorted order:\n\n"); MyBtree.printInorder(); return 0; }