Talk:Sorting algorithm

Latest comment: 1 year ago by Bubba73 in topic tk pic
Former featured article candidateSorting algorithm is a former featured article candidate. Please view the links under Article milestones below to see why the nomination was archived. For older candidates, please check the archive.
Article milestones
DateProcessResult
August 30, 2004Featured article candidateNot promoted

How bad is this code

edit

template <class iterator> void gnome(iterator first, iterator last){ // O(n^2) iterator pos = first; while (pos <= last) { if ((pos == first) || ((*pos) >= *(pos - 1))) ++pos; else { std::iter_swap(pos, pos+1); --pos; } } }

Is this a genuine question about the quality of your code? If so, I'd probably suggest posting on StackExchange or something. — Preceding unsigned comment added by 2605:6000:1522:400B:10FB:405:731E:8F1B (talk) 15:59, 4 October 2017 (UTC)Reply

Slowsort

edit

I'm surprised at not finding slowsort here. While it is one of the less serious kind of algorithms, it is interesting since it is, however slowly, steadily progressing to its goal (as opposed to, say, bogosort). The algorithm:

  1. Find the maximum
    1. Find the maximum of the first half
    2. Find the maximum of the second half
    3. Find the largest of those maxima
  2. Sort the remainder

It is described at http://c2.com/cgi/wiki?SlowSort and was, according to that page, first published in the satyrical article "Pessimal algorithms and the simplexity of computations" by A. Broder and J. Stolfi, in ACM SIGACT News vol. 16 no. 7, pages 49--53, fall 1984. I haven't located that article yet, though. Both links given there are broken. —Preceding unsigned comment added by SQB (talkcontribs) 20:08, 26 July 2009 (UTC)Reply

Stupid sort

edit

Stupid sort - This article was copied from the deleted Wikipedia article which sat in the "Stupid sort" page. -- A perfect demonstration how stupidity proliferates in the internet with the help and andorsement of wikipedia :-) 23:25, 6 February 2010 (UTC)

Yes, the algorithm does seem to be a joke. I recall seeing it here before and wondered what had happened to it. But it seems the term is also used to refer variously to bogosort or to gnome sort. Currently, stupid sort redirects to bogosort.... — Smjg (talk) 02:26, 16 December 2011 (UTC)Reply

Why are worst case time complexities wrong?

edit

For example, the time complexity for bubble sort is listed as n2.

While its computational complexity (time + space) is on the order of n2 (2n2-2n), the worst case time complexity is not n2.

worst case time complexity (comparisons) for bubble sort is (n2/2) - (n/2). But they are all wrong. —Preceding unsigned comment added by 74.140.173.132 (talk) 05:57, 24 February 2010 (UTC)Reply

I don't understand what you say, because Ordo((n2/2) - (n/2)) = Ordo(n2). Any term less than n2 is overshadowed by n2, and the constant 1/2 is irrelevant. The complexity is a scalability function when the function compares to itself dealing with increasingly huge arrays. Rursus dixit. (mbork3!) 19:48, 6 July 2010 (UTC)Reply

the wrong is the thing —Preceding unsigned comment added by 203.110.243.22 (talk) 12:07, 20 November 2010 (UTC)Reply

There is nothing wrong with worst case time complexitiy

edit

(n2/2) - (n/2) IS O(n2)

Missing O()?

edit

Should it be O(n log n) instead n log n in comparison table? —Preceding unsigned comment added by Conan (talkcontribs) 00:06, 9 May 2010 (UTC)Reply

Quick sort error

edit

Comparison table states that its "memory" is log(n). But, as far as I know, quick sort is in-place sort. Even its own article said so: "...Coupled with the fact that quicksort is an in-place sort and uses no temporary memory...". —Preceding unsigned comment added by 85.222.169.174 (talk) 13:44, 4 June 2010 (UTC)Reply

As it says later in the quicksort article, memory is required for the stack frames which are implicit in the recursion. "No temporary memory" is just wrong. 67.162.90.113 (talk) 19:59, 2 November 2010 (UTC)Reply

Randomized Quiz Sort Algorithm

edit

In the worst case, quicksort sorts a group of items in O(N2), where N is the number of items. If randomization is incorporated into the algorithm, however, the chances of the worst case actually occurring become diminishingly small, and on average, quicksort has a runtime of O(N*Log(N)). Other algorithms guarantee a runtime of O(N*Log(N)), even in the worst case, but they are slower in the average case. Even though both algorithms have a runtime proportional to N*Log(N), quicksort has a smaller constant factor - that is it requires C*N*Log(N) operations, while other algorithms require more like 2*C*N*Log(N) operations —Preceding unsigned comment added by Tarun telang (talkcontribs) 10:31, 11 June 2010 (UTC)Reply

"Comparison of algorithms" table ordering

edit

I find it rather ironic that this table does itself not appear to be sorted! The "exchanging" methods are kept together and are ordered alphabetically, but other than that the order seems to be pretty random. I am in favour of sorting this table according to Method and then name, but other options such as by average case complexity and then name also seem fine to me. It may be like a bit of a plug that Timsort has been placed at the top of this table. There seems to be less reason for it to appear in the table compared to some algorithms that are currently omitted.

Secondly, I'd also like to see a best case column added to the table. Then it would make sense to add other algorithms such as Binary Insertion Sort whose best case differs from normal Insertion Sort, but the average and worst cases are the same. It would also make sense to hilight the fact that bubble sort (amoung others) can be implemeneted with an O(n) best case. Several other algorithms such as Bitonic Merge Sort, Bingo Sort, Odd-Even Transposition Sort, Strand Sort, Squeeze Sort, and Order Sort to name a few, could also be added.

Malcolm —Preceding unsigned comment added by 118.92.109.220 (talk) 03:01, 13 June 2010 (UTC)Reply

13 years later. I just sorted the table in a somewhat heuristic order of best to worst for general use. General use in this case means that you are generally concerned with speed or memory used when you have a large number of elements to sort. For small number of elements, unless you do it repetitively and a lot, nobody is really concerned with speed. So they are now ordered, first by best average, then best worst case, then best best case, then memory used, then stability. If your concern is code size or memory used, you can still reorder the list as you wish. Hope this helps. Now ordering the list in this way reveals that there may be some issues. If you compare Comb sort with Library sort for instance, they have identical best and worst limits, but very different averages. Is this really correct? Dhrm77 (talk) 19:05, 22 October 2023 (UTC)Reply

Issues with attached file: SortingAlgoComp.png

edit

The linked file "File:SortingAlgoComp.png" has a very poor quality of information. There are some spelling mistakes "Tourament" (missing 'n') and "Interger" (extra 'r') The description states: "This chart shows the complexity of different algorithms when sorting one million integers (from 1 to 1000).". However it is not clear where these numbers come from. Are they comparison counts? Are they swap (or assignment) counts? A combination of both of these measures? What was the initial ordering of the data? Over how many runs were these numbers averaged? How is it that so many of the algorithms all have exactly 1.99316x10^7 as the count? What does "Sorting networks" mean in this picture, and how is the result for it able to not be a whole number? How was bogosort measured given the ridiculously high number shown?

I believe that in its current form, this image adds no value to the article and should be removed. If the information is to be explained and kept then it should converted to a textual form instead.

Malcolm 118.92.109.220 (talk) 04:45, 13 June 2010 (UTC)Reply

I've just removed it. It is incoherent, and it is OR. But mostly it is incoherent. Apparently nobody cares for the quality of this article, that this was allowed to linger here for a year and a half at least, and God knows for how long really. Appalling. WillNess (talk) 17:39, 25 December 2011 (UTC)Reply

Fixing a lede sentence

edit

This sentence makes no sense to me: "Efficient sorting is important to optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to work correctly." I can't fix it because I don't know what it's trying to say. So I'm inviting others to work on this. If no fix is forthcoming, I will likely delete this sentence.

By the way, it might be helpful in the summary of each algorithm to describe its performance on various types of datasets (e.g., random, nearly sorted, reversed, few unique elements). MichaelBluejay (talk) 12:56, 22 June 2010 (UTC)Reply

--- How about " ...) which require input data to be in sorted lists."

Your suggestion about how sorts handle pathological cases is a good one, but I suspect it can quickly degrade into confusion. For instance, performance of quicksort on pre-structured data will depend a lot on how the pivot is chosen, and this decision is up to the implementer, and not really part of the quicksort definition. Many implementations have 'tweaks' to protect against pathological cases.

184.147.234.114 (talk) 02:27, 10 February 2013 (UTC)Reply

Are sorting networks truly stable sorts?

edit
 
A simple sorting network consisting of four wires and five connectors

Consider this sorting network as included in the Sorting network article. Let's put [2, 1a, 1b, 0] on the wires with 'a' and 'b' as markers for the initial order without them belonging to the sorting key. If this sorting network is stable, we can expect it to return [0, 1a, 1b, 2]. The way I see it, it just doesn't:

2, 1a, 1b, 0 is the initial configuration.
1b, 1a, 2, 0 results from the first comparison. 2 > 1, thus we swap.
1b, 0, 2, 1a results from the second comparison. 1 > 0, thus we swap.
0, 1b, 1a, 2 results from the next two comparisons. We swap both since 1 > 0 and 2 > 1.
0, 1b, 1a, 2 is the result. No swap in the last comparison since 1 = 1.

However, [0, 1b, 1a, 2] does not equal [0, 1a, 1b, 2], which was our expected result. From this counterexample follows that sorting networks are not stable sorts in the general case.

So… am I missing anything here? —Preceding unsigned comment added by 129.13.72.197 (talk) 11:23, 23 June 2010 (UTC)Reply

Agreed - there may be a subclass of networks which are stable, but I've not found any discussion of this; I've looked at various example diagrams and have found it easy to find counter-examples in each. Stability would almost certainly come at a cost in extra levels (and more expensive than just 'tagging' the values to force stability). Table entry "Yes" is either plain wrong, or misleading. 184.147.234.114 (talk) 02:05, 10 February 2013 (UTC)Reply

Quantum sort

edit

Quantum sorting is not faster than normal sorting (http://arxiv.org/abs/quant-ph/0102078) but might offer better time-space tradeoffs. Thus, Quantum sort does not need to have its own article. I propose merging the brief content from quantum sort into the sorting algorithm article. Comments? Please also comment on the quantum sort talk page so that we can delete that article and incorporate it into this one. --DFRussia (talk) 20:56, 29 June 2010 (UTC)Reply

Bubble sort

edit

The section Bubble sort claims:

This algorithm is highly inefficient, and is rarely used[citation needed], except as a simplistic example.

Looks like false to me. Because of the code brevity of the algorithm, it instead should be heavily used on pretty short lists in naive Q&D implementations where a sort is needed. Also it is often combined with quicksort and quickersort for when the to-be-sorted sublists are under a certain length, such as 3 or 4 elements. This use to improve the sorting speed. So the statement is dubious. Rursus dixit. (mbork3!) 19:57, 6 July 2010 (UTC)Reply

I don't think that's true. If quick & dirty is required, one can implement Quicksort in a single line as in [1]. For sorting of short sub-lists, I have observed variants on Insertion sort much more often than Bubble sort. For the case of finite lengths on those sub-lists, optimal compare-and-swap sequences are known for lists with a maximum length of at least 47 items, probably more. I think there's even a sequence on Sloane's for that, can't seem to find it right now. —Preceding unsigned comment added by 129.13.72.197 (talk) 16:17, 10 July 2010 (UTC)Reply

Joshua Kehn blog

edit

Is this sorting demo page an external link that should or should not be included in the article. Baltar, Gaius (talk) 05:06, 6 January 2011 (UTC)Reply

Discussion on this sorting algorithm page and its inclusion in the external links section.

I have undone the addition of this link because, IMO, it is blatant self promotion. Furthermore, I and at least one other editor report problems with the animations there crashing or hanging for substantial periods of time. Reyk YO! 06:48, 10 October 2010 (UTC)Reply

Seeing as this section was already there

If you want to include the links, bring the subject up on the article's talk page. Supply links to the existing discussion. State why you think the links are appropriate for WP. I'd like to see your argument why WP should tolerate links that hang a user's browser or denigrate the use of IE.

I believe these links are appropriate for WP because currently there is not enough diversity in the existing examples. You have one that constrains a viewer to a having the JRE installed, and another is non-interactive through the use of animated GIFs. From this StackOverflow question I found this sorting demo page.

The sorting demo is built on open standards, using HTML5's canvas (with support for IE) and JavaScript to display an interactive animated sort. This demo is compatible with almost all browsers and operating systems, including support for mobile devices such as Apple's iPhone, iPod, iPad and other mobile devices (namely Android based). Open standards allow for community improvement and investigation, rather then simply seeing a few bars move you can download the source and research it yourself. While I will concede that this is less of an issue in sort algorithms that have publicly implementations, it is a good path to follow and more conducive to student learning purposes.

In regards to the hanging a user's browser I find no issue with it. If you do then I'm sure contacting the author with more specific details will yield a fix. As far as denigrating the use of Internet Explorer, the most offensive statement I could find is as follows:

: IE users will notice a overall sluggishness due to IE not implementing native <canvas> support. You have my sympathies. 

Internet Explorer does not have native canvas support, and the author supports this browser through the use of a excanvas plugin that will reduce the speed at which the animation can be displayed.

Baltar, Gaius (talk) 20:17, 2 December 2010 (UTC)Reply

  • Oppose. The site is anti-IE - a stance that WP should avoid. Suggesting that people switch browsers is not reasonable. The requested EL has crashed other browsers. Author states bubble sort will cause browsers to go to lunch for a while. That other demos have problems/requirements does not justify including this one; two wrongs don't make a right. I have a HUGE ISSUE with the notion that a WP user follows a link and his browser locks up. We should not direct people to bad implementations. Despite JK's good intentions, we are not a source of β-testers. Glrx (talk) 21:14, 2 December 2010 (UTC)Reply

Suggesting that a user support standards compliant browsers is reasonable, and the fact that extra effort seems to be made to support Internet Explorer should be noted. The Author states Some sorts (bubble) are particularly intensive and not recommended for older / slower computers., not that will cause browsers to go to lunch for a while. There is a momentary stall considering the large amount of pre-computation required to smoothly render the bubble sort. Considering the source (JavaScript) one can assume that the browsers with faster JavaScript engines will perform better, this is not bias but rather pertinent information.

The argument that other demos have gaps means this one should not isn't valid. This demo is filling gaps left by others, and as such plays a vital role in offering the most options to users. Suggesting that this is beta software is incorrect. Baltar, Gaius (talk) 14:56, 6 December 2010 (UTC)Reply

Continuing. If there are not more opposed I would like inclusion of this link. Baltar, Gaius (talk) 14:56, 6 December 2010 (UTC)Reply

It does not work that way. You want to add some challenged material, so you must garner a WP:Consensus to make that addition. You don't have a consensus. Furthermore, others have reverted the links. Glrx (talk) 16:27, 6 December 2010 (UTC)Reply
Can you explain what issues you still have? I believe the anti-IE and crashing browsers have both been explained unless you have other issues you would like to bring up. Baltar, Gaius (talk) 19:35, 6 December 2010 (UTC)Reply
I would appreciate at least a semblance of discussion. If you could explain what other issues you have I would like a chance to debate them. One other person reverted the link, I don't see that as a consensus. There are two people in a discussion, how do we involve more? — Preceding unsigned comment added by Baltar, Gaius (talkcontribs) 14:20, 13 December 2010 (UTC)Reply
Support inclusion - In my opinion, it is a highly helpful link which will help users visualize the process of sorting algorithms. I just ran it on IE, and it worked just fine (albeit slower than on FireFox). Reaper Eternal (talk) 02:55, 21 December 2010 (UTC)Reply
I also had to remove the 3O template since three editors: Baltar, Gaius (talk · contribs), Glrx (talk · contribs), and Reyk (talk · contribs) are already involved. 3O requires that only two editors be involved in the dispute. I have, however, offered my opinion on the matter above. Reaper Eternal (talk) 03:00, 21 December 2010 (UTC)Reply
My mistake, I forgot {{Reyk}} was included. Baltar, Gaius (talk) 05:34, 21 December 2010 (UTC)Reply
  • Weak oppose- I agree with Glrx that we should not be directing people to implementations of sorting algorithms that can hang the browser. I am no longer so concerned about denigrating IE; the warning against it is actually rather mild. Reyk YO! 22:28, 22 December 2010 (UTC)Reply
What sorts have you found that hang the browser? Baltar, Gaius (talk) 02:33, 23 December 2010 (UTC)Reply
Bubble sorts at full size cause Firefox to hang for a while. Ztothefifth (talk) 02:47, 23 December 2010 (UTC)Reply
Basically this. I have sometimes seen the bubble sort at full size grind to a halt and not recover at all, forcing me to shut it down in task manager, though it usually does right itself eventually. I also get random five to ten second hangs occasionally with the other algorithms. Not game to even try it with IE. Reyk YO! 02:59, 23 December 2010 (UTC)Reply
To be fair there is a warning before doing a full sized bubble sort. The default sort set is significantly smaller then a full sized set. Baltar, Gaius (talk) 14:31, 23 December 2010 (UTC)Reply

What other links to animations are there? All I see is http://www.sorting-algorithms.com/. Ztothefifth (talk) 18:50, 23 December 2010 (UTC)Reply

Another possible link is http://sortvis.org/. Ztothefifth (talk) 21:06, 23 December 2010 (UTC)Reply

As far as I know there are not many sorting demo's out there, let along interactive ones such as this, which is why I'm pushing strongly for inclusion. Most of the demos are simple. Either pictures or animated gifs. The few more complex ones are applets designed to only run one sort with a specific set – essentially no better then an animated gif. This demo is highly interactive and adjustable. Baltar, Gaius (talk) 01:10, 24 December 2010 (UTC)Reply

Attempting Consensus Can we concede that this page offers a interactive sort visualization that is not tied to any platform or operating system requirements, promoting open standards, and has minor stability problems on few platforms? If we can, I believe that inclusion is supported. Baltar, Gaius (talk) 15:18, 4 January 2011 (UTC)Reply

No, there is not a consensus for inclusion. Glrx (talk) 17:37, 4 January 2011 (UTC)Reply
What are your requirements for consensus? I would like to work with you to agree on a point where this link can be included. I will contact the author and see if there is anything he is willing to change or alter that would better encourage a consensus. Does anyone else have points (for or against) that they would like to bring up? Baltar, Gaius (talk) 18:29, 5 January 2011 (UTC)Reply
You could also open a request for comment on this article to get more opinions. Reaper Eternal (talk) 19:12, 5 January 2011 (UTC)Reply
Thank you, I have done so. Baltar, Gaius (talk) 03:35, 6 January 2011 (UTC)Reply

Strong Support These arguments are ridiculous; if you run IE, you should not expect anything to work IMO. But jokes aside, this is a great visualization of sorting algorithms that can help the reader understand the differences. In many cases, and especially so with our CS articles, the articles are too technical (for the average reader), involving runtime complexity, which data structures the algorithms are best implemented in, etc. As a Computer Scientist, I understand these are invaluable to the article. However, they are not "newbie friendly" at all. This provides a great top-level introductory view of sorting algorithms. Also, W/R/T to the larger demos taking longer to pre-process, I think that is a fine and normal limitation. To be honest, I'm surprised that such a trivial "inconvenience" perhaps really outweighs the benefit of deeper understanding of algorithms. Sheesh. jheiv talk contribs

Update: I ran this on IE and didn't experience any noticeable slowness. I have a pretty decent setup though, so that may explain the difference. My guess is that the warning is just there as a courtesy for users who have slowers computers and run IE. Certainly this is not a bad thing... jheiv talk contribs 01:40, 7 January 2011 (UTC)Reply

Glrx I would like to come to a consensus with you before adding the link back in. I would ask that you list any remaining objections to inclusion. This has been dragged on long enough. Baltar, Gaius (talk) 18:43, 7 January 2011 (UTC)Reply


You don't come to a WP:Consensus with me; that is something that happens with a group of editors. I clearly oppose the addition. I have stated reasons to support that position. Although I have some other objections, I don't believe they are worth going into right now.
Normally, the primary disputants don't declare a consensus; they let others do that to avoid the appearance of bias. This discussion started a long time ago, nobody declared a consensus, and the matter sort of died on its own. You keep trying to revive it with the hope of a different result. That's OK, but the landscape hasn't changed. If I read between the lines, User:Reaper Eternal believes there was not a consensus for inclusion because he suggested a RFC.
I'm opposed; it hung my IE browser. There have been some improvements, but I remain opposed. My take is that WP should not link to sites where the user can get into trouble. I don't expect the average WP reader to be sophisticated about using websites or avoiding trouble. Certainly, somebody learning about computing complexity for the first time may not appreciate the implications of sorting parameters. User:Reyk has voiced opposition and noted the simulations also hung Firefox. Reyk has moderated that position somewhat, but Reyk reports "random five to ten second hangs" with other simulations. Reyk is understandably reluctant to attempt viewing the simulations in IE. User:Ztothefifth hasn't declared either way, but Ztothefifth has confirmed the bubble sort causes "Firefox to hang for a while". (Ztothefifth's query about other simulations out there suggests he is reluctant to include the link.)
You are coming across as strong advocate for Joshua Kehn. That position troubles me. Instead of addressing the stability issues, you want to overlook them as "minor stability problems on a few platforms". Your statement that the JK link "promot[es] open standards" is irrelevant to this discussion; it suggests you support an agenda to get browser suppliers to support certain open standards. I don't think that is a goal of WP, and WP should not directing its users to sites with the hope they will complain about the limitations of their current browser or switch to a different browser. User:Reaper Eternal has voiced support for the link and did not experience trouble. I wish RE had said more, but that's fine. After the RFC, User:jheiv came in with strong support. Jheiv comes across as dismissive of anything that hangs IE (even with all joking aside), and that would prompt me to give his support little weight. He characterizes the simulations as not "newbie friendly". His second comment comes across as more unbiased.
As it stands, I don't see a consensus. Since the RFC, there's only been one addition for inclusion, and I don't see that addition creating consensus. At this point, with this set of editors, if both Reyk and Ztothefifth were to support inclusion, then I'd be overruled. Even if just one of two supported inclusion, you might have a consensus. If the link were included, I'd put a black box warning warning on it.
Glrx (talk) 21:02, 7 January 2011 (UTC)Reply
PS. See also WP:ELNO numbers 8, 11, and 16. Glrx (talk) 21:20, 7 January 2011 (UTC)Reply
I will address your concerns later this evening. Baltar, Gaius (talk) 21:46, 7 January 2011 (UTC)Reply
First off, WP:ELNO 8 doesn't apply because <canvas> isn't a plugin. 11 doesn't apply because it's a dedicated page, not a personal site or blog. 16 doesn't apply because it is continuously functional, and will remain functional.
Your stated reasons (anti-IE / unstable) have been addressed. The page is not anti-IE. The page is stable. There are conditions that on older machines / inadequate browsers can produce unforeseen problems – but no more so then a Flash plugin crashing or a Applet refusing to load.
My take is that WP should not link to sites where the user can get into trouble. The users aren't getting into trouble. If you believe this is the case, why not remove every link to a Flash enabled or Applet page from WP because there are cases (especially with older machines) where these plugins will cripple a browser.
Certainly, somebody learning about computing complexity for the first time may not appreciate the implications of sorting parameters I agree, and it appears so does the author. The page is set to a default level which produces no reported crashes.
You are coming across as strong advocate for Joshua Kehn. That position troubles me Please enlighten me as to why you are troubled with my position. My position is something worthwhile and decent being so violently opposed by someone for ungrounded reasons.
Your statement that the JK link "promot[es] open standards" is irrelevant to this discussion; it suggests you support an agenda to get browser suppliers to support certain open standards. This may be your take on it but it is not my intent. My intent is to offer an alternative to the current which is widely accepted and in use. Why should there only be an Applet and a set of animated gif's?
I don't think that is a goal of WP, and WP should not directing its users to sites with the hope they will complain about the limitations of their current browser or switch to a different browser Where (anywhere, please) is this implied (by me or JK)? It surely isn't said, so it must be implied.
He characterizes the simulations as not "newbie friendly". He does not. He says In many cases, and especially so with our CS articles, the articles are too technical (for the average reader), involving runtime complexity, which data structures the algorithms are best implemented in, etc.
There is no need for a black box warning given that there is ample warning of any slight instability on the page itself.
I will contact Reyk and Ztothefifth via their talk pages and ask for for them to revisit this. Baltar, Gaius (talk) 02:53, 8 January 2011 (UTC)Reply
  • I remain opposed to the addition of this link. I've tested the page on Firefox and it frequently hangs for five to ten seconds regardless of the type of sort or the number of items being sorted. Once or twice it has even caused Firefox to stall completely, forcing me to shut it down in Task Manager. I do not regard this as a "minor instability" but a fairly serious one and I think it would be irresponsible to link to it. Finally I think you're treating the consensus building process as "keep talking and talking until everyone does what I tell them", and that's not how it works. Reyk YO! 05:23, 9 January 2011 (UTC)Reply
Reyk, can you clarify which options you are using to create these hangs? I can't duplicate this behavior at all, even on my slowest machine. — Preceding unsigned comment added by Baltar, Gaius (talkcontribs) 13:05, 9 January 2011 (UTC)Reply

Oppose crashed or stalled IE8. This is supposed to be an encyclopedia not a soapbox for some pinhead to tell me what software to run on my computer. If you think it is such a brilliant demo turn it into a gif and have done with it.Greglocock (talk) 04:27, 11 January 2011 (UTC)Reply

WP:NPA? Reaper Eternal (talk) 23:11, 11 January 2011 (UTC)Reply
It'd be NPA if the editor promoting the website link, and the website author, were the same person. I referred to the website author as a pinhead. Obviously they cannot be the same person as that would be COI, and we know that the wiki-editor is a man of integrity, not a pinhead, and so would not commit COI, or indeed, pinheadedness. Don't we? Greglocock (talk) 02:21, 12 January 2011 (UTC)Reply
Note that the author ("pinhead") is not telling you what software to run on your computer. In fact, he is telling you that there is very few requirements, namely a browser made in the past five years. The demo running applets is telling you what to run on your computer.
Turning it into a GIF would completely destroy the uniqueness of the demo.
I don't particularly like what you are suggesting, but feel correcting you would be a waste of time. I'm thoroughly disheartened with the Wikipedia editing system now. Such violent opposition to anything new. The slightest (and do I mean slightest, I have only made it crash a handful of times) instability should be grounds for complete removal? Fine. I remain supportive of inclusion if this issue comes up again. Baltar, Gaius (talk) 05:14, 13 January 2011 (UTC)Reply

Bubble sort stability

edit

I think bubble sort is stable (in contrary to what that written in the table) —Preceding unsigned comment added by 79.177.127.22 (talk) 09:55, 2 November 2010 (UTC)Reply

I just reverted some changes that claimed BS stability depends on the implementation. BS should be stable because neighboring elements are not swapped if they are already in order (and two equal elements are in order). Consequently, equal elements will not get out of order. (They can get out of order when non-neighboors are swapped as in Shell short.) Sadly, my statements are WP:NOR, and I don't have a WP:RS citation for BS stability. Glrx (talk) 21:55, 7 November 2010 (UTC)Reply

Log Base

edit

When it says n*log(n) what is the base of the log function? Is it binary (2), decimal (10), natural (e), or some other base? —Preceding unsigned comment added by 69.91.165.139 (talk) 20:04, 15 November 2010 (UTC)Reply

In big O order notation, the base of the logarithm doesn't matter -- log2, log10, and ln are all related by just a constant. In particular algorithms, there will be a logical base to use: a binary sort would be base 2 and a radix 10 sort would be 10. But again, in terms of big O, the actual base doesn't matter; constant multipliers are discarded. Glrx (talk) 20:49, 15 November 2010 (UTC)Reply

Manual (Physical) Sorting Algorithms

edit

I came looking for assistance in manually sorting tens of thousands of random Election Ballots into a hundred geographic (Precinct) districts. Similarly the Postal Services machine-sort millions of pieces of mail into millions of buckets (zipcodes, routes and addresses). Carrier personnel manually direct-sort their mail into an array of "pigeon-holes" before delivery distribution.

Unfortunately all emphasis here is towards electronic Data-processing. I've got to give these lots of thought to determine applicability to physical-sorts, using physical motions, requiring real-time, and limited buckets within reach or operator constraints. Does EDP sort-theory apply, or have lessons for large physical sorts?

Are info or comments about best physical-sort algorithms appropriate here? What differences apply due to human or mechanical vs electronic/cyber media and speeds? HalFonts (talk) 07:20, 19 November 2010 (UTC)Reply

It is usual to measure comparison sorting algorithms based on the number of comparisons, and assume that the number of element moves (or copies) is proportional. As a physical sort problem, consider sorting a deck of playing cards, commonly by insertion sort. Unlike for EDP, a whole group of already sorted cards moves up or down as one operation. I did once, just to see that I could do it, sort a deck of cards with quicksort. Some EDP sort algorithms easily adapt as physical sort algorithms, and others don't. Reminds me, that radix sort was commonly used to sort decks of computer cards in the early EDP days. There were machines built to do this combination of electronic and physical sorting. Gah4 (talk) 13:44, 4 September 2019 (UTC)Reply

TimSort

edit

The best case for TimSort should be O(n). It is not just a hybrid mergesort but a hybrid natural mergesort. If the data contains runs that are ordered or reverse ordered or a mixture, then it is between O(n) and O(n * log n). See http://bugs.python.org/file4451/timsort.txt

The same can be said for SmoothSort. If the runs are ordered (but not reverse ordered), then it is O(n). The best cases on these should reflect this. Stephen Howe (talk) 22:39, 25 November 2010 (UTC)Reply

Shell sort

edit

What's happened to this section? Just two curly brackets at the moment. Sam Dutton (talk) 14:01, 26 November 2010 (UTC)Reply

How does it have two different averages? —Tamfang (talk) 07:43, 15 December 2010 (UTC)Reply

Because it is an open problem as to what the asymptotic behaviour of shell sort is. Different gap sequences have different asymptotic behaviour. I will see if better bounds can be found as I believe recent publications by computer theorists have managed to give tighter bounds for shell sort.

Stephen Howe (talk) 21:25, 20 December 2010 (UTC)Reply

In the table of comparisons, shouldn't Shell sort be categorized as "Exchanging and Insertion"? It's more of a hybrid of both than just insertion. — Preceding unsigned comment added by 67.183.10.109 (talk) 19:15, 27 August 2017 (UTC)Reply

Quicksort can be stable?

edit

I dont think this is true, it is only true for quicksort for linked lists. It is not how the pivot is handled, as the article describes, but the fact that all equal elements within the partition set must retain their original order while partitioning. To do so, requires a stable partition (function in C++ library). stable partition requires auxilary memory and cant be done in O(N) without the memory. Any comments?

Stephen Howe (talk) 21:46, 10 December 2010 (UTC)Reply

See Quicksort#Complex version. Glrx (talk) 05:28, 11 December 2010 (UTC)Reply

Thank you. I have seen. But there is nothing on this page that talks about stable partitioning for quicksort. In any case, I think I would have heard something of it. In the past 20 years, Bentley & McIlroy's groundbreaking "Engineering a Sort function" in 1993 introduced 3-way partitioning (sometimes known as "fat pivot") for Quicksort which has also been investigated by Sedgewick See "http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.14.8162&rep=rep1&type=pdf" "http://www.cs.princeton.edu/~rs/talks/QuicksortIsOptimal.pdf" which are downloadable PDF's. In terms of partitioning, 2-way partitioning is a maximum of N/2 swaps, for 3-way partitioning (Dutch National Flag problem), it can be as little as ((5 * N) / 9) swaps. But in both cases, neither are stable. It is not pivot element that matters, it is the fact that large blocks of multiple duplicates still retain there original order. I have yet to see stable partioning can be done in O(N) swaps with no auxilary memory. Therefore I motion this reverts to quicksort not being stable unless someone posts URLs to white papers that indicate that recent research indicate that Quicksort can be made stable. NOTE: Quicksort for linked lists can be stable, but I am talking about the array version which I believe this Wiki article addresses.

Stephen Howe (talk) 22:37, 12 December 2010 (UTC)Reply

Perfect Sort / Triangulation Sort

edit

Is a kind of un-algorithm. It simply triangulates from what data is so far sorted (discouraged / takes extra time). But implemented (see code) it shows shows some surprising benefits. Is sometimes referred to being "a sort in which data becomes available for streaming rapidly" (and formidable multi-field). Some books refer to such a method but I haven't seen it one of them code it or compare it's benefits. Interesting for just that. Enjoy. User:Sven nestle2/Perfect Sort / Triangulating Sort

You need to cite WP:RS for this algorithm. Who has evaluated this algorithm? Also watch out for WP:NOR. Glrx (talk) 21:40, 6 June 2011 (UTC)Reply

Excuse me Glrx did you have time to see my corrections yet?

      • I've completely rewritten please look ***

I am trying to provide a good topic. Please say if there is any reason to hold it back.

Wikipedia:Identifying reliable sources. "~ the reliability of a source depends on context. Each source must be carefully weighed to judge ~" I don't see other sorts showing publisher. Wikipedia:No original research. "~ that includes any analysis or synthesis of published material that serves to advance a position not advanced by the sources ~". I'm sure I didn't make any unfounded conclusions. Triangulation - very basic stuff.

But that's parsing hairs. The original post was bad and I'm not sure if you've looked at the new post.

User:Sven nestle2 has continually edited comments out of order, deleted other editors comments, and changed his comments after they received responses. The record for this thread is a mess. This thread starts with this edit.
Bottom line: Sven has not identified any reliable source for his perfect/triangulation sort.
Glrx (talk) 16:41, 12 June 2011 (UTC)Reply
Ah thanks again for the communicae. Mr Glrx I'm unsure what you mean "edited out of order" as my article was or is a work in progress as is yours. If your implying my wiki editing skills need improving I'm sure to admit it. I'm unsure of your use of "thread" as wiki does not resemble chat forum software. My user page should not be an article page is that it? I was going to spend that time fixing later does it matter?
<< But what about the content as it stands now? Any likes or dislikes there? << ?
Sven nestle2 (talk) 20:18, 12 June 2011 (UTC)Reply
What you put on your user page is (mostly) your own business.
Each section on a Talk page is functionally a conversation thread; the lack of inherent forum-style structure obliges us to carefully maintain conventions, among which perhaps the most important is not to damage old comments, because each participant needs to be able to see (and understand) what has gone before. Therefore: try to to place new remarks where they fit the flow, and do not remove or replace old remarks (unless it's yours and you see that you made a mistake before anyone responded). —Tamfang (talk) 18:47, 13 June 2011 (UTC)Reply
Sven writes, "I don't see other sorts showing publisher." Each of the sort methods listed in the article has its own article, which ought to list references; they are not necessarily repeated in this umbrella article. —Tamfang (talk) 19:06, 13 June 2011 (UTC)Reply
Sven, in your exuberance to rewrite your own remarks you removed mine:
What books mention this algo? Any webpages (other than your own)? —Tamfang (talk) 07:21, 11 June 2011 (UTC) restored—Tamfang (talk) 18:47, 13 June 2011 (UTC)Reply

I haven't tried to read the code at User:Sven nestle2/Perfect Sort / Triangulating Sort, but from the verbal description it seems that you're merely giving an exotic name to multikey comparison (if x.key1==y.key1 then compare x.key2 with y.key2, and so on) which can be incorporated into any sort algo. —Tamfang (talk) 18:53, 13 June 2011 (UTC)Reply

Mr. Tamfang thank you. As to removal of past edits yes I was trying to keep my talk section small but I see your point. I'm sorry if my article is not easily instructive I try to make it clear soon. I note you are not Glrx who I asked and have no response from.

As to your multi-key comment you are somewhat right: what you listed would work in one dimension (but yields no new index in one dimension). However it would not work in two and higher dimensions as some way of indexing must be found. It must specify AND track traversal some way. Many sorts do this with assumed partitions to specify and recursion to track. Recursion invisibly "stores" traversal indexing but with poor efficiency. It is amazing that "Triangulation Sort" keeps up near highest speeds and can beat other sorts at a few things isn't it? Recursion wastes but also looses order (you cannot output until all records are finished with such methods) (also: debugging an unusual orders can be frustrating). A sort cannot waste indexing time and still have it. Triangulations sort's best features are not "raw time" - as stated above and on it's page.

If however you believe sorting two and higher dimensions is not a problem and requires no indexes please say how :) I'd love to hear it. Also I do not see a reference for "multi key comparison" algorithm. Is there one other than "triangulation sort"?

— Preceding unsigned comment added by 68.227.219.57 (talk) 20:45, 14 June 2011 (UTC)Reply

Oh, I misunderstood the point of "dimensions". When is it appropriate to sort in two or more dimensions? —Tamfang (talk) 06:33, 15 June 2011 (UTC)Reply

Who said all data must be one dimensional? Do not believe this person. — Preceding unsigned comment added by 68.227.219.57 (talk) 16:08, 17 June 2011 (UTC)Reply

Do you always answer a question with a non sequitur? —Tamfang (talk) 02:26, 29 June 2011 (UTC)Reply

Library sort -- 2004 or 2006?

edit

The overview says library sort was first published in 2004, but the library sort page says 2006. Which is correct? John lindgren (talk) 13:49, 10 June 2011 (UTC)Reply

Spaghetti (Poll sort)

edit

I found the general algorithm for spaghetti sort, as well as linking articles. — Preceding unsigned comment added by Ahshabazz (talkcontribs) 00:11, 27 June 2011 (UTC)Reply

I just removed Spaghettisort from the list, because by the wikipedia article that describes it, it is not a real sorting algorithm. Spaghetti sort requiers that a "massively parallel CPU system" needs to be able to find the longest spaghetti in o(1) time, in order for this to work, but without describing that part of the sorting algorithm, the description is incomplete. There is no such thing as an o(1) sorting algorithm, so the precondition that makes "Spaghetti sort" able to exist as an algorithm, does not exist. In other words "Spaghetti sort algorithm" is not a Computer Science algorithm, but a method with spaghetti, that would require a sorting algorithm on a computer, which is not related to spaghetti. Dybdahl (talk) 06:25, 7 July 2011 (UTC)Reply

  • Remove spaghetti sort from the article. In the interim, I moved it to the impractical group requiring special hardware. I also changed the space metric to reflect reality of an analog comparison. Glrx (talk) 18:15, 7 July 2011 (UTC)Reply

I agree with removing it from this article, but note that the requirement is O(1), not o(1). Finding the largest strand in time O(1) is impractical, requiring custom purpose parallel hardware scaling with problem size; finding it in o(1) is impossible. CRGreathouse (t | c) 03:50, 7 October 2011 (UTC)Reply

  • Keep. I agree that the spaghetti sort, as described in the article on spaghetti sort, is not useful for real computer programs running on classic binary von Neumann computers. However, as far as I can tell, this "sorting algorithm" article is about all kinds of sorting algorithms, whether or not they are useful for computer programs on classic binary von Neumann computers, so I am inclined to keep at least a mention of spaghetti sort in this article. --DavidCary (talk) 21:16, 10 March 2013 (UTC)Reply

A question - spaghetti sort as currently listed claims O(n^2) memory, but its article claims O(n) memory, which seems far more intuitive to me as each value is only tracked once (on its own processor). Is there a reason for this difference? — Preceding unsigned comment added by 12.184.104.214 (talk) 04:57, 21 December 2012 (UTC)Reply

Sleep sort

edit

I find this sorting algorithm to be amusing:

http://rosettacode.org/wiki/Sorting_algorithms/Sleep_sort — Preceding unsigned comment added by 174.137.113.43 (talk) 00:50, 20 July 2011 (UTC)Reply

This should be included as a non-comparison sort, as there isn't anything there that compares to the slow comparison sort.. 'slow sort' at the moment.--31.185.238.2 (talk) 00:38, 12 March 2012 (UTC)Reply

add information about online sorting algorithms

edit

that... you could add in the table if an algorithm is online or not. http://en.wikipedia.org/wiki/Category:Online_sorts Mdtr (talk) 03:26, 3 August 2011 (UTC)Reply

Clarification of Notation

edit

I would appreciate a clarification, either on this page or the logarithms page, of what is meant by  . KlappCK (talk) 15:00, 30 August 2011 (UTC)Reply

The equation has been clarified to be n*(logn)^2, which matches what is stated on STL's webpage and in STL books (Duvavic1 (talk) 01:08, 14 September 2011 (UTC))Reply

Duplicates-Removing Sorts

edit

According to the definition given at the start of the article, duplicates-removing sorting is not sorting. But that kind of algorithms exist; some languages/run-time systems have built-in sort functions with the possibility to specify whether it is needed to remove the duplicates at the same time while sorting, or to preserve them.

One example: pigeonhole sort with removal of duplicates is essentially what the array-based sieve of Eratosthenes is doing while collecting ⁄ marking the composites, and list-based variant exists which performs duplicates-removing mergesort.

Where and how can this be addressed? Or is it already, somewhere? Any pointers, anyone? Thanks. WillNess (talk) 08:20, 22 September 2011 (UTC)Reply

Duplicate-removing sorts can be done in 2-steps: 1. The sort is done 2. An extra pass is made comparing an element with the next element. If the next element is identical on the key, then it should be overwritten by the next successive element. So for identical elements, the first element is preserved. Stephen Howe (talk) 21:31, 28 January 2012 (UTC)Reply

A New Sorting Algorithm: Over-sort

edit

A I just published a new sorting algorithm I worked on for the last 18+ months. I think Over-sort is past "worthy of notice", and into ground breaking territory, but I'm probably biased. — Preceding unsigned comment added by 84.229.4.103 (talk) 08:24, 14 October 2011 (UTC)Reply

standard template library stable_sort() is not always in place merge sort

edit

Standard Template Library includes a stable_sort() function, but does not define how it is implemented. In the case of Microsoft's Visual Studio, stable_sort() uses a temp vector (1/2 the size of the data to be sorted), and is not an in-place merge sort. It uses insertion sort to create chunks of 32 elements, then merges chunks until chunk size equals vector size. The sort algorithms section should not include a reference to Standard Template Library as an example of an in place merge sort. A specific implementation of STL's stable_sort() might use an in-place merge sort, but stable_sort() is not defined that way. Rcgldr (talk) 02:06, 9 December 2011 (UTC)Reply

The ISO C++ standard (1998 and now 2011) does give implementation constraints on how stable_sort should be implemented. It says "Complexity: It does at most N log2(N) (where N == last - first) comparisons; if enough extra memory is available, it is N log(N)." It works out that 3 STL algorithms are adaptive: inplace_merge(), stable_partition(), stable_sort(). They are allowed to acquire a temporary buffer and if the memory is available, then the minimum complexity is achieved. If the memory is not available but a smaller amount is available, then the algorithm degrades gracefully in the direction of the maximum complexity. In the case of Visual Studio, it uses a hybrid merge sort where the initial step is various insertion sorts before merging the results back and forth to the temporary buffer. For VS2003, VS2005, VS2008 the temporary buffer is the size of all the elements. For VS2010, the temporary buffer is 0.5 the size of all the elements as the well-known trick of merge-sorting 2 halves and then merging one half in the temporary buffer with the other remaining half is employed. This cuts down the overhead. Any algorithm that meets the complexity and stability requirements of ISO C++ can be used but most STLs use a hybrid adaptive mergesort. Stephen Howe (talk) 22:02, 28 January 2012 (UTC)Reply

VS2005 stable_sort allocates a temporary buffer 0.5 times the size of the data to be sorted, so does VS2010 and VS2015. I suspect all versions of VS stable_sort allocate a half size temporary buffer. stable_partition does allocate a full size temporary buffer. Rcgldr (talk) 08:23, 15 December 2015 (UTC)Reply

Implementation code improvements for all sort algorithms

edit

Before you start the sorting of the N given elements you should check if the given elements already have the desired sort order (check if (a[i-1] <= a[i]) for all i=(1, .. N-1)). This additional check operation has only a linear run time complexity O(N). That means if the given array is already sorted (this is always the best case) so the run time complexity is always O(N). This additional check operation can be executed at the begin of each sort algorithm. That means in particular that each sort algorithm in the best case always has a linear run time complexity O(N).

Aragorn321 (talk) 05:54, 19 July 2012 (UTC)Reply

Yellow boxes in the "best case" column.

edit

Currently, this column has only red (n^2 or worse) and green (better than n^2). Should nlog(n) be made yellow, for consistency's sake? 140.247.41.147 (talk) 13:48, 23 July 2012 (UTC)Reply

BULL

edit

This "page maker" is deleting or mis-representing code for faster sort algorithms. Never stating where they came from or sources. Giving confusing explinations for advanced algorthms and leaving out practical issues such as found in unix sort(1) (ie, stability, tape v. memory, caching). Deleting sumbitted old and new algorthims (ie, mine but others). But has publishes "timsort" which is a joke algorithm. Say he has a tester and all algorithms but has no evidence of having the same. And the sources cited have 0 to do with the origion books and creators of the algorithms.

Quickersort AND Quickestsort

edit

August 31, 2012

To whom it may concern (at Wikipedia),

Back in the late 1970's I read about two sorting algorithms that are not mentioned at all in this "Sorting algorithm" article. They are Quickersort and Quickestsort. At that time: Quickersort was the latest published improvement of Quicksort; and Quickestsort was the fastest sorting method possible theoretically and (its algorithm was) a trade secret of the U.S.A. Federal Government. If anyone knows more (history) about this, please add it to this "Sorting algorithm" article. Quicksort and Quickersort were written about in the book: "Sorting and Sort Systems" by Harold Lorin, ISBN 0-201-14453-0, Copyright © 1975 by Addison-Wesley Publishing Company, Inc., part of "The Systems Programming Series" [book #4 of 4]. I don't remember where I read about Quickestsort. I thought it was in that book, but it is not listed in the index.

Sincerely yours,

JPD 22:15, 31 August 2012 (UTC)

Sort in random

edit
$ echo "Wikibooks Wikipedia Wiktionary" | sort --random-sort
Wikipedia
Wikibooks
Wiktionary

Unix Sort's algorithm

edit

The sort in GNU/free Linux uses merge sorting which is explained in the Sorting algorithm wiki page.

new development of Sort

edit

Sort uses merge sorting and is speedy to complete 1 column sorting (in a table of rows and colums of words to be selected and sorted).

There is a new sort(1), headsort(1), using an algorthim with opposite speed properties of merge sort that when sorting more than one column can finish in 1/2 the time, or when only needing to begin streaming the sorted data can finish in 1/2 the time (thus head), and can fork sorting jobs better (though neither does by default): but more often looses to sort's merge when sorting 1 column: it has opposite benefits of merge. Streaming and columns headsort(1) can being streaming in less than 1/2 the time. It's used when use columns and piping data are more often than not used.

If you are interested in headsort(1) "1/2 the time" contact johnandsara2@cox.net or sven_nestle2 on wiki with subject "sort algorithm".

It's good to note that for top speed no algorithm is needed just memory: however quite impractically allot is need for any appreciable task especial where used for random tasks; such a thing is useable only within a single program.

ok, let me try describe Euclid sort

edit

Euclid sort sorts in O(N*log(N)) time n needs O(1) mem first we got two groups of sorted items to b merged lets try to describe backward possible variant: first we c what is the last item (in reverse order) from the second group that remains at its place second we c where is moving up the first contestant , in reverse order, by following in reverse order, the elements of first group third we c how many elements from the second group will come compactly with the contestant that i mentioned fourth we apply some Euclid arange function to put the elements in the rite place when exit the loop we probabily will have at least an Euclid arange (as much, but no more) to get the array sorting done

idd wish credits if the case Florin Matei, Romania 93.118.212.93 (talk) 07:36, 22 February 2013 (UTC)Reply

Please read WP:OR; if this were published, it could be added (provided we can figure out what it means). — Arthur Rubin (talk) 16:55, 25 February 2013 (UTC)Reply

ok no problem, suit urself abt that policy of urs i dont think its wrong or anything i was just trying make a decent buck, noone i know likes to publish my ideas, like i said im a poor encoder also thow i got a few ideas, u might b ceen my ip page thank u... n im not playing the smartes nor the poorest here: i just feel like found a way to speak abt my creativity. Florin 93.118.212.93 (talk) 15:36, 27 February 2013 (UTC)Reply

an evn more simple sorting algo that challenges O(N*log(n)) time n o(1) mem

edit

its all abt merging two sorted groups: main repeated action is to move same number of elements from group first n group second (group interchange same no of elements) so we'll have all the elements of groups (sorted in the future) but not in place, then well repeat starting with the group of less length 4 not charging that much the stack... as a strategy we can alternate between remained smaller group n the bigger 1:1 alternate that assure O(log(n)) stack with some strategy against hazard. when memorize parameters of a group in the stack , memorize also the "breaking point" also

 93.118.212.93 (talk) 08:28, 4 March 2013 (UTC)Reply

Is there a name for this sort algorithm?

edit

This sort only works on a sequential set of numbers in some random permutation, using swaps to sort the set of numbers. The worst case number of swaps is n-1. Example code for when the set of number is {0, 1, ..., n-1} stored in an array a[]:

void sort(void)
{
int i;
    for(i = 0; i < n; i++)
        while(   a[i] != a[a[i]])
            swap(a[i],   a[a[i]]);
}

Several

Rcgldr (talk) 09:18, 27 April 2013 (UTC)Reply

Infinite loop? (Made a mental off-by-one error here...) —Ruud 10:38, 28 April 2013 (UTC)Reply
Given the constraint that the array of size n contains a permuted set of numbers {0, 1, ... , n-1}, the worst case number of swaps is n-1. I've since figured out that this algorithm can be expanded to sort an array based on a set of sorted indexes:
//      A is an array of size n
    A[] = ... ;

//      generate array of indexes I
    for(i = 0; i < n; i++)
        I[i] = i;

//      sort I based on values in A
    sort(I, A);

//      sort A using the sorted list of indexes in I
//      while also sorting I
    for(i = 0; i < n; i++){
        while(     I[i] !=  I[I[i]] ){
            swap(A[I[i]], A[I[I[i]]]);
            swap(  I[i],    I[I[i]] );
        }
    }

Rcgldr (talk) 01:04, 29 April 2013 (UTC)Reply

Cycle sort would seem like an appropriate name, but it has already been taken, although it does seems to be based on the same underlying principle: "It is based on the idea that the permutation to be sorted can be factored into cycles, which can individually be rotated to give a sorted result." I wouldn't be surprised if this sort doesn't have a name, as it seems to be mostly of theoretical interest: with the given preconditions it shouldn't be too hard to keep the list sorted in memory in the first place. The extension of the sort to work on key-value pairs can be applied to any sorting algorithm. —Ruud 17:10, 29 April 2013 (UTC)Reply
It's a special case of cycle sort: special_cycle_sort in B. K. Haddon, "Cycle-Sort: A Linear Sorting Method", The Computer Journal, 1990. —Ruud 17:16, 29 April 2013 (UTC)Reply
Since it swaps instead of rotating, maybe it should be called swap sort? I forgot to mention that each swap places at least one element in it's "correct" location. Rcgldr (talk) 07:55, 30 April 2013 (UTC)Reply
Most in-place sorting algorithms, including bubble sort and quicksort, work by swapping elements. This specific algorithm, though, goes through each cycle in the permutation one-at-a-time. Thus the number of iterations of the for-loop in which you perform at least one iteration of the nested while-loop is always exactly equal to the number of cycles in the permutation. —Ruud 11:54, 30 April 2013 (UTC)Reply
  • One interesting aspect of this small loop is that if you consider sorting I by the values of A as a permutation, then this loop "unpermutes" I, while at the same time it "permutes" A. This occurs because the values in the locations swapped in I are used as indexes for the locations swapped in A. Rcgldr (talk) 23:04, 29 April 2013 (UTC)Reply
So the main enhancement of this algorithm is the part that sorts the array A[] while performing the special cycle sort on I[]. I searched again but wasn't able to to fing any references citing that special cycle sort could be used for this purpose. Rcgldr (talk) 04:01, 1 May 2013 (UTC)Reply

Adding explanation for in-place sort

edit

There should be a section explaining what an in place sort is just like there is a section for explaining stability of sorting — Preceding unsigned comment added by Nileshrathi01 (talkcontribs) 14:07, 5 October 2013 (UTC)Reply

0-1 principle

edit

From my university studies I recall there being something called the 0-1 principle. The statement is something like this: "If an oblivious comparison-based sorting algorithm correctly sorts data consisting of only 0s and 1s, then it correctly sorts any data."

I forget the exact wording, but the idea of it is that, if an algorithm meeting certain criteria works as a sorting algorithm for arrays/lists containing only two distinct values, then it works as a sorting algorithm for lists whose elements are taken from an arbitrarily large totally-ordered set.

Does anybody know anything about this? The search hits I've found are explicitly about sorting networks, but what I was taught referred to sorting algorithms more generally. Obviously not all 0-1 sorting algorithms are general sorting algorithms - counting sort with only two counters is an obvious counter-example (pardon the pun). But what are the "certain criteria"? Something that can be said about most sorting algorithms is

  • the only way in which the elements are examined is to use a "less than" or "greater than" operator, which has a boolean result
  • elements are moved from place to place, rather than making many copies of an element and using these as the output

But the aforementioned counting sort can be adapted to meet these criteria, so they are clearly not sufficient.

It may be the case that it is only a theorem for sorting networks, and my lecturer just intended us to use it as a heuristic test to see if a sorting algorithm seems correct. Still, can anyone provide any further insight into this, such that we might be able to add something to the article about it? — Smjg (talk) 17:42, 30 November 2013 (UTC)Reply

@Smjg: A little late for this, but if you do manage to find a reliable source explaining this, I'm sure it could be integrated into the article. – FenixFeather (talk)(Contribs) 04:36, 11 April 2014 (UTC)Reply
That's kinda obvious.... — Smjg (talk) 14:52, 12 April 2014 (UTC)Reply

For what it's worth, I've just tried searching again and found this: [2]

"The most interesting proof uses the so-called 0-1 Sorting Lemma. It allows us to restrict our attention to an input of 0’s and 1’s only, and works for any “oblivious comparison-exchange” algorithm. (Oblivious means: Whether you exchange two values must only depend on the relative order of the two values, and not on anything else.)"
"Whenever the algorithm compares a pair of 1’s or 0’s, it is not important whether it exchanges the values or not, so we may simply assume that it does the same as on the input x."

What is important, however, is whether the algorithm has any conditional behaviour, other than just the swapping of the elements, based on the result of the comparison.

Maybe "oblivious comparison-exchange" was the phrasing my lecturer used. I guess that this phrase means an operation of the form "if A > B, swap A and B", and the essence is that the data is in a black box and the algorithm has access to it only by this operation (which returns no value, hence "oblivious") plus knowledge of the length of the array.

This means that, for a given algorithm and input size, the sequence of oblivious comparison-exchanges is a constant. So what we really have is an algorithm for building a sorting network and then applying it. That said, on a given pair A and B, it can equally be either "if A > B, swap A and B" or "if B > A, swap B and A". So if you don't consider sorting networks to be allowed to have reverse comparators, then we have a generalisation of the sorting network 0-1 principle to a wider class of algorithms. Otherwise, it doesn't really generalise the principle at all.

Maybe what we can still say is that the 0-1 principle can be used to prove the correctness of some sorting algorithms. I'll see what I can come up with.... — Smjg (talk) 14:52, 12 April 2014 (UTC)Reply

Sorting networks section still needs work

edit

"Sorting networks" aren't a singular algorithm that can be described as a single best/worst/average case for time and memory, but they're listed as such in the chart. They're also listed in a special category for sorts that are incredibly impractical or require special hardware. That may be true for the AKS sorting network it's apparently referring to, but small and efficient sorting networks are actually a lot faster than insertion sort, even though insertion sort is credited as being the fastest. In my experience they're anywhere from 2 to 5 times faster than insertion sort.

What should be done about this? My recommendation is to remove sorting networks from the chart entirely and only refer to them as a concept that can be applied to many sorting algorithms (AKS, even-odd, bitonic, bubble and insertion, etc.), either to construct hardware or to make an algorithm for fixed-size data sets. — Preceding unsigned comment added by BonzaiThePenguin (talkcontribs) 03:40, 11 April 2014 (UTC)Reply

It doesn't seem like there is a reference supporting its position in the table, so if you have a good source that describes sorting networks as a strategy that can be applied to many different sorting algorithms, I think you could easily make that change. – FenixFeather (talk)(Contribs) 04:34, 11 April 2014 (UTC)Reply

Explanation of Comparison Sort methods

edit

I'd like to see an explanation of the terms used in the Method column of the Comparison Sorts table in the Comparison of Algorithms section: Partitioning, Merging, Selection, Insertion, Exchanging, Distribution, Random Shuffling. These don't seem to be explained until later in the article, if at all. I think a short paragraph or illustration describing these methods is warranted.2620:36:8000:10:95B8:2D6C:17C9:12AB (talk) 16:59, 17 September 2014 (UTC)Reply

Quicksort variants

edit

If "Quicksort for linked lists" has different characteristics than "Quicksort for arrays" -- one is stable, and the other is not -- then they are different enough for this article to list them as two separate algorithms. Stephen Howe suggested they were different in #Quicksort can be stable?.

Is there a more reliable source that mentions "Quicksort for linked lists" than "Stable Quicksort With Linked List" or "Optimal Quicksort for Single Linked List" ? — Preceding unsigned comment added by DavidCary (talkcontribs) 22:36, 21 August 2015

They are the same algorithm. You can view the linked list as supplying an extra O(n) space for making a stable partition; the array version doesn't have that extra space, so it cannot do a stable partition. Glrx (talk) 21:00, 26 August 2015 (UTC)Reply

Is there a more reliable source that mentions "Quicksort for linked lists" than Yes. [1] I believe it covers "Quicksort for linked lists". I have a copy, I will check. Stephen Howe (talk) 00:44, 14 December 2016 (UTC)Reply

References

  1. ^ Robert Sedgewick's Algorithms 4th edition.

Usage of bigO notation. "At least O(...)"

edit

Big-O should denote upper bounds, and Big-Omega denotes lower bounds. I realize that at least colloquially, people don't pay that much attention and say Big-O in all cases, often intending a meaning closer to that of Big-Theta (the conjunction of Big-O and Big-Omega), but that is incorrect usage and should be corrected on Wikipedia, or am I missing a semi-formal convention which standardizes that usage? At the very least the article on Big-O notation seems to agree with me at a quick glance.

"A fundamental limit of comparison sorting algorithms is that they require linearithmic time – O(n log n) – in the worst case"

"Comparison-based sorting algorithms (...) need at least O(n log n) comparisons for most inputs."

"These are all comparison sorts, and so cannot perform better than O(n log n) in the average or worst case."

Syrak (talk) 22:49, 16 November 2015 (UTC)Reply

@Qwertyus:
I've reverted Qwertyus; my revert replaces the Ω() bound in favor of O(n).
Higher up in that section, the best average/worst case performance is described as O(nlogn).
These are all comparison sorts, and so cannot perform better than O(n log n) in the average or worst case.
There are strange scope of quantification issues.
Saying a sorting algorithm is O(nlogn) allows the execution with particular input to be less than nlogn; bubble sort taking n comparisons is not a surprise; it does not contradict the bound.
Some algorithms are strictly Θ(nlogn). Heapsort goes through all the motions no matter what. But that doesn't mean all sorting has such a lower bound.
Saying a sorting algorithm is Ω(nlogn) says it can never do better than nlogn. Bubblesort does better once in a while.
Throwing in "average case" or "worst case" muddies an order statistic.
Speaking of Ω() in the context of "worst case" is really speaking in terms of an upper bound (the worst case) rather than a lower bound.
The meaning of "average case" for many sorts is the same as "worst case". Yes, I know quicksort's worst case is a counterexample to that statement.
I don't see Ω() having the precision claimed.
Glrx (talk) 21:13, 30 November 2015 (UTC)Reply
CLRS, Theorem 8.1 (3rd ed., p. 193):
Any comparison sort algorithm requires Ω(n log n) comparisons in the worst case.
This is exactly the bound we're trying to express: no algorithm can beat an n log n lower bound in its worst case. The theorem allows linear best cases (it doesn't concern them) and quadratic worst cases (since n2 = Ω(n log n)). CLRS don't say much about the average case, except giving it as an exercise to proof that linear average case is impossible.
Our current formulation is in fact imprecise. A linear time algorithm still runs in O(n log n), so in that sense it doesn't perform any better. QVVERTYVS (hm?) 21:58, 30 November 2015 (UTC)Reply
(e/c) I don't dispute that a sorting algorithm requires k * (nlogn) comparisons in the worst case.
I dispute that makes the algortithm complexity Ω(nlogn).
The CT definition of Ω() does not have the notion of worst case in it. It says that for all possible inputs, the lower bound cannot be beaten. ∀ inputs is not the same as ∃ a worst case input.
Quicksort requires n2 comparisons in the worst case; would we then say that quicksort is Ω(n2)?
It's true in the worst case, but it doesn't seem right to me. I don't like putting quantification monikers on order statistics that have lower and upper bounds built into them.
To put it another way, I can use Ω(), Θ(), and O() to characterize heap and bubble sort:
  • Heap sort: Θ(nlogn)
  • Bubble sort: Ω(n) and O(n2)
Those statements make sense as complexity theory lower (best case) and upper (worst case) bounds. And I don't have to add "best" or "worst" monikers.
If you want to be precise, you do. Heap sort is also Ω(n) and O(n2) because Θ(nlogn) is within those bounds. A precise statement would be that bubble sort is Θ(n2) in the worst case and Θ(n) in the best case. QVVERTYVS (hm?) 08:23, 1 December 2015 (UTC)Reply
In that context, a claim that comparison sorts are Ω(nlogn) is confusing.
I don't see the same confusion with a statement that a comparison sort cannot be better than O(nlogn).
Perhaps the best way out is to avoid order notation and just say comparison sorts require at least nlogn comparisons for some inputs.
Glrx (talk) 23:25, 30 November 2015 (UTC)Reply
That would be a good way of stating the claim, but it's still a classical theorem that I think should be stated as precisely as CLRS state it as well. QVVERTYVS (hm?) 10:08, 1 December 2015 (UTC)Reply
Just a little comment about "at least nlogn comparisons" suggestion. This just doesn't work, because actually it is "at least Cnlogn comparisons" where C is unknown constant. So to say number of operations is proportional to nlogn in worst case scenario (rather than bounded by this exact number), which basically is the meaning on O(nlogn) notation. Trimutius (talk) 14:57, 3 December 2015 (UTC)Reply
  • If the lower bound of a comparison sort is   comparisons, but the article says  , then one could, for example, infer that a comparison sort that only requires   comparisons exists, because   (see doi:10.1145/1008328.1008329).   notation would be appropriate; such an algorithm would not pass because  , no matter how close c is to 0 (also see same page).
  • The definitions of big O and Ω notations in Knuth's paper is as follows. Let f and g be functions that map to the real numbers:
-Esquivalience t 22:29, 16 February 2016 (UTC)Reply
Well we should just think about what is important for sorts. Best cases of algorithms actually refer to Ω(f(n)), while worst cases refer to O(f(n)), and average case refers to some heuristics, which were made based on all possible inputs. But what is important for the algorithm is for it to be as fast as possible in worst possible case, so O(f(n)) is the only one in practical use for computer algorithms. And even though in tables we can mention that best cases are Ω(f(n)), this information is redundant and not relevant to the subject of the article. Trimutius (talk) 20:47, 17 February 2016 (UTC)Reply


The theorem does not state "the lower bound of a comparison sort is   comparisons".
We are interested in the order statistics of sorting algorithms.
The notation has an shorthand connotation just like graph theory algorithms.
Let n be an input sequence of n elements; f(n) is the number of comparisons an algorithm needs to sort n. Shorthand is just f(n), but realize that the shorthand f(n) is not single valued.
Given a particular algorithm, the meaning of O() and Ω() are clear. The first is an upper bound on the comparisons and second is a lower bound on the comparisons. There is no input sequence that will take more than O() comparisons and no input sequence that will take fewer than &Omega().
The theorem states that for some input sequences (not necessarily all input sequences), a comparison sort will require some multiple of n log n comparisons.
That means a comparison sorting algorithm cannot be better than O(n log n).
If somebody claims to have an O(n) comparison sort, then we know they are wrong (or there's a mistake in the theorem).
The theorem does NOT mean that all comparison sorts must be at least Ω(n log n).
We know that the lowly bubble sort is Ω(n) because it will finish in n comparisons for sorted input sequences.
Claiming a sort is Ω(n) does not violate the theorem.
When another quantification ("in the worst case") is placed outside of the notation, it walks outside of the order statistic notation and butchers an interior quantifier.
As it turns out, the order notation is inadequate for what you want to say here. It also adds confusion rather than clarifies.
Glrx (talk) 02:47, 18 February 2016 (UTC)Reply


I'm not really sure what the talk is about? It is pretty obvious that in all referenced literature they use Big O Notation for a particular reason. Because Ω just isn't really important. In all cases where "upper bound" (or "worst case") is used Big O Notation should be used (and only there), though looking at the current state of article this notation is used for average cases for some reason in many cases.
Phrases like "average time complexity of O(n log n)" just don't make sense, and there are at least couple of places where it states that. I'm against using Ω though, because that one isn't related to. We need some kind of alternative. For example just simply saying with words rather than notation: "average time complexity proportional to nlogn". Because that what people are basically trying to say.
Trimutius (talk) 22:26, 19 February 2016 (UTC)Reply
I also agree that it would be best to state the theorem without using asymptotic notation; even TAOCP tends to avoid it:
A possible replacement is (ref is TAOCP chapter 5):

The number of comparisons required by a comparison sorting algorithm is roughly n log n comparisons in the worst case (some algorithms require proportionately n log n comparisons);[1] in better cases, it is possible to perform comparison sorting in at most a multiple of n comparisons.[2]

-Esquivalience t 02:05, 6 March 2016 (UTC)Reply

References

  1. ^ Knuth 1998, § 5.3.1: The best worst case ... hence roughly n lg n comparisons are needed.
  2. ^ Knuth 1998, ch. 5: On the other hand, if the keys are known to be randomly distributed with respect to some continuous numerical distribution, we will see that sorting can be accomplished in O(N) steps on the average.
  • Knuth, Donald (1998). The Art of Computer Programming, Volume 3: Sorting and Searching (2nd ed.). Addison-Wesley Professional. {{cite book}}: Invalid |ref=harv (help)
Hello @Glrx:
I've corrected again O with Ω. Reading what has been written on this talk page, I think clarification is needed.
First, I hope we all agree that "A has a fundamental requirement of B" has the same meaning than "A requires B" (unless you want to do some nasty knowledge obfuscation).
Second, the sentences "Comparison sorting algorithms have a fundamental requirement of Ω(n log n) comparisons" or "Comparison sorting algorithms have a fundamental requirement of O(n log n) comparisons" are ambiguous since it doesn't say for which complexity measure. Let us denote X a symbol in {O, Ω}. Unambiguous sentences would be "Comparison sorting algorithms require X(n log n) comparisons for worst case complexity." or "Comparison sorting algorithms require X(n log n) comparisons for best case complexity." or "Comparison sorting algorithms require X(n log n) comparisons for average complexity.", etc. (Average complexity is just the expected complexity for the uniform distribution over all instances of size n. You can have "expected complexity" for other probability distributions.) However, in complexity theory, when the complexity measure is not given explicitly, it is assumed most of the time that it is the worst case complexity.
Third, the choice of the complexity measure (worst case, best case, average, etc.) is independent of the choice of the asymptotic notation ; you can have theorems with O and worst case complexity, with Ω and worst case complexity, with O and best case complexity, with Ω and best case complexity, with O and average complexity, with Ω and average complexity, etc. Please don't assume that worst case complexity is some kind of negation and that Ω is also some kind of negation, like when you say "dont worst-worst the notation" ; it isn't double negation that cancels out or anything similar.
Fourth, if you want to upper bound worst case complexity, you have to prove that all instances are below the bound. If you want to lower bound worst case complexity, you just have to prove that some instance is above the bound. It's pure logic.
Fifth, Trimutius said "I'm not really sure what the talk is about? It is pretty obvious that in all referenced literature they use Big O Notation for a particular reason. Because Ω just isn't really important.". It is true that O is much more frequently encountered than Ω but it isn't "Because Ω just isn't really important.". The fundamental reason for that is that most of the scientific articles present a result on the complexity of a problem ; hence to give an upper bound on the complexity of a problem, it is sufficient to find SOME algorithm for the problem and upper bound the complexity of this algorithm. But if you want to show that the complexity of the problem is lower bounded, you have to prove that ALL possible algorithms for this problem have this lower bound. This is a difference between "there exists one algorithm" and "for all algorithms" that makes it much more harder in general to prove lower bounds for complexity of problems. The effort to lower bound the complexity of the algorithm is usually not done, since they cannot conclude anything with it for the problem, which is the ultimate goal, and also since most of the time the only easy-to-prove lower bound an algorithm has is the trivial lower bound of Ω(n) (the algorithm must read the input). It is expected by most computer scientist that P doesn't equal NP but that is extremely hard to prove because it implies proving a superpolynomial lower bound for some NP problem and thus having to deal with all possible algorithms for this problem. The lower bound for "Comparison sorting algorithms" is really an exception because it simply uses decision trees to be proved. No not-trivial lower bound is known for ALL sorting algorithms. In general, O results are much more easier to prove than Ω results.
Sixth, so now consider the truth value of the four sentences: S1 = "Comparison sorting algorithms require O(n log n) comparisons for worst case complexity.", S2 = "Comparison sorting algorithms require O(n log n) comparisons for best case complexity.", S3 = "Comparison sorting algorithms require Ω(n log n) comparisons for worst case complexity." or S4 = "Comparison sorting algorithms require Ω(n log n) comparisons for best case complexity.". We can translate the mathematical asymptotic notation as follows: S1 = "ALL comparison sorting algorithms require, asymptotically, up to a constant factor, at most n log n comparisons for worst case complexity.", S2 = "ALL comparison sorting algorithms require, asymptotically, up to a constant factor, at most n log n comparisons for best case complexity.", S3 = "ALL comparison sorting algorithms require, asymptotically, up to a constant factor, at least n log n comparisons for worst case complexity." or S4 = "ALL comparison sorting algorithms require, asymptotically, up to a constant factor, at least n log n comparisons for best case complexity.". S1 and S2 are false since you can design poorly some very bad comparison sorting algorithm that will repeat an arbitrary high number of times, say nn, the comparison between the first and second element before proceeding to do a merge sort for example and ultimately solving the problem of sorting. With this poor algorithm the best and worst case will not be too far apart compared to the order of magnitude of their values, and way above n log(n). On the other hand, S3 is true, that's the classical result using decision trees. S4 is false since some comparison sorting algorithm may use only n-1 comparisons on already sorted input which would be the best case. Hence the only correct sentence is "Comparison sorting algorithms require Ω(n log n) comparisons for worst case complexity.". I'll let you check by yourself that "Comparison sorting algorithms require O(n log n) comparisons for whatever complexity." is also false (you can always make a poor algorithm).
I hope I helped at least someone on this topic. Ask questions if it wasn't clear but be aware that I'm not always connected to Wikipedia so the answer may take some time. Best regards, SectionFinale (talk) 15:56, 17 December 2017 (UTC)Reply

NEW - release of hsort (triangulation sort) and why it matters

edit

hello and thank you for reading about this "not paper book published" interesting sort algo.

http://sourceforge.net/projects/headsort/

This isn't on the main page or book but i beleive DOES matter. Simply because it was never tried in publications doesn't mean it doesn't matter; results matter, and are in forma gnu sort(1) for trial.

Mergesort is stable and fast, it heuristicly "bins" like radix and speedily compares LINES of data using strcmp - making MULTI-DIMENSIONAL sorting (multi field) very fast. extra fields (dimensions) are (smartly) catenated onto previous ones (gnu sort does this) and never compared if strcmp stops before then.

However - Mergesort heuristic failure is comparing item that have ALREADY been compared.

Traditionally it is thought a "triangulation sort" cannot be fast because it needs two or thee dimensions to compare characters (also no cpu strcmp to use). However it gains by never comparing two things twice. That still lags behind merge sort best case, but does win in some cases.

"hsort" or headsort can win in a few major ways.

ONE: when the first bin is done (which is triangulated and cannot finish sooner), headsort program can begin piping sorted output. (unlike a simple "sift up" sort, it can finish the tail quickly as well). this is important if anyone is waiting to view data or has a battery that cannot withstand needless background heat while they view.

TWO: "hsort" found two NEW improvements which may well make it a good algo to use in general. First instead of data[line][field][characters] it tried data[field][line][characters] and nearly being deleted attempt: found a special case which allowed a dimension reduction of 1, at moderate cost (a good improvement). Doing so ate up all time to try the other...

THREE: Next and recently, HSORT tried a simpler dimensional reduction. Using a 1d array sorted that output ptr to 3d array. Impossible per say: but distribution counting allows it on the tail, so it is fully 1/2 true.

FOUR: the old adage *foo[][][] is expesive is not as true as it had been in the 8086 days. New CPU can do some extra dimensions in 1 CPU tick, which remove more of the remaining handicap of the 'hsort' algorithm in practice.

With those two (or fee) dimensional reductions my tests show that hsort beats "merge sort" used in gnu sort heavily in best cases, and also still wins in headsort's worst cases: just all around faster FOR multi dimensional sorting.

I'd like others comments but I'm well aware its' rare any proffessors exchange comments on algorithms.

hsort is debuggable (unlike recursive heuristics, it's flow is mostly natural)

hsort also has implications in distributive computing because it can recurse dimensions and doesn't require merging or waiting per say (implementation of piping finished bins in 123 order aside which is easy, which is always true single process)

Allowing an sssumption this hsort is faster in a worthy manner, it does matter in that scientific algorithms can finish or never finish. Many math algorithms have sorting at their core (for example, factoring). While for "web server technology" sorting might not be important - it is important whether scientific processes can finish and show a result. And that is simply re-iterating a textbook big-O-notation explanation.

LASTLY. hsort algorithm is not an idea it's available for use (has other algo in a collection too) as a gnu-sort compatible program, and has a demo "mini sort" prog to make it's portability and use more clear.

— Preceding unsigned comment added by 72.219.207.23 (talk) 23:16, 19 March 2016 (UTC)Reply

The example image of stable sorting is not quite correct

edit

The image that shows whether a sort is stable or not (Sorting_stability_playing_cards) correctly shows the output of an unstable sort. The example of a stable sort can really only be said to not be shown to be unstable. If the description indicated that it would consistently (or provably) work in that way, it would be better. Dcrafti (talk) 03:59, 17 January 2017 (UTC) DcraftiReply

Comb sort in the comparison table

edit

The section "Comparison of algorithms" gives n2 as the average time complexity of comb sort. I am not competent enough to suggest what the correct complexity is (I suspect that this is a rather sophisticated problem with a result of something like nsome number between 1 and 2), but n2 cannot be right; in practice this algorithm sorts huge lists at speeds comparable to nlogn algorithms. Same about worst case actually. — Preceding unsigned comment added by 2001:7D0:8891:1380:BC17:C2C1:9181:5154 (talk) 07:53, 7 August 2017 (UTC)Reply

Shell sort is in the wrong section

edit

Shell sort is clearly not a variant of bubble sort. Why is it listed under the headline "bubble sort and variants"? Dp99 (talk) 09:34, 6 June 2018 (UTC)Reply

It you wanted to know how to make bubblesort into an NlogN sort, shellsort would do it. Gah4 (talk) 20:38, 4 September 2019 (UTC)Reply

Twin Sort Algorithm

edit

The "Twin Sort Technique" (https://arxiv.org/abs/1710.07992) is the latest core computer sorting algorithm and got published recently, and I am the main author of it. Can this be covered and added to your portal (https://en.wikipedia.org/wiki/Sorting_algorithm)? Millions of students, researchers and engineers get benefited out of this addition.

VeereshDevireddy (talk) 15:10, 16 August 2018 (UTC)Reply

The answer is "definitely not" until it is widely cited in professional literature. After looking at your paper, I predict that time will never come. McKay (talk) 07:30, 17 August 2018 (UTC)Reply

cache

edit

It occurs to me that caching effects on modern (or not so modern) processors can greatly affect the speed of some memory access patterns, and that could be significant in some sort algorithm comparisons. Not knowing where else to mention it, though it might be useful in pages for specific sort algorithms, I am starting here. Gah4 (talk) 20:43, 4 September 2019 (UTC)Reply

I agree with "caching effects on modern (or not so modern) processors can greatly affect the speed of some memory access patterns". "(...) and that could be significant in some sort algorithm comparisons", yes. For instance, a sorting algorithm which works on the whole input from the start (such as bubblesort) is less cache-aware than an algorithm such as mergesort which can focus on local solutions, which tend to be contiguous in memory, before going over a larger slice of whole input. BernardoSulzbach (talk) 22:28, 4 September 2019 (UTC)Reply
One example of this is on a processor with an 8 way cache, a 4 way merge sort (without heap, just an if/else tree to determine smallest element), there are 4 cache lines for the 4 runs being merged, and a 5th cache line for the merged output. The end result in my testing is that 4-way merge sort is about 15% faster than 2-way merge sort. Rcgldr (talk) 02:53, 6 September 2019 (UTC)Reply


Example stable sort vs. FERPA

edit

The dynamic web page in the second paragraph implies (at least to me) a Publicly accessible web page, whose content runs afoul of FERPA.

The second paragraph of the Stability section starts with:

Stability is important for the following reason: say, if the data is sorted first by student name, in some cases, dynamically on the webpage, and now the data is again sorted by which class section they are in.

A webpage containing student name and corresponding class section data would at first glance appear to be "'directory information" but is actually an education record. Therefore if the web page is accessible by others than those with a defined, institutional, education purpose, the web page would be in violation of FERPA.

My feeling is those learning these topics should not be creating things, that, if used with real-life data, would be in violation of the law.

"What is an education record? | Protecting Student Privacy". studentprivacy.ed.gov. US Department of Education. Archived from the original on Christmas 2018. Retrieved 26 February 2020 – via https://studentprivacy.ed.gov/frequently-asked-questions. [...]records include but are not limited to grades, transcripts, class lists, student course schedules, health records (at the K-12 level), student financial information (at the post secondary level), and student discipline files. [...] {{cite web}}: Check date values in: |archive-date= (help); External link in |via= (help)

None of this is relevant to the article. The article is about algorithms for sorting data, not about what should be done with the data. The text also says nothing about publishing the data, so the question of what is legal to publish is entirely irrelevant. McKay (talk) 22:25, 27 February 2020 (UTC)Reply
"FERPA Tutorial - Directory Information|When is Directory Information Not Really Directory Information?". Office of The University Registrar - Penn State. Retrieved 26 February 2020. It is important to also understand the concept of "implicit disclosure." An implicit disclosure may occur when a list consists only of directory information but the list itself by definition reveals non-directory information. For example, a list of names and email addresses of all students who have a particular grade-point average reveals the students' GPAs. Likewise, a class list containing names and email addresses of the students reveals class enrollments. Since neither grade-point average nor class enrollment are directory items, releasing these lists without prior consent of the students constitutes a FERPA violation.{{cite web}}:  CS1 maint: url-status (link)

--Lent (talk) 08:55, 26 February 2020 (UTC)Reply

How is this stable sort example?

edit
Try out this sandbox example, which tracks the sorting changes for two different stable sorting examples on the same data:
See User:Lent/sandbox
Thanks --Lent (talk) 11:20, 29 February 2020 (UTC)Reply

Static stable sorting example

edit
Unsorted table→Sorted by State Code→ Stable Sort by State, then City
Row City
State Code
Old

New
Row
City
State Code
Original

Initial Sorting

Following
stable sort
City
State Code
1 Chicago IL 13 → 1 Rockford IA 13 → 1 → 1 Rockford IA
2 Rockford IL 2 → 2 Rockford IL 4 → 5 → 2 Champaign IL
3 Evanston IL 3 → 3 Evanston IL 1 → 4 → 3 Chicago IL
4 Champaign IL 1 → 4 Chicago IL 3 → 3 → 4 Evanston IL
5 Detroit MI 4 → 5 Champaign IL 2 → 2 → 5 Rockford IL
6 New York NY 12 → 6 Rockford MI 5 → 7 → 6 Detroit MI
7 Buffalo NY 5 → 7 Detroit MI 12 → 6 → 7 Rockford MI
8 Milwaukee WI 15 → 8 Rockford MN 15 → 8 → 8 Rockford MN
9 Albany NY 11 → 9 Syracuse NY 9 → 12 → 9 Albany NY
10 Green Bay WI 6 → 10 New York NY 7 → 11 → 10 Buffalo NY
11 Syracuse NY 7 → 11 Buffalo NY 6 → 10 → 11 New York NY
12 Rockford MI 9 → 12 Albany NY 11 → 9 → 12 Syracuse NY
13 Rockford IA 14 → 13 Rockford TN 14 → 13 → 13 Rockford TN
14 Rockford TN 8 → 14 Milwaukee WI 10 → 15 → 14 Green Bay WI
15 Rockford MN 10 → 15 Green Bay WI 8 → 14 → 15 Milwaukee WI

Static Stable sorting tables with dividers

edit
Unsorted table→Sorted by State Code→ Stable Sort by State, then City
Row City
State Code
Old

New
Row
City
State Code
Original

Initial Sorting

Following
stable sort
City
State Code
1 Chicago IL 13 → 1 Rockford IA 13 → 1 → 1 Rockford IA
2 Rockford IL 2 → 2 Rockford IL 4 → 5 → 2 Champaign IL
3 Evanston IL 3 → 3 Evanston IL 1 → 4 → 3 Chicago IL
4 Champaign IL 1 → 4 Chicago IL 3 → 3 → 4 Evanston IL
5 Detroit MI 4 → 5 Champaign IL 2 → 2 → 5 Rockford IL
6 New York NY 12 → 6 Rockford MI 5 → 7 → 6 Detroit MI
7 Buffalo NY 5 → 7 Detroit MI 12 → 6 → 7 Rockford MI
8 Milwaukee WI 15 → 8 Rockford MN 15 → 8 → 8 Rockford MN
9 Albany NY 11 → 9 Syracuse NY 9 → 12 → 9 Albany NY
10 Green Bay WI 6 → 10 New York NY 7 → 11 → 10 Buffalo NY
11 Syracuse NY 7 → 11 Buffalo NY 6 → 10 → 11 New York NY
12 Rockford MI 9 → 12 Albany NY 11 → 9 → 12 Syracuse NY
13 Rockford IA 14 → 13 Rockford TN 14 → 13 → 13 Rockford TN
14 Rockford TN 8 → 14 Milwaukee WI 10 → 15 → 14 Green Bay WI
15 Rockford MN 10 → 15 Green Bay WI 8 → 14 → 15 Milwaukee WI

Stable sorting example

edit
Unsorted table
City State
Chicago IL
Rockford IL
Evanston IL
Champaign IL
Detroit MI
New York NY
Buffalo NY
Milwaukee WI
Albany NY
Green Bay WI
Syracuse NY
Rockford MI
Rockford IA
Rockford TN
Rockford MN
Sorted Alphabetically by only City
City States
Albany NY
Buffalo NY
Champaign IL
Chicago IL
Detroit MI
Evanston IL
Green Bay WI
Milwaukee WI
New York NY
Rockford MI
Rockford IA
Rockford TN
Rockford MN
Rockford IL
Syracuse NY
Sorted Alphabetically by only State
City States
Rockford IA
Rockford IL
Evanston IL
Chicago IL
Champaign IL
Rockford MI
Detroit MI
Rockford MN
Syracuse NY
New York NY
Buffalo NY
Albany NY
Rockford TN
Milwaukee WI
Green Bay WI
City States
Sorted Alphabetically by State, then City
Rockford IA
Champaign IL
Chicago IL
Evanston IL
Rockford IL
Detroit MI
Rockford MI
Rockford MN
Albany NY
Buffalo NY
New York NY
Syracuse NY
Rockford TN
Green Bay WI
Milwaukee WI
City States
Sorted Alphabetically by City, then State
Albany NY
Buffalo NY
Champaign IL
Chicago IL
Detroit MI
Evanston IL
Green Bay WI
Milwaukee WI
New York NY
Rockford IA
Rockford IL
Rockford MI
Rockford MN
Rockford TN
Syracuse NY

Derived from Jeffery S. Leon's Sorting Algorithms — Stability". [1]

To sort by the above tables, click on the triangle. Then hold the Shift key and click on the triangle to sort by another column, while keeping the order for the first column when values of the second column are equal.

Criticisms, Suggestions and Comments

edit
Apparently this renders statically on mobile devices, with no interactive sortability for the user. Also MOS:COLLAPSE means we shouldn't hide the content by default. Perhaps a version of this should be a subpage, as a more detailed alternative to the playing card example shown. --Lent (talk) 09:19, 28 February 2020 (UTC)Reply
Please check out the evolving page over at: User:Lent/sandbox
Thanks --Lent (talk) 11:20, 29 February 2020 (UTC)Reply


I need some advice on MOS Table Summaries, Table Color (used to emphasize motion of rows during sorting), and the five across tables.

Also the mobile Wikipedia seems to expand the tables by default, with no obvious way to hide the table.

Also which, if any data elements, should I get rid of? I think omitting a few more cities named Rockford might help, right?

Anyway, I would like this to be a nice example, which allows interaction using the built-in Wikipedia functionality for sortable tables.

Finally, where can I find documentation for normal users that the holding Shift key while clicking a sort triangle maintains the previous column(s) sort order? Did you know about this?

--Lent (talk) 06:13, 28 February 2020 (UTC)Reply

  1. ^ Leon, Jeffrey S. (13 January 2008). "Sorting Algorithms — Stability" (PDF). Department of Mathematics, Statistics, and Computer Science | University of Illinois at Chicago. Archived (PDF) from the original on 27 November 2014. Retrieved 28 February 2020. Our sorting software might allow sorting on only one field at a time. We may 1) First sort the array A alphabetically by city. 2) Then sort the array A alphabetically by state, using a stable sorting algorithm.
The tables work fine for me, but they actually don't really illustrate the notion of stability all that well. The example that was sorted "by state, then by city" sounds like it should be illustrating the effect of two successive sort operations: the first by state, and the second by city; but it's actually showing the data sorted a single time "by state, then by city" which is equivalent to sorting by city, then doing a second stable sort by state. If the salient concept here is stability, and you have multiple sorting steps, the key point to illustrate is that later sort steps preserve ordering from earlier steps, but in the example in the table, later sort steps actually consider all the columns, and so the earlier steps are irrelevant. It makes for quite a confusing illustration of stability. --Doradus (talk) 12:48, 6 March 2020 (UTC)Reply

Average Case vs Expected Runtime for Bogosort

edit

I would like to discuss a revert done by an anonymous IP. The reason for the revert was that the expected runtime is given in the average case. This is, however, wrong. The expected runtime is different from the average case. These two names are different concepts: the average case is a description of the input. The input meaning here the list to sort. We assume some distribution of the input and calculate the expectation for the runtime over this distrubtion. This is different for "expected runtime". The 'expected' in 'expected runtime' does not relate to the input, but instead to what we often call 'random bits'. The 'random bits' are not part of the input (this would make the whole concept of probablistic programming ad absurdum). Therefore we can not say that the average-case is identical to the expected runtime. We have infinitely many expected runtimes, especially we have a worst-case expected runtime, a best-case expected runtime and again we also have an average-case expected runtime.

Even worse, saying that the worst-case runtime of bogosort is infinite is at best misleading. That is because there is really no concept in which infinite would be a meaningful runtime for bogosort. I would accept using the phrase 'certainly unbounded', meaning assuming that there is no randomess, the runtime is not bounded by any function depening on the input. Even saying that the runtime is unbounded (without using the word certainly) is misleading as the probability for unbounded runtime is 0 and even the expected runtime is at most n x !n. I would therefore ask to redo my change to fit more with the given source (which uses the same semantics) and with the current research. --PhoenixIra (talk) 17:25, 22 June 2020 (UTC)Reply

Worst case is commonly meant to be the worst possible case, which is unbounded for bogosort. The input in this case would commonly include the worst possible random seed as well. At any rate, we follow the sources, and your addition (and the matching addition at Bogosort) was unsourced. Cite a reputable source and interprets that the worst case behavior as you are and perhaps we can use it. - MrOllie (talk) 17:51, 22 June 2020 (UTC)Reply
I don't agree. Citing CLRS: "The worst-case running time of an algorithm gives us an upper bound on the running time for any input. Knowing it provides a guarantee that the algorithm will never take any longer. We need not make some educated guess about the running time and hope that it never gets much worse." (Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to Algorithms, Third Edition. MIT Press 2009, page 28) and later claryfing: "When analyzing the running time of a randomized algorithm, we take the expectation of the running time over the distribution of values returned by the random number generator. We distinguish these algorithms from those in which the input is random by referring to the running time of a randomized algorithm as an expected running time. In general, we discuss the average-case running time when the probability distribution is over the inputs to the algorithm, and we discuss the expected running time when the algorithm itself makes random choices." (Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to Algorithms, Third Edition. MIT Press 2009, page 117). CLRS also sometimes uses the phrase 'worst-case' complexity when refering to randomized algorithms, when they really mean 'certain worst-case'. Including random bits in the input would also make the runtimes totally absurd: for an infinite runtime, you would need infinite random bits, making an infinite input. This is obviously broken. Again, expectation and worst/average-case are different concepts. Coming to bogosort, we have 'Sorting the slow way: an analysis of perversely awful randomized sorting algorithms': "Let S denote the number of swaps carried out by bogo-sort on a given input x of length n. Then E[S]=(n − 1)n! in the worst and average case." and "Let C denote the number of comparisons carried out by bogo-sort on a given input x of length n. Then E(C) = (e − 1)n! + n − O(1) in the worst-case" (Gruber, H.; Holzer, M.; Ruepp, O., "Sorting the slow way: an analysis of perversely awful randomized sorting algorithms", 4th International Conference on Fun with Algorithms, Castiglioncello, Italy, 2007 (PDF), Lecture Notes in Computer Science, 4475, Springer-Verlag, pp. 183–197, Corollary 6 and 10) and .Again, I am on board with using the phrase "unbounded certainl runtime" and "n x !n expected runtime", however, infinite is wrong and don't mentioning the worst-case expected runtime is bad. --PhoenixIra (talk) 18:11, 22 June 2020 (UTC)Reply
The Gruber et al. conference paper looks good, you've convinced me. Just add a citation to that. - MrOllie (talk) 18:24, 22 June 2020 (UTC)Reply

"For optimum efficiency, the input data should be stored ... "

edit

An introductory sentence says, "For optimum efficiency, the input data should be stored in a data structure which allows random access rather than one that allows only sequential access."

But that is true only when the cost of random access is very small. If the volume of input data is too long to fit in available fast memory (e.g., RAM), then having input data stored in a data structure residing on slow memory (e.g., hard disk, or magnetic tape) then such a data structure can be terrible.

I'm changing the introductory sentence to say, "For optimum efficiency, input data in fast memory should be stored in a data structure which allows random access rather than one that allows only sequential access." DavidForthoffer (talk) 20:41, 4 February 2021 (UTC)Reply

Bad behavior for parallel sort

edit

First bullet in 'Classification'

Computational complexity (worst, average and best behavior) in terms of the size of the list (n). For typical serial sorting algorithms, good behavior is O(n log n), with parallel sort in O(log2 n), and bad behavior is O(n2)

Does anyone know what the 'O' for bad behaviour for parallel sorting is?

Darcourse (talk) 10:12, 22 July 2021 (UTC)Reply

Comparison list is incomplete Tim Sort is missing

edit

Link: https://en.wikipedia.org/wiki/Timsort — Preceding unsigned comment added by 2001:16b8:1e5b:ba00:2c00:cbff:fe64:7ae (talkcontribs)

No, it is there. Look again. - MrOllie (talk) 19:26, 18 August 2021 (UTC)Reply

Online sorting

edit

I was surprised to find no page on wikipedia for comparing online sorting algorithms, and no section in the Online algorithm page. I'm not sure if it deserves its own page, a section in this page with a new comparison table, or maybe a column in an existing comparison table? Privychick (talk) 13:32, 18 March 2022 (UTC)Reply

tk pic

edit

What is the point of the tk screenshot at the top?? —Tamfang (talk) 00:46, 21 April 2023 (UTC)Reply

Seeing no response ... —Tamfang (talk) 04:52, 3 November 2023 (UTC)Reply
I agree that it wasn't very good. But I think something should be there. Do you see any good ones at Category:Sort algorithms? Perhaps File:Visualization of Gnome sort.gif Bubba73 You talkin' to me? 05:29, 3 November 2023 (UTC)Reply
Perhaps an animation of a less stupid sort. ;) Heapsort or Quicksort for example. —Tamfang (talk) 06:23, 3 November 2023 (UTC)Reply
There are some better ones at the Wikimedia Commons: Category:Animations of sort algorithms, see this Bubba73 You talkin' to me? 06:31, 3 November 2023 (UTC)Reply

AI Sorting

edit

Just wondered if AI sorts should be added:

GrahamHardy (talk) 16:42, 23 June 2023 (UTC)Reply

By which I guess you mean improved algorithms recently discovered by AIs. If they are novel concepts they ought to be listed, but the videos seem to say they are speedier versions of known approaches. (I'll try to read the articles linked under the first video.) —Tamfang (talk) 23:45, 18 July 2023 (UTC)Reply
The DeepMind AI sorting algorithm targets sorting 3 to 5 elements and in the case of 5 elements, saves one compare. Not much of a breakthrough. Rcgldr (talk) 07:38, 24 July 2023 (UTC)Reply
That could potentially be significant if it is in a recursive divide-and-conquer algorithm. Bubba73 You talkin' to me? 05:36, 3 November 2023 (UTC)Reply

mistakes and ArrayV

edit

1 strength of ArrayV is, that the numbers, which are visualized as bars, are always integers, as I have never seen pigeonhole sort failing to sort an array.

There are some wrong time complexities in Sorting algorithm#Others.

Simple pancake sort should be O(n) best and O(n2) average and worst and Bitonic sorter should be O(n log2n), best average and worst, where log2n means (log n)2. Well, these time complexities fit well to the results in https://www.youtube.com/watch?v=8MsTNqK3o_w&t=2825s andhttps://www.youtube.com/watch?v=v4Uo0Pi6odo&t=15s. The fastest a sorting algorithm can reach is O(n). In array v oof 2, sb broke ArrayV by running parallel bitonic sort with 16384 numbers, but I think, it was just a freeze bug, like https://www.youtube.com/watch?v=vlLrmCifVRY&t=1646s.

Another strength of ArrayV is how the sorting algorithms are visualized. AceOfSpadesProduc100 himself said in (follow-up video) Threaded pattern-defeating merge sort on ArrayV-v4.0, in Parallel bitonic sort on ArrayV, in Parallel odd-even merge sort on ArrayV and in ArrayV-v4.0's parallel sorting algorithms, that he uses an Intel Core i7-9750H, which has 6 cores and 12 threads. But, the way ArrayV visualizes the sorting algorithms lets the parallel sorting algorithms do the fully parallel method. For example, parallel bitonic sort could use flan sort rotations. 94.31.83.138 (talk) 19:43, 23 August 2023 (UTC)Reply

corrections for "Explaining EVERY Sorting Algorithm" parts

edit

bold = correction

part 2: https://www.youtube.com/watch?v=wqibJMG42Ik

patience sort: O(n) best, O(n log n) average and worst, O(n) memory, stable

tournament sort: O(n log n) best, average and worst, O(n) memory, stable

bitonic sort: O(n log^2 n) best, average and worst, O(1) memory, unstable

odd-even merge sort: O(n log^2 n) best, average and worst, O(1) memory, unstable

pairwise sorting network: O(n log^2 n) best, average and worst, O(1) memory, unstable

stooge sort: O(n^2.71) best average and worst, O(1) memory, unstable

slow sort: O(n^(log n)) best average and worst, O(1) memory, unstable

variants and hybrids: https://www.youtube.com/watch?v=FntVy6lPVyo

cartesian tree sort: O(n log n) best, average and worst, O(n) memory, stable

Stalin sort: O(n) best, average and worst, O(1) memory, stable

sleep sort: O(n) best, O(n + r) average and worst, O(n) memory, stable

miracle sort: O(n) best (on a sorted list), O(inf) average and worst, O(1) memory, stable

power sort: O(n * n!) best, average and worst, O(n * n!) memory, stable

A sorting algorithm can not be faster than O(n).

Stalin sort increases unsorted values to the largest previously read value (https://www.youtube.com/watch?v=hyOlWQ9MLPI). For example, [1, 2, 5, 3, 6, 4, 10] becomes [1, 2, 5, 5, 6, 6, 10].

Sleep sort is a time-based sorting algorithm. The smallest number wakes up first and the largest number wakes up last (https://www.youtube.com/watch?v=ktgxMtWMflU&t=210s). Equal numbers will wake up at the same time, but waking up is swapless.

Miracle sort only makes comparisons and power sort prints all the permutations in auxiliary arrays, so both of them are swapless.

Swaps of non-adjacent numbers are what makes a sorting algorithm unstable (https://www.youtube.com/watch?v=KJuxI1BBLyQ&t=403s).

some examples

cycle sort, bitonic sort, odd-even merge sort, pairwise sorting network, stooge sort and slow sort: Swap non-adjacent numbers. = unstable

insertion sort and binary insertion sort: Swap only adjacent numbers. = stable

patience sort, tournament sort, cartesian tree sort, Stalin sort, sleep sort, miracle sort and power sort: no swaps = stable 94.31.89.138 (talk) 20:28, 28 October 2023 (UTC)Reply