Talk:Golden-section search
This article is rated Start-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||
|
Untitled
editNeeds expansion and inline references. Adam McCormick (talk) 00:53, 21 March 2008 (UTC)
Motivation, etc
editFibonacci search was actually introduced by J. Kiefer in 1953. Kiefer also introduced the golden section search ,i.e., using the limit of
F(n-1)/F(n) ~ 0.618 as the constant ratio. — Preceding unsigned comment added by 109.66.185.182 (talk) 17:14, 24 December 2011 (UTC)
I have changed a few things on the recent addition.
The motivation previously provided might not be as clear as it could be. I have added a sentence in the beginning. The new motivation was first of all wrong, and secondly, it removed the explanation of the intervals a,b,c in the diagram.
The selection of the "probe" point in the golden ratio does not guarantee the fastest possible convergence. The fastest possible convergence ratio is not known a priori. The selection of the probe point yields a constant narrowing ratio and search interval, so that at each point, no one interval is favored over another in the search.
I think the other additions are excellent. PAR 21:38, 13 February 2007 (UTC)
Hi David - on the recent additions - yes, thats the part that was missing. It looks a lot better. PAR 22:04, 13 February 2007 (UTC)
- Thanks — I'm glad we seem to have converged to something we both agree is an improvement. —David Eppstein 22:08, 13 February 2007 (UTC)
Problem with choosing the next d in example code?
edit- d = c + resphi*(b - c)
The above line does not work for both branches the algorithm can take. —Preceding unsigned comment added by 71.237.136.91 (talk) 05:41, 30 May 2010 (UTC)
Problem with the Termination Condition section?
editI hesitate to make the change myself, but there is clearly at least a notation issue in that section. It starts out by listing , , and as the relevant variables, but in the equation we find and no . In the original text (Numerical Recipes in C, page 402), there is no and the equation is written as:
as opposed to
in this article. The fact that didn't change tells me that this is not just an off-by-1 problem. I think the issue arises from the fact that earlier in the article the notation is 1-based, not 0-based. My proposed change would be to list the formula as:
The biggest reason I didn't change it though is the following line that refers to the precision of and , and I don't know which notation scheme this is referring to (I imagine it's the 1-based one).
(sorry for the deletion in the history - realized I wasn't logged in when I wrote it, wanted to get it put back in my name so I can address comments/questions)
Why?
editWhat if any and when there are benefits over binary search? —Preceding unsigned comment added by 193.184.83.235 (talk) 10:19, 13 May 2011 (UTC)
- They solve different problems. Binary search looks for a particular value in a monotonic function (one that only goes up, or only goes down) while this one looks for the value where a function that goes up and then down takes its maximum value. —David Eppstein (talk) 15:59, 13 May 2011 (UTC)
- Is it simply a version of ternary search where the intermediate test points are chosen a little more carefully than thirds? The interval scales by 1/phi = 0.618... instead of 2/3 = 0.666... https://github.com/cp-algorithms/cp-algorithms/pull/1119
- You can also do ternary search as binary search on finite differences, assuming a discrete search space. Wqwt (talk) 02:29, 19 June 2024 (UTC)
- You can also do binary search with breakpoints at 1/3 instead of at their worst-case optimal location, which for binary search is 1/2. But why would you? Anyway this is probably not the same in that a ternary search would split the current interval into three every time, with two test points, while this search takes an already-split-in-two interval and splits its larger part with only one more test point. —David Eppstein (talk) 04:32, 19 June 2024 (UTC)
x1, x2, etc vs. a, b, c
editIt seems odd that the top part of the article talks about a, b, c being segment lengths, while the pseudocode algorithm assumes them to be points on the real line. (the comment line notwithstanding.) I think it would be much clearer to use x1, x2, in the pseudocode. Am I missing something? Shabbychef (talk) 18:46, 6 February 2012 (UTC)
- Absolutely agree. I think it should be fixed. Also gr=(math.sqrt(5)-1)/2 in python program, gr is bad name for it. as this is not a golden ratio
Strictly monotonous
editIt seems that the code as presented does not adequately handle the case where f(x) == f(b), yet Wikipedia's definition of unimodal function only says the function needs to be monotonous, which is defined as non-decreasing (ie. constant values are okay). As I don't think there's any good way of remedying this, maybe the article should be made more precise instead? I wouldn't know if e.g. “strictly unimodal” would make any sense, though. Sesse (talk) 14:37, 13 August 2012 (UTC)
- What is the "correct" result? f(x)==f(b) has no minimum. PAR (talk) 22:22, 13 August 2012 (UTC)
- In that case, any x would do. With a function like e.g. f(x)==abs(floor(x)), there would certainly be a minimum at f(x)=1 (even if we exclude the discontinuity at x=0), but the algorithm as stated might very well wander off and pick e.g. x=3 instead. Maybe this should be emphasized with an assert(f(x) != f(b)) in the code? -Sesse (talk) 18:58, 14 August 2012 (UTC)
- Can you give a specific example involving f(x)==abs(floor(x)), including lower boundary (x1), center value (x2) and upper boundary (x3) where x1<=x2<=x3 for which it "wanders off" and gives an incorrect answer? PAR (talk) 01:42, 15 August 2012 (UTC)
- It's not exactly the “wandering off” problem I saw in my earlier testing (my function is actually quite a lot more complicated, and reducing it to a simple test case is not all that easy), but for f(x)=abs(ceil(x)), a=-100, b=1, c=16, tau=1e-3, the algorithm (at least when I try it in C) returns x=1.000009, whose f(x) = 2. In testing this, I also found another bug; if b=x=0, the algorithm will never terminate since the right side of the inequality is zero. You can see this with e.g. f(x) = abs(ceil(x)), a=-10, b=-1 and c=10; relatively quickly, it converges on a=b=c=0 and just loops infinitely from there. Just changing < to <= fixes that one easily enough, though. -Sesse (talk 09:38, 15 August 2012 (UTC)
- I found a slightly better example. f(x)=abs(ceil(x)), a=-1000, b=-989, c=303. At some point, this reduces to a=-3.532451, b=-1.999827, c=0.480010, for which x=-1.052613, and then the algorithm picks the wrong way by setting c=x, taking the correct solution outside the [a,c] interval. It eventually chooses to return x=-2.000174, for which f(x)=2, which obviously is wrong (e.g. x=-0.5 would give f(x)=0). -Sesse (talk 09:55, 15 August 2012 (UTC)
- As there's been no response for over a month, I'm making the change. -Sesse (talk) 22:58, 19 September 2012 (UTC)
Link to Brent's Method is incorrect
editThere are two algorithm's both named Brent's method. The link to article on Brent's method links to an article on root-finding, not function minimization. — Preceding unsigned comment added by Rwsaustin (talk • contribs) 20:12, 21 October 2013 (UTC)
why this algorithm sometimes doesn't find the global maximum
editAs you probably alread know, the bisection method is guaranteed to always find a zero of the function, even if the function has several zeros.
Sometimes I want to find the maximum of a function that I suspect is not "strictly unimodal". For example, let's say I have a polynomial function p(x) that I suspect has two local maxima. For example, let's say I have a continuous piecewise linear function L(x) with a finite number of pieces, and none of the pieces are horizontal, but I suspect it has several maxima.
For the above functions, once I've found locations x1, x2, x3 such that x1 < x2 < x3 and f(x1) < f(x2) and f(x3) < f(x2), I know that at least one local maximum is in the range x1 to x3.
When given those points x1, x2, x3, is the golden section search guaranteed to converge on a local maximum (which may or may not be the global maximum)?
I think this article would be improved by a few words discussing what happens when this algorithm is applied to a function that is not "a strictly unimodal function". In other words, I think this article would be improved by explaining why this algorithm sometimes doesn't find the global maximum when the function is not a strictly unimodal function. --DavidCary (talk) 15:35, 17 March 2015 (UTC)
Is python program for golden section search broken?
editI may be wrong there, but seems there at least three problems:
- It is inconsistent with article. a,b used for points and not for interval length. phi/gr in program is not a golden ration.
- On each step it calculates values in 2 points. Calculating function in only one point on each step IMHO main advantage of it over ternary search. There is commented code, but IMHO it creates more confusion then adds value.
- It doesn't work. If you try to run it on [-z,z] interval, on first step it will produce bigger interval ~[-0.2z, 2.2z] instead of shrinking it
Av life (talk) 01:41, 25 September 2015 (UTC)
- All the edits after 16 March 2015 have been to add then tweak the Python code. I don't have the energy to study it at the moment, but it looks typical of a home-grown solution that is not really appropriate here. The simplest might be to just remove the Python code if you are confident it has problems. I confirm its style is undesirable. Johnuniq (talk) 02:02, 25 September 2015 (UTC)
- I have problems with the method too - it fails to converge correctly when told to find the minimum of the function x*x in the interval [-30, 30] with a starting point of 20. It gets stuck around 7.082039324993694 for about 30 iterations before getting unstuck and continuing to converge. — Preceding unsigned comment added by 90.217.67.144 (talk) 11:50, 27 May 2016 (UTC)
Neither of the Python code excerpts work at all. It works for parabola probably just because it is symmetric, but fails spectacularly for my function that almost looks like a parabola and is almost symmetric but it converges to either upper or lower boundary instead of the maximum. The function I use is-x - 113162 / (x / (x + 5619543) + 0.181412) ** 1.216066 + 902011
- it is strictly unimodal, positive in the search range, and has global maximum at x=29444.44. But this implementation "converges" to either of the boundaries in so many iterations, whereas a simple secant search with the derivative approximated with secants always converges to 1-e10 distance from the maximum in as little as 7 iterations and 22 function evaluations. I was interested to see how my algorithm performs against GSS and was pleased to find the Python implementations here only to find out that they were totally broken. As I've lost more time trying to debug each of these 3 non-working Python excerpts and writing this comment than it would have taken to implement the GSS from scratch, I have to agree with Johnuniq and I shall remove it; if I manage to implement sensible working GSS code in Python I shall add it here then. --Ztn (talk) 21:30, 3 April 2022 (UTC)- I was mistaken above. The problem with all the excerpts here are that they look for the minimum instead of maximum. Since x² does not have a minimum it doesn't work. Ztn (talk) 22:19, 3 April 2022 (UTC)
More efficient (and correct) implementations
editThe first python version is illustrative, but not efficient, while the recursive ones were difficult to follow and (for the python example) apparently wrong.
I replaced all but the first version with alternatives that are correct and reuse previously calculated values. This feature is the main reason the technique is interesting, and finding a good example of this is remarkably hard.
The codes are translated and optimized versions of
http://mathfaculty.fullerton.edu/mathews/n2003/goldenratiosearchmod.html
Warren-macevoy (talk) 09:14, 11 February 2018 (UTC)
References
Golden and Bisection Method
editNumerical Recipes states that, "[t]he golden section search guarantees that each new function evaluation will (after self-replicating ratios have been achieved) bracket the minimum to an interval just 0.61803 times the size of the preceding interval," which is certainly nice, but with bisection (which is easier to code and easier to understand) the minimum is bracketed by an interval just 0.5 times the size of the preceding interval. So why bother with the extra effort of coding the golden section interval and processing the necessary extra steps for 0.61803 times bracket reductions to achieve as much as 0.5 times bracket reductions? If there is a benefit to golden section (over bisection) it certainly doesn't seem to be mentioned here (nor is the existence of bisection even mentioned). Can anyone help me understand any benefits over golden section over bisection? 71.120.2.107 (talk) 02:06, 30 October 2022 (UTC)mjd
- To find which way to jump at each step of bisection you have to evaluate two function values at the midpoint and compare them. To find which way to jump with the golden ratio search you only need one new evaluation. So there are fewer evaluations overall. —David Eppstein (talk) 05:03, 30 October 2022 (UTC)
- I am not sure I understand what you are saying. So please let me explain what I do. The article itself says, "The technique derives its name from the fact that the algorithm maintains the function values for four points whose three interval widths are in the ratio φ:1:φ where φ is the golden ratio. These ratios are maintained for each iteration and are maximally efficient." When I do a bi-section search I start with three, not four, values that I know bracket the minimum. How do I know this? By starting with three evaluations. In order to be sure I am bracketing a minimum the value of the function at the two endpoints must be greater than the value at the midpoint. It is not required that the midpoint be halfway between the two, but it must be between the two and the value of the function at that point must be lower than the value of the function at the two endpoints. So three evaluations of the function to initialize the search. I have to start with the minimum bracketed for this to happen, but this is required for the golden section method as well, so this starting point is either the same for both methods or the bisection has an advantage here if this article is correct in claiming *four* evaluations of the function are required for each step. Now, for bisection, after evaluating the function at these three initial values of the abscissa, I choose the *two* ordered pairs that have the lowest ordinate value (the midpoint and the endpoint with the lower ordinate value of the two) and a choose a new abscissa that is just the average of those two abscissas chosen. That is my new abscissa point. I do exactly *one* function evaluation at that point and I now have a new triplet composed of the two ordered pairs selected from the starting triplet and the new ordered pair I just calculated at their midpoint. That is one complete bisection step. For every subsequent step I perform exactly one additional evaluation of the function and at every successive step I reduce the interval bracketing the minimum by half. So one evaluation per bracket reduction of 0.5. Golden section performs one evaluation per step as you state here, and the article is clear that the interval reduction is only 0.61803 as opposed to 0.5. So what am I doing wrong with the bisection search that only requires me to do one evaluation per step if as you state everyone else requires two function evaluations per step? I'm really very interested in understanding this, because if I can search more effectively I'd really love to do so. Thanks. 71.120.2.107 (talk) 15:23, 30 October 2022 (UTC)mjd
- So how do you go about finding your starting three evaluations with the property that the midpoint is the minimum? And what do you do when you your new interval has a midpoint that is not the minimum of the three? Golden section search does not require the middle point to be the minimum point. —David Eppstein (talk) 16:18, 30 October 2022 (UTC)
- You're right - thanks - I took it for granted that the new minimum was between the lowest pair in the old triplet - when I checked it was obvious that isn't always the case. So I see it is required for bisection that the midpoint evaluation be done on both intervals - so bisection does take on average more evaluations than golden section minimization. For the first question, the first triplet with the middle point having the minimum function value of the three, I have to estimate that and check I did it right before bisection starts. I see no perfect way to do that for either Golden section or bisection. For both, that is the first minimum requirement so you have to find that by some brute force method. This article on Golden section starts out, "[f]or a strictly unimodal function with an extremum inside the interval, it will find that extremum." That's the case for both. Your question, "how do you go about finding your first three evaluations," I assume, is not a hint that is performed differently for the two approaches. — Preceding unsigned comment added by 71.120.2.107 (talk) 21:50, 30 October 2022 (UTC)
- The point is that for golden section search, the three selected points can be chosen purely based on their index; their values do not need any special ordering. It is automatically true that the middle point is not the largest of the three, and that is all that it needs. —David Eppstein (talk) 23:11, 30 October 2022 (UTC)
- You're right - thanks - I took it for granted that the new minimum was between the lowest pair in the old triplet - when I checked it was obvious that isn't always the case. So I see it is required for bisection that the midpoint evaluation be done on both intervals - so bisection does take on average more evaluations than golden section minimization. For the first question, the first triplet with the middle point having the minimum function value of the three, I have to estimate that and check I did it right before bisection starts. I see no perfect way to do that for either Golden section or bisection. For both, that is the first minimum requirement so you have to find that by some brute force method. This article on Golden section starts out, "[f]or a strictly unimodal function with an extremum inside the interval, it will find that extremum." That's the case for both. Your question, "how do you go about finding your first three evaluations," I assume, is not a hint that is performed differently for the two approaches. — Preceding unsigned comment added by 71.120.2.107 (talk) 21:50, 30 October 2022 (UTC)
- So how do you go about finding your starting three evaluations with the property that the midpoint is the minimum? And what do you do when you your new interval has a midpoint that is not the minimum of the three? Golden section search does not require the middle point to be the minimum point. —David Eppstein (talk) 16:18, 30 October 2022 (UTC)
- I am not sure I understand what you are saying. So please let me explain what I do. The article itself says, "The technique derives its name from the fact that the algorithm maintains the function values for four points whose three interval widths are in the ratio φ:1:φ where φ is the golden ratio. These ratios are maintained for each iteration and are maximally efficient." When I do a bi-section search I start with three, not four, values that I know bracket the minimum. How do I know this? By starting with three evaluations. In order to be sure I am bracketing a minimum the value of the function at the two endpoints must be greater than the value at the midpoint. It is not required that the midpoint be halfway between the two, but it must be between the two and the value of the function at that point must be lower than the value of the function at the two endpoints. So three evaluations of the function to initialize the search. I have to start with the minimum bracketed for this to happen, but this is required for the golden section method as well, so this starting point is either the same for both methods or the bisection has an advantage here if this article is correct in claiming *four* evaluations of the function are required for each step. Now, for bisection, after evaluating the function at these three initial values of the abscissa, I choose the *two* ordered pairs that have the lowest ordinate value (the midpoint and the endpoint with the lower ordinate value of the two) and a choose a new abscissa that is just the average of those two abscissas chosen. That is my new abscissa point. I do exactly *one* function evaluation at that point and I now have a new triplet composed of the two ordered pairs selected from the starting triplet and the new ordered pair I just calculated at their midpoint. That is one complete bisection step. For every subsequent step I perform exactly one additional evaluation of the function and at every successive step I reduce the interval bracketing the minimum by half. So one evaluation per bracket reduction of 0.5. Golden section performs one evaluation per step as you state here, and the article is clear that the interval reduction is only 0.61803 as opposed to 0.5. So what am I doing wrong with the bisection search that only requires me to do one evaluation per step if as you state everyone else requires two function evaluations per step? I'm really very interested in understanding this, because if I can search more effectively I'd really love to do so. Thanks. 71.120.2.107 (talk) 15:23, 30 October 2022 (UTC)mjd
Conditions
editI think it's easy to devise an example of continuous and even differentiable function for which the method won't converge to the local extrema, thus the fact that it will always converge to local extrema seems wrong, imagine something like x^3sin(1/x), for a function like that method can converge to 0, though 0 isn't local extrema. KhayaleevT (talk) 13:40, 26 November 2022 (UTC)