Wikipedia:Reference desk/Archives/Computing/2024 August 7

Computing desk
< August 6 << Jul | August | Sep >> August 8 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is a transcluded archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


August 7

edit

Single versus Multiple Exit Points in a Function

edit

When I was in school back in the 90s, we were taught that a function should have only one exit point. Do they still teach this? I'm asking because I'm coming across a lot of code when doing code reviews where the developer has multiple exit points and I'm wondering if I should ask them to change their code to have one exit point or let it slide. For example, I often see code like this:

        private static bool IsBeatle1(string name)
        {
            if (name == "Paul") 
            {
                return true;
            }
            if (name == "John")
            {
                return true;
            }
            if (name == "George")
            {
                return true;
            }
            if (name == "Ringo")
            {
                return true;
            }
            return false;
        }

Personally, this is how I would have written this code:

        static bool IsBeatle2(string name)
        {
            bool isBeatle = false;
            if (name == "Paul")
            {
                isBeatle = true;
            }
            else if (name == "John")
            {
                isBeatle = true;
            }
            else if (name == "George")
            {
                isBeatle = true;
            }
            else if (name == "Ringo")
            {
                isBeatle = true;
            }
            return isBeatle;
        }

So, my question is two fold:

  1. Do they still teach in school that a function should have a single exit point?
  2. When doing code reviews, should I ask the developer to rewrite their code to have one single exit point? Yes, I realize that this second question is a value judgement but I'm OK with hearing other people's opinions.

Pealarther (talk) 11:21, 7 August 2024 (UTC)[reply]

If there was only one school with only one instructor, your answer could be "yes" or "no." However, there are millions of schools with millions of instructors. So, the only correct answer is "both." Yielding functions and scripting languages have changed what is considered optimal when writing functions. So, it comes down to what the function does, what language is being used, and what the instructor feels like teaching. 75.136.148.8 (talk) 11:49, 7 August 2024 (UTC)[reply]
  • Many things taught in the '90s, and especially the '80s, are now realised to be unrealistic.
There is no reason why functions should only have one exit point. What's important is that some boundary exists somewhere where you can make absolute statements about what happens when crossing that boundary. Such boundaries are commonly functions, but it's broader than that too. In this case, we can define a contract, 'Leaving this function will always return a Boolean, set correctly for isBeatleness.' That's sufficient. To then mash that into this type of simplistic 'Only call return once, even within a trivial function' is pointless and wasteful.
You might also look at 'Pythonic' code, the idiomatic style of coding optimised for use in Python. This raises exceptions quite generously, see Python syntax and semantics#Exceptions. The boundary here is outside the function, but instead the scope of the try...except block. In Pythonic code, the exception handler that catches the exception might be a very long way away. Andy Dingley (talk) 13:58, 7 August 2024 (UTC)[reply]
Yes, it was accepted wisdom (at least in academic teaching of programming) in the 1980s, and Pascal (the main teaching language in a lot of academic settings) effectively enforced it (at least in the academic versions people taught - I rather think Turbo Pascal, which was always more pragmatic, will not enforce this). But it leads to some horrible patterns:
  1. checking inputs and other preconditions are acceptable leads to deep nested ifs, with the "core" of the function many deep.
  2. "result" accumulation - especially where "break" is also prohibited (with the same reasoning), where the function has "finished" its calculation, but has to set a result variable, which then trickles down to the eventual real return at the end of the function. This (and the break prohibition) leads to fragile "are we done yet" booleans.
So the restriction was an attempt to avoid bad code, but in doing so produced lots of different kinds of bad, unreadable, fragile code. So it's a daft restriction.
I've no idea what academics teach now, and frankly what universities (often in toy or abstract cases) do is seldom what industry does. So let's look at what industry does:
  • Code Complete reads "Minimize the number of returns in each routine. It's harder to understand a routine when, reading it at the bottom, you're unaware of the possibility that it returned some-where above. For that reason, use returns judiciously—only when they improve readability."
  • Neither the CoreCPPGuidelines nor Google's C++ styleguide seems to say anything on the topic
  • Notable codebases like Chrome, the Linux Kernel, PDF.js, IdTech3, MySQL, LLVM, and gcc all frequently use multiple return paths.
That doesn't mean "just return willy-nilly wherever", as that can be as bad - Code Complete gives smart advice. But it's a bad rule, which won't produce better code in real circumstances, and will frequently produce worse code. "Write good code" can't be boiled down to such simple proscriptions. -- Finlay McWalter··–·Talk 14:13, 7 August 2024 (UTC)[reply]
I tend to agree with the OP. However, the example of multiple exits he gives is not that bad because they are all right together. It would be worse practice to have four exits randomly spread out in a routine. Bubba73 You talkin' to me? 04:08, 8 August 2024 (UTC)[reply]
The underlying rationale for the directive to have a single exit is to make it easier to ascertain that a required relationship between the arguments and the return value holds, as well as (for functions that may have side effects) that a required postcondition holds – possibly involving the return value. If the text of the body of a function fits on a single screen, forcing a single exit will usually make the code less readable. As long as it is easy to find all exits – much easier with on-line editors than with printouts of code on punch cards as was common before the eighties – the directive no longer fulfills a clear purpose.  --Lambiam 08:14, 8 August 2024 (UTC)[reply]
  • Back when I was at school we didn't have computers and nobody taught software at all.  
I would view a long if...else... chain as generally evil, and would see the sequence of if...return tests as somewhat "better".
However, I don't do C#, so I would have written something in PHP like this:
function IsBeatle3(string $name): bool {
  return $name == "Paul" || $name == "John" || $name == "George" || $name == "Ringo";
}
This single statement highlights that there is one test and one exit point and that the function always returns a value of the same type.
For something heavier, or perhaps if there were more than two return values, I might have used a switch statement. Perhaps something like this:
function IsBeatle4(string $name): bool {
  switch($name) {
    case "Paul":
    case "John":
    case "George":
    case "Ringo":
      return true;

    default:
      return false;
  }
}
Here, the switch is doing the comparisons for you and using the switch statement highlights that all cases have been handled and there is no other way out. — GhostInTheMachine talk to me 17:58, 18 August 2024 (UTC)[reply]
It's a structured programming thing and enforced by mostly pascaline languages for mostly ideological reasons. It comes from the same mindset as the dislike of gotos and ultimately, a desire for and anticipation of automatic verification. In C-likes that had return from the beginning, if/else if/else if/else looks supremely nooby, whereas if/return/if/return is idiomatic, especially when handling errors. Aecho6Ee (talk) 03:53, 22 August 2024 (UTC)[reply]

How are one-time passwords secure?

edit

To log into my Mailchimp account, I need a password plus a one-time code I either read off the Google Authenticator app on my Samsung tablet, or off the iCloud keychain. The two sources always give the same code, and to set them up, I had to enter a 16-letter code. My question is: how does any of this increase security? To get the one-time code, all a hacker needs is the 16-letter code used, and they're good to go. It just seems like a second password but more complicated. I thought the idea of one-time codes was that it would be something I know (password) and something I have (my tablet). But in fact the something I have is only useful because of the 16-letter code (something else I know). Amisom (talk) 15:48, 7 August 2024 (UTC)[reply]

If you know the secret key (the code you started with), the current time, and the algorithm, you can produce the OTP key at any point in time. 75.136.148.8 (talk) 17:21, 7 August 2024 (UTC)[reply]
Or indeed, as I said, all you need is the secret key and a widely available app like Google Authenticator. So my question is how and why that is more secure than a password alone. Amisom (talk) 17:23, 7 August 2024 (UTC)[reply]
The issue is if your communication is being intercepted, someone is looking over your shoulder, or a bug in the browser state means the text you entered (which should be forgotten immediately) is retained in memory, and a wrongdoer can recover it later. If you were sending a shared secret (e.g. a password), now the enemy has your password. If all you enter is the OTP, which expires in a minute or two, the enemy has only seconds to use it. As the OTP is generated from the 80-bit shared secret with a one-way function (in this case, a cryptographic hash function), they can't reverse OTP to recreate the 80-bit secret. The 80-bit shared secret key should not be your regular password, nor derived from it. Typically, when setting up a HOTP entry in Authenticator, the service (e.g. Mailchimp) should generate an 80 bit random key and usually shows this on screen with a QR code (for Google Authenticator to read). After that, the 80-bit shared secret is never passed between the two parties. -- Finlay McWalter··–·Talk 18:02, 7 August 2024 (UTC)[reply]
Technically, as the concern mentions, if I had thousands of computers all attempting to authenticate at the same time, I could have each one attempt thousands of possible OTP keys based on trying every possible original seed value used to set up the OTP. If one works, I can continue using it without the extra overhead of trying millions of combinations. But, even if only 16 hex values were used in the random initial seed, there would be over 1,000,000,000,000 possible values to try. As mentioned above, you can't intercept this as you can with a password. It is not transmitted anywhere. The user never types it into anything after setting up the OTP. But, the concern is not completely without merit. It is possible that someone could randomly pick out the original value used to set up the whole thing and then have their own copy of it to use. It comes down to the old analogy of you can spend billions to build a system to work out a person's OTP and hack into their bank account or you can spend $5 on a good hammer and force them to give you their phone so you can use it to log in easily. 75.136.148.8 (talk) 19:53, 7 August 2024 (UTC)[reply]
There are 1616 different hexadecimal strings of length 16, which is more than 1.8×1019. This is a whole lot more than 1,000,000,000,000.  --Lambiam 21:51, 7 August 2024 (UTC)[reply]
The comment above that mentions an 80-bit shared secret. Assuming 8 bits per character, that is 10 characters, not 16. Regardless, it is correct to state that it is not likely someone will brute force an OTP secret easily. 12.116.29.106 (talk) 12:40, 8 August 2024 (UTC)[reply]
I was reacting to the comment immediately above my reaction, which stated,
"But, even if only 16 hex values were used in the random initial seed, there would be over 1,000,000,000,000 possible values to try."
If "16" was a typo for "10", the hexadecimal strings of length 10 are more than 1.2×1024 in number, still more than 1,000,000,000,000 by over a factor of 1,000,000,000,000. These days you can buy a 7168-core GPU with a clock speed of 2.4 GHz, so trying 1,000,000,000,000 values is not an obvious impossibility. Five-character random passwords are not safe against brute force.  --Lambiam 22:16, 8 August 2024 (UTC)[reply]