In the C++ programming language, a reference is a simple reference datatype that is less powerful but safer than the pointer type inherited from C. The name C++ reference may cause confusion, as in computer science a reference is a general concept datatype, with pointers and C++ references being specific reference datatype implementations. The definition of a reference in C++ is such that it does not need to exist. It can be implemented as a new name for an existing object (similar to rename keyword in Ada).

Syntax and terminology

edit

The declaration of the form:

<Type>& <Name>

where <Type> is a type and <Name> is an identifier is said to define an identifier whose type is lvalue reference to <Type>.[1]

Examples:

int a = 5;
int& r_a = a;

extern int& r_b;

Here, r_a and r_b are of type "lvalue reference to int"

int& Foo();

Foo is a function that returns an "lvalue reference to int"

void Bar(int& r_p);

Bar is a function with a reference parameter, which is an "lvalue reference to int"

class MyClass { int& m_b; /* ... */ };

MyClass is a class with a member which is lvalue reference to int

int FuncX() { return 42 ; };
int (&f_func)() = FuncX;
int (&&f_func2)() = FuncX; // essentially equivalent to the above

FuncX is a function that returns a (non-reference type) int and f_func is an alias for FuncX

const int& ref = 65;

const int& ref is an lvalue reference to const int pointing to a piece of storage having value 65.

int arr[3];
int (&arr_lvr)[3] = arr;
int (&&arr_rvr)[3] = std::move(arr);
typedef int arr_t[3];
int (&&arr_prvl)[3] = arr_t{}; // arr_t{} is an array prvalue
int *const & ptr_clv = arr; // same as int *const & ptr_clv = &arr[0];
int *&& ptr_rv = arr;
// int *&arr_lv = arr; // Error: Initializing an lvalue reference to non-const type with an rvalue

arr_lvr is a reference to an array. When initializing a reference to array, array-to-pointer conversion does not take place, but it does take place when initializing a reference to pointer. Since array-to-pointer conversion returns a prvalue, only lvalue references to const and rvalue references can be initialized with its result. Similarly, when initializing a reference to function, function-to-pointer conversion does not take place (see f_func above), but it does take place when initializing a reference to function pointer:

int FuncX() { return 42 ; };
int (*const &pf_func)() = FuncX; // same as int (*const &pf_func)() = &FuncX;
int (* &&pf_func2)() = FuncX;

The declaration of the form:

<Type>&& <Name>

where <Type> is a type and <Name> is an identifier is said to define an identifier whose type is rvalue reference to <Type>. Since the name of an rvalue reference is itself an lvalue, std::move must be used to pass an rvalue reference to a function overload accepting an rvalue reference parameter. Rvalue references to cv-unqualified type template parameters of that same function template or auto&& except when deduced from a brace-enclosed initializer list are called forwarding references (referred to as "universal references" in some older sources[2]) and can act as lvalue or rvalue references depending on what is passed to them.[3] When found in function parameters, they are sometimes used with std::forward to forward the function argument to another function while preserving the value category (lvalue or rvalue) it had when passed to the calling function.[4]

Types which are of kind "reference to <Type>" are sometimes called reference types. Identifiers which are of reference type are called reference variables. To call them variable, however, is in fact a misnomer, as we will see.

References are not objects and references can only refer to object or function types. Arrays of references, pointers to references and references to references are not allowed because they require object types. int& i[4], int&*i and int& &i will cause compilation errors (while int(& i)[4] (reference of array) and int*&i (reference of pointer) will not assuming they are initialized). References to void are also ill-formed because void is not an object or function type, but references to void * can exist.

Declaring references as const or volatile(int& volatile i) also fails unless a typedef/decltype is used in which case the const/volatile is ignored. However, if template argument deduction takes place and a reference type is deduced (which happens when forwarding references are used and an lvalue is passed to the function) or if typedef, using or decltype denote a reference type it is possible to take a reference to that type. In that case the rule that is used to determine the type of reference is called reference collapsing and works like this: Assuming a type T and a reference type to T TR, attempting to create an rvalue reference to TR creates a TR while an lvalue reference to TR creates an lvalue reference to T. In other words, lvalue references override rvalue references and rvalue references of rvalue references stay unchanged.

int i;
typedef int& LRI;
using RRI = int&&;

LRI& r1 = i; // r1 has the type int&
const LRI& r2 = i; // r2 has the type int&
const LRI&& r3 = i; // r3 has the type int&

RRI& r4 = i; // r4 has the type int&
RRI&& r5 = 5; // r5 has the type int&&

decltype(r2)& r6 = i; // r6 has the type int&
decltype(r2)&& r7 = i; // r7 has the type int&

A non-static member function can be declared with a ref qualifier. This qualifier participates in overload resolution and applies to the implicit object parameter like const and volatile but unlike those two, it does not change the properties of this. What it does is mandate that the function be called on an lvalue or rvalue instance of the class.

#include <iostream>

struct A
{
  A() = default;
  void Print()const& { std::cout << "lvalue\n"; }
  void Print()const&& { std::cout << "rvalue\n"; }
};

int main()
{
    A a;
    a.Print();            // prints "lvalue"
    std::move(a).Print(); // prints "rvalue"
    A().Print();          // prints "rvalue"
    A&& b = std::move(a);
    b.Print();            // prints "lvalue"(!)
}

Relationship to pointers

edit

C++ references differ from pointers in several essential ways:

  • A reference itself is not an object, it is an alias; any occurrence of its name refers directly to the object it references. A pointer declaration creates a pointer object which is distinct from the object the pointer refers to.
    • Containers of references are not allowed because references are not objects, while containers of pointers are commonplace for polymorphism.
    • It is not allowed to create a reference of reference, because references can only refer to objects (or functions).
  • References cannot be uninitialized. Because it is impossible to reinitialize a reference, they must be initialized as soon as they are created. In * Once a reference is created, it cannot be later made to reference another object; it cannot be reseated. This is often done with pointers.
  • References cannot be null, whereas pointers can; every reference refers to some object, although it may or may not be valid.

particular, local and global variables must be initialized where they are defined, and references which are data members of class instances must be initialized in the initializer list of the class's constructor. For example:

  • int& k; // compiler will complain: error: `k' declared as reference but not initialized
    

There is a simple conversion between pointers and references: the address-of operator (&) will yield a pointer referring to the same object when applied to a reference, and a reference which is initialized from the dereference (*) of a pointer value will refer to the same object as that pointer, where this is possible without invoking undefined behavior. This equivalence is a reflection of the typical implementation, which effectively compiles references into pointers which are implicitly dereferenced at each use. Though that is usually the case, the C++ Standard does not force compilers to implement references using pointers.

A consequence of this is that in many implementations, operating on a variable with automatic or static lifetime through a reference, although syntactically similar to accessing it directly, can involve hidden dereference operations that are costly.

Also, because the operations on references are so limited, they are much easier to understand than pointers and are more resistant to errors. While pointers can be made invalid through a variety of mechanisms, ranging from carrying a null value to out-of-bounds arithmetic to illegal casts to producing them from arbitrary integers, a previously valid reference only becomes invalid in two cases:

  • If it refers to an object with automatic allocation which goes out of scope,
  • If it refers to an object inside a block of dynamic memory which has been freed.

The first is easy to detect automatically if the reference has static scoping, but is still a problem if the reference is a member of a dynamically allocated object; the second is more difficult to detect. These are the only concerns with references, and are suitably addressed by a reasonable allocation policy.

Uses of references

edit
  • Other than just a helpful replacement for pointers, one convenient application of references is in function parameter lists, where they allow passing of parameters used for output with no explicit address-taking by the caller. For example:
void Square(int x, int& out_result) {
  out_result = x * x;
}

Then, the following call would place 9 in y:

int y;
Square(3, y);

However, the following call would give a compiler error, since lvalue reference parameters not qualified with const can only be bound to addressable values:

Square(3, 6);
  • Returning an lvalue reference allows function calls to be assigned to:
    int& Preinc(int& x) {
      return ++x;  // "return x++;" would have been wrong
    }
    
    Preinc(y) = 5;  // same as ++y, y = 5
    
  • In many implementations, normal parameter-passing mechanisms often imply an expensive copy operation for large parameters. References qualified with const are a useful way of passing large objects between functions that avoids this overhead:
    void FSlow(BigObject x) { /* ... */ }  
    void FFast(const BigObject& x) { /* ... */ }
    
    BigObject y;
    
    FSlow(y);  // Slow, copies y to parameter x.
    FFast(y);  // Fast, gives direct read-only access to y.
    

If FFast actually requires its own copy of x that it can modify, it must create a copy explicitly. While the same technique could be applied using pointers, this would involve modifying every call site of the function to add cumbersome address-of (&) operators to the argument, and would be equally difficult to undo, if the object became smaller later on.

Polymorphic behavior

edit

Continuing the relationship between references and pointers (in C++ context), the former exhibit polymorphic capabilities, as one might expect:

#include <iostream>

class A {
 public:
  A() = default;
  virtual void Print() { std::cout << "This is class A\n"; }
};

class B : public A {
 public:
  B() = default;
  virtual void Print() { std::cout << "This is class B\n"; }
};

int main() {
  A a;
  A& ref_to_a = a;

  B b;
  A& ref_to_b = b;

  ref_to_a.Print();
  ref_to_b.Print();
}

The source above is valid C++ and generates the following output:

This is class A

This is class B

References

edit
  1. ^ ISO/IEC 14822, clause 9.3.3.2, paragraph 1.
  2. ^ Sutter, Herb; Stroustrup, Bjarne; Dos Reis, Gabriel. "Forwarding References" (PDF).
  3. ^ ISO/IEC 14822, clause 13.10.2.1, paragraph 3.
  4. ^ Becker, Thomas. "C++ Rvalue References Explained". Retrieved 2022-11-25.
edit