Reference (computer science)

In

dereferencing

the reference. A reference is distinct from the datum itself.

A reference is an

array or table, an operating system handle, a physical address on a storage device, or a network address such as a URL

Formal representation

A reference R is a value that admits one operation, dereference(R), which yields a value. Usually the reference is typed so that it returns values of a specific type, e.g.:[1]^[2]

interface Reference<T> {
  T value();
}

Often the reference also admits an assignment operation store(R, x), meaning it is an abstract variable.^[1]

Use

References are widely used in

indirect addressing and of many linked data structures, such as linked lists. References increase flexibility in where objects can be stored, how they are allocated, and how they are passed between areas of code

. As long as one can access a reference to the data, one can access the data through it, and the data itself need not be moved. They also make sharing of data between different code areas easier; each keeps a reference to it.

References can cause significant complexity in a program, partially due to the possibility of

pointer arithmetic

.

The mechanism of references, if varying in implementation, is a fundamental programming language feature common to nearly all modern programming languages. Even some languages that support no direct use of references have some internal or implicit use. For example, the call by reference calling convention can be implemented with either explicit or implicit use of references.

Examples

wild pointers. Smart pointers are opaque data structures

that act like pointers but can only be accessed through particular methods.

A

C standard I/O library), used to abstract file content. It usually represents both the file itself, as when requesting a lock

on the file, and a specific position within the file's content, as when reading a file.

In

WSDL description of a remote web service can be viewed as a form of reference; it includes a complete specification of how to locate and bind to a particular web service. A reference to a live distributed object

is another example: it is a complete specification for how to construct a small software component called a proxy that will subsequently engage in a peer-to-peer interaction, and through which the local machine may gain access to data that is replicated or exists only as a weakly consistent message stream. In all these cases, the reference includes the full set of instructions, or a recipe, for how to access the data; in this sense, it serves the same purpose as an identifier or address in memory.

If we have a set of keys K and a set of data objects D, any well-defined (single-valued) function from K to D ∪ {null} defines a type of reference, where null is the image of a key not referring to anything meaningful.

An alternative representation of such a function is a directed graph called a

inaccessible objects

.

External and internal storage

In many data structures, large, complex objects are composed of smaller objects. These objects are typically stored in one of two ways:

With internal storage, the contents of the smaller object are stored inside the larger object.
With external storage, the smaller objects are allocated in their own location, and the larger object only stores references to them.

Internal storage is usually more efficient, because there is a space cost for the references and

dynamic allocation metadata, and a time cost associated with dereferencing a reference and with allocating the memory for the smaller objects. Internal storage also enhances locality of reference

by keeping different parts of the same large object close together in memory. However, there are a variety of situations in which external storage is preferred:

If the data structure is recursive, meaning it may contain itself. This cannot be represented in the internal way.
If the larger object is being stored in an area with limited space, such as the stack, then we can prevent running out of storage by storing large component objects in another memory region and referring to them using references.
If the smaller objects may vary in size, it is often inconvenient or expensive to resize the larger object so that it can still contain them.
References are often easier to work with and adapt better to new requirements.

Some languages, such as Java, Smalltalk, Python, and Scheme, do not support internal storage. In these languages, all objects are uniformly accessed through references.

Language support

Assembly

In assembly language, it is typical to express references using either raw memory addresses or indexes into tables. These work, but are somewhat tricky to use, because an address tells you nothing about the value it points to, not even how large it is or how to interpret it; such information is encoded in the program logic. The result is that misinterpretations can occur in incorrect programs, causing bewildering errors.

Lisp

One of the earliest opaque references was that of the Lisp language cons cell, which is simply a record containing two references to other Lisp objects, including possibly other cons cells. This simple structure is most commonly used to build singly linked lists, but can also be used to build simple binary trees and so-called "dotted lists", which terminate not with a null reference but a value.

C/C++

The

its standard library

, but still retained the ability to circumvent these safety mechanisms for compatibility.

Fortran

Fortran does not have an explicit representation of references, but does use them implicitly in its

call-by-reference calling semantics. A Fortran

reference is best thought of as an alias of another object, such as a scalar variable or a row or column of an array. There is no syntax to dereference the reference or manipulate the contents of the referent directly. Fortran references can be null. As in other languages, these references facilitate the processing of dynamic structures, such as linked lists, queues, and trees.

Object-oriented languages

A number of object-oriented languages such as Eiffel, Java, C#, and Visual Basic have adopted a much more opaque type of reference, usually referred to as simply a reference. These references have types like C pointers indicating how to interpret the data they reference, but they are typesafe in that they cannot be interpreted as a raw address and unsafe conversions are not permitted. References are extensively used to access and assign objects. References are also used in function/method calls or message passing, and reference counts are frequently used to perform garbage collection of unused objects.

Functional languages

In

mutable variables, data that can be modified. Such reference cells can hold any value, and so are given the polymorphic

type α ref, where α is to be replaced with the type of value pointed to. These mutable references can be pointed to different objects over their lifetime. For example, this permits building of circular data structures. The reference cell is functionally equivalent to a mutable array of length 1.

To preserve safety and efficient implementations, references cannot be

algebraic datatype

mechanism. The programmer is then able to enjoy certain properties (such as the guarantee of immutability) while programming, even though the compiler often uses machine pointers "under the hood".

Perl/PHP

Perl supports hard references, which function similarly to those in other languages, and symbolic references, which are just string values that contain the names of variables. When a value that is not a hard reference is dereferenced, Perl considers it to be a symbolic reference and gives the variable with the name given by the value.^[3] PHP has a similar feature in the form of its $$var syntax.^[4]

Python

Python includes a wide range of references. Generally, programmers use id(var) to access the reference address of a variable var. Reference in Python is more complex than C++ or other programming languages.

A constant (for example, an integer 2) is an object in Python and therefore has a reference to a fixed address. When programmer calls an a = 2, the program creates a memory space for consonant 2 and arranges a reference to it. Then the program links the reference of a to the address of 2, which reveals that changing the value of a numerical variable changes the reference of it. Whereas in mutable types (for example, a list type), value changing won't change the variable's reference.

When a variable is passed as a function parameter, the function gets the reference to it by default, not its value. For mutable types, this causes the original variable easy to be modified; for immutable type, modifying its value in the function will cause the temporary variable reference in the function to be modified to otherwhere, hence will not cause the original variable to be changed.

References

^
ISBN 978-3-540-15212-5
.

^ "Reference (Java Platform SE 7)". docs.oracle.com. Retrieved 10 May 2022.
^ "perlref". perldoc.perl.org. Retrieved 2013-08-19.
^ "Variable variables - Manual". PHP. Retrieved 2013-08-19.

External links

Pointer Fun With Binky Introduction to pointers in a 3-minute educational video – Stanford Computer Science Education Library

[Sherman-1] 
ISBN 978-3-540-15212-5
.

[2] "Reference (Java Platform SE 7)". docs.oracle.com. Retrieved 10 May 2022.

[3] "perlref". perldoc.perl.org. Retrieved 2013-08-19.

[4] "Variable variables - Manual". PHP. Retrieved 2013-08-19.

[2]

[1]

[3]

[4]

v t e Data types
Uninterpreted	Bit Byte Trit Tryte Word Bit array
Numeric	Arbitrary-precision or bignum Complex Decimal Fixed point Floating point Reduced precision Minifloat Half precision bfloat16 Single precision Double precision Quadruple precision Octuple precision Extended precision Long double Integer signedness Interval Rational
Pointer	Address physical virtual Reference
Text	Character String null-terminated
Composite	Algebraic data type generalized Array Associative array Class Dependent Equality Inductive Intersection List Object metaobject Option type Product Record or Struct Refinement Set Union tagged
Other	Boolean Bottom type Collection Enumerated type Exception Function type Opaque data type Recursive data type Semaphore Stream Strongly typed identifier Top type Type class Empty type Unit type Void
Related topics	Abstract data type Boxing Data structure Generic Kind metaclass Parametric polymorphism Primitive data type Interface Subtyping Type constructor Type conversion Type system Type theory Variable