Programmers, particularly those with a mathematical background, often use words from mathematics when describing their systems. Unfortunately, they also often do so without consideration of the non-mathematical background of their colleagues. I thought I might talk today a bit about the word “idempotent”. This is a very easy concept but lots of people don’t know that there is a word for it.

There are two closely related *mathematical* definitions for “idempotent. First, a **value** is “*idempotent under function foo*” if the result of doing foo to the value results in the value right back again. Another common way to say this is that the value is a *fixpoint* of foo.

Second, a **function** is “idempotent” if the result of doing it twice — where we are feeding the output of the first call into the second call — is exactly the same as the result of doing it once. Or, in other words, *every output of the function is idempotent under it*. Or in even more other words, *every output of an idempotent function is a fixpoint of the function.*

The most trivial mathematical example of the second kind is the *constant function*. If f(x) = c, then clearly f(x) = f(f(x)) = f(f(f(x))) … so f is idempotent (and the constant is idempotent under it).

The identity function f(x) = x is also idempotent; I’m sure you see why.

The function that takes a finite set of numbers and returns a set containing only its largest element is idempotent (and every one-element set is idempotent under it). I’m sure you can think of lots of examples of idempotent functions.

To get a handle on the other sense, pick an operation — say, doubling. The only value which is idempotent under that operation is zero. The operation “subtracting any non-zero value” has no idempotent values. Squaring has two idempotent values, zero and one.

The way programmers use “idempotent” is slightly different, but this concept comes up all the time in practical programming, particularly around caching logic. Usually when used in the computer science sense we don’t mean that the effect of the function is invariant under *composition of the function with itself*, but rather that it is* invariant over any number of calls greater than one*. For example, I don’t know how many times I’ve written:

HRESULT GetTypeLibCreator(ICreateTypeLib2 ** ppctl) { if (this->m_pctl == NULL) { HRESULT hr; hr = CreateTypeLib2(SYS_WIN32, pszName, &this->m_pctl); if (FAILED(hr)) return hr; } *ppctl = this->m_pctl; this->m_pctl->AddRef(); return S_OK; }

A nice little idempotent function — calling it two, three, or any other number of times with the same arguments has exactly the same result as calling it once.

The place you see the other characterization of idempotence all the time is in C++ header files. Include a needed header zero times and you’ll get “not defined” errors. Accidentally include it twice and you’ll get “redefinition” errors. It’s a major pain to make sure that every header file is included exactly once. Therefore, most headers use some trick to make them idempotent under the inclusion operation:

#ifndef STUFF_H_INCLUDED #define STUFF_H_INCLUDED // headers here #endif // STUFF_H_INCLUDED

or in more modern systems, the “#pragma once” directive makes headers idempotent under inclusion.

Great Post. One element that is often overlooked (granted a niche concern, but when it applies can be critical) is idempotence in the time domain. Sure you will get back the same value, but will it take the same time to calculate.

In real-time application (which are distinct from low-latency) getting this part right is often the hardest part especially with modern languages, processors, etc….

Pingback: Life, part 12 | Fabulous adventures in coding