# THIS BLOG HAS MOVED

Hello all!

The Components Programming blog has moved to a new domain.

The Wordreference site will no longer be updated.

See you there!

Fernando.
http://componentsprogramming.com

# Writing min function, part 4: Const-Correctness

This is the fourth article of the series called “Writing min function”.

I still have to solve two mistakes made in the code of the previous posts. One of them is C++ specific, and the other it is a mistake that could be made in any programming language.

Originally, I wanted to address the two mistakes in the same article, but when I started writing about the first one, the article became longer than I wanted, so I will address each mistake in separate posts.

Therefore, this article will deal with an issue that only concerns to C++ programmers, so if C++ is not of your interest, feel free to jump directly to the fifth article.

In general, I want that the algorithms written in my articles are not tied to a specific programming language, but in order to write the algorithms in a practical way, I have to do it in a real language, and C++ is my choice.
Once the mistakes have been corrected, with the help of other people, I’ll start writing the min function in other programming languages.

So, to start, I’ll update the usage code example to reflect things I would like to do using the min function:

void raise_salary(employee& e, float increment) {
e.salary += increment;
}


The raise_salary function is very simple, takes a reference (non-const) to an employee and increases her salary by increment. I would like to use it in the following way:

void usage_with_mutable_objects() {
employee e1 {1, "John", 5'000.0f};
employee e2 {3, "George", 4'500.0f};

raise_salary(
min(e1, e2, salary_comparator{}),
500.0f
); //#1 Compile-time error
}


(employee and salary_comparator were defined in Part 3)

In #1 we get a compile-time error, but, why?
Remember our last version of the min function:

template <TotallyOrdered T>
T const& min(T const& a, T const& b) {
if (a < b) return a;
return b;
}

template <typename T, StrictWeakOrdering Cmp>
requires SameType<ArgumentType<Cmp>, T>
T const& min(T const& a, T const& b, Cmp cmp) {
if (cmp(a, b)) return a;
return b;
}


(Note: When I refer to the min function, actually I am not referring to a single function, but to a family of functions)

It takes two constant-references to two objects (and a comparator); and returns a constant reference to one of the objects. That means that the returned reference cannot be modified, because it is returned as const.
The raise_salary function tries to modify the object returned by the min function, causing the compilation-time error, because you can’t modify something declared as const.

If I manually “inliniarize” the code of the functions min and raise_salary I would get something like:

// manually-inliniarized code

employee e1 {1, "John", 5'000.0f};
employee e2 {3, "George", 4'500.0f};

employee const& m = e1 < e2 ? e1 : e2;
m.salary += 500.0f; //Compile-time error


In the last code, it is more clear that we are trying to modify something that is constant.
The variable m is a constant-reference pointing (or rather, referencing) to e1 or to e2.
Both e1 and e2 were not declared as constant, so they are mutable objects.
Then,  why the min function returns a constant reference and not just an ordinary (non-const) reference?
Consider the following variation of a wrong min function:

//Obs: It is not valid C++ code (ill-formed code)
template <typename T, StrictWeakOrdering Cmp>
requires SameType<ArgumentType<Cmp>, T>
T& wrong_min(T const& a, T const& b, Cmp cmp) {
if (cmp(a, b)) return a;
return b;
}


It is not valid C++ code (or using Standard C++ terminology, it is ill-formed code), it doesn’t compiles, because we are trying to return something that is constant as non-constant. And what’s up with that?
Well, suppose that the wrong_min function compiles, in a fictitious programming language, then consider the following usage code:

void usage_wrong_min() {
employee const e1 {1, "John", 5'000.0f};
employee const e2 {3, "George", 4'500.0f};
//...


The two employees are declared as const objects, meaning that they are immutable objects, so, they cannot be modified. Then consider the folliwing usage of the wrong_min function.

//...
employee& m = wrong_min(e1, e2, salary_comparator{});
//...


Because, wrong_min function returns a non-const reference, we can assign the result of the function to a non-const reference, and then we could modify the object referenced by m:

//...
m.salary += 500.0f; //#2
}


Remember that the wrong_min function is not valid, but, if it were valid, we would be in trouble, since we would be modifying an object that can not be modified, because it is constant or immutable.

You might wonder: “But… they are objects, so they reside in memory, and memory can be modified,… what is the problem with that? ”

According to the C++ Standard [1], “…any attempt to modify a const object during its lifetime (3.8) results in undefined behavior.
In practice, a const object could be placed by the compiler (and the operating system) in a read-only segment of memory. Any attempt to write to read-only memory causes a segmentation fault, probably causing an abnormal termination of the process (program crash).

That is the reason why C++ protects us from writing things like wrong_min.

Returning to the min function, then, why not remove const from everywhere?
Let’s see:

template <TotallyOrdered T>
T& min(T& a, T& b) {
if (a < b) return a;
return b;
}


If we wrote the min function as above, then we would have problems with the following code:

int m = min(5, 7); // Compile-time error


The code above doesn’t compile. Why?
In C++, the values 5 and 7 are called integer-literal, more specifically they are called decimal integer literal (base ten). Literals are something that can’t be modified, they are constants. So to accept the code above we have to return to our original min function, using const everywhere.
But, remember, we want our min function also allows us to work with mutable objects.

So, we want a function that:

1. Returns a constant-reference if any of its parameters is a constant object, or,
2. Returns a non-constant-reference if both parameters are non-constant objects.

The simplest way to do it in C++ is a little verbose, but it works:

template <TotallyOrdered T>
T const& min(T const& a, T const& b) {
if (a < b) return a;
return b;
}

template <TotallyOrdered T>
T& min(T& a, T& b) {
if (a < b) return a;
return b;
}


We have to repeat the code, the first function covers the case 1 and the second one covers the case 2. Maybe in a future version of C++ this can be done without the need for repetition of code.

So I’m going to write the full family of min functions we have so far:

template <typename T, StrictWeakOrdering Cmp>
requires SameType<ArgumentType<Cmp>, T>
T const& min(T const& a, T const& b, Cmp cmp) {
if (cmp(a, b)) return a;
return b;
}

template <typename T, StrictWeakOrdering Cmp>
requires SameType<ArgumentType<Cmp>, T>
T& min(T& a, T& b, Cmp cmp) {
if (cmp(a, b)) return a;
return b;
}

template <TotallyOrdered T>
T const& min(T const& a, T const& b) {
if (a < b) return a;
return b;
}

template <TotallyOrdered T>
T& min(T& a, T& b) {
if (a < b) return a;
return b;
}


Remember that in the following article I will address the last outstanding issue. So do not take the above code as totally definitive.

And, with the code above, we can write things like the following:

employee const e1 {1, "John", 5'000.0f};
employee const e2 {2, "Peter", 6'000.0f};
employee e3       {3, "George", 4'500.0f};
employee e4       {4, "Frank", 5'000.0f};

employee const& m1 = min(e1, e2, salary_comparator{});
employee const& m2 = min(e1, e3, salary_comparator{});
employee const& m3 = min(e4, e2, salary_comparator{});

min(e3, e4, salary_comparator{}).salary += 500.0f;
//or
raise_salary(
min(e3, e4, salary_comparator{}), 500.0f
);


Finally, I wrote another version of the min function that looks simpler because it prevents writing the same code twice, but it uses some template-metaprogramming (TMP) hacks.

The code looks like the following:

template <TotallyOrdered T, TotallyOrdered U>
requires SameType<T, U>
HERE_SOME_MAGIC min(T&& a, U&& b) {
if (a < b) return (HERE_SOME_MAGIC)a;
return (HERE_SOME_MAGIC)b;
}


Then, I presented the code to Alex, because I wanted to know what he thought, and he answered me with the following advice:

– Alex Stepanov: “My standard advice: concentrate on algorithms and data structures. Do not spend time on meta-programming. If it is not in Knuth (or something like Knuth), do not do it.

At first I did not understand the reason for his answer, but then, after spending hours focusing on details of the template-metaprogramming code, I understood that it is better to focus on real problems, which are the algorithms.

Even so, if you want to take a look at the code, write me by private and I will gladly share it (See About page).

I didn’t want to make an article explaining Const-Correctness so long, because there are countless books and articles on the Internet [2] that explain the topic better than me. But as I tried to explain some things, I thought it was necessary to extend the explanation, which ended in this article.

## References

[1] Latest publicly available C++ Draft Standard (n3797) (at September 2014) [dcl.type.cv] paragraph 4:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3797.pdf

[2] For better references about Const-Correctness and ROM-ability, see:

# Writing min function, part 3: Weakening the ordering

This is the third article of the series called “Writing min function”.

Now we understand what Concepts are (do we?), I will try to complete the min function.

What do I mean with “complete”?

Well, we need to see how to use our min function, but first, let’s write the min function in a real programming language.

From now on I will write code in C++. Why C++?, it is a topic for another article. But don’t worry, I will try to analyze all the algorithms in several programming languages in the following articles.

Here is the code of the min function from part 2:

//Note 1: this code is Concept-enabled C++.
//        Later we will see how to use it.
//Note 2: Still has errors!
template <TotallyOrdered T>
T const& min(T const& a, T const& b) {
if (a < b) return a;
return b;
}


(We will see how to define the TotallyOrdered concept in C++ later. For now, think in the mathematical definition presented before)

And we use it in this way:

void usage_with_builtin_types_simple() {
int a = 12;
int b = 34;

//using variables
cout << min(a, b); //print the result on standard output

//using integer-literals (base10)
cout << min(1, 2);

//using variables and literals
cout << min(a, 2);

//assigning the result (copy)
int m1 = min(a, b);
int m2 = min(1, 2);
}


These are very simple examples using the int builtin type, but our min function is supposed to be Generic, so it has to operate with all types that satisfy the TotallyOrdered concept. So let’s see a little more complicated example.

Suppose we maintain the employee database of a company and we want to take two employees and know which is the minimum of the two. Let’s see it on code. Here a simplified Employee class written in C++:

struct employee { string name; float salary; };


And now we use the min function with Employees:

void usage_with_employees() {
employee e1 {"John", 5000.0f};
employee e2 {"Peter", 6000.0f};
employee e3 {"George", 4500.0f};
employee e4 {"Frank", 5000.0f};

employee m = min(e1, e2); // ???
}


What happend at line 7?

Well, we should get a compile-time error saying that the employee type doesn’t satisfy the TotallyOrdered concept.

Why?

First, syntactically there is no way of comparing two employees using the less-than-operator required by the TotallyOrdered concept.
If we use C++ templates without Concepts (or duck-typing templates) we will get a compile-time error pointing to the min function, saying that a < b could not be done. Instead, if we use a dynamic duck-typing language, like Python, we will get a similar error but at runtime.
The compiler (or interpreter) doesn’t know how to do a < b for employees, so this is the reason why we get the error.

So, how to make my Employee type to satisfy the TotallyOrdered concept?

Let’s start satisfying the requirements imposed by the concept:

1. Employee must satisfy the Regular concept.
2. We have to provide an operator< with the signature: Employee x Employee -> bool
3. The operator< must be a total ordering relation.

For now I want to skip the points 1 and 3, we will see them in another article. So let’s concentrate on point 2, which is a syntactic requirement, that specify that we have to provide the operator<, so let’s do it:

bool operator<(employee const& a, employee const& b) {
return ???????;
}


This is the canonical way on C++ for implementing a less-than-operator, but … What should I put on line 2?

Actually, I don’t know. That answer should be given by the designer of the Employee class. Well…, that’s me (?).

First, remember, total ordering, is, roughly speaking, some kind of Natural Ordering. So we need to know what is the natural ordering of Employees.
Maybe the natural ordering of employees is by name, maybe by salary, … I don’t know. This depends on the domain of the application. In a company, I think, employees have a unique identification number, maybe that is a good candidate to implement total ordering. So, let’s modify our Employee class:

struct employee { int id; string name; float salary; };


Now, let’s finalize our less-than-operator:

bool operator<(employee const& a, employee const& b) {
return a.id < b.id;
}


Now, we have an Employee class with a natural ordering, by id, that satisfies the TotallyOrdered concept (remember, we are ignoring the points 1 and 3).

But, what if we want to know who is the lowest paid employee, and then raise his salary. Should we modify the less-than-operator to compare by salary?

bool operator<(employee const& a, employee const& b) {
return a.salary < b.salary;
}


Is it OK?
No, we are imposing a default un-natural ordering to employee’s, it is not the right way to do it.

Changing the Employee’s natural ordering is not an option, so, we need another min function, one that takes a relation as parameter.
Let’s do it in the old-unconstrained-way (wrong way?), that is, without Concepts:

//Note: It compiles, but is incorrect, still, be patient!
template <typename T, typename Comparator>
T const& min(T const& a, T const& b, Comparator cmp) {
if (cmp(a, b)) return a;
return b;
}


Now we can write code like the following:

struct salary_comparator {
bool operator()(employee const& a, employee const& b) const {
return a.salary < b.salary;
}
};

void usage_with_employees() {
employee e1 {1, "John", 5000.0f};
employee e2 {2, "Peter", 6000.0f};
employee e3 {3, "George", 4500.0f};
employee e4 {4, "Frank", 5000.0f};

// using natural employee ordering (by id)
employee m = min(e1, e2);

// using another (unnatural) ordering
employee m2 = min(e1, e2, salary_comparator{});

// using another (unnatural) ordering, with lambdas
employee m3 = min(e1, e2, [](employee const& a, employee const& b){
return a.name < b.name; } );
}


But so far, I have not mentioned anything that an experienced programmer does not know, the use of predicates (comparators) is a common thing in practically all programming languages.
What most programmers (and existing implementations) forget is to specify the semantics of the predicate.

So, what are the semantic requirements?
What is Comparator?

Comparator is a Relation, that is, a binary Predicate. What kind of relation?
It is an ordering. What kind of ordering?
Is it a total ordering relation? Well, let’s check it:

Remember, a Relation r is a Strict Total Ordering if: For all a, b and c in the domain of the r, the following must hold:
Transitivity: if r(a, b) and r(b, c) then r(a, c)
Trichotomy: only one of the following holds, r(a, b), r(b, a) or a = b

It’s easy to prove that the transitivity axiom holds so I will skip it; but, what about trichotomy? Let’s prove it with an example. Given our previous defined employees:

employee e1 { 1, "John", 5000.0f };
employee e2 { 2, "Peter", 6000.0f };
employee e3 { 3, "George", 4500.0f };
employee e4 { 4, "Frank", 5000.0f };


And our salary_comparator, here called r in order to abbreviate it:

• Take e1 and e2: According to trichotomy: only one of the following holds, r(e1, e2), r(e2, e1) or e1 = e2
What is the result of r(e1, e2)? r(e1, e2) means e1.salary < e2.salary, which means: 5000 < 6000, so it holds. The other two propositions are false (intentionally omitted), so trichotomy holds in that case.
• Take e1 and e3: Following the same analysis,  r(e1, e3) means e1.salary < e3.salary, which means: 5000 < 4500, so it doesn’t hold. But r(e3, e1) means … 4500 < 5000 which is true, and the last proposition is false (again, intentionally omitted), so trichotomy holds.
• Now, take e1 and e4: Following the same analysis, r(e1, e4) and r(e4, e1) are both false, so, the proposition e1 = e4 have to be true if we want trichotomy holds.
Is e1 = e4 true?
No! because e1 is not equal to e4, they are differents employees, they are not the same.

Then, the trichotomy axiom does not hold, that is, the salary_comparator relation is not a Total Ordering on the Employee Set. This means that Total Ordering is too restrictive.

So, what kind of ordering relation should be our Comparator?

Partial Ordering? No, we saw in Part1 that Partial Ordering is too weak to define min.
We need something in between Partial and Total Ordering: what we need is called Weak Ordering[1].

Roughly speaking, weak ordering says that if r(a, b) and r(b, a) are false, then, a and b are equivalents.

So, let’s modify the min function to introduce weak ordering:

//Note: yes! you guess it, it is incorrect, still!
template <typename T, StrictWeakOrdering Cmp>
requires SameType<ArgumentType<Cmp>, T>
T const& min(T const& a, T const& b, Cmp cmp) {
if (cmp(a, b)) return a;
return b;
}


The code above means that we have a function called min, that takes two formal parameters, a and b, both of the same type, called T.
The funcion has a third formal parameter, cmp, that models the concept called StrictWeakOrdering. The “requires” clause means that T (the type of a and b) and the argument type of the Comparator (Cmp) must be the same.

Well, in this article I explained what Weak Ordering means and why it is important, I want to end it with a quote from Alex:
“Mathematicians are happy with Total and Partial ordering. But most of them don’t know what is Weak Ordering. It is not a common term in mathematics but it is essential in computer science, because when we want to order things, we want to order by something. For example by social security number, by name, by age”.

In the next article, finally, I will tell you what are the mistakes that remain to be addressed.

You can get the complete source code on my Github repository.

## References

[1] For a formal definition of Weak Ordering see:
http://www.elementsofprogramming.com/eop-concepts.pdf

# Writing min function, part 2: Understanding Concepts

This is the second article of the series called “Writing min function”.

I want complete the min function and fix the mistakes mentioned in the previous post. But first, we have to understand Concepts, so let’s review the last version of the function with its requirements.

// Requires:
//  The type of a is equal to the type of b, and it is called T,
//  and T is TotallyOrdered
min(a, b) {
if (a < b) return a
return b
}


Here we specify that:

• the formal parameters a and b are of the same type, and we called this type: T.
• T models the concept called TotallyOrdered.

“A type models a concept, if the requirements expressed by the concept are satisfied for this type”

So, let’s review the formal definition of the TotallyOrdered concept:

$TotallyOrdered(\texttt{T}) \triangleq \hspace{37mm}\texttt{//line1} \\ \hspace*{13mm} \texttt{Regular(T)} \hspace{50mm}\texttt{//line2} \\ \hspace*{7mm} \land <\texttt{: T x T} \rightarrow \textnormal{bool} \hspace{30mm}\texttt{//line3} \\ \hspace*{5mm} \land total\_ordering(<) \hspace{36mm}\texttt{//line4}$

• line1: A type T models the TotallyOrdered concept if:
• line2: The type T also has to model the Regular[1] concept. This means that TotallyOrdered is defined in terms of Regular.
• line3: A procedure less-than-operator (<) with the signature: T x T -> bool, must exist. This is the syntactic rule that allows us to write things like: a < b
• line4: This is a semantic requirement, meaning that the less-than-operator procedure has to be a Total Ordering relation.

But… remember that there are two kinds of TotalOrdering.
Do we mean Reflexive or Strict TotalOrdering? Because it would be one or the other.

Let’s review the difference with examples:

• An example of Reflexive Total Ordering is the $\leq$ relation on the Natural numbers set, or in other words, ($\mathbb{N}$, $\leq$) is a Reflexive Totally Ordered Set.
That is, for all a, b and c in $\mathbb{N}$, the following must hold:
Transitivity: if a $\leq$ b and b $\leq$ c then a $\leq$ c
Antisymmetry: if a $\leq$ b and b $\leq$ a then a $=$ b
Totality: a $\leq$ b or b $\leq$ a

($\mathbb{N}$, $\geq$) is another example of a Reflexive Totally Ordered Set.

• An example of Strict Total Ordering is the $\textless$ relation on the Natural numbers set, or in other words, ($\mathbb{N}$, $\textless$) is a Strict Totally Ordered Set.
That is, for all a, b and c in $\mathbb{N}$, the following must hold:
Transitivity: if a $\textless$ b and b $\textless$ c then a $\textless$ c
Trichotomy: only one of the following holds, a $\textless$ b, b $\textless$ a or a $=$ b

($\mathbb{N}$, $>$) is another example of a Strict Totally Ordered Set.

Exercise 1: Prove that
a. $\leq$ is a Transitive relation
b. $\leq$ is an Antisymmetric relation
c. $\leq$ is a Total relation
on $\mathbb{N}$

Exercise 2: Prove that
a. $\textless$ is a Transitive relation
b. $\textless$ obeys the Trichotomy law
on $\mathbb{N}$

So, we have four options with which the TotallyOrdered concept could be defined: $\textless$, $\leq$, $>$, or, $\geq$. Whatever we choose is a right decision, but we have to choose.

Exercise 3: Do you know why any of the four options is a right choice?

I’m lying, actually we will not choose anything, the TotallyOrdered concept is defined using $\textless$, but here I will show the thinking behind the choice.

We have two choices to make:

1. Less… vs. Greater…: $\textless$ or $\leq$ vs. $>$ or $\geq$
2. Reflexive vs. Strict: $\leq$ or $\geq$ vs. $\textless$ or $>$

As we know, for 1, Alex has chosen Less…. The rationale for his decision is simple: Counting!
$\textless$ is the natural order of natural numbers. Why? Usually we count in ascending order, it is the natural way of counting.

Having chosen Less… then, we have to chose between Reflexive ($\leq$) and Strict ($\textless$).
Alex has chosen Strict ($\textless$) and his reasoning is: “It is one character less” (len(‘<‘) < len(‘<=’))
But maybe you could think: “This is not a good reason”.
The Alex’s answer is: “Well, we can choose either because they are equivalents!, then, you could use any decision procedure, such as, fewer typing”
Finally, another fundament: “Mathematicians consistently use $\textless$ in their books as the primary ordering, when they talk about, for example, Totally Ordered Fields they write all the axioms in terms of $\textless$

Summarizing, the choice is to use LessThan that is Strict, so we use $\textless$.

Well, now we understand what we mean when we use the TotallyOrdered concept, but  this post is ended and we haven’t written any new code.
You must be thinking: “Anyone knows how to write the min function without having any knowledge about abstract algebra”.
The Alex’s answer is: “Yes, they may know how to write it, but they implemented it incorrectly time and time again. How I know that? Because I was one of them.”

And this is mine: “Remembering some mathematics doesn’t do any harm”

In the next post I will write some code. Be patient 🙂

## References

[1] Regular is another concept, maybe the most important one, I will cover it later, but for the moment we will ignore it. If you want to see its definition, see http://www.elementsofprogramming.com/eop-concepts.pdf [page 1]