**ChatGPT:** Here is an example of a parallel program in C that uses OpenMP to search an array of integers for a given integer:

```
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
int search(int *arr, int n, int value) {
int i;
int found = -1;
#pragma omp parallel for
for (i = 0; i < n; i++) {
if (arr[i] == value) {
found = i;
}
}
return found;
}
int main(int argc, char *argv[]) {
int n, value, found, i;
printf("Enter the number of elements in the array: ");
scanf("%d", &n);
int arr[n];
printf("Enter the elements of the array: ");
for (i = 0; i < n; i++) {
scanf("%d", &arr[i]);
}
printf("Enter the value to search for: ");
scanf("%d", &value);
found = search(arr, n, value);
if (found == -1) {
printf("The value %d was not found in the array\n", value);
} else {
printf("The value %d was found at index %d in the array\n", value, found);
}
return 0;
}
```

This program uses the `#pragma omp parallel for`

directive to parallelize the for loop that iterates through the array. This directive tells the OpenMP runtime to create a team of threads and divide the loop iterations among them. Each thread will execute the loop body with its own private copy of the `i`

variable.

When the loop finishes, the value of `found`

is -1 if the value was not found in the array, otherwise it will be the index of the first occurrence of the value in the array.

It is important to note that when you are using OpenMP, the order of the elements in the array may change and the program may not find the first occurrence of the value.

**Me: Your solution has a data race involving variable found. Try to correct it.**

**ChatGPT:** You are correct, the code I provided has a data race involving the `found`

variable. Here is a corrected version of the program that avoids the data race:

```
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
int search(int *arr, int n, int value) {
int i;
int found = -1;
#pragma omp parallel for
for (i = 0; i < n; i++) {
if (arr[i] == value) {
#pragma omp critical
{
if (found == -1) {
found = i;
}
}
}
}
return found;
}
int main(int argc, char *argv[]) {
int n, value, found, i;
printf("Enter the number of elements in the array: ");
scanf("%d", &n);
int arr[n];
printf("Enter the elements of the array: ");
for (i = 0; i < n; i++) {
scanf("%d", &arr[i]);
}
printf("Enter the value to search for: ");
scanf("%d", &value);
found = search(arr, n, value);
if (found == -1) {
printf("The value %d was not found in the array\n", value);
} else {
printf("The value %d was found at index %d in the array\n", value, found);
}
return 0;
}
```

In this version, I’ve added the `#pragma omp critical`

directive to the block of code that updates the `found`

variable. This directive creates a critical section, which is a block of code that can be executed by only one thread at a time. By enclosing the block of code that updates the `found`

variable in a critical section, we ensure that only one thread can update the variable at a time, which eliminates the data race.

The `#pragma omp critical`

directive is necessary to ensure that the variable `found`

is updated atomically, meaning that no other thread will see an intermediate value.

It is important to note that the critical section may cause a serialization of the execution and affect the performance, so be careful when using it.

]]>VerifyThis is an annual program verification competition held as part of ETAPS. It is an opportunity for community members and tool authors to showcase their work and learn from each other with hands-on exercises.

The competition proceeds in three rounds. In each round, participants are given 90 minutes to implement and prove specified properties of a given algorithm and/or data structures. They are free to use any verification tools they choose.

Xavier Denis and I are co-chairing VerifyThis 2023. We are currently looking for problem submissions. If you have recently encountered an interesting challenge in your work, don’t hesitate to submit it. Typical challenges have clear input-output specifications and often incorporate one or more of the following: heap allocation, concurrency, arithmetic reasoning. A challenge usually describes a problem using natural language together with some pseudocode, and then provides a list of properties or “verification tasks” of varied levels of difficulty. Contributors are encouraged to look at the Archive of previous problems.

An award will be given for any submission used in the competition.

To avoid spoiling the competition for others, we ask that you keep the subject of your submission private.

Submissions should be sent by email to xldenis@lri.fr and siegel@udel.edu.

Submission Criteria:

- A brief yet precise problem description, specifically identifying verification sub-tasks.
- A solution to the challenge is strongly encouraged, otherwise please provide a sketch of correctness.
- The description document can use any reasonable format, including plain text or PDF.

A concurrency flag has two states: *up* or *down*. It supports two atomic operations: *lower* and *raise*. The lower operation is enabled only when the flag is up, in which case it changes the state to down. If lower is invoked when the flag is down, it blocks until the flag is up. The raise operation is only defined when the flag is down, in which case it changes the state to up. It is erroneous to attempt to raise the flag when it is already up. Note the asymmetry: it is never erroneous to invoke lower — the operation will simply block until the flag can be lowered. Raise never blocks, but it should only be invoked when the flag is down.

(The concept of semaphore in concurrency theory is due to Edsker Dijkstra. He used the letter V to represent the raise operation, and P for the lower operation. V stands for *verhogen*, Dutch for “to raise”. P may stand for *passering* — “passing” — or *probeer te verlagen* — “try to decrease”.)

This post looks at the problem of designing a *barrier* protocol using concurrency flags. A *barrier* is a synchronization operation involving a set of threads. The basic property of a barrier is that no thread shall leave the barrier until all threads in the set have entered the barrier. A second desirable property is *no unnecessary delay* — once all threads have entered the barrier, all threads should be able to depart without unnecessary blocking. A third desirable property is *reusability* — the barrier should be able to be invoked multiple times in a program and work correctly.

It is easy to design a barrier with the first property but not the second: never let any thread leave the barrier. We will see it is actually tricky to design a barrier with all three properties.

The case of two threads is a good place to start. We can implement a 2-thread barrier using two flags, f1 and f2, both initially down. The protocol is

Thread 1: Thread 2: raise f1; lower f1; lower f2; raise f2;

```
Thread 1: Thread 2:
raise f1; lower f1;
lower f2; raise f2;
```

The first operation that will execute in this protocol is thread 1’s raise f1. After that, thread 1 must block, as f2 is down, but thread2 can now lower f1. Thread 1 is still blocked, so thread 2 raises f2 and now thread 2 may depart the barrier. Thread 1 may now lower f2 and depart.

The basic barrier property holds: if thread 1 arrives first to the barrier, it raises f1 but then blocks at the lower f2 and cannot proceed until thread 2 arrives and eventually raises f2. If thread 2 arrives at the barrier first, it blocks, at the lower f1. The property of no unnecessary delay is admittedly vague, but at a minimum it requires that once every thread arrives, eventually every thread should be able to leave, and that is the case for this protocol. The third property, reusability, also holds — I’ll hold off on a proof of that, but for now observe that at least the flags are returned to their original state after both threads leave the barrier.

It is helpful to think of “raise” as sending a signal, and “lower” as receiving that signal. Then we can view the protocol as follows:

Thread 1: Thread 2: raise f1; ------------> lower f1; lower f2; <------------ raise f2;

First thread 1 sends a signal to thread 2. Thread 2 receives the signal and then sends a signal to thread 1. Thread 1 receives the signal.

We can obtain a more rigorous analysis of the protocol by examining its **state space**. The *state space* is a directed graph in which the nodes are program states, and an edge represents an atomic transition from one state to another. A state encodes the values of all variables in the program, including the location (program counter) of each thread, and the state of each flag. The *reachable state space* is the subgraph consisting of the states and transitions that are reachable from the initial state.

Let’s put our 2-thread barrier in a loop, to analyze reusability, and also label the two statements in each thread to provide values for the location:

Thread 1: Thread 2: while (true) { while (true) { 0: raise f1; 0: lower f1; 1: lower f2; 1: raise f2; } }

When Thread 1’s location is 0, its next action will be the raise operation, after which it moves to location 1. From location 1 (assuming f2 is up), Thread 1 lowers f2 and moves to location 0, etc. A state of this program may be represented as a tuple [pc1, pc2, f1, f2], where pc1 and pc2 are locations in {0,1}, and f1 and f2 are in {U,D} (U=up, D=down). The initial state is [0,0,D,D]. From that state, only one transition is enabled, Thread 1’s raise of f1, and the resulting state is [1,0,U,D]. Continuing in this way, we obtain the following reachable state space:

The reachable state space shows us that the barrier program is deterministic: at each reachable state, there is one and only one enabled transition. Furthermore, we may interpret location 0 as being “outside” the barrier and location 1 as being “inside” the barrier. The reachable state space also shows that no thread will leave the barrier until both have entered: the leaving transitions are the two transitions that change a 1 to a 0; the entering transitions change a 0 to a 1. The only path from the initial state proceeds with two enters, followed by two exits, repeated forever. This also shows there is no unnecessary delay (or at least all threads eventually leave the barrier), and the barrier is reusable, since we are doing all of this inside a loop. Incidentally we can also check that flags are used correctly, i.e., there is never an attempt to raise a flag that is already up.

One of the first times I showed this 2-flag barrier in a class, a student asked whether both threads could raise first, then lower. I thought about it. The two raises could occur in either order — fine, you end up at the same state in either case. Then the two lowers could happen in either order — also fine. And neither lower could occur without the raise in the other thread occurring first, guaranteeing the basic barrier property. So I said it looked right, but I would check. I went back to the office and thought about it. The program would become:

Thread 1: Thread 2: while (true) { while (true) { 0: raise f1; 0: raise f2; 1: lower f2; 1: lower f1; } }

Let’s do the same state-enumeration analysis. This time the reachable state space looks like this:

The problem occurs from a state like [0,1,U,D]. From this state, Thread 2 could lower f1 and return to location 0, but also Thread 1 could attempt to raise f1 when f1 is already up, leading to the ERRor state. In fact, this barrier works correctly once, but it is not reusable: in the erroneous path, Thread 1 races to the second iteration of the loop and attempts to raise f1, while Thread 2 has not yet completed its lower operation in the first iteration. This is the kind of subtle interleaving-dependent issue that makes concurrent programming hard (and fun).

Now let’s see how we can automate this analysis. I will use CIVL, but any standard model checker will do. Here is a CIVL-C model of the first (correct) barrier program:

typedef _Bool flag_t; flag_t f1=0, f2=0; void raise(flag_t * f) { $assert(!*f); *f = 1; } void lower(flag_t * f) { $when(*f) *f = 0; } void thread1() { while (1) { raise(&f1); lower(&f2); } } void thread2() { while (1) { lower(&f1); raise(&f2); } } int main() { $proc t1 = $spawn thread1(); $proc t2 = $spawn thread2(); $wait(t1); $wait(t2); }

A flag is just a Boolean variable. The restriction on raise is checked by an assertion. The “when” statement in function lower causes the caller to block until the condition in parentheses is true, then executes the body of the statement; the check that the condition is true and execution of *f = 0 occur atomically, i.e., no other thread executes between those two points.

Running CIVL on this program yields:

siegel@monk barrier % civl verify barrierGood.cvl CIVL v1.21+ of 2021-11-04 -- http://vsl.cis.udel.edu/civl === Source files === barrierGood.cvl (barrierGood.cvl) === Command === civl verify barrierGood.cvl === Stats === time (s) : 0.5 memory (bytes) : 285212672 max process count : 0 states : 17 states saved : 24 state matches : 1 transitions : 20 trace steps : 17 valid calls : 96 provers : z3 prover calls : 0 === Result === The standard properties hold for all executions. siegel@monk barrier %

The key part of the output is “The standard properties hold for all executions”, indicating the assertion is never violated. Let’s do the same for the erroneous protocol:

typedef _Bool flag_t; flag_t f1=0, f2=0; void raise(flag_t * f) { $assert(!*f); *f = 1; } void lower(flag_t * f) { $when(*f) *f = 0; } void thread1() { while (1) { raise(&f1); lower(&f2); } } void thread2() { while (1) { raise(&f2); lower(&f1); } } int main() { $proc t1 = $spawn thread1(); $proc t2 = $spawn thread2(); $wait(t1); $wait(t2); }

Checking…

siegel@monk barrier % civl verify barrierBad.cvl CIVL v1.21+ of 2021-11-04 -- http://vsl.cis.udel.edu/civl Violation 0 encountered at depth 15: CIVL execution violation in p1 (kind: ASSERTION_VIOLATION, certainty: PROVEABLE) at barrierBad.cvl:7.2-13 $assert(!*f); ^^^^^^^^^^^^ Assertion: !(*(f)) -> !*&<d0>f1 -> !true -> false Context: true Call stacks: process 0: main@4 barrierBad.cvl:32.2-6 "$wait" process 1: raise@7 barrierBad.cvl:7.2-8 "$assert" called from thread1@13 barrierBad.cvl:17.4-8 "raise" process 2: lower@10 barrierBad.cvl:12.2-6 "$when" called from thread2@18 barrierBad.cvl:25.4-8 "lower" Logging new entry 0, writing trace to CIVLREP/barrierBad_0.trace Terminating search after finding 1 violation. === Source files === barrierBad.cvl (barrierBad.cvl) === Command === civl verify barrierBad.cvl === Stats === time (s) : 0.49 memory (bytes) : 285212672 max process count : 0 states : 15 states saved : 21 state matches : 0 transitions : 18 trace steps : 14 valid calls : 71 provers : z3 prover calls : 0 === Result === The program MAY NOT be correct. See CIVLREP/barrierBad_log.txt

The output indicates the assertion is violated and shows that process 1 (i.e., Thread 1) is in the “raise” call while process 2 is just before the lower. We can see more details, including a blow-by-blow account of the execution that arrived at that point, as follows:

siegel@monk barrier % civl replay -showTransitions barrierBad.cvl CIVL v1.21+ of 2021-11-04 -- http://vsl.cis.udel.edu/civl Initial state: State (id=9) | Path condition | | true | Dynamic scopes | | dyscope d0 (parent=NULL, static=1) | | | variables | | | | f1 = NULL | | | | f2 = NULL | Process states | | process 0 | | | call stack | | | | Frame[function=main, location=0, barrierBad.cvl:4.0-16 "flag_t f1=0, f2=0", dyscope=d0] Executed by p0 from State (id=9) 0->1: f1=false at barrierBad.cvl:4.0-16 "flag_t f1=0, f2=0" 1->2: f2=false at barrierBad.cvl:4.0-16 "flag_t f1=0, f2=0" 2->3: t1=$spawn thread1() [t1:=p1] at barrierBad.cvl:30.2-28 "$proc t1 = $spawn thread1 ... )" 3->4: t2=$spawn thread2() [t2:=p2] at barrierBad.cvl:31.2-28 "$proc t2 = $spawn thread2 ... )" --> State (id=24) Step 1: Executed by p1 from State (id=24) 12->13: LOOP_BODY_ENTER (guard: 1!=0) [$assume(true)] at barrierBad.cvl:16.9 "1" --> State (id=26) Step 2: Executed by p1 from State (id=26) 13->7: raise(&<d0>f1) at barrierBad.cvl:17.4-13 "raise(&f1)" --> State (id=29) Step 3: Executed by p2 from State (id=29) 16->17: LOOP_BODY_ENTER (guard: 1!=0) [$assume(true)] at barrierBad.cvl:23.9 "1" --> State (id=31) Step 4: Executed by p2 from State (id=31) 17->7: raise(&<d0>f2) at barrierBad.cvl:24.4-13 "raise(&f2)" --> State (id=34) Step 5: Executed by p2 from State (id=34) 7->8: $assert(!*&<d0>f2) at barrierBad.cvl:7.2-13 "$assert(!*f)" --> State (id=36) Step 6: Executed by p1 from State (id=36) 7->8: $assert(!*&<d0>f1) at barrierBad.cvl:7.2-13 "$assert(!*f)" --> State (id=38) Step 7: Executed by p1 from State (id=38) 8->9: *(&<d0>f1)=true at barrierBad.cvl:8.2-7 "*f = 1" 9->RET: raise(...) return at barrierBad.cvl:9.0 "}" --> State (id=43) Step 8: Executed by p1 from State (id=43) 14->10: lower(&<d0>f2) at barrierBad.cvl:18.4-13 "lower(&f2)" --> State (id=46) Step 9: Executed by p2 from State (id=46) 8->9: *(&<d0>f2)=true at barrierBad.cvl:8.2-7 "*f = 1" 9->RET: raise(...) return at barrierBad.cvl:9.0 "}" --> State (id=51) Step 10: Executed by p2 from State (id=51) 18->10: lower(&<d0>f1) at barrierBad.cvl:25.4-13 "lower(&f1)" --> State (id=54) Step 11: Executed by p1 from State (id=54) 10->11: *(&<d0>f2)=false at barrierBad.cvl:12.12-17 "*f = 0" 11->RET: lower(...) return at barrierBad.cvl:13.0 "}" --> State (id=59) Step 12: Executed by p1 from State (id=59) 12->13: LOOP_BODY_ENTER (guard: 1!=0) [$assume(true)] at barrierBad.cvl:16.9 "1" --> State (id=61) Step 13: Executed by p1 from State (id=61) 13->7: raise(&<d0>f1) at barrierBad.cvl:17.4-13 "raise(&f1)" --> State (id=64) Step 14: Error 0: CIVL execution violation in p1 (kind: ASSERTION_VIOLATION, certainty: PROVEABLE) at barrierBad.cvl:7.2-13 $assert(!*f); ^^^^^^^^^^^^ Assertion: !(*(f)) -> !*&<d0>f1 -> !true -> false Context: true Call stacks: process 0: main@4 barrierBad.cvl:32.2-6 "$wait" process 1: raise@7 barrierBad.cvl:7.2-8 "$assert" called from thread1@13 barrierBad.cvl:17.4-8 "raise" process 2: lower@10 barrierBad.cvl:12.2-6 "$when" called from thread2@18 barrierBad.cvl:25.4-8 "lower" --> State (id=66) Step 15: Trace ends after 15 trace steps. Violation(s) found. === Source files === barrierBad.cvl (barrierBad.cvl) === Command === civl replay -showTransitions barrierBad.cvl === Stats === time (s) : 0.44 memory (bytes) : 285212672 max process count : 0 states : 16 valid calls : 72 provers : z3 prover calls : 0

That, in a nutshell, is how you can use a model checker to verify, or find a bug in, a concurrent algorithm. And if you are a teacher looking for examples of simple “truthy” concurrent protocols with subtle bugs, you can’t do much better than this erroneous 2-thread barrier.

]]>The problem description is here. The problem is inspired by the “world’s simplest lock-free hash table”, from Jeff Preshing’s blog. The problem describes a lock-free concurrent hash set that supports adding an element and membership test. The algorithms use an atomic compare-and-swap operation.

Here are the properties to prove:

- empty(
*n*) creates an empty set with capacity*n* - member(
*k*) == true, if insert(*k*) has been executed before (and returned true) - member(
*k*) == false, if no insert(*k*) that returned true can have been executed - termination
- every key is contained in the table at most once
- if insert returns false, the table is full

Here is the CIVL solution I came up with during the competition:

```
/* Challenge 3: concurrent hash set
Author: Stephen Siegel (Team CIVL)
Algorithm is implemented exactly as written.
What we check: there are two threads, each does two inserts.
The keys to insert are chosen nondeterministically within some bounded range.
Assertions are checked throughout (see comments).
The hash function is a fixed function.
*/
$input int NB = 4; // upper bound on N
$input int N; // the length of the array that will serve as the hash set
$assume ( 1<=N && N<=NB );
$input int KB = 4; // upper bound on keys. All keys lie in [0,KB)
int DATA[2][2]; // DATA[i][0], DATA[i][1] are the keys thread i will insert
#include <stdlib.h>
#include <stdio.h>
typedef int K; // just using nonnegative ints for the keys, for now
#define key_invalid -1
typedef K * hset; // hset is array of length N of K
/* Standard CAS, exactly as specified in problem */
K compare_and_swap(K *target, K oldv, K newv) {
K result;
$atomic {
if (*target == oldv) *target = newv;
result = *target;
}
return result;
}
// returns something in [0,n), a function of k
size_t get_hash(size_t n, K k) {
// for now. Could make this more general but to enumerate all hash
// functions would take KB^N cases!
return (3*k+7)%n;
}
/* Is the hash set full? */
_Bool is_full(hset t) {
return $forall(int i | 0<=i && i<N) t[i] != key_invalid;
}
/* Creates the empty hash set with capacity n */
hset empty(size_t n) {
hset t = (hset)malloc(n*sizeof(K));
for (size_t i=0; i<n; ++i) t[i] = key_invalid;
return t;
}
/* Need to free memory allocated by empty, else memory leaks */
void delete(hset t) {
free(t);
}
/* insert function, exactly as specified */
_Bool insert(K k, hset t) {
size_t n = N;
size_t i0 = get_hash(n,k);
size_t i = i0;
do {
{ // optimization
K kk = t[i]; // reads are atomic always
if (kk == k) return 1;
if (kk != key_invalid) {
i = (i+1)%n;
continue;
}
}
K k1 = compare_and_swap(&t[i], key_invalid, k);
if (k1 == k) return 1;
i = (i+1)%n;
} while (i != i0);
return 0;
}
/* member test, exactly as specified */
_Bool member(hset t, K k) {
size_t n = N; // t.length
size_t i0 = get_hash(n,k);
size_t i = i0;
do {
K k1 = t[i]; // atomic
if (k1 == k) return 1;
if (k1 == key_invalid) return 0;
i = (i+1)%n;
} while (i != i0);
return 0;
}
// the result of the calls to insert will be stored here...
_Bool result[2][2] = {{0,0},{0,0}};
/* The function executed by each thread. */
void thread(int id, hset t) {
for (int j=0; j<2; j++) {
result[id][j] = insert(DATA[id][j], t);
$assert(result[id][j] || is_full(t));
}
}
// checks: every key is contained in the table at most once...
void checkUniqueKeys(hset t) {
for (int i=0; i<KB; i++) {
int c = 0;
for (int j=0; j<N; j++)
if (t[j] == i) c++;
$assert(c<=1);
}
}
int main() {
hset t = empty(N);
printf("N=%d\n", N);
// empty(n) creates an empty set with capacity n...
$assert($forall (int i | 0<=i && i<N) t[i]==key_invalid);
// choose DATA to insert...
for (int i=0; i<2; i++)
for (int j=0; j<2; j++)
DATA[i][j] = $choose_int(KB); // try 3 here for faster
printf("DATA = {{%d, %d}, {%d, %d}}\n",
DATA[0][0], DATA[0][1], DATA[1][0], DATA[1][1]);
$proc p1 = $spawn thread(0, t), p2 = $spawn thread(1, t);
$wait(p1); $wait(p2);
// member(k) == true, if insert(k) has been executed before (and returned true)...
for (int i=0; i<2; i++)
for (int j=0; j<2; j++)
if (result[i][j]) $assert(member(t, DATA[i][j]));
// member(k) == false, if no insert(k) that returned true can have been executed...
for (int k=0; k<KB; k++) {
if (!$exists (int i,j | 0<=i && i<2 && 0<=j && j<2)
k==DATA[i][j] && result[i][j])
$assert(!member(t, k));
}
checkUniqueKeys(t);
delete(t);
}
// termination: no cycles in state space or deadlocks ==> termination
```

The CIVL code follows the C++-like pseudocode given in the problem description very closely. I used CIVL to verify certain properties involving two concurrently executing threads with an upper bound (N) on the capacity of the hash set. Each thread chooses two keys nondeterministically and inserts those keys into the set. Then membership tests are performed. The output (excerpted):

CIVL v1.21+ of 2021-11-04 -- http://vsl.cis.udel.edu/civl N=4 DATA = {{0, 0}, {0, 0}} DATA = {{0, 0}, {0, 1}} DATA = {{0, 0}, {0, 2}} DATA = {{0, 0}, {0, 3}} DATA = {{0, 0}, {1, 0}} . . . DATA = {{3, 3}, {3, 0}} DATA = {{3, 3}, {3, 1}} DATA = {{3, 3}, {3, 2}} DATA = {{3, 3}, {3, 3}} === Source files === c3_4.cvl (c3_4.cvl) === Command === civl verify c3_4.cvl === Stats === time (s) : 89.92 memory (bytes) : 1765801984 max process count : 3 states : 1488999 states saved : 1075525 state matches : 120256 transitions : 1609254 trace steps : 682079 valid calls : 2637416 provers : cvc4, z3, why3 prover calls : 10456 === Result === The standard properties hold for all executions.]]>

In this post I will show an Alloy model I made for my course *Logic in Computer Science*. We spend a good bit of time in the course doing derivations (formal proofs) by Natural Deduction (ND), first for Propositional Logic (PL), then for First Order Logic. Here are the rules of ND for PL as presented in *Rigorous Software Development* by Almeida, Frade, Pinto and de Sousa, Springer, 2011:

A judgment in this logic is a *sequent* of the form \(\Gamma\vdash A\), where \(\Gamma\) is a set of (propositional) formulas and \(A\) is a formula. The rules specify when you can conclude new judgments from old. For example, rule (Ax) says that one can always conclude \(\Gamma\cup\{A\}\vdash A\) for any set \(\Gamma\) and formula \(A\) (\(\Gamma, A\) being shorthand for the set \(\Gamma\cup\{A\}\)). Rule \((\textrm{I}_\neg)\) says that if one has derived \(\Gamma,A\vdash\bot\) then one may conclude \(\Gamma\vdash\neg A\). Here is a derivation of the “law of the excluded middle”, \(\emptyset\vdash p\vee\neg p\):

\begin{array}{rrcll} 0. & \neg(p\vee\neg p), \neg p & \vdash & \neg p & (\textrm{Ax}) \\ 1. & \neg(p\vee\neg p), \neg p & \vdash & p\vee\neg p & (\textrm{I}_{\vee2})0\\ 2. & \neg(p\vee\neg p), \neg p & \vdash & \neg(p\vee\neg p) & (\textrm{Ax})\\ 3. & \neg(p\vee\neg p), \neg p & \vdash & \bot & (\textrm{E}_\neg)1,2\\ 4. & \neg(p\vee\neg p) & \vdash & p & (\textrm{RAA})3\\ 5. & \neg(p\vee\neg p) & \vdash & \neg(p\vee\neg p) & (\textrm{Ax})\\ 6. & \neg(p\vee\neg p) & \vdash & p\vee\neg p & (\textrm{I}_{\vee1})4\\ 7. & \neg(p\vee\neg p) & \vdash & \bot & (\textrm{E}_\neg)6,5\\ 8. & & \vdash & p\vee\neg p & (\textrm{RAA})7 \end{array}

The following Alloy model specifies an ND derivation for PL. Using this model, the Alloy Analyzer can automatically find a derivation for a given sequent. It is a good demonstration of the expressiveness of the Alloy logic. In 218 lines (including comments), the syntax of PL, derivations, and the rules of ND are fully specified, as well as several examples:

```
/* Alloy model of Natural Deduction for Propositional Logic
Stephen Siegel, University of Delaware, Nov. 2021
This model will find ND derivations for sequents in PL, i.e.,
judgments of the form S |- f, where S is a set of PL formulas
(the antecedents) and f is a PL formula (the consequent).
*/
-- derivations are trees but can be ordered for readability,
-- though this can increase runtime significantly.
-- children of node d always occur before d in the total order...
open util/ordering[Derivation]
/* PL Formulas */
abstract sig Formula { } -- all PL formulas
one sig False extends Formula { } -- the formula "False"
sig Prop extends Formula { } -- propositions
abstract sig Op {}
one sig NOT, AND, OR, IMPLIES extends Op {}
sig CompoundFormula extends Formula { -- formulas formed by a connective
op: one Op, -- the operator
arg1: one Formula, -- argument 1 (required)
arg2: lone Formula -- argument 2 (optional)
}{
op = NOT iff no arg2 -- only NOT takes 1 argument
}
-- no formula can occur as a sub-formula of itself...
fact wellFormedFormula { no f: Formula | f in f.^( arg1 + arg2 ) }
/* Derivations */
abstract sig Rule {} -- enumeration of the rule schema for ND
one sig Ax, RAA, EAnd1, EAnd2, IAnd, EOr, IOr1, IOr2, EImp, IImp, ENot, INot
extends Rule {}
sig Derivation { -- a derivation is a recursive (tree-like) structure:
rule: one Rule, -- the rule of which this derivation is an instance
premise1, premise2, premise3: lone Derivation, -- at most 3 sub-derivations
antecedents: set Formula, -- left side of sequent of conclusion
consequent: one Formula -- right side of sequent of conclusion
}
fun antes[d: Derivation]: set Formula { d.antecedents }
fun form[d: Derivation]: one Formula { d.consequent }
fun premise[]: Derivation->Derivation { premise1 + premise2 + premise3 }
fact wellFormedDerivation { -- each derivation is the application of a rule
all d: Derivation | {
d.premise in prevs[d] -- use only if ordering Derivation
ax[d] or eAnd1[d] or eAnd2[d] or iAnd[d] or eOr[d] or iOr1[d] or
iOr2[d] or eImp[d] or iImp[d] or eNot[d] or iNot[d] or raa[d]
}
}
-- use this to say every Derivation is used in the derivation
-- that means the scope for Derivation must be exact, rather
-- than upper bound
pred exactRoot[d: Derivation] { d.*premise = Derivation }
-- When ordering,
-- the scope is specified exactly rather than as an upper bound,
-- so need to do something with the derivations that are not used.
-- say all derivations not reachable from root d just stutter d
pred isRoot[d: Derivation] {
all d1: Derivation - d.*premise | {
d1.rule = d.rule
d1.premise1 = d.premise1
d1.premise2 = d.premise2
d1.premise3 = d.premise3
d1.antecedents = d.antecedents
d1.consequent = d.consequent
}
}
/* Derivation rules */
-- Rule 1: Ax: S,f |- f
pred ax[d: Derivation] {
d.rule = Ax and no d.premise and form[d] in antes[d]
}
-- Rule 2: Reductio-Ad-Absurdum: if S,!f |- False then S |- f
pred raa[d: Derivation] {
d.rule = RAA and no d.premise2 + d.premise3
let d1 = d.premise1 |
form[d1] = False and some g: CompoundFormula |
g.op = NOT and g.arg1 = form[d] and antes[d1] = antes[d] + g
}
-- Rule 3: Eliminate-and-1: if S |- f&g then S |- f
pred eAnd1[d: Derivation] {
d.rule = EAnd1 and no d.(premise2 + premise3)
antes[d.premise1] = antes[d]
let h = form[d.premise1] | h.op = AND and h.arg1 = form[d]
}
-- Rule 4: Eliminate-and-2: if S |- f&g then S |- g
pred eAnd2[d: Derivation] {
d.rule = EAnd2 and no d.(premise2 + premise3)
antes[d.premise1] = antes[d]
let h = form[d.premise1] | h.op = AND and h.arg2 = form[d]
}
-- Rule 5: Introduce-And: if S |- f and S |- g then S |- f&g
pred iAnd[d: Derivation] {
d.rule = IAnd and no d.premise3
let d1 = d.premise1, d2=d.premise2, h=form[d] |
antes[d1] = antes[d2] and antes[d1] = antes[d] and
h.op = AND and h.arg1 = form[d1] and h.arg2 = form[d2]
}
-- Rule 6: Eliminate-Or: if S |- f1|f2 and S,f1 |- h and S,f2 |- h then S |- h
pred eOr[d: Derivation] {
d.rule = EOr
let S = antes[d], d1 = d.premise1, d2=d.premise2, d3=d.premise3 |
antes[d1] = S and antes[d2] = S + form[d1].arg1 and antes[d3] = S + form[d1].arg2
and form[d1].op = OR and form[d2] = form[d] and form[d3] = form[d]
}
-- Rule 7: Introduce-Or-1: if S |- f then S |- f|g
pred iOr1[d: Derivation] {
d.rule = IOr1 and no d.premise2 + d.premise3
antes[d.premise1] = antes[d]
form[d].op = OR and form[d].arg1 = form[d.premise1]
}
-- Rule 8: Introduce-Or-2: if S |- f then S |- g|f
pred iOr2[d: Derivation] {
d.rule = IOr2 and no d.premise2 + d.premise3
antes[d.premise1] = antes[d]
form[d].op = OR and form[d].arg2 = form[d.premise1]
}
-- Rule 9: Eliminate-Implies: if S |- f->g and S |- f then S |- g
pred eImp[d: Derivation] {
d.rule = EImp and no d.premise3
let d1 = d.premise1, d2 = d.premise2 | {
antes[d1] = antes[d] and antes[d2] = antes[d]
form[d2].op = IMPLIES and form[d2].arg1 = form[d1]
form[d] = form[d2].arg2
}
}
-- Rule 10: Introduce-Implies: if S,f |- g then S |- f->g
pred iImp[d: Derivation] {
d.rule = IImp and no d.premise2 + d.premise3
let d1 = d.premise1, h=form[d] |
h.op = IMPLIES and antes[d1] = antes[d] + h.arg1 and form[d1] = h.arg2
}
-- Rule 11: Eliminate-Not: if S |- !f and S |- f then S |- g
pred eNot[d: Derivation] {
d.rule = ENot and no d.premise3
let d1 = d.premise1, d2 = d.premise2 |
antes[d1] = antes[d] and antes[d2] = antes[d] and
form[d2].op = NOT and form[d2].arg1 = form[d1]
}
-- Rule 12: Introduce-Not: if S,f |- False then S |- !f
pred iNot[d: Derivation] {
d.rule = INot and no d.premise2 + d.premise3
let d1 = d.premise1 |
form[d].op = NOT and antes[d1] = antes[d] + form[d].arg1 and
form[d1] = False
}
/* Examples */
-- A |- A (simple application of Ax rule)
pred ax1[d: Derivation] {
isRoot[d]
let f = antes[d] | one f and f=form[d] and f in Prop
}
-- A & B |- B & A (and is commutative)
pred andCommute[d: Derivation] {
isRoot[d]
let f = antes[d], g = form[d] |
one f and g.op = AND and f.(arg1+arg2) in Prop and f.op = AND and
g.arg1 = f.arg2 and g.arg2 = f.arg1 and f.arg1 != f.arg2
}
-- A & (B | C) |- (A & B) | (A & C) (and distributes over or)
pred dist1[d: Derivation] {
isRoot[d]
let f = antes[d] |
one f and f.op = AND and f.arg2.op = OR and
let a = f.arg1, b = f.arg2.arg1, c = f.arg2.arg2, g = form[d] | {
a+b+c in Prop
a != b and a != c and b != c
g.op = OR
g.arg1.op = AND and g.arg1.arg1 = a and g.arg1.arg2 = b
g.arg2.op = AND and g.arg2.arg1 = a and g.arg2.arg2 = c
}
}
-- |- P | !P (Law of the Excluded Middle)
pred LEM[d: Derivation] {
isRoot[d]
no antes[d]
let f = form[d], p = f.arg1, f2 = f.arg2 |
f.op = OR and p in Prop and f2.op = NOT and f2.arg1 = p
}
-- |- !(A & B) -> (!A | !B)
pred deMorgan1[d: Derivation] {
exactRoot[d]
no antes[d]
let f = form[d], f1 = f.arg1, f2 = f.arg2, f11=f1.arg1,
f21 = f2.arg1, f22 = f2.arg2, a = f11.arg1, b = f11.arg2 |
a != b and a+b in Prop and f.op = IMPLIES and
f1.op = NOT and f11.op = AND and
f2.op = OR and f21.op = NOT and f21.arg1 = a and f22.op = NOT and f22.arg1 = b
}
/* Runs. Solving times shown for Plingeling on MacBook Pro M1 */
run ax1 -- 1s
run andCommute for 2 Prop, 5 Formula, 4 Derivation -- 1s
run dist1 for 3 Prop, 9 Formula, 13 Derivation -- 7s
run LEM for 1 Prop, 5 Formula, 9 Derivation -- 10s
run deMorgan1 for 2 Prop, 10 Formula, 15 Derivation -- varies wildly
```

The model defines the syntax for a propositional logic formula, and then defines the structure of a derivation, which is a recursive (tree-like) structure. Each node is the application of one of 12 rules, has some number (\(\leq 3\)) of children (premises, or sub-derivations), and a conclusion, which is a sequent of the form \(S\vdash f\), where \(S\) is a set of formulas (the *antecedents*) and \(f\) is a formula (the *consequent*). Then there is a series of examples where I describe the desired sequent and ask Alloy to find a derivation of it.

From the Alloy Analyzer menu, if we select to execute the “run LEM…” command, the result, after a few seconds, is an instance of the model. The instance can be viewed in Alloy’s Table view mode. The first table shows the Formulas. In addition to the atomic formulas False (\(\bot\)) and Prop0 (\(p\)), the following three compound formulas appear:

These represent, in order: \(\neg(p\vee\neg p)\), \(\neg p\), and \(p\vee\neg p\). Now here is the derivation, exactly as it appears in Alloy’s Table view:

which is exactly the derivation one would write by hand and is shown above.

Notice that we never had to specify any algorithm to find derivations. Rather, we just had to tell Alloy what a derivation looks like, and Alloy—and the PLingeling SAT solver—do the hard work of finding something matching our description.

]]>