This problem was part of the VerifyThis 2022 verification competition. The competition took place on April 2, 2022, in München, Germany, as part of the ETAPS conference. The competition consists of three challenges. Each challenge describes some algorithms, including some pseudo-code, and a list of properties to verify. Participants use any platform of their choice to implement, specify, and verify the algorithms. Each challenge lasts one hour and 50 minutes.
The problem description is here. The problem is inspired by the “world’s simplest lock-free hash table”, from Jeff Preshing’s blog. The problem describes a lock-free concurrent hash set that supports adding an element and membership test. The algorithms use an atomic compare-and-swap operation.
Here are the properties to prove:
- empty(n) creates an empty set with capacity n
- member(k) == true, if insert(k) has been executed before (and returned true)
- member(k) == false, if no insert(k) that returned true can have been executed
- termination
- every key is contained in the table at most once
- if insert returns false, the table is full
Here is the CIVL solution I came up with during the competition:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 |
/* Challenge 3: concurrent hash set Author: Stephen Siegel (Team CIVL) Algorithm is implemented exactly as written. What we check: there are two threads, each does two inserts. The keys to insert are chosen nondeterministically within some bounded range. Assertions are checked throughout (see comments). The hash function is a fixed function. */ $input int NB = 4; // upper bound on N $input int N; // the length of the array that will serve as the hash set $assume ( 1<=N && N<=NB ); $input int KB = 4; // upper bound on keys. All keys lie in [0,KB) int DATA[2][2]; // DATA[i][0], DATA[i][1] are the keys thread i will insert #include <stdlib.h> #include <stdio.h> typedef int K; // just using nonnegative ints for the keys, for now #define key_invalid -1 typedef K * hset; // hset is array of length N of K /* Standard CAS, exactly as specified in problem */ K compare_and_swap(K *target, K oldv, K newv) { K result; $atomic { if (*target == oldv) *target = newv; result = *target; } return result; } // returns something in [0,n), a function of k size_t get_hash(size_t n, K k) { // for now. Could make this more general but to enumerate all hash // functions would take KB^N cases! return (3*k+7)%n; } /* Is the hash set full? */ _Bool is_full(hset t) { return $forall(int i | 0<=i && i<N) t[i] != key_invalid; } /* Creates the empty hash set with capacity n */ hset empty(size_t n) { hset t = (hset)malloc(n*sizeof(K)); for (size_t i=0; i<n; ++i) t[i] = key_invalid; return t; } /* Need to free memory allocated by empty, else memory leaks */ void delete(hset t) { free(t); } /* insert function, exactly as specified */ _Bool insert(K k, hset t) { size_t n = N; size_t i0 = get_hash(n,k); size_t i = i0; do { { // optimization K kk = t[i]; // reads are atomic always if (kk == k) return 1; if (kk != key_invalid) { i = (i+1)%n; continue; } } K k1 = compare_and_swap(&t[i], key_invalid, k); if (k1 == k) return 1; i = (i+1)%n; } while (i != i0); return 0; } /* member test, exactly as specified */ _Bool member(hset t, K k) { size_t n = N; // t.length size_t i0 = get_hash(n,k); size_t i = i0; do { K k1 = t[i]; // atomic if (k1 == k) return 1; if (k1 == key_invalid) return 0; i = (i+1)%n; } while (i != i0); return 0; } // the result of the calls to insert will be stored here... _Bool result[2][2] = {{0,0},{0,0}}; /* The function executed by each thread. */ void thread(int id, hset t) { for (int j=0; j<2; j++) { result[id][j] = insert(DATA[id][j], t); $assert(result[id][j] || is_full(t)); } } // checks: every key is contained in the table at most once... void checkUniqueKeys(hset t) { for (int i=0; i<KB; i++) { int c = 0; for (int j=0; j<N; j++) if (t[j] == i) c++; $assert(c<=1); } } int main() { hset t = empty(N); printf("N=%d\n", N); // empty(n) creates an empty set with capacity n... $assert($forall (int i | 0<=i && i<N) t[i]==key_invalid); // choose DATA to insert... for (int i=0; i<2; i++) for (int j=0; j<2; j++) DATA[i][j] = $choose_int(KB); // try 3 here for faster printf("DATA = {{%d, %d}, {%d, %d}}\n", DATA[0][0], DATA[0][1], DATA[1][0], DATA[1][1]); $proc p1 = $spawn thread(0, t), p2 = $spawn thread(1, t); $wait(p1); $wait(p2); // member(k) == true, if insert(k) has been executed before (and returned true)... for (int i=0; i<2; i++) for (int j=0; j<2; j++) if (result[i][j]) $assert(member(t, DATA[i][j])); // member(k) == false, if no insert(k) that returned true can have been executed... for (int k=0; k<KB; k++) { if (!$exists (int i,j | 0<=i && i<2 && 0<=j && j<2) k==DATA[i][j] && result[i][j]) $assert(!member(t, k)); } checkUniqueKeys(t); delete(t); } // termination: no cycles in state space or deadlocks ==> termination |
The CIVL code follows the C++-like pseudocode given in the problem description very closely. I used CIVL to verify certain properties involving two concurrently executing threads with an upper bound (N) on the capacity of the hash set. Each thread chooses two keys nondeterministically and inserts those keys into the set. Then membership tests are performed. The output (excerpted):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
CIVL v1.21+ of 2021-11-04 -- http://vsl.cis.udel.edu/civl N=4 DATA = {{0, 0}, {0, 0}} DATA = {{0, 0}, {0, 1}} DATA = {{0, 0}, {0, 2}} DATA = {{0, 0}, {0, 3}} DATA = {{0, 0}, {1, 0}} . . . DATA = {{3, 3}, {3, 0}} DATA = {{3, 3}, {3, 1}} DATA = {{3, 3}, {3, 2}} DATA = {{3, 3}, {3, 3}} === Source files === c3_4.cvl (c3_4.cvl) === Command === civl verify c3_4.cvl === Stats === time (s) : 89.92 memory (bytes) : 1765801984 max process count : 3 states : 1488999 states saved : 1075525 state matches : 120256 transitions : 1609254 trace steps : 682079 valid calls : 2637416 provers : cvc4, z3, why3 prover calls : 10456 === Result === The standard properties hold for all executions. |