Generating Power Set
Iterating over all possible subsets of a set is a problem that can arise not only in competitive programming but also in day-to-day programming. While the total number of possible subsets is quite huge (2^n where n is the cardinality of the input set), often we want an algorithm that systematically considers all subsets for small values of n.
A power set is a set of all subsets of a given input set. Often we may want to simply iterate over all members of the power set, but sometimes (if we have enough memory) we may need the power set itself. In the following post, I have attempted to present some algorithms (with C++11 implementations) that solve both kinds of problems (simple iteration and complete power set construction).
Recursive algorithm for power set generation
Power set generation is a problem that yields naturally to a recursive algorithm. Consider a set say, $\lbrace 1, 2, 3, 4 \rbrace$. If we had a function that can generate the power set, $\mathbb{S}$, of $\lbrace 2, 3, 4 \rbrace$ we can generate the power set of the original input set by appending 1 to each member of $\mathbb{S}$ (call the result $\mathbb{T}$) and then taking the union of $\mathbb{S}$ and $\mathbb{T}$.
As an example, let us try generating the powerset of $\lbrace 1, 2, 3 \rbrace$. The power set of $\lbrace 2, 3 \rbrace, \mathbb{S},$ is $\lbrace \phi, \lbrace 2 \rbrace , \lbrace 3 \rbrace , \lbrace 2, 3 \rbrace \rbrace$. Adding $1$ to each member of $\mathbb{S}$, gives us, $\mathbb{T}$, $\lbrace \lbrace 1 \rbrace, \lbrace 1, 2 \rbrace, \lbrace 1, 3 \rbrace, \lbrace 1, 2, 3 \rbrace \rbrace$. Union of $\mathbb{S}$ and $\mathbb{T}$ gives us the required answer: $\lbrace \phi, \lbrace 2 \rbrace , \lbrace 3 \rbrace , \lbrace 2, 3 \rbrace, \lbrace 1 \rbrace, \lbrace 1, 2 \rbrace, \lbrace 1, 3 \rbrace, \lbrace 1, 2, 3 \rbrace \rbrace$.
Translating the above idea to code (C++11):
1// Accepts a set of integers and optionally a position to start reading the
2// array from.
3// Returns the powerset of integers from arr [pos=0 ... arr.size()-1]
4vector< vector<int> > powerset1(const vector<int> &arr, int pos=0) {
5 int arr_size = arr.size();
6 vector< vector<int> > results;
7
8 if (pos >= arr_size) {
9 // Recursion base case
10 results.push_back(vector<int>());
11 return results;
12 }
13
14 vector< vector<int> > rest = powerset1(arr, pos+1);
15 results = rest;
16 for (auto &subset : rest) {
17 subset.push_back(arr[pos]);
18 results.push_back(subset);
19 }
20 return results;
21}
I have assumed a set of integers that do not have duplicates. The code can be easily modified to accept other datatypes or maybe templates. If the vector has duplicate elements, just copy and deduplicate it before passing.
Since this method returns the complete power set, the space complexity is of the order of 2 ^ n. The time complexity is exponential.
Iterative algorithm, same idea
1vector< vector<int> > powerset2(const vector<int> &arr) {
2 int arr_size = arr.size();
3
4 vector< vector<int> > pset;
5 pset.push_back(vector<int>());
6
7 for (int pos = 0; pos < arr_size; pos++) {
8 vector< vector<int> > tmp(pset);
9 for (auto &subset : tmp)
10 subset.push_back(arr[pos]);
11 copy(tmp.begin(), tmp.end(), back_inserter(pset));
12 }
13
14 return pset;
15}
Iterating over power set, using a bitmask (lexical ordering)
We can use a bitmask of length = size of input set to denote a subset choice. For example, a selection of {1, 2} in {1, 2, 3, 4} can be denoted by 0011, where the LSB (least significant bit) denotes 0th index and MSB denotes the highest possible index (arr.size() - 1). $\phi$ is denoted by 0000 and a selection of all integers is denoted by 1111.
According to this idea, each set in the powerset can be represented by a bitmask. Therefore, iterating over all possible bitmasks will do the trick. Note that all possible bitmasks of length n is simply [0 … 2^n - 1].
This solution is sufficient for most practical applications running on a single computer. I am going to use a 64-bit unsigned long long for this. If you have a set bigger than this, you cannot do it in a reasonable amount of time on a single machine.
1void powerset3(const vector<int> &arr) {
2 int arr_size = arr.size();
3
4 unsigned long long lim = (1 << arr_size) - 1;
5 for (unsigned long long i = 0; i <= lim; i++) {
6 unsigned long long mask = i;
7
8 for (int idx = 0; idx < sizeof(mask) * 8; idx++) {
9 if ((mask >> idx) & 1)
10 cout << arr[idx] << " ";
11 }
12 cout << "\n";
13 }
14}
Iterating over power set, using banker’s sequence
Using banker’s sequence we can generate subsets in a monotonically increasing order of cardinality. Depending on your use case, it may be extremely useful. The idea is to use a bitmask as before but we will consider all bitmasks with set bits = k before we consider any mask with set bits > k.
1void powerset4(const vector<int> &arr) {
2 int arr_size = arr.size();
3 vector<bool> mask(arr_size, false);
4
5 for (int i = 0; i <= arr_size; i++) {
6 fill(mask.begin(), mask.end()-i, false);
7 fill(mask.end()-i, mask.end(), true);
8
9 do {
10 for (int j = 0; j < arr_size; j++) {
11 if (mask[j])
12 cout << arr[j] << " ";
13 }
14 cout << "\n";
15 } while(next_permutation(mask.begin(), mask.end()));
16 }
17}
Conclusion
As Donald Knuth once remarked, there seems to be as many algorithms for unsorting data as there are for sorting it. There are many approaches and algorithms for power set generation. I hope I have provided a decent coverage of the most popular ones.
As always, comments are welcome.