Recently, there was a local contest which I set a problem for (task 4 in this pdf).
The abridged problem statement is that you choose an array $$$A$$$ of length $$$N$$$ where the elements are $$$<2^X$$$. The grader will pick $$$[L,R]$$$ and return $$$A_L \oplus A_{L+1} \oplus \ldots \oplus A_R$$$, where $$$\oplus$$$ is the bitwise xor operation. You need to find any integer $$$z$$$ such that $$$L \leq z \leq R$$$.
I set this in contest with the constraints of $$$N=2^{19}$$$ and $$$X=192 \approx \frac{(\log N)^2}{2}$$$ where you also had to answer queries quickly.
The main idea of my solution is to keep splitting the query range in half. Suppose we know that $$$l \leq L \leq R \leq r$$$. Let $$$m=\frac{l+r}{2}$$$.
Then there are $$$3$$$ cases:
- answer $$$m$$$
- recurse to $$$[l,m)$$$
- recurse to $$$(m,r]$$$
Let us put a bit on position $$$m$$$. If that bit is activated, then it is case 1. Otherwise, we need to differentiate case 2 and 3. Notice that in the array $$$[1,2,1,4,1,2,1]$$$, all the subarrays have non-zero xor-sums. So we can put this array on the left side. If there exists any bit that is activated in that range, it is case 2. Otherwise, it is case 3.
In total, this uses something like $$$2+3+4+\ldots+\log N$$$ bits.
#include <bits/stdc++.h>
using namespace std;
#define ll long long
#define rep(x,start,end) for(auto x=(start)-((start)>(end));x!=(end)-((start)>(end));((start)<(end)?x++:x--))
#define all(x) (x).begin(),(x).end()
#define sz(x) (int)(x).size()
const int L=18;
int n=1<<(L+1);
vector<bitset<192>> v;
void dnc(int l,int r,int idx,int d){
if (d==0) return;
int m=l+r>>1;
v[m][idx]=1;
idx-=d;
rep(x,l,m){
v[x][idx+__builtin_ctz(x+1-l)]=1;
}
dnc(l,m-1,idx-1,d-1);
dnc(m+1,r,idx-1,d-1);
}
vector<bitset<192>> init(){
rep(x,0,n) v.push_back(bitset<192>(0));
dnc(0,n-2,191,L);
v[n-1][0]=1;
return v;
}
int answer(bitset<192> v){
if (v[0]) return n-1;
int l=0,r=n-1;
int idx=191,d=L;
while (d){
int m=l+r>>1;
if (v[idx]) return m;
bool flag=false;
rep(x,idx-d,idx) if (v[x]) flag=true;
if (flag) r=m-1;
else l=m+1;
idx-=d+1;
d--;
}
return l;
}
During testing, oolimry and icypiggy decided it would be funny to write solutions that had $$$X \approx c \cdot \log N$$$ (of course they also had to ensure they could answer queries quickly so these are more of describing speedup techniques rather than techniques for reducing number of bits).
(this was written by oolimry)
Let $$$B = \sqrt N$$$, we split it into $$$B$$$ buckets of size $$$B$$$ each. There are $$$3$$$ layers.
We generate $$$B$$$ random $$$36$$$-bit numbers such that all $$$O(B^2)$$$ ranges have unique xors, which we can store all unique xors into a map. Let $$$R[i]$$$ be i-th the random $$$36$$$ bit number. This will be used in layer 1 and layer 2.
For layer 1 ($$$36$$$ bits), the position is blank if $$$i \text{ mod } B != 0$$$. Else if $$$i \text{ mod } B == 0$$$ (i.e the boundary between two buckets), the position has the value $$$R[i/B]$$$ for those 36 bits.
Layer 2 ($$$36$$$ bits), the position has value $$$R[i \text{ mod } B]$$$.
If the range extends across more than $$$1$$$ bucket, layer 1 can detect it. Otherwise, layer 2 will tell us that the range is of the form $$$L + kB$$$ to $$$R + kB$$$ for some $$$k$$$, where $$$k$$$ is the bucket number. Now we use layer 3 to determine which bucket.
Layer 3 ($$$38$$$ bits) is assigned to all positions and can just be assigned randomly. To find the value of $$$k$$$, just check all B buckets and verify if the range xor matches. Layer 3 serves as a random checksum.
[The values of 36 and 38 are arbitrary]
#include <bits/stdc++.h>
using namespace std;
#define all(x) (x).begin(), (x).end()
#define sz(x) ((int) x.size())
#define show(x) cerr << #x << " is " << x << endl;
#define show2(x, y) cerr << #x << " is " << x << ", " << #y << " is " << y << endl;
#define show3(x,y,z) cerr << #x << " is " << x << " " << #y << " is " << y << " " << #z << " is " << z << endl;
typedef long long lint;
typedef pair<lint, lint> ii;
typedef pair<lint, ii> iii;
mt19937_64 rng(time(0));
int n;
lint arr[2005];
bitset<192> arrb[2005];
vector<iii> M;
lint bitsA = 38;
lint bucket = 2000;
lint bitsD = 36;
bitset<192> brr[2005];
bitset<192> firstmask;
bitset<192> secondmask;
bitset<192> thirdmask;
vector<bitset<192> > toset;
vector<bitset<192> > prefxor;
pair<int, vector<bitset<192> > > init(){
int n = 1<<19;
toset.resize(n);
prefxor.resize(n);
rng.seed(1633860579000212400);
for(int i = 0;i < bucket;i++){
arr[i] = rng() & ((1LL<<bitsA) - 1LL);
}
for(int l = 0;l < bucket;l++){
lint X = 0;
for(int r = l;r < bucket;r++){
X ^= arr[r];
M.push_back(iii(X, ii(l,r)));
}
}
sort(all(M));
for(int i = 0;i < bucket;i++){
bitset<192> bs(arr[i]);
arrb[i] = bs;
}
for(int i = 0;i < n;i++){
toset[i] |= arrb[i%bucket];
if(i%bucket == 0){
toset[i] |= (arrb[i/bucket] << bitsA);
}
bitset<192> bs(rng()&((1LL<<bitsD)-1));
bs <<= (2*bitsA);
toset[i] |= bs;
}
for(int i = 0;i < n;i++){
prefxor[i] = toset[i];
if(i != 0) prefxor[i] ^= prefxor[i-1];
}
return {n, toset};
}
int answer(bitset<192> bs){
lint B = 0;
for(int b = 0;b < bitsA;b++){
if(bs[b+bitsA]) B |= (1LL<<b);
}
auto it = lower_bound(all(M), iii(B, ii(-1,-1)));
if(it != M.end() and it->first == B){
return (it->second.first) * bucket;
}
lint A = 0;
for(int d = 0;d < bitsA;d++){
if(bs[d]) A |= (1LL<<d);
}
auto it2 = lower_bound(all(M), iii(A, ii(-1,-1)));
for(int b = 0;b < bucket;b++){
int L = b*bucket + (it2->second).first;
int R = b*bucket + (it2->second).second;
bitset<192> BBS = prefxor[R];
if(L > 0) BBS ^= prefxor[L-1];
if(BBS == bs){
return L;
}
}
assert(false);
}
(this was written by errorgorn so might be a gross oversimplication :P)
If we randomly assign bits such that they were periodic with period $$$T$$$ and their xor-sum of a subarray with length $$$T$$$ was $$$0$$$, then we could find several candidates for $$$(l \text{ mod } T,r \text{ mod } T)$$$. We can store these bits for several coprime values of $$$T$$$ then we have another checksum to ensure that our candidate is correct.
Do note that this solution actually recovers the original subarray!
#include <bits/stdc++.h>
#include "battleship.h"
using namespace std;
const int N = (1<<19);
vector<bitset<192>> V;
const int m0 = 801;
const int m[3] = {797, 787, 773};
const int bits = 21;
mt19937 gen(0); //Standard mersenne_twister_engine seeded with rd()
uniform_int_distribution<> distrib(1, (1<<bits)-1);
int crt0[m0+2][m0+2];
int crt1[m0+2][m0+2];
int crt2[m0+2][m0+2];
int ctr[(1<<bits)+1];
pair<short,short> p[m0*m0];
int next_random() {
return distrib(gen);
}
const bitset<192> mask = bitset<192>((1<<22)-1);
vector<bitset<192>> init() {
V.resize(N);
for(int i=0; i<N; i++) V[i].reset();
vector<int> v;
v.push_back(0);
set<int> s;
while(v.size() <= m[0]) {
int x;
do {
x = next_random();
} while(s.find(x)!=s.end()); // make sure x does not already exist
s.insert(x);
v.push_back(x);
}
for(int i=0; i<m[0]; i++) {
for(int j=0; j<i; j++) {
ctr[v[i]^v[j]]++;
}
}
for(int i=1; i<=(1<<bits); i++) {
ctr[i] += ctr[i-1];
}
for(int i=0; i<m[0]; i++) {
for(int j=0; j<i; j++) {
int* pt = &(ctr[v[i]^v[j]]);
(*pt)--;
p[*pt] = make_pair(i,j);
}
}
for(int a=0; a<3; a++) {
v[m[a]] = 0;
//v.push_back(0);
for(int i=0; i<N; i++) {
int r = i % m[a];
int d = v[r+1] ^ v[r]; // copy the bits of d into V[i]
V[i] |= ((bitset<192>)(d))<<(a*22);
}
// counting sort (i,j) based on (v[i]^v[j])
}
for(int i=0; i<N; i++) {
long long tmp = (79143LL*i) ^ (79143LL*(i+1));
V[i] |= ((bitset<192>)(tmp)) << 66;
}
for(int i=1; i<=N; i++) {
crt1[i%m[0]][i%m[2]] = i;
crt2[i%m[0]][i%m[1]] = i;
}
return V;
}
int answer(bitset<192> x) {
// extract the lowest 66 bits into a number
int a[3];
a[0] = static_cast<int>((x&mask).to_ulong());
a[1] = static_cast<int>(((x>>22)&mask).to_ulong());
a[2] = static_cast<int>(((x>>44)&mask).to_ulong());
long long z = (((x>>66)).to_ullong());
if(a[2]==0) {
int b = 0;
int c = 1;
for(int i1=ctr[a[b]]; i1 < ctr[a[b]+1]; i1++) {
pair<short,short> p0 = p[i1];
for(int i2=ctr[a[c]]; i2 < ctr[a[c]+1]; i2++) {
pair<short,short> p1 = p[i2];
int st = crt2[p0.first][p1.first];
int en = crt2[p0.second][p1.second];
if(st<=N && en<=N && (z ^(79143LL*st) ^ (79143LL*en))==0) {
return (st+en)/2;
}
swap(p1.first, p1.second);
st = crt2[p0.first][p1.first];
en = crt2[p0.second][p1.second];
if(st<=N && en<=N && (z ^(79143LL*st) ^ (79143LL*en))==0) {
return (st+en)/2;
}
}
}
}
if(a[1]==0) {
int b = 0;
int c = 2;
for(int i1=ctr[a[b]]; i1 < ctr[a[b]+1]; i1++) {
pair<short,short> p0 = p[i1];
for(int i2=ctr[a[c]]; i2 < ctr[a[c]+1]; i2++) {
pair<short,short> p1 = p[i2];
int st = crt1[p0.first][p1.first];
int en = crt1[p0.second][p1.second];
if(st<=N && en<=N && (z ^(79143LL*st) ^ (79143LL*en))==0) {
return (st+en)/2;
}
swap(p1.first, p1.second);
st = crt1[p0.first][p1.first];
en = crt1[p0.second][p1.second];
if(st<=N && en<=N && (z ^(79143LL*st) ^ (79143LL*en))==0) {
return (st+en)/2;
}
}
}
}
if(true) {
int b = 1;
int c = 2;
for(int i1=ctr[a[b]]; i1 < ctr[a[b]+1]; i1++) {
pair<short,short> p0 = p[i1];
for(int i2=ctr[a[c]]; i2 < ctr[a[c]+1]; i2++) {
pair<short,short> p1 = p[i2];
//int st = crt0[p0.first][p1.first];
//int en = crt0[p0.second][p1.second];
int st = static_cast<int>((391139LL * p1.first) + (217213LL * p0.first)) % (m[1]*m[2]);
int en = static_cast<int>((391139LL * p1.second) + (217213LL * p0.second)) % (m[1]*m[2]);
if(st<=N && en<=N && (z ^(79143LL*st) ^ (79143LL*en))==0) {
return (st+en)/2;
}
swap(p1.first, p1.second);
st = static_cast<int>((391139LL * p1.first) + (217213LL * p0.first)) % (m[1]*m[2]);
en = static_cast<int>((391139LL * p1.second) + (217213LL * p0.second)) % (m[1]*m[2]);
if(st<=N && en<=N && (z ^(79143LL*st) ^ (79143LL*en))==0) {
return (st+en)/2;
}
}
}
}
assert(false);
return 0;
}
Anyways, this raises some interesting questions about the minimal $$$X$$$ we need if we completely do not care about answering queries quickly. A trivial lower bound is $$$X \approx \log N$$$ since we could query $$$[pos,pos]$$$ for all values of $$$pos$$$. An upper bound is randomly choosing bits. I would think that it is reasonable to assume that the xor-sum of subarrays should be independent and we should require somewhere around $$$X \approx 4 \cdot \log N$$$ bits?
Is there a better lower and upper bound $$$X$$$ in this problem?
I believe that I can improve the lower bound to $$$2 \log n$$$ (asymptotically) by showing that any array $$$A$$$ for which the problem is 100% solvable will also allow you to recover the interval $$$[L, R]$$$ (up to a small edge case that doesn't affect asymptotics). In this case, being able to recover any interval means that any two distinct intervals must have different xor-sums.
Let $$$P_i = A_1 \oplus A_2 \oplus \dots \oplus A_i$$$. Then the xor-sum of $$$[L,R] $$$ is $$$P_{L-1} \oplus P_R$$$. Now the only way an interval is un-recoverable is if there are two interval $$$[L,R], [L',R']$$$ (WLOG $$$L \leq L'$$$) with the same xor-sum. This means $$$P_{L-1} \oplus P_R = P_{L'-1} \oplus P_R' \implies P_{L-1} \oplus P_R \oplus P_{L'-1} \oplus P_R' = 0 \implies P_{L-1} \oplus P_{L'-1} = P_R \oplus P_R'$$$.
Now if we know the original problem is solvable, then there must be some $$$z$$$ we can answer such that $$$L \leq z \leq R$$$, $$$L' \leq z \leq R'$$$. This implies $$$L' \leq z \leq R \implies L' - 1 < R$$$.
Now this means that the two intervals $$$[L, L'-1], [R+1, R']$$$ (if they exist) must have the same xor-sum and do not overlap, meaning the original problem was not 100% solvable. Note that if $$$R > R'$$$ we use $$$[R'+1,R]$$$ instead.
The only exception is when $$$L = L'$$$ or $$$R = R'$$$. However in either case this leads to some interval $$$[B, C]$$$ with xor-sum $$$0$$$, which means (unless $$$B = C$$$) that the xor-sums of the non-overlapping intervals $$$[B,B],[B+1,C]$$$ are the same, which is bad.
The only remaining exception is an interval of length $$$1$$$ with xor-sum $$$0$$$, which we can see is an actual counterexample to my earlier claim in the form of arrays like $$$A = [0, 1]$$$ for which the original problem is solvable, but there are two distinct intervals with xor-sum 1. However, as we can have at most one element $$$A_i$$$ equal to 0, this only allows us to increase $$$N$$$ by at most 1.
For completeness, I'll explain why being able to recover the interval forces $$$X \geq 2 \log N$$$ asymptotically. This is just because there must be at least $$$\binom{N}{2}$$$ xor-sums of intervals, and each xor-sum is between $$$0$$$ and $$$2^X$$$. As each xor-sum must be distinct, we get $$$2^X \geq \binom{N}{2} \implies X \geq \log \binom{N}{2} \approx 2 \log N$$$.
The upper bound is $$$2 \cdot \log N$$$ bits. Check XX OpenCup GP of Zhejiang Problem J.