Hi. I want to share a simple method for coordinate compression. Even noob can write it.
vector<int> a(n);
// read the vector
vector<int> b = a;
sort(b.begin(), b.end());
map<int, int> m;
for (int i = 0; i < n; i++) {
m[b[i]] = i;
}
for (int i = 0; i < n; i++) {
a[i] = m[a[i]];
}
Now every value of an array lies in [0, n)
. The most convineint it that if you need the original value for a[i]
, you can just write b[a[i]]
.
And how do you write it?
I think when you are initializing the map, you need to use
m[b[i]]
instead ofm[a[i]]
, right?Yes, thank you.
+1, this method is faster because it doesn't use sets/maps
The one from the blog doesn't map to values 0,1,2,3,... but instead, it leaves some gaps. This might be confusing while debugging.
EDIT: It might be even faster to sort pairs (value, index) to avoid binary search. This requires iterating the sorted sequence and doing
nxt++
when we have value different than previous value. Longer implementation, better runtime (I think).I know my method isn't fast. I just think it's very easy to comprehend for beginners.
Thanks for your idea, it is greate, here is my "short" template implementation from your code UwU
I don't claim that [N lower bounds] are slower than [sort]. I claim that sort + lower bounds are together slower than sort (even if it's sort on pairs).
Also, it's faster to choose the size of vector M while creating it, instead of doing emplace_back many times.
Yes sir, thanks for pointing my mistakes
And yes sir, but I just afraid that some class might not have constructor.
Here are my fixed versions
Is there anything in C++ that can be compared with
<
and!=
and doesn't have a constructor? Even if you create your own class with some int/string attributes, they are 0/empty by default. So it's only if you create a class with some custom constructor with arguments? Seems very rare in CP.Btw. I don't think something so simple should be templated and copy-pasted to use during a contest. In the long run, you are better off by just implementing it every time. This way you get to know an algorithm well and you can modify it.
Yes sir, it is really just very small chances for random cases when I read other solutions with non-constructor custome class/struct, but/and all of them also dont have comparators
Can I ask for a question related to the implementations ?
Yesterday I use priority_queue with value comparator and it is amolst TLE while using position comparator give TLE. Why using position are slower
And is it good to use my third implementation of coordinate compression than my other two ? (comparing using position instead of directly compare the values)
Accessing something in a separate array costs time and it isn't cache-efficient. It's better to have a value here.
Just run some tests and see which version is faster.
Oh thanks sir, I thought using index will reduce the amount of extra memory but now I know sometimes cache miss is really a problem
Yes sir, but instead of copying the template to every problems, I usually only write the
template<>
when I face such problem, since the algorithm is simple but it is still a bunch of lines of code if we use many times :VThat's nice code. It help me. Thank you!
Better
if(!m.count(b[i]) m[b[i]]=i;
My way
Notice that by using set, this would work faster when the number of duplicates higher, but will work slower when all array are unique
Notice that by using set, this would work faster when the number of duplicates higher, but will work slower when all array are unique
I came across a problem on CSES Salary Queries where Code using Set+Map for compression resulted in TLE, and Code using Vector+Sort+Map gave AC.
PS: The Time-Limit of the problem is 1s.
you don't need sets
Thanks for giving me a new idea
I forgot
auto
method so I thought it would be long to use something likebut also I think I can use
unordered_map
to assigning value faster (if hash function for specific class then it would be more worse)I write it the same way as you
Can you suggest some problems based on this cool trick?
Many algorithms only need the order of the number compare to each other and not they value, so something like {$$$1, 4, 9, 16$$$} can be use as {$$$1, 2, 3, 4$$$} with return the same result (smaller value, faster to do). So when you face some problem like that, we can compress the array down to smaller (since the limitation for $$$n$$$ usually much smaller than maximum value of $$$a_i$$$). One algorithm usually use this trick is Sweep Line
Well, for example 61E - Противник слаб. And in general it is used when you want to store a set of numbers with a segment tree.
[Deleted]
In unordered_map the keys are not sorted
I use this template: