Performance tip : using std::deque as a memory cache

№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

If you've been doing competitive programming for a while, you're probably aware by now that dynamically allocating memory (i.e. using new or malloc) is pretty expensive in terms of computation time (both because of slow allocation and potential cache misses).

What many people do to avoid dynamically allocated memory is implement a "memory cache": a pre-allocated array of elements, from which you can fetch elements at will. In code, this might look like this:

elem elemCache[MAX_NUMBER_OF_ELEMENTS];
int nxtElemCache = 0;

elem *fetchElement() {
  return &elemCache[nxtElemCache++];
}

And instead of using the new operator, you just use the fetchElement function (you can also override the new operator itself, but it's slightly less readable for the purposes of this thread, so we'll leave it at that).

The problem arises when you do not know what MAX_NUMBER_OF_ELEMENTS is. For example, let's say that you are implementing a data structure (maybe a persistent segment tree) which you are not very familiar with. You can usually approximate this number (let's take the segment tree example: you know that at every step you are creating at most two nodes and you are doing O(log N) steps so maybe 2 * N * log N is a safe upper bound; maybe increase the factor for safety), but maybe since you are not familiar with the structure you don't know of a good upper bound. Maybe you just don't want to waste time thinking about it or maybe you are simply afraid that your calculations are wrong and don't want to take the chance and have wasted submission.

An elegant solution is to use std::deque. An example of how this would work in code:

deque<elem> elemCache;

elem *fetchElement() {
  elemCache.push_back(elem());
  return &elemCache.back();
}

Simply add an element to the back of your deque cache and return the address of this element.

You might be asking what makes std::deque such a good candidate for this job. Well, there are quite a few reasons:

pushing to the end of std::deque takes amortized constant time (similar to std::vector)
unlike something like std::vector, references to elements of a deque are never invalidated (when pushing at the front or back of the structure); this is guaranteed by the standard; this means that once you have allocated an element, it will remain allocated at that address (which is the desired behavior: you don't want the pointers to be invalidated);
unlike something like std::list, it uses very little memory overhead;
cache efficiency is maintained (again, unlike something like std::list) since internally internally std:deque holds contiguous chunks of memory.

Here are some submissions for last contest's problem F using persistent segment tree:

using new operator 39729133
using static array 39729224
using std::deque 39783231

As a later edit, I will also add that this method also supports deletion (and later re-using the "deleted" memory), with the help of an additional stack. The idea is simple: instead of physically deleting a node, just throw it in a "recycle bin". When wanting to fetch a new node, first check if you have anything in the recycle bin instead. Implementation below:

stack<elem*> recycleBin;
deque<elem> elemCache;

elem *fetchElement() {
  if (!recycleBin.empty()) {
    elem *temp = recycleBin.top();
    recycleBin.pop();
    return temp;
  }
  elemCache.push_back(elem());
  return &elemCache.back();
}

inline void deleteElement(elem *e) {
  recycleBin.push(e);
}

Комментарии (2)

Написать комментарий?

mouse_wireless

6 лет назад, # |

Auto comment: topic has been updated by mouse_wireless (previous revision, new revision, compare).

→ Ответить

lukasz_strong_wsm

4 года назад, # |

← Rev. 2 →

+34

It took me a moment to understand, so I'll share it here:

std::deque is NOT faster than static array. The time difference between 39729224 and 39783231 in favour of std::deque is caused by allocating significantly more memory for the static array (and perhaps because we are setting non-default values in the constructor). Here is a faster submit using static array 103981478.
the overhead of std::deque is CRAZY SMALL for this application, which is great. But it doesn't mean that it is the best structure and we should use it everywhere.
std::deque != avoid dynamically allocated memory. Just because the container is responsible for dynamic allocation, doesn't mean we are avoiding it in our code.
cache efficiency is maintained (again, unlike something like std::list) since internally internally std:deque holds contiguous chunks of memory.
Yes and no. cppreference says that each chunk can store ~16 underlying objects. After we fill one chunk we need to allocate memory for another one. But current chunk and the new chunk will not necessarily be memory-wise "close" to each other, which will cause a drop in performance in some applications, such as linear search.
readability tip: use std::stack if you intend to use only push and access top element, std::stack is implemented as a container adapter (by default for std::deque).

Блог пользователя mouse_wireless