Hello!
I wish to continue the discussion on C++ input/output performance, which was started by freopen long ago. freopen compared the speed of the two common input/output methods in C++: the stdio library, inherited from C (<stdio>
), and the newer iostreams library (<iostream>
/…). However, these tests did not account for the fact that there are several important iostreams optimizations which can be used. They have been mentioned more than once on Codeforces (first, second, third). I have written a program that compares performance of the stdio and iostreams libraries with these optimizations turned on.
UPD1: Added information on _CRT_DISABLE_PERFCRIT_LOCKS
.
UPD2: Added printf()
/scanf()
calls on strings for completeness.
What are the optimizations?
The first one is enabled by placing this line in the beginning of the program, before any input/output:
ios_base::sync_with_stdio(false);
This command turns off iostreams and stdio synchronization (description). It is on by default, which means that calls to iostreams and stdio functions can be freely interleaved even for the same underlying stream. When synchronization is turned off, mixing calls is no longer allowed, but iostreams can potentially operate faster.
The second optimization is about untying cin
from cout
:
cin.tie(NULL);
By default, cin
is tied to cout
, which means that cout
is flushed before any operation on cin
(description). Turning this feature off allows iostreams, again, to operate faster. One should be careful with this optimization in interactive problems: it should either not be used, or an explicit flush
should be issued each time.
I should also note that frequent use of endl
also negatively affects iostreams performance, because endl
not only outputs a newline character, but also flushes the stream's buffer (decription). You can simply output '\n'
or "\n"
instead of endl
.
What tests are included in the program?
I have tried to reproduce the most typical cases that occur when solving problems.
int
input/output with stdio, iostreams and, for comparison, custom functionsdouble
input/output- Character input/output
- String input/output: both
char *
andstd::string
What tests are not included in the program?
long long
— I estimate it to give roughly the same relative results asint
- Manual conversion of
int
to a customchar
buffer of a fairly large size, which would then get directly output withfwrite()
/cout.write()
(and the same for input) - Rather unusual character input/output method with
cin.rdbuf()->sgetc()
andcout.rdbuf()->sputc()
- Any tests that change the stream buffer size (it seems that in GCC iostreams is unaffected by user settings for standard streams). This can be potentially explored more thoroughly.
How do I run this?
Compile the program, not forgetting optimization (-O2
/Release), then run it with the same working directory as where the program binary is. If you get a Access denied message on Windows, running the program with elevated privileges could help. The program will need about two hundred megabytes of free space in the directory for temporary files.
Additional notes
- Why does each test need a separate process?
Because ios_base::sync_with_stdio(false)
disallows combined stdio and iostreams usage, and also, theoretically, prohibits using freopen()
to redirect cin
/cout
.
- Why is it needed to remove the test file before each new test?
To have equal conditions for all runs. However, this could be disputable. Maybe it's better to rewrite the file?
- Why does the child process measure the time, and not the parent process?
To exclude process creation/destruction time from the results.
- Why can't you use something more precise like
getrusage()
instead ofclock()
?
I can. That is, when I understand how to do it in Windows :-)
The results
I ran the tests on a PC with Pentium 4, so the figures might look a bit big.
- For Visual C++ 2010: http://pastie.org/4680309
int, printf 9.45 9.48 9.44 int, cout 22.03 22.01 22.21 int, custom/out 11.17 11.06 11.20 int, scanf 5.04 4.77 4.82 int, cin 20.26 20.16 20.16 int, custom/in 10.25 10.25 10.25 double, printf 19.23 18.98 18.95 double, cout 37.49 37.52 37.44 double, scanf 12.11 11.75 11.73 double, cin 26.88 26.57 26.57 char, putchar 13.29 13.76 13.48 char, cout 23.52 24.15 23.41 char, getchar 12.87 12.82 12.74 char, cin 16.13 16.22 16.50 char *, printf 6.88 6.74 6.57 char *, puts 3.95 3.82 3.95 char *, cout 6.36 6.32 6.43 string, cout 6.40 6.40 6.61 char *, scanf 6.16 6.10 6.13 char *, gets 3.98 3.96 3.96 char *, cin 8.72 8.91 8.85 string, getline 11.70 11.47 11.53
Here, everything is obvious. stdio is a lot faster than iostreams. It is notable that printf()
/scanf()
are even faster than the custom-written functions for int
(but see addendum below). puts()
/gets()
are faster than printf()
/scanf()
on strings — this is understandable. Writing a std::string
takes the same time as for char *
, but reading to a std::string
is slower — certainly because of the need to dynamically allocate memory.
- For MinGW (GCC 4.7.0): http://pastie.org/4680314
int, printf 9.72 9.61 9.61 int, cout 6.08 6.05 6.10 int, custom/out 2.73 2.75 2.76 int, scanf 5.01 5.01 5.01 int, cin 3.99 4.04 4.04 int, custom/in 0.86 0.86 0.87 double, printf 22.51 22.40 22.42 double, cout 110.98 111.77 111.01 double, scanf 12.18 12.20 12.17 double, cin 118.87 118.84 118.87 char, putchar 1.67 1.65 1.64 char, cout 3.93 3.87 3.85 char, getchar 0.78 0.80 0.80 char, cin 3.29 3.31 3.29 char *, printf 5.55 5.47 5.49 char *, puts 5.37 5.32 5.41 char *, cout 8.72 8.72 8.78 string, cout 8.74 8.71 9.06 char *, scanf 7.07 7.04 7.02 char *, gets 3.84 3.79 3.77 char *, cin 5.30 5.38 5.35 string, getline 14.15 14.12 14.16
This one is not so one-sided. Quite unexpectedly, it turns out that iostreams is about 20-30% faster than stdio for int
. The custom int
functions beat both by a significant margin, though. For double
it's reversed: iostreams is very slow. putchar()
/getchar()
work about 2-3 times faster than cout
/cin
for character input/output. String input/output does not differ as much, but also here stdio is faster. puts()
/gets()
are again faster than printf()
/scanf()
on string input/output. As in the previous case, std::string
takes the same time as char *
to be output, but more time to be input.
I leave it up to the readers to draw conclusions and decide what to use. Flame constructive discussion is welcome.
Addendum
For Visual C++, there is a method to significantly speed up basic operations on stdio streams by turning off stream locking for the getchar()
, putchar()
and some other functions. To do this, add this line before any #include
s:
#define _CRT_DISABLE_PERFCRIT_LOCKS
(description). This will work only if the following conditions are also met:
- The program must be statically linked with the standard library (
/MT
; Codeforces seems to do this) - The program can include
<stdio.h>
, but must not include<cstdio>
or any of the iostreams headers (<iostream>
/…)
Alternatively to the magic above, you can just use _putchar_nolock()
/_getchar_nolock()
instead of putchar()
/getchar()
. Linux also has similar functions: link.
With this optimization character input/output speed increases nearly ninefold (!), and so does the speed of custom int
functions:
int, custom/out 1.70 1.70 1.72 int, custom/in 1.28 1.26 1.28 char, putchar 1.72 1.62 1.61 char, getchar 1.36 1.34 1.36
MinGW does this by default and is not subject to the aforementioned restrictions.
Can you tell me about custom/in or custom/out. What it means? Thank you.
Ctrl+F "custom" in the program linked in the post.
(I see that this is kinda old entry, but it was brought up into recent actions I will make my input).
I think it would also be interesting to have results for different degrees of precisions "not set" (which is probably always a bad idea to use at contests) vs small (e.g. 2) vs big (e.g. 10).
Is
ios::sync_with_stdio(0);
the same asios_base::sync_with_stdio(0)
?Do you mean
std::basic_ios
? Yes, it is inherited from thestd::ios_base
class.Here are my results, using Ubuntu 18.04, 3.8GHz i7-7700HQ CPU, g++ 7.4.0 (compiled with
-std=c++11
sincegets
is removed from c++14):andreyv Do you mind if I modify this code (with attribution) for testing? If possible what license do you release this code under?
Sure. This code is in public domain.
Also tentative evidence that (CodeForces compilers) GNU C++17 7.3.0 I/O is faster than GNU C++11 5.1.0 I/O.
awesome post!