Python is a great programming language: simple, expressive, compact.↵
↵
In Codeforces, Python is often the best choice for Div 2 A and B problems. For example, problem [problem:600A] is very easy to write in Python: first tokenize the string with the built-in `split()` function, then try to parse the integers with the built `int()`, then output the comma-separated strings of results with `",".join(lst)`.↵
↵
For the more complex problems, writing fast-enough Python code is often a challenge. Here is a list of tips to improve performance.↵
↵
- Use PyPy instead of the standard Python intepreter. According to [20 offical benchmarks](http://speed.pypy.org/) it is on average 7 times faster than Python 2. PyPy2 is in my opionion the best option at the moment for competitive programming.↵
↵
- Append to an existing string with "+=" instead of concatening **more than two** strings with "+" and storing the result with "=". (With two strings, both ways work fine. `s1 = s1 + s2` is fast because Python interpreter optimizes that to `s1 += s2`.)↵
↵
- `string.join(iterable)` is as fast as it gets.↵
↵
- list comprehension is faster than `for` loops.↵
↵
- prefer `xrange()` instead of `range()`, as the former returns an iterator, while the latter a new list.↵
↵
- avoid `zip()` in Python 2 (constructs) a list, prefer a `for` loop instead. `zip()` is fine in Python 3 because there it returns an iterator.↵
↵
- it's ok to use Python list as an array; it has O(1) element access time. It is NOT ok to use it as queue, even if the list is sorted. Use `collections.deque` instead if you need fast extraction of the max element.↵
↵
↵
↵
### Experimental results↵
↵
For the following tests I used a Lenovo laptop with Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz. Each test was run 10 times, and the average results are reported.↵
↵
### I/O: Python vs. `printf/scanf` vs. `cin/cout`↵
↵
**The problem:** read a file with integers from 0 to 500000, store them in an array, then print out the array.↵
↵
**Implementations:** [scanf/printf](http://pastebin.com/LbFKyXJM), [cin/cout](http://pastebin.com/jHgkrnxU), ["fast" cin/cout](http://pastebin.com/6BGYQNe0), [Python 2](http://pastebin.com/4VwDk9iQ), [Python 3](http://pastebin.com/tTprhue4).↵
↵
**The results:**↵
↵
~~~~~↵
scanf/printf: 0.1056 sec↵
cin/cout: 0.2274 sec↵
"fast" cin/cout: 0.0943 sec↵
↵
Python 2: 0.3535 sec↵
PyPy 2: 0.1645 sec↵
↵
Python 3: 0.2535 sec↵
PyPy 3: 0.3324 sec↵
~~~~~↵
↵
**Verdict:** C++ is faster, but PyPy2 should be fast enough for most problems.↵
↵
### Iterator construction vs. list construction↵
↵
**The problem:** sum all integers from 0 to 10e6 by using a `for` loop.↵
↵
**The results:** ↵
↵
Loop with `range`:↵
↵
~~~~~↵
python2: 0.502402067184 sec↵
pypy2: 0.0170 sec↵
~~~~~↵
↵
Loop with `xrange`:↵
↵
~~~~~↵
python2: 0.334251880646 sec↵
pypy: 2: 0.0166962146759 sec↵
~~~~~↵
↵
Loop with `range` in Python3 that is in fact the same as `xrange` in Python2:↵
↵
~~~~~↵
python3: 0.6606624750002084 sec↵
pypy3: 0.0168 sec↵
~~~~~↵
↵
**Verdict:** the differences between PyPy and standard Python are actually much much more important than range vs. xrange!↵
↵
### List construction: iterative growth vs. pre-reserving all memory at start.↵
↵
**The problem:** Create a list with 1e6 elements.↵
↵
The code when using iterative growth:↵
↵
~~~~~↵
l = []↵
for x in range(1000000):↵
l.append(x)↵
~~~~~↵
↵
The code when pre-reserving memory:↵
↵
~~~~~↵
l = [0] * 1000000↵
for x in range(1000000):↵
l[x] = x↵
~~~~~↵
↵
**The results:** ↵
With append:↵
↵
~~~~~↵
pypy2: 0.483808994293 sec↵
python2: 0.698101997375 sec↵
pypy3: 0.510159969329 sec↵
python3: 0.844806871000 sec↵
~~~~~↵
↵
With []:↵
↵
~~~~~↵
pypy2: 0.0570020675659 sec↵
python2: 0.618577957153 sec↵
pypy3: 0.06054091453552246 sec↵
python3: 0.6261008259998562 sec↵
~~~~~↵
↵
**Verdict:** PyPy is able to take advantage of the faster algorithm (with one-time growth), while standard Python is not able to. The PyPy speedup is an order-of-magnitude (~10 times).↵
↵
↵
**Result summary**: PyPy2 beats the others hands down in these tests. PyPy3 looks a bit unread at the moment, sometimes slower than the standard Python.
↵
In Codeforces, Python is often the best choice for Div 2 A and B problems. For example, problem [problem:600A] is very easy to write in Python: first tokenize the string with the built-in `split()` function, then try to parse the integers with the built `int()`, then output the comma-separated strings of results with `",".join(lst)`.↵
↵
For the more complex problems, writing fast-enough Python code is often a challenge. Here is a list of tips to improve performance.↵
↵
- Use PyPy instead of the standard Python intepreter. According to [20 offical benchmarks](http://speed.pypy.org/) it is on average 7 times faster than Python 2. PyPy2 is in my opionion the best option at the moment for competitive programming.↵
↵
- Append to an existing string with "+=" instead of concatening **more than two** strings with "+" and storing the result with "=". (With two strings, both ways work fine. `s1 = s1 + s2` is fast because Python interpreter optimizes that to `s1 += s2`.)↵
↵
- `string.join(iterable)` is as fast as it gets.↵
↵
- list comprehension is faster than `for` loops.↵
↵
- prefer `xrange()` instead of `range()`, as the former returns an iterator, while the latter a new list.↵
↵
- avoid `zip()` in Python 2 (constructs) a list, prefer a `for` loop instead. `zip()` is fine in Python 3 because there it returns an iterator.↵
↵
- it's ok to use Python list as an array; it has O(1) element access time. It is NOT ok to use it as queue, even if the list is sorted. Use `collections.deque` instead if you need fast extraction of the max element.↵
↵
↵
↵
### Experimental results↵
↵
For the following tests I used a Lenovo laptop with Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz. Each test was run 10 times, and the average results are reported.↵
↵
### I/O: Python vs. `printf/scanf` vs. `cin/cout`↵
↵
**The problem:** read a file with integers from 0 to 500000, store them in an array, then print out the array.↵
↵
**Implementations:** [scanf/printf](http://pastebin.com/LbFKyXJM), [cin/cout](http://pastebin.com/jHgkrnxU), ["fast" cin/cout](http://pastebin.com/6BGYQNe0), [Python 2](http://pastebin.com/4VwDk9iQ), [Python 3](http://pastebin.com/tTprhue4).↵
↵
**The results:**↵
↵
~~~~~↵
scanf/printf: 0.1056 sec↵
cin/cout: 0.2274 sec↵
"fast" cin/cout: 0.0943 sec↵
↵
Python 2: 0.3535 sec↵
PyPy 2: 0.1645 sec↵
↵
Python 3: 0.2535 sec↵
PyPy 3: 0.3324 sec↵
~~~~~↵
↵
**Verdict:** C++ is faster, but PyPy2 should be fast enough for most problems.↵
↵
### Iterator construction vs. list construction↵
↵
**The problem:** sum all integers from 0 to 1
↵
**The results:** ↵
↵
Loop with `range`:↵
↵
~~~~~↵
python2: 0.502402067184 sec↵
pypy2: 0.0170 sec↵
~~~~~↵
↵
Loop with `xrange`:↵
↵
~~~~~↵
python2: 0.334251880646 sec↵
pypy
~~~~~↵
↵
Loop with `range` in Python3 that is in fact the same as `xrange` in Python2:↵
↵
~~~~~↵
python3: 0.6606624750002084 sec↵
pypy3: 0.0168 sec↵
~~~~~↵
↵
**Verdict:** the differences between PyPy and standard Python are actually much much more important than range vs. xrange!↵
↵
### List construction: iterative growth vs. pre-reserving all memory at start.↵
↵
**The problem:** Create a list with 1e6 elements.↵
↵
The code when using iterative growth:↵
↵
~~~~~↵
l = []↵
for x in range(1000000):↵
l.append(x)↵
~~~~~↵
↵
The code when pre-reserving memory:↵
↵
~~~~~↵
l = [0] * 1000000↵
for x in range(1000000):↵
l[x] = x↵
~~~~~↵
↵
**The results:** ↵
With append:↵
↵
~~~~~↵
pypy2: 0.483808994293 sec↵
python2: 0.698101997375 sec↵
pypy3: 0.510159969329 sec↵
python3: 0.844806871000 sec↵
~~~~~↵
↵
With []:↵
↵
~~~~~↵
pypy2: 0.0570020675659 sec↵
python2: 0.618577957153 sec↵
pypy3: 0.0605409145355
python3: 0.6261008259998
~~~~~↵
↵
**Verdict:** PyPy is able to take advantage of the faster algorithm (with one-time growth), while standard Python is not able to. The PyPy speedup is an order-of-magnitude (~10 times).↵
↵
↵
**Result summary**: PyPy2 beats the others hands down in these tests. PyPy3 looks a bit unread at the moment, sometimes slower than the standard Python.