TLDR: will people get angry if the judging servers for a contest use ARM64?
I'm setting up the hosting infrastructure for the Bay Area Programming Contest using DOMjudge, and as part of this setup I need to choose what machines the judgehosts will run on. I've decided to use AWS EC2 for hosting since I've already tested setting up DOMjudge as a dry run on there.
Of the machine families that EC2 provides, I've chosen to go with the burstable (T) instances, run in unlimited mode. (Also, if anyone has experience with using these for programming contests, I'd love to hear how it went for you. I read through some AWS docs and it seems that the compute performance should be stable if run in unlimited mode, but I haven't done any hands-on testing yet.) I'm currently picking between the T3 and T4g instances. T3 runs x86_64 and T4g runs ARM64.
I would like to use T4g because it's cheaper, but I'm concerned that some common hacks (e.g. the avx2 pragma) only work on x86_64 and that people will be angry about that. I tried doing some research into what architectures are commonly used in CP judges and I could only find Codeforces, which apparently uses Intel Skylake, and IOI, which uses Intel Core i5. Also, USACO Training mentions in section 1-2 that "programs are run on a modern processor but times are scaled to a 700 MHz Pentium III". All of these use x86, AFAIK. So ultimately I have two questions:
- From a technical perspective: What common hacks that competitive programmers use won't work on ARM64?
- From an organizing perspective: As a contest organizer, is it my responsibility to make sure these features are available? Or are participants responsible for understanding the pragmas and other lines of code that are often blindly copy-pasted?
Yes, people will be very angry if you go with anything other than x86.
The biggest issue is not pragmas (some of them can crash your program on older-but-still-in-use x86, let alone arm, and even those who use them blindly are usually aware of this), but that all sorts of
__builtin
intrinsics that worked predictably on x86 would either crash or become ridiculously slow on arm, and that, perhaps even more importantly, certain things that are left undefined or implementation-defined in the programming language will be different across architectures (e.g., floating-point numbers, endianness, pointers, memory). In addition, the performance of solutions (especially those relying on vectorization) could no longer be scaled using a simple multiplier: one could, in theory, make a change that makes a program 2x faster on x86 but 2x slower on arm and vice versa.It would be interesting to conduct a proper study, but I'd estimate that among CF C++ solutions (even those that do not explicitly use any x86 features), around 5-10% would either WA, RE or TL after re-running them on arm and scaling their running time.
In general, you want your server hardware to be as close to the participants' hardware as possible, and unless you can give every participant an M1/M2 MacBook, you will have much fewer problems by sticking with x86.
I think it's a choice that organizers have to make at the end of the day. If you want to be like USACO/other OIs and ban pragmas/attributes to avoid "unfair" unexpected speedups, using ARM64 would be a better choice than x86_64 in this respect (though the validity of such arguments is very debatable). If you're going down this path, you should be able to guarantee solvability from the problemsetting side of things as well as programming language side of things.
Anything compiler specific (__builtin* as pointed out in the comment above) should be viewed with caution, though with modern C++, you don't need intrinsics for most cases: for instance, we have C++20 functionality that can be used to compute clz, ctz and popcount. You should also look out for compilers/interpreters for other languages (if they're allowed) being buggy/unsupported on ARM64.
The practical advice is to stick with x86 as already pointed out, since not only is it closer to most people's machines, but it also has potentially better language support. If you're going with ARM64, you probably also want some kind of custom invocation, so this might end up costing you more than what you'd pay with x86.
I'd prefer promoting clean coding practices, and raising awareness among the community that using non-standard/platform-specific hacks in solutions is not good. As nor mentioned all bit counting operations are part of standard in c++, time to switch to it.