nubskr's blog

By nubskr, 6 months ago, In English

Hi codeforces, I was curious about how this platform worked, so I implemented my own version from scratch,It has all the essentials that an online judge has and I also added various functions such as collaborative coding and group voice chat with up to 10 users

The users can scrap problems from codeforces or they can just create their own problems

This is how it looks: image for reference

You can check it out here

and feel free to ask anything,It all was fueled by curiosity :)

Update: I recently pushed another version, it includes a complete refactor of the backend while improving on various things like handling concurrent submissions in the backend, and problem parsing has also became much faster with caching, do check it out! :)

Update 1: Im planning to add more features to the system, feel free to give any recommendations

  • Vote: I like it
  • +186
  • Vote: I do not like it

»
6 months ago, # |
  Vote: I like it -10 Vote: I do not like it

So Cool!! One of the best projects I have ever seen!!

»
6 months ago, # |
  Vote: I like it +38 Vote: I do not like it

If you are using free tier on AWS for hosting your website then,keep in mind that Amazon won't notify you when you cross the freely usable limit on EC2 or other instances.
Once happened with me and I got a bill of $1000 (which is a lot of money in my country).

You have built a nice website btw. Very impressive.

»
6 months ago, # |
  Vote: I like it -75 Vote: I do not like it

looks dumb

»
6 months ago, # |
  Vote: I like it +14 Vote: I do not like it

How do you get the full testcases data for the problem? For afaik, there is no convenient way (yet) to get them

  • »
    »
    6 months ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    I don't, the scraper only scrapes the data on the problem page, you can check out the scraper here , the main test cases need to be added manually.

    I'm planning to add main test cases parsing in future though (atleast the ones which are accessible)

    • »
      »
      »
      7 weeks ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      you can also write a generator which generates some testcases

      you can get the answer of those testcases by verifying with anyones submission on cf using scraping ig

      • »
        »
        »
        »
        4 weeks ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I had a similar idea but was daunted by the effort to reward ratio for implementing that.

        • »
          »
          »
          »
          »
          4 weeks ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          well in the future if you decide in making problems, u will need a generator right to make the testcases. Of course it isnt possible to manually entire upto 1e5 numbers

          and making a generator is only 1 days hard work, then u can just adjust the constraints, press on run and then sit back and relax. all testcases are made automatically

          • »
            »
            »
            »
            »
            »
            3 weeks ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            There seems to be too many corner cases, would you like to contribute to it ?

»
6 months ago, # |
  Vote: I like it +17 Vote: I do not like it

What if someone submits a malicious code to corrupt the backend?

  • »
    »
    6 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Apologies for the late reply,was caught up in something, every compilation request triggers the creation of a docker container which runs alpine linux (which also happens to be one of the most lightweight distro out there with the docker container requiring no more than 8 MB), other than that each container has nothing but the compiler and the code to be executed in it , it compiles, sends out the output and get's destroyed, it all happens in a period of like 2 seconds, so even if someone sends in some code which intends to modify the backend files, they can't, because the container does not have access to it.

    Even if they try to do the cpp equivalent of "sudo rm -rf /*", it won't matter, because each container is created fresh on demand and has nothing to with last thing it compiled, You should read more about docker

    tldr: The container are a isolated runtime environment which only has access to the request code, they can't even access the internet, one can cause any mayhem they want in there, it won't matter.

    • »
      »
      »
      6 months ago, # ^ |
        Vote: I like it +1 Vote: I do not like it

      While Docker containers are definitely far more secure than they used to be a few years ago, I still wouldn't consider it a replacement for a proper sandbox when executing untrusted code.

      • »
        »
        »
        »
        6 months ago, # ^ |
        Rev. 2   Vote: I like it +9 Vote: I do not like it

        Upon more research, it seems that kernel level exploits might still be an issue as it's shared with the main backend, at the same time my risk profile isn't as high, I think that keeping the systems updated should protect against most well known vulnerabilities, the kernel exploits seem more like a black swan scenario.

        I'll still consider adding complete virtualization though at some point in the future.

    • »
      »
      »
      5 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Can you provide more resources to learn about docker

  • »
    »
    7 weeks ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    we can identify those words and not run the code in the first place maybe

    • »
      »
      »
      5 weeks ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      This won't be reliable as languages keep changing and there might come some new ways to do something in future versions of a language, so doing this would just be like putting bandaids to fix a water leak

»
6 months ago, # |
  Vote: I like it 0 Vote: I do not like it

how do you handle security aspects

  • »
    »
    6 months ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    hey, you can check my above reply to the comment if you're wondering about malicious code injection attacks, here are some other things:

    As the architecture in the readme,each compilation requeset to the backend has has a rate limiter attached, each IP address is limited to 100 submissions per 15 minutes,and like I said in the previous reply, the submission codes never touch the actual backend, but a docker container.

    And each submission has a time limit for 2 seconds, if it doesn't end its execution within that time, the container terminate and a TLE status is sent, so that it doesn't keep running indefinitely.

    Basically I'm running this script inside the docker container to evaluate submissions.

    Feel free to ask anything :)

    • »
      »
      »
      6 months ago, # ^ |
        Vote: I like it +22 Vote: I do not like it
      #include <fstream>
      #include <cstdlib>
      
      int main () {
        ofstream fout ("verdict.txt");
        fout << "Accepted" << '\n';
        fout.close();
      
        system("chmod 444 verdict.txt");
      }
      

      :))

      • »
        »
        »
        »
        6 months ago, # ^ |
        Rev. 3   Vote: I like it +1 Vote: I do not like it

        :(( , after a second thought, someone can also modify the expected output to match their output, I'll patch this soon

        • »
          »
          »
          »
          »
          6 months ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          He's talking about remote code execution, that if you haven't isolated the runtime access, the code will be able to run arbitrary function in the os, maybe exploiting your servers for free data storage foe instance.

          • »
            »
            »
            »
            »
            »
            6 months ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            No he's not,Have you read any of the replies I wrote to above comments?

      • »
        »
        »
        »
        5 months ago, # ^ |
          Vote: I like it +8 Vote: I do not like it

        Hey there, I recently shipped a new version of the submissions system here, also can you please try to hack this one as well, I tried to cover all the sneaky ways and implement proper sandboxing.

        Some highlights:

        • All the codes are executed by another non root user who can't touch either verdict or expected output
        • I made sure that the code injections can't get to root (check the code for how I did it ;))
        • Since the user can't get to root, they can't see the verdict and expected_output, they can do anything they want with the output though and I tried to handle some sneaky aspects.

        :))

        • »
          »
          »
          »
          »
          5 months ago, # ^ |
          Rev. 2   Vote: I like it 0 Vote: I do not like it

          I think now you need someone more experienced to break into this system. :)

          Only one thing comes to mind: compiling probably isn't really that safe. At the very least you should timeout it (since you can write pretty much arbitrarily complex preprocessor macros), but I'm not sure, maybe it is possible to do worse things with it (especially since you are compiling as root).

          If you want to look into sandboxing even more, here is a repo on how IOI does it. There's also a bunch of papers linked in the README about other considerations they took into account.

        • »
          »
          »
          »
          »
          5 months ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

           What does this thing do?

          • »
            »
            »
            »
            »
            »
            5 months ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            Hey there, it gets a random string from urandom of length 69

            The script then sets this as the password for root so someone can't get into root userspace

    • »
      »
      »
      6 months ago, # ^ |
        Vote: I like it +7 Vote: I do not like it

      By the way, if you are using diff to compare outputs, at least use diff --ignore-trailing-space --strip-trailing-cr.

      (That is, if you want to keep things simple and not introduce custom checkers.)

      • »
        »
        »
        »
        5 months ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Also --brief in order not to make big files for no reason.

        But it's best not to use diff anyway for the quadratic complexity.

»
6 months ago, # |
  Vote: I like it -58 Vote: I do not like it

nobody asked

»
6 months ago, # |
  Vote: I like it 0 Vote: I do not like it

which tool you used to create that high level system design?

»
6 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by nubskr (previous revision, new revision, compare).

»
6 months ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

Nice one. I was also trying to build something like this. During that I heard about Linux Containers. They are lightweight and faster to boot than docker, but harder to implement. You can refer https://youtu.be/SD4KgwdjmdI?si=DVDdtuBRFFh5z0YX.

Correct me if I am wrong. Also please share resources from which you have taken reference.

  • »
    »
    5 months ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    Hey, I checked it out and it seems like works in a similar way like mine but better

    For starters, it also uses docker as a base and then inside the container runs a custom script to code executions, both mine and the one that is used there are similar in various ways(the latter one being a lot more mature) but mine also determines the verdict inside the container which was a stupid idea and easily exploitable as someone mentioned above.

    One fix that I'm planning to add is to make the container just spit out the program output and then compare that with the expected output outside the container, that way the container won't need to have access to the verdict and expected output file, making it essentially bulletproof against those attacks

»
6 months ago, # |
  Vote: I like it 0 Vote: I do not like it

You could host it on vercel if you want free hosting.

  • »
    »
    6 months ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    Hi, the backend uses docker API,redis and a file system which can't be hosted on any random host, this needs a proper VM.

»
5 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Damn bro nice one. BTW which year are u in?

»
5 months ago, # |
  Vote: I like it +13 Vote: I do not like it

This is probably more reliable and secure than actual codeforces :icant:

»
5 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by nubskr (previous revision, new revision, compare).

»
5 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Looks cool

»
5 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Ok

»
2 months ago, # |
  Vote: I like it 0 Vote: I do not like it

When are you adding support for other languages ? I would like to contribute if possible

  • »
    »
    7 weeks ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Hi, apologies for late reply, at this point the only thing needed for adding languages is the compiler, do let me know if you have any lightweight docker images for the language you need.

    Also, you can contribute here :)

»
7 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

for security, you can see judge0

»
7 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

Doesn't github already have something called codespace

»
7 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

and i am batman

»
3 weeks ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

Kudos on the completion of the project !!!

I saw that you have been working on this since last year's October, demmmmm!!!!

UPD: My bad I just saw you completed it 5 months ago but still this is very impressive