Блог пользователя vlchen888

Автор vlchen888, история, 4 года назад, По-английски

Hey all,

We are excited to invite everyone to the Quora Programming Challenge 2021, a free programming competition open to participants from around the world*! The competition will take place from Saturday, February 6th, 2021 to Sunday, February 7th, 2021, in two divisions of 4 hours each, to accommodate different timezones.

Quora is a platform to ask questions, get useful answers, and share what you know with the world. On this contest, we include some problems inspired by real-world challenges that Quora engineers faced in building and growing the product. In addition to competitive programming-style questions, there will also be some machine learning problems. We hope that this contest will be fun for everyone!

The top contestants in each division will be able to win the following prizes:

  • 1st place: $2000
  • 2nd place: $1000
  • 3rd place: $500
  • 4th — 10th: $200
  • Top 50 finishers: Quora Challenge T-Shirt

Our recruiting team will also be reaching out to top participants for their interest in interviewing for roles at Quora. We recently became a remote-first company, so you don’t have to relocate to work with us.

Register for the competition here by 2/5/2021: https://www.quora.com/careers/challenge.

Hope to see you at the contest!

  • Проголосовать: нравится
  • +127
  • Проголосовать: не нравится

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Auto comment: topic has been updated by vlchen888 (previous revision, new revision, compare).

»
4 года назад, # |
  Проголосовать: нравится +9 Проголосовать: не нравится

Deja vu.

»
4 года назад, # |
  Проголосовать: нравится +21 Проголосовать: не нравится

Why is LinkedIn profile a required field?
If it is so much required, why does it accept https://linked.in/none as answer?
Mind that the answer is accurate and truthful to the extent possible, as required in the rules.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +79 Проголосовать: не нравится

    It's not really surprising that the form doesn't validate that the URL is a real profile URL, they did some rough validation and that's enough; do you expect forms to also automatically check that your City is a "real city" against some magical database before allowing your submission? No one ever does that.

    LinkedIn profile is required because, clearly, Quora is running this (for free, with real prize money!) as a tool to recruit people to work there.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      For important stuff, there's a way to validate an address: "We sent you a mail / message / etc. with a code, please input it in the next form to complete the registration process".

      This one looks mandatory (required field in the form) but not enforced (no validation and no specific mention of LinkedIn profile in the rules, only "fill the form accurately and truthfully"). Hence the question, which is more about the company's intent: if it is really required, it can be clearer in the rules or via validation, if not, the required status can be dropped.

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +11 Проголосовать: не нравится

        Validation only ever really happens during actual account creation; no one ever validates things like that in registration forms.

        I guess the real question is whether they'll disqualify you if you don't have a LinkedIn? I'm not sure if they will, but from the rules about "accurately and truthfully", it sounds like they can, so I'd just make a LinkedIn and leave it mostly empty.

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

The problems in the 2014 version of this challenge has a not-strictly-algorithmic challenge, in particular Sorted Set where you have to parallelize a task, which is not common in CP. Are these types of questions fair game under your description of "competitive programming-style questions"?

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +46 Проголосовать: не нравится

    Hi, thanks for the question! This iteration of the challenge will contain problems that fall into two categories.

    1. “Algorithm” problems: These are competitive programming-style questions that have strictly algorithmic solutions. It is possible to achieve a full score on these problems. Most of the problems will fall into this category.
    2. “Machine Learning” problems: These are problems around machine learning concepts. Scoring is problem-dependent, but roughly would be based on some scaled accuracy metric. We do not guarantee that it is possible to achieve a full score on these problems.
    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +55 Проголосовать: не нравится

      I think many people are familiar with type 1, but for type 2, do you think you could provide a simple example problem ahead of time?

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +5 Проголосовать: не нравится

        +1. Will any machine learning background be required to solve those problems?

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        Hi Anand, you can take a look at this problem from our 2014 contest for an example of what we’re considering an ML problem: https://www.quora.com/q/quorahaqathon/Quora-Haqathon-Labeler. These problems will not have perfect solutions, and you’ll find knowledge of machine learning useful for solving them. Resources are constrained though, so you won’t be training a SOTA model, and heuristic based approaches may be sufficient or superior for some problems.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +53 Проголосовать: не нравится

      Have you considered splitting the challenge into two contests? Also it's been my experience that it's quite hard to give a meaningful solution to a ML problem in 5 hours :(.

      Will it be clear on which problems you can guarantee a full score?

»
4 года назад, # |
  Проголосовать: нравится -33 Проголосовать: не нравится

Since I don't really like Quora as a platform (I believe in the idea, but not the current implementation), is there any way for me to register for the contest without logging into a Quora account? I try to minimize my social media accounts, especially those based around ads and data monetization (which is all of them nowadays, except Microsoft Circles).

Additionally, what's the approximate value of a Quora Challenge T-Shirt? I'm trying to decide whether it's worth attending based on the expected value.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +8 Проголосовать: не нравится

    I think a shirt like that mainly has value to you, so no one else can answer that question. Unless you mean the monetary value of a shirt which can't be over 15 dollars (and could be a lot less)

»
4 года назад, # |
  Проголосовать: нравится +13 Проголосовать: не нравится

Reminder that registration for this contest is closing soon!

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +20 Проголосовать: не нравится

    Am I right that all problems allow partial scoring, and that wrong attempts don't increase penalty? Will there be a limit on a number of submissions per problem?

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +28 Проголосовать: не нравится

    29 January 2021 — "In the week before the contest, you should receive another email from us with the link to the contest platform and instructions on how to log in. We will use the same link for both divisions of the contest, and you will be able to log in to either one (or both) of the divisions."

    I have not received another email ...

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Ok I have received an email

      "The Quora Programming Challenge 2021 is almost upon us!

      Here is the login information that you will need to access the contest (please save this information):

      Contest URL: challenge.quora.com ..."

»
4 года назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

Missed registration by some time, thought I will be able to register by 5 Feb IST time :(

»
4 года назад, # |
Rev. 3   Проголосовать: нравится +6 Проголосовать: не нравится
From https://challenge.quora.com/rules

I find this interesting for discussion :)

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +22 Проголосовать: не нравится

    Doubling the time limit is understandable but not needed; doubling the memory limit is just weird.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      I think they were right to allow double time limit (for java at least, no idea about python).

      I personally solved the datacenter problem from division 2 first in java, got TLE, implemented same approach (almost same code) which got accepted.

»
4 года назад, # |
Rev. 3   Проголосовать: нравится +6 Проголосовать: не нравится

Any Java Coders out here, I keep getting the error, "Execution failed because the return code was nonzero". I named my file as Example.java and removed the package and also removed the public keyword from my class. But this continues to show up. I tried submitting a solution using CPP and It got accepted. (BTW this is for the practice session.)

EDIT : (I managed to solve it, my file name had to be "example.java" and not "Example.java")

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    There isn't a custom invocation like what Codeforces have that can show us what is the error code :/

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Actually, I missed the practice session. Can you tell me that what should be the name of class for Java solution ? Or can we name it anything ?

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      What I think is, class and file name will be the same as question name.

      But the thing is, they have not mentioned this anywhere and if you name it something else then a weird error will come

      • »
        »
        »
        »
        4 года назад, # ^ |
        Rev. 3   Проголосовать: нравится +4 Проголосовать: не нравится

        "Since the grading system renames files to match the task name, you will need to use that same task name for the public class name in Java solutions. In the practice problem, this would be "example", since the grading system renames the file to be example.java" . They've sent something like this as an announcement. So, according to it name your file as the "questionname".java and make sure there are no capitals.

»
4 года назад, # |
  Проголосовать: нравится +33 Проголосовать: не нравится

Will standings page be available during the contest? I don't see one in practice session.

»
4 года назад, # |
  Проголосовать: нравится +29 Проголосовать: не нравится

"A Participant choosing to participate in both divisions could win prizes in both divisions. Limit one (1) prize per division per Participant.".

So, is it possible that best 50 in both divisions is exactly the same?

»
4 года назад, # |
  Проголосовать: нравится +2 Проголосовать: не нравится

FYI, here are the Installed Python packages quoted from the official website:

  • Keras 2.4.3
  • numpy 1.19.5
  • Keras-Preprocessing 1.1.2
  • lightgbm 3.1.1
  • pandas 1.1.5
  • scikit-learn 0.24.1
  • scipy 1.5.4
  • tensorboard 2.4.1
  • tensorboard-plugin-wit 1.8.0
  • tensorflow 2.4.1
  • tensorflow-estimator 2.4.0
  • torch 1.7.1
  • xgboost 1.3.3
»
4 года назад, # |
  Проголосовать: нравится -8 Проголосовать: не нравится

Is there any possibility of late registration for either division ?

»
4 года назад, # |
  Проголосовать: нравится +92 Проголосовать: не нравится

anton would've surely rejected atleast 6/7 problems. :)

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +18 Проголосовать: не нравится

    seeing same type of comments nowadays It feels that anton have redefined competitive programming.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +56 Проголосовать: не нравится

      I think this problemset would have been rejected by 2015 coordinators as well.

      • »
        »
        »
        »
        4 года назад, # ^ |
        Rev. 4   Проголосовать: нравится -18 Проголосовать: не нравится

        why because they were too difficult ? I usually observe that people like kind of easy problem set and dislike difficult one.

        Also notice that format was completely different . It was 4 hrs instead of 2 hrs.

        -is-this-fft- i don't know why i can't reply two times in 10 minutes , So you want to say because problems were kind of popular or well known and thus they were boring ? From "standard" i usually feel well known problems which people solve while learning some topic.

        Was only me who found problemset was only for > master people ?

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится +32 Проголосовать: не нравится

          No, because of problem style. Most of the problems were very standard, even a few years ago. I basically quit the contest out of boredom.

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится +15 Проголосовать: не нравится

          -is-this-fft- i don't know why i can't reply two times in 10 minutes ,

          Probably because of unrated and low contribution.

          So you want to say because problems were kind of popular or well known and thus they were boring ? From "standard" i usually feel well known problems which people solve while learning some topic.

          Yes, that. For example there was a basic "learning König's theorem" problem and a pretty basic "learning HLD" problem. Note that using these techniques to set problems is fine, but not if knowing the thing is basically the whole problem. I also think that it is fine if there is an "educational" problem from time to time, but if all problems are like that then the problemset is bad.

          Was only me who found problemset was only for > master people ?

          Well, not master+, more like expert+. But you're right that this contest wasn't very green-friendly.

          • »
            »
            »
            »
            »
            »
            4 года назад, # ^ |
              Проголосовать: нравится +56 Проголосовать: не нравится

            There was no need for HLD, binary climb was more than enough

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

In problem C(Problem ID: students), won't the graph always be bipartite? (Solving with this assumption fetched me 21 points though) or was it an implementation mistake?

21 Pointer CODE
  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +19 Проголосовать: не нравится

    It is bipartite but your complexity is wrong.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Yeah I did the same. But I dont know anything about Maxmimum Independent set. So I just copied some template from github.

    I am really curious about the expected solution.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +3 Проголосовать: не нравится

    Yes, the graph is bipartite. But I think you are assuming that you can always get a maximum independent set in a bipartite graph by taking all the nodes in one of it sides (same color). That's clearly not true.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Yes, you're absolutely right, so is finding maximum independent set of a graph some standard problem?

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Can you please tell me what is wrong with my code. I also used bipartite graph and took max of two colors for every connected component. https://ideone.com/cKVuu2

      • »
        »
        »
        »
        4 года назад, # ^ |
        Rev. 2   Проголосовать: нравится +1 Проголосовать: не нравится

        Imagine bipartie graph

        1 1
        2 1
        3 1
        3 2
        3 3
        

        Each half has 3 vertices, so your solution will output 3. But if we take 1, 2 from left side and 2, 3 from right side we can achieve 4

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится 0 Проголосовать: не нравится

          I'm sorry but didn't understand what you mean by 1 2 and 3 in that array. Basically I did a bfs on each connected components and assigned 0 or 1 color to each node and took the maximum of two colors for each connected components. Can you please tell me what is wrong with my methodology here.

          • »
            »
            »
            »
            »
            »
            4 года назад, # ^ |
              Проголосовать: нравится +1 Проголосовать: не нравится

            Sorry, it was badly formatted. Hope it is understandable now

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +3 Проголосовать: не нравится

    I think you have to find the maximum independent set instead of choosing only one color.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Maximum independent set = total vertexes — edges in maximum matching

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Do you mean maximum bipartite matching? If yes can you share some insight of how you got this formula(sorry if it's something very standard)

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +10 Проголосовать: не нравится

        From König's theorem we know that number of vertexes in minimal cover equals to number of edges in maximum matching. So total number of vertex is sum of vertex in minimum cover and maximal independent set so we have this equation.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      How to find maximum matching for this graph? My Dinic only gets sub 2.

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +24 Проголосовать: не нравится

        If u use dinic on vertex only if it appears in the input, which is $$$4\times 10^5$$$, then it would be fast enough. Since it works in O(sqrt(V) * E) in bipartite graph.

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Is the wall problem (forgot the exact name) a connected component dp problem? Can't find a valid transition during contest.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +10 Проголосовать: не нравится

    Yea it is, you just need to maintain the current type of connected components in each column and also which squares in the column are "active".

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +8 Проголосовать: не нравится

      Any chance you could send me your solution? I'm not sure why mine doesn't pass, and I'd like to stress test it to find a countercase.

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +10 Проголосовать: не нравится
      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        I coded the bruteforce solution during the contest to debug my full solution. I felt like it was worth it since it is difficult to debug otherwise, and at least I can gain 14 pts from the first subtask in case I did not manage to debug it until the end of the contest.

        Even the O(2^N) solution requires a MST. :|

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
          Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

          I also wrote a brute force in contest, both to submit for 14 points and to use for stress testing. Unfortunately, I was unable to find a countercase using that solution, so I requested other competitors' solutions so that I could try to generate a countercase with a higher N. Unfortunately, even that didn't work out, so apparently there's a relatively obscure bug somewhere in my solution.

          Did your brute force give you the countercase you were looking for, and if so, would you mind sharing the case? It sounds like a fair number of others failed on the same test case I did (TC7 from subtask one), so there's a decent chance countercases that worked for others will also break my solution.

          • »
            »
            »
            »
            »
            »
            4 года назад, # ^ |
            Rev. 3   Проголосовать: нравится +18 Проголосовать: не нравится

            Did your brute force give you the countercase you were looking for, and if so, would you mind sharing the case?

            Fortunately, yes. This is the countercase

            Спойлер

            It should return 111 where my buggy full solution code returns 115. It does not pass the first hidden testcase though. The issue was related to the first and the third row squares being connected while the second is not.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Thanks man

»
4 года назад, # |
  Проголосовать: нравится +8 Проголосовать: не нравится

How can people apart from top-50 check their ranks?

»
4 года назад, # |
  Проголосовать: нравится +10 Проголосовать: не нравится

Did anyone get an issue on the wall task where on test case 7 the code would throw a runtime error? I know others who had the same problem

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    I didn't end up with an RTE, but I got WA7 on the first subtask and couldn't finish debugging before the end of the round. I'm also looking for a countercase; I've stress tested my solution on thousands of cases against a slow solution I wrote and the correct solutions I've just been sent and can't find a case that I WA. If anyone happens to have gotten WA7/RTE7, found a countercase, and successfully debugged their solution, that case would be very much appreciated!

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Can you show your code?

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        Sure, here's a link: https://pastebin.com/pRjgYdNH.

        Essentially, I use a state that, after all vertical walls prior to a given column and all horizontal walls in that column have been processed, indicates the connectivity structure of the current column and which connected components contain workers. There are then 22 states, though I track them as values from 0 to 47 to make keeping track of the underlying values easier--essentially, for a given state S, column 0 is in connected component 0, S % 2 is the connected component containing column 1, (S / 2) % 3 is the connected component containing column 2, and S / 6 is a bitmask representing which components contain workers, and 32 possible transitions at each column (as there are five walls that could be removed in each transition). I precalculate all possible transitions, after which my solution is a fairly straightforward DP approach.

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится +13 Проголосовать: не нравится

          I ran a stress-test against your solution compiled with sanitizers, and what I found is that your code fails with RE when $$$n$$$ is $$$1$$$. The problem is that in line 150 you allocate an array of size $$$N-1$$$, and it seems to be not legal (maybe it is undefined behaviour, I am not sure): I get an error "variable length array evaluates to non-positive value 0".

»
4 года назад, # |
  Проголосовать: нравится +7 Проголосовать: не нравится

Will tasks be available to submit somewhere later?

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

what was test case 7 of problem escape. it was getting stuck.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    One thing that I missed in first attempt and got WA at test 7 was that guards will also be stopped by the walls. So you can't just check manhattan distance of a cell from a guard and determine its feasibility.

»
4 года назад, # |
  Проголосовать: нравится +24 Проголосовать: не нравится

How to solve flipped? My 93 points code: CODE

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +24 Проголосовать: не нравится

    I got AC by assuming each column originally followed a normal distribution with its post-flip mean and standard deviation and picking the rows where the product of the normal PMF corresponding to each value assuming the row is not flipped divided by the product of the normal PMF corresponding to each value assuming the row is flipped is minimized. Intuitively, I assumed that the flips didn't substantially change the overall distribution of each column and then picked the rows where flipping most drastically increases the probability that the values in the row appear in their corresponding distributions.

    Here's my code (excluding my template): https://pastebin.com/YtYNwPh5

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +24 Проголосовать: не нравится

    I computed the mean and standard deviation of each column. Then, for each row $$$r$$$, let $$$f(r)$$$ denote the sum of squared distance from 0.5 of all the value of the cumulative normal distribition on each value in the row (drawn from the estimated normal distribution for each column). Do the same for reversed row $$$r'$$$ and finally sort all rows by the value $$$f(r')-f(r)$$$ and output the first $$$b$$$ rows.

»
4 года назад, # |
Rev. 2   Проголосовать: нравится +37 Проголосовать: не нравится

Seems like the problemset is not (yet) publicly available. I uploaded the problemset to my server for public consumption (so those who missed the registration can also follow the discussion).

»
4 года назад, # |
  Проголосовать: нравится +7 Проголосовать: не нравится

Does anyone know the ML techniques that should be used for the last 2 problems?

Also, I was getting RTE for 3 cases in problem Flipped with Cpp. I switched to Java and I was able to pass those cases. I suspect that I had to use the extra memory limit.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +13 Проголосовать: не нравится

    For the second last problem, we want to select the reversed rows. A cell is misplaced if its value deviates from the mean of the column. We do want to normalise the deviation by dividing with the standard deviation of the column. The rows with many very misplaced cells are likely reversed.

    We can reverse the b worst rows, and the result was already quite good (around 70?).

    The issue with reversing many rows at once is that we might reverse the wrong row. One way is to reverse one row at a time, and recalculate the mean and standard deviation. But this is too slow.

    The middle ground is to reverse a portion of the b rows, in maybe 16 portions. I scored 99 points (even though it displayed 100 on the console).

    The "ML technique" here is statistical knowledge of mean and standard deviation.

    Code: https://github.com/tonghuikang/codecomp/blob/quora-2021-1/template/f.py

    For the last problem, I used LightGBM (which is one the packages Quora allows, and is faster than XGBoost).

    The challenge is feature engineering. Each data should have the same set of features, but what we see in the data is that each user can visit multiple sites. What I did is to cycle through the visits — each user will revisit the first site again. For the time, I calculated the time difference, and cycle. Each data will have 200+200 features.

    I was prepared to handle more feature engineering and fine-tune the threshold, but I got full marks after a few tries.

    The "ML technique" feature engineering, and trust the black box GBDT model and pray.

    Code: https://github.com/tonghuikang/codecomp/blob/quora-2021-1/template/spam.ipynb

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +3 Проголосовать: не нравится

      For the last problem, I think trying to do feature engineering is unnecessary. They tell you the exact model used to generate the data (a Markov chain for each user type), and there are pretty much no hidden variables, so you can directly estimate the parameters of the Markov chain using the input data, and then directly compute the Bayesian probabilities of the user types given the test inputs.

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

For problem Walls, how can we use the size of the grid to our advantage, or is there any algorithm to find the minimum spanning tree of a subgraph?

»
4 года назад, # |
  Проголосовать: нравится +55 Проголосовать: не нравится

Will there be editorial?

»
4 года назад, # |
  Проголосовать: нравится +3 Проголосовать: не нравится

Sorry for Asking, but can anyone help me with problem B " Escape ". I could not find the idea how to mark all the cell visible to atleast 1 guard any help.

  • »
    »
    4 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +4 Проголосовать: не нравится

    For each cell $$$c$$$ that a guard with radius $$$r$$$ is on, set $$$dist[c] = -(r+1)$$$. Now do a multisource shortest path using a priority queue starting from all these cells. For any cell, if the shortest distance is found to be less than 0 at the end, then it is covered by atleast one guard.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Do you mean we should use multisource BFS with priority queue to prioritize popping the cells with lower $$$dist[c]$$$ first? Because when using a normal queue it won't be correct I think, for instance on this example.

      5 10 2
      ##########
      #S.......#
      #........#
      #E.......#
      ##########
      3 4 1
      3 9 10
      
    • »
      »
      »
      4 года назад, # ^ |
      Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

      shiven But what if I come across a cell that is already marked(monitored by another guard), now should I add this cell in the queue ? If yes then time complexity will be quadratic, if No, then what if a cell adjacent to that cell is not monitored by the previous guards, but is monitored by the current guard ?

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        You'll be using a priority queue. It's almost dijkstra.

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится 0 Проголосовать: не нравится

          ok, but what should we do if we come across a cell that is already guarded by someone else, should we add it to priority queue ? If No then what if a cell adjacent to that cell is not monitored by the previous guards, but is monitored by the current guard ?

»
4 года назад, # |
  Проголосовать: нравится +1 Проголосовать: не нравится

How to solve problem "Escape". My approach was to first block all the cells which are at the distance <= d from guard. Then I find the length of the shortest path using bfs. It runs on subtask 1 only. Code

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +1 Проголосовать: не нравится

    Hey, I used multisource BFS, pushed the coordinate of every guard in the queue with their distance, and mark the cell visited if a guard can reach the cell. Now, if our source is visited means at least one guard can reach our initial position and we can't go any further, so "IMPOSSIBLE". If our destination is visited means we can't go to destination, so "IMPOSSIBLE" and if all the neighbors of destination are visited then there is no chance to go to destination cell so the answer is "IMPOSSIBLE". If none of the conditions is satisfied above then apply simple BFS from source to destination. If we can reach the destination cell then print the distance else "IMPOSSIBLE".

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Division 1 recently concluded! Congratulations to all the winners and really happy to see all of the interesting discussion here. We hope you enjoyed the Division 1 problems and hope to see you at Division 2, which is starting in less than 6 hours: Feb 7 2021, 01:00 to 05:00 (UTC + 00:00)

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Can someone provide me the final ranklist link?

»
4 года назад, # |
  Проголосовать: нравится -15 Проголосовать: не нравится

How tf ML problems are ML?

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Does anyone happen to have the sample I/O for Spam downloaded, and if so, would you mind posting it publicly? I wrote a solution to the problem after the contest ended, and I'm curious about how it would perform on a larger test case.

»
4 года назад, # |
  Проголосовать: нравится +3 Проголосовать: не нравится

Any link for tutorial?

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Can anybody tell the efficient solution for escape problem for full points ? None of the comments in this blog mention the efficient solution .

I think Multi-source BFS would give TLE as well , because you will visit a cell say (x,y) which is being gaurd by multiple enemies , so you will have to mark it multiple times which will waste time , and you cant ignore to mark it multiple times as well as each enemy may have different 'd' .

Please someone throw light on the efficient solution for escape ?

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    You only need to keep track of the enemy with the longest remaining range if starting from that cell. That way you only need to propagate from that cell once.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Did you use a priority queue to maintain the "enemy with longest remaining range"?

»
4 года назад, # |
  Проголосовать: нравится +10 Проголосовать: не нравится

Very surprised that https://en.wikipedia.org/wiki/Strong_connectivity_augmentation is not given as a CP problem before, or maybe my Google-fu skill is bad.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Same, I found a solution on Google, but I could only get the first subtask because I had some weird bug causing me to WA on the second subtask.

    Best I could find was this link in Russian, which I tried to piece together with Google Translate :)

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Doesn't this give the solution as well?

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        Yes, but good luck trying to understand / implement this in 45 mins if you haven't seen the idea before.

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        That gives the correct answer, I believe, but getting the construction right is the (much) harder part of the problem. I attempted to implement the approach given in this paper http://www.springer.com/cda/content/document/cda_downloaddocument/9780387235288-c2.pdf but was unsuccessful in doing so (I suspect because I couldn't get the details of the T > S case, which isn't discussed in the paper, right).

        I was not a fan of this problem, given that it's relatively standard and is essentially an implementation (or copying/pasting, if anyone was able to find code online) test. (I also liked malicious less than the other ML problems--it felt like it required much more esoteric knowledge of machine learning, while most of the other ML problems could be solved using general intuition.)

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится +3 Проголосовать: не нравится

          My guess is, it's very hard to give meaningful ML problems within the time frame of 3-4 hour contest that is not just merely pulling something from the library. Also the satisfaction depends on how quickly you get to a good score :)

          I got quite frustrated with malicious; turns out I only scratch the surface idea, kudos to huikang for going into exploring the dataset and figuring out the correct insight!

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Oops, I forgot the crucial line 327 of breaking when you find a matching. My code now AC's on CSES.

      Submission

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        Actually, the paper I was following to implement this code also forgot to put that line in. Though it's definitely my fault for not checking their code first before blindly implementing :P

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится -8 Проголосовать: не нравится

        CSES' testcases are weak. I stresstested my code and it got wrong answer on simple testcase.

»
4 года назад, # |
  Проголосовать: нравится +2 Проголосовать: не нравится

https://cses.fi/problemset/task/1685 The exact same problem.. This is a joke :/
The quora one got me wrong answer on TC 6.

»
4 года назад, # |
  Проголосовать: нравится +25 Проголосовать: не нравится

Finally tourist and I have something in common — we are the only ones in the top 100 who completely solve malicious before the last hour.

https://www.quora.com/q/quoraprogrammingchallenge/Division-2-One-Hour-Remaining

Code: https://github.com/tonghuikang/codecomp/blob/quora-2021-2/template/malicious.ipynb

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Where can the list of all Div1 problems be found ?

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

A chance of editorials for the divisions in this contest?

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

By any chance if anyone who has downloaded the problemset for Div2 can share it here?

»
4 года назад, # |
  Проголосовать: нравится +4 Проголосовать: не нравится

Someone pls share the implementation of Escape and Students problem of DIV-1 Thanks in advance:)

  • »
    »
    4 года назад, # ^ |
    Rev. 4   Проголосовать: нравится +2 Проголосовать: не нравится

    Here is my code for Problem Escape (excluding my template):

    AC_CODE

    EDIT: It seems the test-cases for this problems were weak, thanks to quinoa for giving a counter case for the above solution. The following solution seems correct to us now, though confirmation of the same is appreciated.

    REVISED_CODE
    • »
      »
      »
      4 года назад, # ^ |
      Rev. 4   Проголосовать: нравится +2 Проголосовать: не нравится

      Did you get AC with this? I don't think it's correct since you're not prioritizing the nodes with greatest remaining distance when popping them from the queue. For instance it will fail on this testcase. Your code outputs "2" whereas it should say "IMPOSSIBLE".

      5 10 2
      ##########
      #S.......#
      #........#
      #E.......#
      ##########
      3 4 1
      3 9 10
      
      

      Whereas when removing the first guard

      5 10 1
      ##########
      #S.......#
      #........#
      #E.......#
      ##########
      3 9 10
      

      your code gives "IMPOSSIBLE" (which is correct).

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +2 Проголосовать: не нравится

        True, it seems like test-cases were weak, so how will we remove them prioritizing the distance, using a priority queue. Can you share your implementation?

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится +3 Проголосовать: не нравится

          I could not solve the problem, my solution only worked for the small test case. I think if you change your queue by a priority queue then it will work. Because then we will visit every cell when the remaining distance is maximal and then we should never visit it again. Whereas when using a normal queue we might visit a node when this distance is not maximal, and it is incorrect to label it as visited at this point.

          But I'm also looking for some confirmation that this is the expected solution..

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится +1 Проголосовать: не нравится
»
4 года назад, # |
  Проголосовать: нравится +3 Проголосовать: не нравится

How do we solve datacenter and skyscraper(div-2) optimally?

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +6 Проголосовать: не нравится

    Datacenter: transform each coordinate (x, y) into (x + y, x — y). The problem is now equivalent to finding the datacenter i that minimizes

    $$$\sum_{j = 1}^n |x_i - x_j| + |y_i - y_j|$$$

    which is much nicer to handle: precompute some prefix / suffix sum and iterate through each possible datacenter as the choice.

    Skyscraper: maintain a segment tree where each node contains the maximum height (and possibly the minimum height, not sure if this is needed). The goal is, for a skyscraper i, to quickly find the minimum and maximum indices L and R such that all skyscrapers within [L; R] have heights no more than that of i. You can adapt the segment tree to quickly find these indices in O(log N) time.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +6 Проголосовать: не нравится

      I don't get the solution for Data Center. Can you please elaborate why it is equivalent?

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        We need to compute the distance of two points $$$(x_1, y_1)$$$ and $$$(x_2, y_2)$$$ under the original and transformed formula. By suitable translation and reflection, we can shift these two points to $$$(0, 0)$$$ and $$$(x, y)$$$, and furthermore we can assume $$$x \ge y \ge 0$$$ (the formula is similar for all other cases).

        Under the original formula, the distance of two points is $$$\max (x, y) = x$$$.

        Under the transformed formula, the distance of two transformed points $$$(0, 0)$$$ and $$$(x + y, x - y)$$$ are $$$(x + y) + (x - y) = 2x$$$. So the new distance under the transformed formula is just double the old distance.

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится +13 Проголосовать: не нравится

          Another way to think about this is we are rotating the entire coordinate by 45 degrees. So Manhattan distance becomes Chebyshev (chessboard) distance, and vice-versa.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      How do you adapt segment tree for this problem?

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        Store the maximum for its range in every node.

        Now, to find the smallest $$$i$$$ in the range $$$[l, r]$$$, for which $$$h[i] > x$$$, descend carefully down the segtree.

        Something like:

        if (tree[left_child] > x) return descend(left_child);
        else return descend(right_child);
        

        This way, you will find the required index in $$$O(log N)$$$, per query.

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится 0 Проголосовать: не нравится

          Does it mean we have to decompose [i+1, n] into $$$O(\log n)$$$ intervals and find the first one having max > h[i]?

»
4 года назад, # |
  Проголосовать: нравится -6 Проголосовать: не нравится

Every time there is an asterisk after "world", I automatically assume it excludes Iran. What a shame :(

»
4 года назад, # |
Rev. 2   Проголосовать: нравится +10 Проголосовать: не нравится

How to solve div2 message for any decent score? The English prediction problem.

I know that the dataset is generated from actual English phrases, so I've tried different greedy techniques of picking words based on some brewed up weighted score (frequency, position of the missing word, distance to other words of the same phrase, etc.). But I wasn't able to get anywhere 30+

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +10 Проголосовать: не нравится

    We were tasked to predict the missing word in the sentence.

    When I want to predict the missing word b in a triplet a,b,c — I consider for every possible b, the frequency of (a,b), the frequency of (b,c) and the freqency of (a,b,c). The candidate word for b that appears the most frequently will be my prediction.

    It makes sense if we give greater weights to (a,b,c) because it is less likely to appear compared to (a,b) or (b,c). I count the freqency every appearance of (a,b,c) 10 times rather than once that I did for (a,b) and (b,c). This scored around 47/100 points.

    To get 78/100 points, I considered two words before and two words after the missing word. In the final minutes of the contest, I was fine-tuning the weights and submitted the code repeatedly.

    Code: https://github.com/tonghuikang/codecomp/blob/quora-2021-2/template/f.py

    It is also possible to consider up to 30 characters before and after for a slightly higher score at a very little increase in complexity, but my code was not structured to do that.

    For the full score, I think we need to use a neural network model. The general problem here is masked language encoding.

    https://keras.io/examples/nlp/masked_language_modeling/

    There are readily available templates out there, but it needs to download billon-parameter models which is not allowed by the online judge. We need to build (or copy) a downsized model that is suitable for the dataset.

  • »
    »
    4 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +10 Проголосовать: не нравится

    I think I got lucky when hitting high score pretty quickly.

    First I got 70 pts by keeping track of a frequency table of all 2-gram (an ordered pair of word (x, y)), and for each possible candidate w, its score is freq (p, w) * freq (w, r) where p and r are the word before and after w, respectively (if there's no such word p/r or the pair doesn't exist in the frequency table, assign the weight of 1 for that pair).

    Return the word of highest score.

    I got 82 by improving the above idea with including the 3-grams.

»
4 года назад, # |
  Проголосовать: нравится +40 Проголосовать: не нравится

Got my T-shirt (Singapore)

Applied and got an offer, and I have accepted the offer ... Yay

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    congrats !

    Did you get any link to track the T-shirt? (cause no link sent to me)

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      There was no tracking link sent to me too, the package does not indicate UPS or FedEx.