Qingyu's blog

By Qingyu, 4 hours ago, In English

I've checked today is not April 1st.

(source: 12 Days of OpenAI: Day 12 https://www.youtube.com/watch?v=SKBG1sqdyIU)

  • Vote: I like it
  • +174
  • Vote: I do not like it

»
4 hours ago, # |
  Vote: I like it +29 Vote: I do not like it

Merry Christmas!

»
4 hours ago, # |
  Vote: I like it +31 Vote: I do not like it

thanks for guiding me to become red

»
4 hours ago, # |
  Vote: I like it +10 Vote: I do not like it

Anyone know why o1 is rated 1891 here? From https://openai.com/index/learning-to-reason-with-llms/ o1 preview and o1 are rated 1258 / 1673, respectively.

  • »
    »
    4 hours ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Benq do you think it's the end?

    • »
      »
      »
      4 hours ago, # ^ |
        Vote: I like it -6 Vote: I do not like it

      end for us mortal humans, not for gods...

      • »
        »
        »
        »
        3 hours ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        At this rate, it will be over for these so-called gods soon. It is chess all over again.

  • »
    »
    4 hours ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    1891 was o1-ioi I think

    • »
      »
      »
      4 hours ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      hm, o1-ioi is only 1807 in the link I shared though

      • »
        »
        »
        »
        4 hours ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        it's probably o1 with high-compute like in the pro plan.

  • »
    »
    114 minutes ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Possibly it's "o1 pro mode" or a finetune like o1-ioi or some other o1 model idk at this point because there's so many

»
4 hours ago, # |
  Vote: I like it -7 Vote: I do not like it

in 5 years, there will be no way to pretend that the average human is worth more than a rock

»
4 hours ago, # |
  Vote: I like it +21 Vote: I do not like it

I'll wait until it starts participating in live contests and having Red performance

»
4 hours ago, # |
  Vote: I like it +8 Vote: I do not like it

damn im cooked

»
4 hours ago, # |
  Vote: I like it 0 Vote: I do not like it

Not possible...

»
4 hours ago, # |
Rev. 2   Vote: I like it +5 Vote: I do not like it

I doubt that AI can do better math research than humans 5 years later.

  • »
    »
    4 hours ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    That's the only thing you're gonna be able to do 5 years later — doubt.

  • »
    »
    72 minutes ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Is this a prediction about humans now vs AIs in 5 years or AI + human in 5 years vs AIs in 5 years?

»
3 hours ago, # |
  Vote: I like it +21 Vote: I do not like it

From the presentation we know, that o3 is significantly more expensive. o1-pro now takes ~3 minutes to answer to 1 query. based on the difference in price for o3, o3 is expected to be like 40-100?(more???) times slower. CF contest lasts at most 3 hours. How can o3 get to 2700 if it will spend all the time on solving problem A? It's very interesting to read the paper about o3, and specifically how do they measure its performance.

  • »
    »
    2 hours ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    It must be parallelized. Surely there is something like MCTS involved

»
3 hours ago, # |
  Vote: I like it +19 Vote: I do not like it

I will personally volunteer myself as the first human coder to participate in the inevitable human vs AI competitive programming match.

»
3 hours ago, # |
  Vote: I like it +41 Vote: I do not like it

I only believe it if it was tested in a live contest

»
3 hours ago, # |
  Vote: I like it 0 Vote: I do not like it

Dude, I feel big threat

»
3 hours ago, # |
  Vote: I like it +7 Vote: I do not like it

If o3 really has deep understanding of competitive programming core principles I think it also means it can become a great problemsetting assistant. Of course it won't be able to make AGC-level problems but imagine having more frequent solid div.2 contests that would be great.

»
3 hours ago, # |
  Vote: I like it +16 Vote: I do not like it

Is this a real life?

»
3 hours ago, # |
  Vote: I like it 0 Vote: I do not like it

How do these things perform on marathon tasks? Psyho

»
2 hours ago, # |
  Vote: I like it -6 Vote: I do not like it

I don't see why people are paranoid about those insane ratings claimed by OpenAI. I guess they're worried about cheaters, but why? Competitive programming isn't only about Codeforces — it's a whole community. In every school and country, we know each other personally, we see each other solve problems live, and we compete against each other in onsite contests. So we know each other's level. When we see someone who we know isn't a strong competitive programmer suddenly ranking in the top 5 of a Codeforces contest, it doesn't mean much. We just feel sorry for them that they've started cheating. It will be more funny when we see a red coder who can't qualify for ICPC nationals from their university.

  • »
    »
    117 minutes ago, # ^ |
      Vote: I like it +12 Vote: I do not like it

    i think you're not seeing the bigger picture, the implications for the competitive programming are huge. 1) we might lose sponsors/sponsored contests because now contest performance isn't a signal for hiring or even skill? 2) let's not kid ourselves, but a lot of people are here just to grind out cp for a job / cv and that's totally fine. now they will be skewing the ratings for literally everyone. 3) from 2 it may follow that codeforces elo system completely breaks and we'll have no rating? the incentive to compete is completely gone which will further drive down the size of the active community there are many more, i bet you could even prompt chatgpt for them :D

    • »
      »
      »
      86 minutes ago, # ^ |
        Vote: I like it -7 Vote: I do not like it

      we'll have no rating
      And then we will have no cheaters. Happy ending

  • »
    »
    117 minutes ago, # ^ |
      Vote: I like it +9 Vote: I do not like it

    It will be more funny when we see a red coder who can't qualify for ICPC nationals from their university.

    It's not funny, it happens quite often, for example, at our university(

    • »
      »
      »
      83 minutes ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Red was just an example, A more accurate example would be a team of newbies qualifying while a team of reds fails to do so. don't tell me it's still not funny

  • »
    »
    106 minutes ago, # ^ |
      Vote: I like it +2 Vote: I do not like it

    I think it has major implications for the whole world, not only competitve programming. For example, pace of mathematical research can easily double almost overnight (realistically over like a year period).

»
111 minutes ago, # |
  Vote: I like it +3 Vote: I do not like it

According to this article, it does not seem practical for the average user to run?

Quoting, "Granted, the high compute setting was exceedingly expensive — in the order of thousands of dollars per task, according to ARC-AGI co-creator Francois Chollet."

However, this is indeed a large step forward for AI.

»
110 minutes ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it
O1: I'm faster than humans
O3: I'm better pal

;(

»
70 minutes ago, # |
  Vote: I like it 0 Vote: I do not like it

Do I still have a chance to reach LGM before AI?

»
66 minutes ago, # |
  Vote: I like it +11 Vote: I do not like it

OpenAI is lying. I bought 1 month of o1 and it is not nearly 1900 rating. It is as bad as me. I think they lie on purpose because they are burning a lot of money and they want people to buy their model.

»
62 minutes ago, # |
  Vote: I like it 0 Vote: I do not like it

Day by day I am getting mindfucked with these latest AI updates so much that I might lose my sanity.

»
53 minutes ago, # |
  Vote: I like it +5 Vote: I do not like it

I'm a bit skeptical. o1 is claimed to have a rating around 1800 and I've seen it fail on many div2Bs.

»
14 minutes ago, # |
  Vote: I like it +1 Vote: I do not like it

If I already have lower rating than o1-preview, why should I be concerned?