• TootSweet@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    if you want to take OpenAI’s own research into account

    No thank you.

    OlympicArena validation set (text-only)

    “Our extensive evaluations reveal that even advanced models like GPT-4o only achieve a 39.97% overall accuracy (28.67% for mathematics and 29.71% for physics)”

    • The OlympicArena analysis that you cited.
    • Sl00k@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      The jump from GPT-4o -> o1 (preview not full release) was a 20% cumulative knowledge jump. If that’s not an improvement in accuracy I’m not sure what is.