• will_a113@lemmy.ml
    link
    fedilink
    English
    arrow-up
    24
    ·
    3 天前

    The original gpt4 is just an LLM though, not multimodal, and the training cost for that is still estimated to be over 10x R1’s if you believe the numbers. I think where R 1 is compared to 4o is in so-called reasoning, where you can see the chain of though or internal prompt paths that the model uses to (expensively) produce an output.

    • jacksilver@lemmy.world
      link
      fedilink
      arrow-up
      5
      arrow-down
      2
      ·
      edit-2
      3 天前

      I’m not sure how good a source it is, but Wikipedia says it was multimodal and came out about two years ago - https://en.m.wikipedia.org/wiki/GPT-4. That being said.

      The comparisons though are comparing the LLM benchmarks against gpt4o, so maybe a valid arguement for the LLM capabilites.

      However, I think a lot of the more recent models are pursing architectures with the ability to act on their own like Claude’s computer use - https://docs.anthropic.com/en/docs/build-with-claude/computer-use, which DeepSeek R1 is not attempting.

      Edit: and I think the real money will be in the more complex models focused on workflows automation.