• pepperfree@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      10 days ago

      Llama 3.3 was good, tho. For the multimodal, llama 4 also use llama3.2 approach where the image and text is made into single model instead using CLIP or siglip.