edit to clarify a misconception in the comments, this is an instagram post so “caption” refers to the description under the image or video

as an example, this text i am typing now is also a “caption”

just saying because someone started a debate misunderstanding this to be about subtitles (aka “closed captions”) and that’s just not the case 👍

  • thejoker954@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    17
    ·
    24 days ago

    If you use any generic LLM then yes, but there are LLMs (like i said in another reply - its prrobably not a LLM - but as there is no ‘real’ ai that’s what I’m calling all this ai bullshit) That are trained specifically for captioning/transcripts, just not necessarily done in real time.

    Doing it “live” is what increases the error rate.

      • thejoker954@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        7
        ·
        24 days ago

        I have to disagree with you. Ai is never a more accurate way to describe what we have now. Not until they call true ai something different.

        I know its a weird hill to die on, but die on it I will. Calling one artifical intelligence and one virtual intelligence could work.

        Also it’s my understanding that LLMs are considered a type of neural net so I don’t see it being more accurate to call it a neural net vs a llm.

        And they are all subsets of machine learning so calling it an ml model leads me back to the same issue I have with “ai”. (And the same reason those loser usb fucks can suck a bag of dildos) lack of clairty of what it actually can do.

        • Norah (pup/it/she)@lemmy.blahaj.zone
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          1
          ·
          23 days ago

          Then call it ML or a neural net. Using the term LLM like you are for other forms of machine learning is just going to cause needless confusion, like it has in this thread.

          • thejoker954@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            23 days ago

            No. “Machine learning” is the root of the tree.

            Or to steal another commenters attempts to have me call it that - that would be like calling a chihuahu a wolf.

            Machine learning -> neural net -> LLM. Thats the basic “path”. I dont CARE if LLM is technically wrong when using machine learning or neural net is also inaccurate.

            If anything yall should be arguing for me to call it ASR 2.0

    • RushLana@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      24 days ago

      I will frame it another way. You cannot automate subtitles or caption. And I always find reviewing automated output is harder than doing it yourself.