• Curtis "Ovid" Poe (he/him)@fosstodon.org
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    @zogwarg OK, my grammar may have been awkward, but you know what I meant.

    Meanwhile, those of us working with AI and providing real value will continue to do so.

    I wish people would start focusing on the REAL problems with AI and not keep pretending it’s just a Markov Chain on steroids.

    • zogwarg@awful.systems
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      3 months ago

      On a less sneerious note, I would draw distinctions between:

      • Being able to extract value from LLM/GenAI
      • LLM/GenAI being able to sustainably produce value (without simple theft, and without cheaper alternatives being available)

      And so far i’ve really not been convinced of the latter.

      • Curtis "Ovid" Poe (he/him)@fosstodon.org
        link
        fedilink
        arrow-up
        0
        ·
        3 months ago

        @zogwarg

        Consider traditional databases which let you search for strings. Vector databases let you search the meaning.

        For one client, someone could search for “videos about cats”. With stemming and stop words, that becomes “cat” and the results might be lists of videos about house cats and maybe the unix “cat” command. Tigers, lions, cheetahs? Nope.

        Vector database will return tigers/lions/cheetahs because it “knows” they are cats. A much smarter search. I’ve built that for a client.

        • Curtis "Ovid" Poe (he/him)@fosstodon.org
          link
          fedilink
          arrow-up
          0
          ·
          3 months ago

          @zogwarg For a traditional database, you can get those “lions/cheetahs/tigers” by manually attaching metadata to all videos. That is slow, error-prone, and expensive. It also only works for the metadata you *think* to assign to videos.

          A good vector database takes a query in natural language and lets you search the “meaning” of unstructured data. You can search a data corpus much faster this way even though it’s largely unstructured data!

          That’s real value, and it’s not expensive.

          • zogwarg@awful.systems
            link
            fedilink
            English
            arrow-up
            0
            ·
            edit-2
            3 months ago

            I realize it’s probably a toy example but specifically for “cats” you could achieve the similar results by running a thesaurus/synonym-set on your stem words. With the added benefit that a client could add custom synonyms, for more domain-specific stuff that the LLM would probably not know, and not reliably learn through in-prompt or with fine-tuning. (Although i’d argue that if i’m looking for cats, I don’t want to also see videos of tigers, or based on the “understanding” of the LLM of what a cat might be)

            For the labeling of videos itself, the most valuable labels would be added by humans, and/or full-text search on the transcript of the video if applicable, speech-to-text being more in the realm of traditional ML than in the realm of GenAI.

            As a minor quibble your use case of GenAI is not really “Generative” which is the main thing it’s being sold as.

            • Curtis "Ovid" Poe (he/him)@fosstodon.org
              link
              fedilink
              arrow-up
              0
              ·
              3 months ago

              @zogwarg I’ve written up a quick explanation at https://gist.githubusercontent.com/Ovid/17b19faf2fb7e0019e375e97f0a4c8af/raw/196735daa5274ded8f2363a41d78a490e8325f67/vector.txt

              And yes, this is still GenAI. “Gen” doesn’t just mean “generating text”. It also relates to “understanding” (cough) the meaning of your prompt and having a search space where it can match your meaning with the meaning of other things. That’s where it starts to “generate” ideas. For vector databases, instead of generating words based on the meaning, it’s generating links based on the meaning.

              • self@awful.systems
                link
                fedilink
                English
                arrow-up
                1
                ·
                3 months ago

                fosstodon is the programming dot dev of mastodon and I mean that in every negative way you can imagine

                your posts all give me slimy SEO vibes and you haven’t shown any upward trajectory since claiming that only generative AI lacks a separation between code and data (fucking what? seriously, think on this) so you’re getting trimmed

                • froztbyte@awful.systems
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  3 months ago

                  I just ended up throwing the name into a search engine (one of those boring old actually search engine things; how pedestrian of me)

                  I’m Curtis “Ovid” Poe. I’ve been building software for decades. Today I largely work with generative AI, Perl, Python, and Agile consulting. I regularly speak at conferences and corporate events across Europe and the US.

                  ah.

                • FRACTRANS@awful.systems
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  3 months ago

                  back when I used the wider fediverse more frequently I had fosstodon on mute for a significant amount of time

                  glad to know it’s still Like That