• DandomRude@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    arrow-down
    1
    ·
    11 hours ago

    I use various AI models and I repeatedly notice that certain information is withheld or misrepresented, even though it is freely available in abundance and is therefore part of the training data.

    I don’t think this is a coincidence, especially since the operators of all cloud LLMs are so business-minded.

    • porcoesphino@mander.xyz
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 hours ago

      A bunch of this can be expected failure modes for LLMs. Do you have a list of short examples to get an idea?

      • DandomRude@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        2 hours ago

        Yes, it’s clear that some of this may have to do with the fact that even if cloud LLMs have live browsing capabilities, they often still rely on outdated information from their training data. I am simply describing my impressions from somewhat extensive use of cloud LLMs.

        I don’t have a list of examples, but in my comment above I have mentioned two that I find suspicious.

        I simply think that these products should be used with skepticism as a matter of principle. This is simply because none of the companies that offer them are known for ethical behavior - quite the opposite.

        In the case of Google, for example, I don’t think it will be too long before (public) advertising opportunities are implemented in Gemeni, because Google’s business model is essentially the advertising business. The other cloud LLMs are also products of purely profit-oriented companies—and manipulating public opinion is a multi-billion dollar business that they will certainly not want to miss out on. Social media platforms have demonstrated this in the past as has Google and others with their “classic” search engines, targeting and data selling schemes. Whether this raises ethical issues is likely to be of little concern to these companies as their only concern is profit.

        The simple fact is that it is completely unclear what logic the providers use to regulate the output. It is equally unclear what criteria are used to select training data (here, too, the output can already be influenced by deliberately omitting certain information).

        What I am getting at is that it can be assumed that all providers are interested in maximizing profits—and it is therefore likely that they will allow themselves to be paid to specifically promote certain topics, products, or even worldviews, or to withhold information that is unwelcome to wealthy interest groups.

        As a regular user of cloud LLMs, I have the impression that this is already happening. I cannot prove this tho, as it would require systematic, scientific studies to demonstrate whether and to what effects manipulation occurs. Unfortunately, I do not know whether such studies already exist.

        However, it is a fact that in the past, all technologies that could have been used to serve humanity have been massively abused for profit. I don’t understand why it should be any different with cloud LLMs, which are offered exclusively by some of the world’s largest corporations.

        • porcoesphino@mander.xyz
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 hour ago

          Yeah, I’m not disagreeing with the probable outcome here. I just think that it’s more likely at this point in time for the LLM output to be doing its stochastic thing in a way your human brain is seeing patterns in. But, I was also curious how wrong I was and that’s part of why I asked for some examples. Not that I could really validate them

          • DandomRude@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            25 minutes ago

            Yes, that could well be the case. Perhaps I am overly suspicious, but because the potential of LLMs to influence public opinion is so high due to their reach and the way they present information, I think it is highly likely that the companies offering them are already profiting from this, or at least will do so very soon.

            Musk is already demonstrating in his clumsy way that it is easily possible to manipulate the output in a targeted manner if you have full control over the model – and this isn’t the first time he has attracted attention for doing so. You almost have to be grateful to him for it, because it’s so obvious. If you do it more subtly, it’s even more dangerous.

            In any case, the fact is that the more people use LLMs, the more “interpretive authority” will be centralized, because the development and operation of LLMs is so costly that only a few large corporations can afford it – and they want to make money and are unscrupulous in doing so.

            In any case, we will not be able to rely on people’s ability to recognize attempts at manipulation. I think this is already evident from the fact that obvious misinformation on mainstream social media platforms and elsewhere is believed unquestioningly by so many people. Unfortunately, the effects are disastrous: if people were more critical, Trump would never have become US president, for example – certainly not twice.

      • DandomRude@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        ·
        8 hours ago

        For example, objective information about Israel’s actions in Gaza. The International Criminal Court issued arrest warrants against leading members of the government a long time ago, and the UN OHCHR classifies the actions of the State of Israel as genocide. However, these facts are by no means presented as clearly as would be appropriate given the importance of these institutions. Instead, when asked whether Israel is committing genocide, one receives vague, meaningless answers. Only when specifically asked whether numerous reputable institutions actually classify Israel’s actions as genocide do most LLMs reveal that much, if not all, evidence points to this being the case. In my opinion, this is a deliberate method of obscuring reality, as the vast majority of users will not or cannot ask questions if they are unaware of the UN OHCHR’s assessment or do not know that arrest warrants have been issued against leading members of the Israeli government on suspicion of war crimes (many other reputable institutions have come to the same conclusion as the UN OHCHR and the International Criminal Court).

        Another example: if you ask whether it is legally permissible to describe Donald Trump as a rapist, you will be told that this is defamation. However, a judge in the Carroll case has explicitly stated that this description applies to Trump – so it is in fact legally permissible to describe him as such. Again, this information is only available upon explicit request, if at all. This also distorts reality for people who are not yet informed. However, since many people initially seek information from LLMs, this leads to them being misinformed because they lack the background knowledge to ask explicit follow-up questions when given misleading answers.

        Given the influence of both Israel and the US president, I cannot help but suspect that there is an intention behind this.

        • markko@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 hours ago

          Given the influence of both Israel and the US president, I cannot help but suspect that there is an intention behind this.

          Not to mention the large number of Israelis (often former Mossad/intelligence agents) directly involved in US tech companies.