I just don’t get how so many people just start by it. Every time I set my expectations lower for what it can be useful at, it proceeds to prove itself likely to fail at that when I actually have a use case that I think one of the LLMs could tackle. Every step of the way. Being told by people that the LLMs are amazing, and that I only had a bad experience because I hadn’t used the very specific model and version they love, and every time I try to verify their feedback (my work is so die-hard they pay for access to every popular model and tool), it does roughly the same stuff, ever so slightly shuffling what they get right and wrong.
I feel gaslit as it keeps on being uselessly unreliable for any task that I would conceivably find it theoretically useful for.
I’ve had similar experiences. Try to do something semi-difficult and it fails, sometimes in an entertainingly shit way at least. Try something simple where I already know the answer? Good chance there’s at least one fundamental issue with the output.
So what are people who use this tech actually getting out of it? Do they just make it regurgitate things from StackOverflow? Do they have a larger tolerance for cleaning up trash? Or do they just not check the output?
I just don’t get how so many people just start by it. Every time I set my expectations lower for what it can be useful at, it proceeds to prove itself likely to fail at that when I actually have a use case that I think one of the LLMs could tackle. Every step of the way. Being told by people that the LLMs are amazing, and that I only had a bad experience because I hadn’t used the very specific model and version they love, and every time I try to verify their feedback (my work is so die-hard they pay for access to every popular model and tool), it does roughly the same stuff, ever so slightly shuffling what they get right and wrong.
I feel gaslit as it keeps on being uselessly unreliable for any task that I would conceivably find it theoretically useful for.
I’ve had similar experiences. Try to do something semi-difficult and it fails, sometimes in an entertainingly shit way at least. Try something simple where I already know the answer? Good chance there’s at least one fundamental issue with the output.
So what are people who use this tech actually getting out of it? Do they just make it regurgitate things from StackOverflow? Do they have a larger tolerance for cleaning up trash? Or do they just not check the output?