• 0 Posts
  • 83 Comments
Joined 2 years ago
cake
Cake day: January 15th, 2024

help-circle
  • If there’s one thing that coding LLMs do “well”, it’s expose the need in frameworks for code generation. All of the enterprise applications I have worked on in modernity were by volume mostly boilerplate and glue. If a statistically significant portion of a code base is boilerplate and glue, then the magical statistical machine will mirror that.

    LLMs may simulate filling this need in some cases but of course are spitting out statistically mid code.

    Unfortunately, committing engineering effort to write code that generates code in a reliable fashion doesn’t really capture the imagination of money or else we would be doing that instead of feeding GPUs shit and waiting for digital God to spring forth.




















  • One thing I’ve heard repeated about OpenAI is that “the engineers don’t even know how it works!” and I’m wondering what the rebuttal to that point is.

    While it is possible to write near-incomprehensible code and make an extremely complex environment, there is no reason to think there is absolutely no way to derive a theory of operation especially since any part of the whole runs on deterministic machines. And yet I’ve heard this repeated at least twice (one was on the Panic World pod, the other QAA).

    I would believe that it’s possible to build a system so complex and with so little documentation that on its surface is incomprehensible but the context in which the claim is made is not that of technical incompetence, rather the claim is often hung as bait to draw one towards thinking that maybe we could bootstrap consciousness.

    It seems like magical thinking to me, and a way of saying one or both of “we didn’t write shit down and therefore have no idea how the functionality works” and “we do not practically have a way to determine how a specific output was arrived at from any given prompt.” The first might be in part or on a whole unlikely as the system would need to be comprehensible enough so that new features could get added and thus engineers would have to grok things enough to do that. The second is a side effect of not being able to observe all actual input at the time a prompt was made (eg training data, user context, system context could all be viewed as implicit inputs to a function whose output is, say, 2 seconds of Coke Ad slop).

    Anybody else have thoughts on countering the magic “the engineers don’t know how it works!”?