• bitwolf@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 hours ago

    I had a similar experience processing PDFs of building plans. 4-8k PDFs, took 5-10minutes in Python.

    I ended up switching to node.js and it processed the same PDFs in 120 seconds.

    Over the years I only really find Python useful for interviewing and occasionally in ci pipelines.

    • wheezy@lemmy.ml
      link
      fedilink
      arrow-up
      1
      ·
      3 minutes ago

      It’s a great tinkering language. Which is a lot of what I do for personal projects no one else will ever see. I find it’s biggest strength is also it’s biggest weakness. It’s really easy to write that you assume you don’t have to care about under the hood stuff.

      But something as simple as using a list instead of a set can turn a 2 minute script into a 2 hour script pretty quickly.

      I remember when I first started using it I was working with building a list and then comparing elements of another list to see if it was contained.

      My list was static in the comparison so I just did a “x in Y”.

      Y was massive though.

      If I was using any other language I would have thought about the data type more and obviously use a set/hash for O(1) lookup. But since I was new to python I didn’t even think about it because it didn’t seem to give a fuck about data types.

      A simple set_a = set(list_a) was all I needed. I think python is so easy to pick up that no one even bothers to optimize what they are writing. So you get even worse performance than it should have.