Links are almost always base64 encoded now and the online url decoders always produce garbage. I was wondering if there is a project out there that would allow me to self-host this type of tool?

I’d probably network this container through gluetun because, yanno, privacy.

Edit to add: Doesn’t have to be specifically base64 focused. Any link decoder that I can use in a privacy respecting way, would be welcome.

Edit 2: See if your solution will decode this link (the one in the image): https://link.sfchronicle.com/external/41488169.38548/aHR0cHM6Ly93d3cuaG90ZG9nYmlsbHMuY29tL2hhbWJ1cmdlci1tb2xkcy9idXJnZXItZG9nLW1vbGQ_c2lkPTY4MTNkMTljYzM0ZWJjZTE4NDA1ZGVjYSZzcz1QJnN0X3JpZD1udWxsJnV0bV9zb3VyY2U9bmV3c2xldHRlciZ1dG1fbWVkaXVtPWVtYWlsJnV0bV90ZXJtPWJyaWVmaW5nJnV0bV9jYW1wYWlnbj1zZmNfYml0ZWN1cmlvdXM/6813d19cc34ebce18405decaB7ef84e41 (it should decode to this page: https://www.hotdogbills.com/hamburger-molds)

  • e0qdk@reddthat.com
    link
    fedilink
    English
    arrow-up
    13
    ·
    22 hours ago

    There’s something else going on there besides base64 encoding of the URL – possibly they have some binary tracking data or other crap that only makes sense to the creator of the link.

    It’s not hard to write a small Python script that gets what you want out of a URL like that though. Here’s one that works with your sample link:

    #!/usr/bin/env python3
    
    import base64
    import binascii
    import itertools
    import string
    import sys
    
    input_url = sys.argv[1]
    parts = input_url.split("/")
      
    for chunk in itertools.accumulate(reversed(parts), lambda b,a: "/".join([a,b])):
      try:
        text = base64.b64decode(chunk).decode("ascii", errors="ignore")
        clean = "".join(itertools.takewhile(lambda x: x in string.printable, text))
        print(clean)
      except binascii.Error:
        continue
    

    Save that to a file like decode.py and then you can you run it on the command line like python3 ./decode.py 'YOUR-LINK-HERE'

    e.g.

    $ python3 ./decode.py 'https://link.sfchronicle.com/external/41488169.38548/aHR0cHM6Ly93d3cuaG90ZG9nYmlsbHMuY29tL2hhbWJ1cmdlci1tb2xkcy9idXJnZXItZG9nLW1vbGQ_c2lkPTY4MTNkMTljYzM0ZWJjZTE4NDA1ZGVjYSZzcz1QJnN0X3JpZD1udWxsJnV0bV9zb3VyY2U9bmV3c2xldHRlciZ1dG1fbWVkaXVtPWVtYWlsJnV0bV90ZXJtPWJyaWVmaW5nJnV0bV9jYW1wYWlnbj1zZmNfYml0ZWN1cmlvdXM/6813d19cc34ebce18405decaB7ef84e41'
    https://www.hotdogbills.com/hamburger-molds/burger-dog-mold
    

    This script works by spitting the URL at ‘/’ characters and then recombining the parts (right-to-left) and checking if that chunk of text can be base64 decoded successfully. If it does, it then takes any printable ASCII characters at the start of the string and outputs it (to clean up the garbage characters at the end). If there’s more than one possible valid interpretation as base64 it will print them all as it finds them.