• pieland@piefed.social
      link
      fedilink
      English
      arrow-up
      25
      ·
      edit-2
      2 days ago

      I haven’t tried this, but this is what I’ve read:

      You can highlight the redacted text, copy it, and paste the text into another document (like Word, WordPad, Notepad, etc.).

      Another method I saw mentioned on Facebook:

      “The backgrounds are transparent. Pull them into photoshop and throw a layer of white between the text and the black background and you have your text.”

      If anything is unclear, please ask. Even if I can’t answer it, maybe someone else can.

    • adhd_traco@piefed.social
      link
      fedilink
      English
      arrow-up
      13
      ·
      2 days ago

      It looks like the maintainers of the project have updated their project and page. It’s much simpler now.

      Installation:

      pip install pdfplumber pymupdf

      Usage:

      python redact_extract.py example.pdf

      • pieland@piefed.social
        link
        fedilink
        English
        arrow-up
        7
        ·
        2 days ago

        If anyone has any questions about this comment as well, please feel free to ask. I know what this comment means, but a few years ago I wouldn’t’ve had a clue.

    • adhd_traco@piefed.social
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      2 days ago

      Here’s step-by-step instructions for the tool that the OP of the reddit thread open sourced. It creates a side-by-side pdf of the redacted and unredacted version at the end.

      No root access is required at any point.

      1. Download and extract the files of https://github.com/leedrake5/unredact?tab=readme-ov-file

      2. Create a python virtual environment, but make sure the destination folder doesn’t already exist (~/.env) here.

      python3 -m venv ~/.env

      1. activate the environment

      source ~/.env/bin/activate
      You should now see (.env) before your prompt.

      1. Now install the python dependencies.

      pip install pdfplumber pymupdf
      You’re all set up.

      1. While still having your virtual environment active, indicated by the (.env) before your pompt, navigate to the downloaded github project, where the ‘redact_extract.py’ file is located.

      2. Copy whatever pdf document you want to try to unredact to the same location.

      3. execute the script

      python redact_extract.py taco_crimes.pdf

      The script should now have created a file for you in the current location with the redacted and unredacted version side by side.

      To leave the virtual environment:

      deactivate

      To enter it again:

      source ~/.env/bin/activate

      To delete everything cleanly, just delete the virtual environment (~/.env in this case)


      The project linked in evacide’s Mastodon toot is even simpler to install. Create and activate a virtual environment like before, but at a different location (.env1 instead of .env, for example).

      Then install the tool from pip in the virtual environment:

      pip install x-ray

      The tool is now installed and can be executed with a pdf file like so:

      xray /path/to/your/file.pdf

      https://github.com/freelawproject/x-ray

      (sorry for the bad formatting. after posting, I can’t preview anymore to figure out how to fix it.)

        • adhd_traco@piefed.social
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          2 days ago

          I haven’t tried any yet, except verify with sample from reddit’s OP.

          Here’s a a google drive link from the reddit thread with three files. One original justice.gov pdf with bad redactions. The same file, unredacted. And a third single pdf side-by-side of the aforementioned two files. By default, OP’s tool creates this side-by-side pdf.