• adhd_traco@piefed.social
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    20 hours ago

    Here’s step-by-step instructions for the tool that the OP of the reddit thread open sourced. It creates a side-by-side pdf of the redacted and unredacted version at the end.

    No root access is required at any point.

    1. Download and extract the files of https://github.com/leedrake5/unredact?tab=readme-ov-file

    2. Create a python virtual environment, but make sure the destination folder doesn’t already exist (~/.env) here.

    python3 -m venv ~/.env

    1. activate the environment

    source ~/.env/bin/activate
    You should now see (.env) before your prompt.

    1. Now install the python dependencies.

    pip install pdfplumber pymupdf
    You’re all set up.

    1. While still having your virtual environment active, indicated by the (.env) before your pompt, navigate to the downloaded github project, where the ‘redact_extract.py’ file is located.

    2. Copy whatever pdf document you want to try to unredact to the same location.

    3. execute the script

    python redact_extract.py taco_crimes.pdf

    The script should now have created a file for you in the current location with the redacted and unredacted version side by side.

    To leave the virtual environment:

    deactivate

    To enter it again:

    source ~/.env/bin/activate

    To delete everything cleanly, just delete the virtual environment (~/.env in this case)


    The project linked in evacide’s Mastodon toot is even simpler to install. Create and activate a virtual environment like before, but at a different location (.env1 instead of .env, for example).

    Then install the tool from pip in the virtual environment:

    pip install x-ray

    The tool is now installed and can be executed with a pdf file like so:

    xray /path/to/your/file.pdf

    https://github.com/freelawproject/x-ray

    (sorry for the bad formatting. after posting, I can’t preview anymore to figure out how to fix it.)

      • adhd_traco@piefed.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 day ago

        I haven’t tried any yet, except verify with sample from reddit’s OP.

        Here’s a a google drive link from the reddit thread with three files. One original justice.gov pdf with bad redactions. The same file, unredacted. And a third single pdf side-by-side of the aforementioned two files. By default, OP’s tool creates this side-by-side pdf.