Download watermarked file

Hi there.

I’m looking for a way to download a watermarked file using the API with a viewer user.
I saw the platform docs mentioning that when using a Viewer user to download a watermarked file, you need to first open the file for preview and then the download becomes available.

How can I do the same thing using the API instead of the UI?

Hi @juliano.net, welcome to the forum!

You started with a very interesting question, lets go step by step.

IAs you stated and confirmed in support note, we can see that a watermarked download is only possible when the user is a viewer.

This works fine in the previewer UI.

However when we try to download the file for a user with viewer permission we het a HTTP 403 access denied.

If the user is editor or above the user will be able to download the file without watermark, since they can edit it.

In order to get around this you can download the PDF representation of the file. This will include the watermark, and it is accessible to the viewer.

No sure if you are familiar with file representations, we have an article on that topic:

Consider these auxiliary methods for file representations (I’m using python)

def do_request(url: str, access_token: str):
    resp = requests.get(
        url, headers={"Authorization": f"Bearer {access_token}"}
    )
    resp.raise_for_status()
    return resp.content


def file_representations(
    client: Client, file: FileMini, rep_hints: str = None
) -> List[FileFullRepresentationsEntriesField]:
    """Get file representations"""
    file = client.files.get_file_by_id(
        file.id, fields=["name", "representations"], x_rep_hints=rep_hints
    )
    return file.representations.entries


def representation_download(
    access_token: str,
    file_representation: FileFullRepresentationsEntriesField,
    file_name: str,
):
    if (
        file_representation.status.state
        != FileFullRepresentationsEntriesStatusStateField.SUCCESS
    ):
        print(
            f"Representation {file_representation.representation} is not ready"
        )
        return

    url_template = file_representation.content.url_template
    url = url_template.replace("{+asset_path}", "")
    file_name = (
        file_name.replace(".", "_").replace(" ", "_")
        + "."
        + file_representation.representation
    )

    content = do_request(url, access_token)

    with open(file_name, "wb") as file:
        file.write(content)

    print(
        f"Representation {file_representation.representation}",
        f" saved to {file_name}",
    )

This will download the file representation of a word document as a pdf which includes the watermark.

    FILE_DOCX = "1418524151903"

    # Make sure the file exists
    file_ppt = client.files.get_file_by_id(FILE_DOCX)
    print(f"\nFile {file_ppt.name} ({file_ppt.id})")

    # Get PDF representation
    file_ppt_repr_pdf = file_representations(client, file_ppt, "[pdf]")
    access_token = client.auth.retrieve_token().access_token
    representation_download(access_token, file_ppt_repr_pdf[0], file_ppt.name)

Let us know if this helps

Cheers

Hi @rbarbosa, thanks for your response.

I saw a code snippet using the file representations feature, however when I made a request to the endpoint, I only got a JSON like this (taken from your blog article, but the structure was the same):

[
...
    {
        "representation": "jpg",
        "properties": {
            "dimensions": "32x32",
            "paged": "false",
            "thumb": "true"
        },
        "info": {
            "url": "https://api.box.com/2.0/internal_files/1294096878155/versions/1415005971755/representations/jpg_thumb_32x32"
        }
    },
...
]

There were no url_template fields in there. I’ll check the Python SDK source code to find if I’m missing any HTTP header or anything like that.

Not sure what you mean by your comment:

Was this a previous attempt?

That is a jpeg image representation, specifically an icon.

Did you try to requesting a PDF representation?

For example when requesting the PDF representation, I get this JSON:

[
    {
        "content": {
            "url_template": "https://public.boxcloud.com/api/2.0/internal_files/1418524151903/versions/1590468968479/representations/pdf/content/{+asset_path}?watermark_content=1708453905"
        },
        "info": {
            "url": "https://api.box.com/2.0/internal_files/1418524151903/versions/1590468968479/representations/pdf"
        },
        "properties": {
            "dimensions": null,
            "paged": null,
            "thumb": null
        },
        "representation": "pdf",
        "status": {
            "state": "success"
        }
    }
]

Let us know.

Best regards

Actually, it worked. I may have mixed other requests settings in the middle of the REPL session.

Thank you, @rbarbosa .

1 Like