How to download files from a shared folder

Hi,
I need to download all the files in
https://app.box.com/s/rf6p81j3o507e8c5saywtlc1p91f8po9
I cannot download it through the browser because the files are > 150 GB. Thus, I want to create a script that downloads the files one by one.

To do so, I signed up for Box, and created a Custom App. On the ‘purpose’ drop-down menu, I selected ‘other’. Next, I selected ‘server authentication (with JWT)’. I then navigated to the configuration tab and added clicked ‘Genereate a Public/Private Keypair’, which downloaded a file named 0_XXXXXenv_config.json where the X’s are random digits/characters. I renamed this file to config.json and then tried to run:

from boxsdk import JWTAuth, Client

auth = JWTAuth.from_settings_file('config.json')
client = Client(auth)
auth.authenticate_instance()

shared_folder = client.get_shared_item("https://app.box.com/s/rf6p81j3o507e8c5saywtlc1p91f8po9")
for item in shared_folder.get_items(limit=1000):
    client.file(file_id=item.id).download_to(item.name)
    break

but I get:

boxsdk.exception.BoxOAuthException: 
Message: Please check the 'sub' claim. The 'sub' specified is invalid.
Status: 400
URL: https://api.box.com/oauth2/token
Method: POST
Headers: {'Date': 'Tue, 23 Apr 2024 08:49:18 GMT', 'Content-Type': 'application/json', 'Strict-Transport-Security': 'max-age=31536000', 'Set-Cookie': 'box_visitor_id=6627760e46a336.78777513; expires=Wed, 23-Apr-2025 08:49:18 GMT; Max-Age=31536000; path=/; domain=.box.com; secure; SameSite=None, bv=DSYS-1179; expires=Tue, 30-Apr-2024 08:49:18 GMT; Max-Age=604800; path=/; domain=.app.box.com; secure, cn=4; expires=Wed, 23-Apr-2025 08:49:18 GMT; Max-Age=31536000; path=/; domain=.app.box.com; secure, site_preference=desktop; path=/; domain=.box.com; secure', 'Cache-Control': 'no-store', 'Via': '1.1 google', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'Transfer-Encoding': 'chunked'}

I cannot list the files, let alone download them. What am I doing wrong?

Hi @TravisPetit , welcome to the forum!

Looks like something is off in your config.json file.

The sub claim represents the box_subject_id and it should be your enterprise id if the box_sub_type is enterprise, which should be. As a side note you could also specify a sub type of user and the user id in the sub claim.

First I would check the config.json, here is a sample:

{
  "boxAppSettings": {
    "clientID": "fe...so",
    "clientSecret": "3N...xi",
    "appAuth": {
      "publicKeyID": "39749s1s",
      "privateKey": "-----BEGIN ENCRYPTED PRIVATE KEY-----\nMI...G\nlCE=\n-----END ENCRYPTED PRIVATE KEY-----\n",
      "passphrase": "fa...c3"
    }
  },
  "enterpriseID": "877840855",
  "webhooks": {
    "primaryKey": "zX...E0",
    "secondaryKey": "Ew...nf"
  }
}

Make sure your enterprise id matches (Admin console → Billing)

Another common error I make is to forget to re-authorize the JWT application every time I make a change to it’s configuration.

Make sure you have submitted your application under Authorization in the developer console:

And then on the admin console → Apps, under the custom apps manager, authorize the app:

This seems to be quite a popular use case, there are several posts on the forum mentioning the need to build a script to download data.

Just for fun here is an example, similar to yours, and ignoring the .gz files:

"""demo to download files from a box web link"""

import os
from boxsdk import JWTAuth, Client


def main():
    auth = JWTAuth.from_settings_file(".jwt.config.json")
    auth.authenticate_instance()
    client = Client(auth)

    web_link_url = "https://app.box.com/s/rf6p81j3o507e8c5saywtlc1p91f8po9"

    user = client.user().get()
    print(f"User: {user.id}:{user.name}")

    shared_folder = client.get_shared_item(web_link_url, "")
    print(f"Shared Folder: {shared_folder.id}:{shared_folder.name}")
    print("#" * 80)

    print("Type\tID\t\tName")
    os.chdir("downloads")
    items = shared_folder.get_items()
    download_items(items)
    os.chdir("..")


def download_items(items):

    for item in items:
        if item.type == "folder":
            if not os.path.exists(item.name):
                os.mkdir(item.name)
            os.chdir(item.name)

            # print the folder name
            print("-" * 80)
            print(f"\n\n{item.type}\t{item.id}\t{item.name}")
            print("-" * 80)

            download_items(item.get_items())
            os.chdir("..")

        if item.type == "file":
            print(f"{item.type}\t{item.id}\t{item.name}", end="")

            # check if item name ends with .tar.gz
            if item.name.endswith(".gz"):
                print("\t .gz skipped")
                continue

            with open(item.name, "wb") as download_file:
                item.download_to(download_file)
            print("\tdone")


if __name__ == "__main__":
    main()
    print("Done")

Resulting in:

User: 20344589936:UI-Elements-Sample
Shared Folder: 193110430595:INTERVAL_Metabolon_GWAS_summary_stats
################################################################################
Type    ID              Name
--------------------------------------------------------------------------------


folder  193712488944    M00053
--------------------------------------------------------------------------------
file    1134408152107   INTERVAL_M00053_formattedForMeta_sorted_chr_1.txt.gz     .gz skipped
file    1134416904386   INTERVAL_M00053_formattedForMeta_sorted_chr_1.txt.gz.tbi        done
file    1134408265055   INTERVAL_M00053_formattedForMeta_sorted_chr_10.txt.gz    .gz skipped
file    1134423566812   INTERVAL_M00053_formattedForMeta_sorted_chr_10.txt.gz.tbi       done
file    1134409796066   INTERVAL_M00053_formattedForMeta_sorted_chr_11.txt.gz    .gz skipped
file    1134417464230   INTERVAL_M00053_formattedForMeta_sorted_chr_11.txt.gz.tbi       done
file    1134414789727   INTERVAL_M00053_formattedForMeta_sorted_chr_12.txt.gz    .gz skipped
file    1134410779924   INTERVAL_M00053_formattedForMeta_sorted_chr_12.txt.gz.tbi       done
file    1134416678707   INTERVAL_M00053_formattedForMeta_sorted_chr_13.txt.gz    .gz skipped
file    1134418213051   INTERVAL_M00053_formattedForMeta_sorted_chr_13.txt.gz.tbi       done
file    1134417716501   INTERVAL_M00053_formattedForMeta_sorted_chr_14.txt.gz    .gz skipped
file    1134412411878   INTERVAL_M00053_formattedForMeta_sorted_chr_14.txt.gz.tbi       done
file    1134411158800   INTERVAL_M00053_formattedForMeta_sorted_chr_15.txt.gz    .gz skipped
file    1134417029597   INTERVAL_M00053_formattedForMeta_sorted_chr_15.txt.gz.tbi       done

Let us know if this helps

Hi @rbarbosa, thank you so much for your reply.
I do not have an authorization tab. Instead I have a ‘General Settings’, ‘Configuration’ and ‘App Diagnostics’ tab, as shown in this screenshot.

Am I doing something wrong? I chose JWT as an authentication method.

Thanks again.

Best
Travis

Hi @TravisPetit

That explains it!

Seems to me that either you have a free account or a “Personal Pro” account, that does not have the admin console or the application approval.

These accounts do not support CCG or JWT applications, only OAuth 2.0

We do have a free developer account that will enable you to do this.

You have a couple of options moving forward:

  • Create a new free developer account and discard the existing one. Check your current usage, files, shared links, etc
  • Use current account with OAuth 2.0 - Not ideal for scripting but doable

Help us understand which would you prefer.
Feel free to send me a private message with more details so I can identify your current account (is it the one associated with your forum user email?).

1 Like

Hi @rbarbosa

I created a new developer account, and authorized a new JWT app. I can run your script now.
Thank you very much!

Best,
Travis

Perfect, happy to help!

This topic was automatically closed 4 days after the last reply. New replies are no longer allowed.