Search for Folder using Metadata Query

For all intents and purposes, I’m using cURL commands to operate my application.

Project: I have a large number of folders (1000+) in the root of my application. Each folder has a standard structure of sub-folders, and within each sub folder there can be a large number of individual files (1000+).

Need: I have a database with associated records. Each record is given an identifier (UUID). I created a metdata template with one field called UUID. Goal is to add a metadata instance to related files and folders with associated UUID. When you pull up the record in the database, application will search all metadata for UUID, and return appropriate files.

Problem: I have successfully done so with files. Added metadata instance, search for UUID, and found list of files (99% of the time it returns just one). I have NOT been able to do so with folders. I need to be able to search for a metadata instance on a “folder” and retrieve it.

Because of the sheer number of folders, list_items_in_folder was a 100 entity limit, and looping through them without finding specified folder gives me a lot of false NOT FOUND errors just because the right folder was not included. Also, I’m trying to make this application operate at a reasonable speed because we can have a good number of simultaneous users, and currently iterating through multiple loops at 100 items per clip and drilling down through sub-directories is creating too many API calls and bogging down the system.

What I’m looking for is more guidance on how to search for metadata on folders. Also given the scope of the project, I’m open to other suggestions on how to approach the problem.

Hi @comcduarte , welcome to the forum!

Just to see if I got your use case correct, and before we jump into code, I’ve tried to implemented it manually on box. Please check my end result:

I’ve got a couple of folders:

Metadata for folder a is aaa, and for folder b is bbb:

And within each folder 3 files:

Metadata on each file is its number:

Same for folder b:

Now if I search for UUID aaa, I get folder A:

And if I search for UUID 013 I get file 013:

Is this what your are trying to get to?

Here are some search examples using cURL and metadata for the above example:

The first step is to identify the correct metadata template, for example:

curl --location 'https://api.box.com/2.0/metadata_templates/enterprise' \
--header 'Authorization: Bearer ...' \

Outputs:

        {
            "id": "de0c9fdd-aa40-4430-9e52-8c6c1187ae55",
            "type": "metadata_template",
            "templateKey": "uuidTemplate",
            "scope": "enterprise_877840855",
            "displayName": "UUID Template",
            "hidden": false,
            "copyInstanceOnItemCopy": false,
            "fields": [
                {
                    "id": "9951183a-645c-40fd-9523-363e5423b431",
                    "type": "string",
                    "key": "uuid",
                    "displayName": "UUID",
                    "hidden": false,
                    "description": "UUID"
                }
            ]
        },

Now let’ s search for file 013 using the uuid 013

curl --location 'https://api.box.com/2.0/metadata_queries/execute_read' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ...' \
--data '{
  "from": "enterprise_877840855.uuidTemplate",
  "query": "uuid = :xxx",
  "query_params": {
    "xxx": "013"
  },
  "ancestor_folder_id": "0",
  "fields": [
    "type",
    "id",
    "name",
    "metadata.enterprise_877840855.uuidTemplate.uuid"
  ]

}'

Resulting in:

{
    "entries": [
        {
            "name": "File 013.docx",
            "etag": "1",
            "metadata": {
                "enterprise_877840855": {
                    "uuidTemplate": {
                        "$scope": "enterprise_877840855",
                        "$template": "uuidTemplate",
                        "$parent": "file_1410606801538",
                        "$version": 1,
                        "uuid": "013"
                    }
                }
            },
            "id": "1410606801538",
            "type": "file"
        }
    ],
    "limit": 100
}

Same but now for Folder a using uuid aaa:

curl --location 'https://api.box.com/2.0/metadata_queries/execute_read' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ...' \
--data '{
  "from": "enterprise_877840855.uuidTemplate",
  "query": "uuid = :xxx",
  "query_params": {
    "xxx": "aaa"
  },
  "ancestor_folder_id": "0",
  "fields": [
    "type",
    "id",
    "name",
    "metadata.enterprise_877840855.uuidTemplate.uuid"
  ]

}'

Resulting in:

{
    "entries": [
        {
            "name": "Folder A",
            "etag": "0",
            "metadata": {
                "enterprise_877840855": {
                    "uuidTemplate": {
                        "$scope": "enterprise_877840855",
                        "$template": "uuidTemplate",
                        "$parent": "folder_243470257892",
                        "$version": 1,
                        "uuid": "aaa"
                    }
                }
            },
            "id": "243470257892",
            "type": "folder"
        }
    ],
    "limit": 100
}

For more information, check out this article:

Let us know if this helps.

Cheers

First may I say that I have never experienced a reply so prompt, professional, and thorough in most of the boards I participate in. It is much appreciated.

Your interpretation was exact. Your result is what I would have expected, but when I submitted the ticket my result was only listing those types that were files. Mind you, I was only using one test example, so I had one file with UUID set, and one folder with the same UUID set. My query only produced the file.

After reading your example, I tried the exact same code I had an issue with, and two entities appeared, both the file and the folder, which is what I was working towards.

Is there a time delay, or index delay when assigning metadata? This turned out to be a non-issue.

Also, if you can help me understand, why is the output from this include an array with dollar signs ‘$’ in the key. All the other API functions don’t have that, and it’s really messing with me.

"$scope": "enterprise_877840855",

Hi,

Thanks for your feedback.

As for the $ sign in the keys, they represent global keys, or keys that exist in all metadata templates, and are set automatically, as opposed to the ones we can create and change.

When creating this example I also experienced a delay on the search, less than a minute if a remember correctly.

I’m not 100% sure, but I think this is just when applying the template for the first time, since indexes are created. Once the index exists it should be almost immediate, adding a new metadata on a file, or updating an existing one.

Also for context, I’m using a free developer account, not a Box enterprise account. This account has very little resources, and content. This also means I don’t have a easy way to perform some testing.

Your set up with 1000+ folders and 1000+ files is a great test case, let us know if you find some unacceptable response times.

Cheers

A post was split to a new topic: Find folders in content explorer using metadata queries