Box AI API Documentation

Introduction

NOTE: This is a sneak preview of our thinking about how the AI API will look and is for informational purposes and not in production yet. This is subject to change prior to General Availability. Please use this in the spirit it is intended and feel free to provide thoughts, comments and feedback in this thread.

The Box AI API extends the power of Box AI to your custom applications. Imagine the Box AI Q&A functionality built into your third party integration, or the ability to generate content like you can in Notes, right in your product’s content editor. The Box AI API maintains enterprise-grade security, privacy, and compliance requirements. As long as your content is already in Box, you can leverage the API to tap into that content in 3rd party and custom apps with peace of mind.

The API is built to be easy to use. Simply call the Ask endpoint, provide your document(s), and ask any question. You can also use this endpoint to generate text on the fly. Choose to receive the answer all at once or stream to get the answer one token at a time, allowing you to start providing your answer more quickly, similar to the users experience in the ChatGPT app.

Create your app

The first step is to create your app. We have tested with both JWT auth and OAuth2, so select the authentication that makes the most sense for your application. Be sure to capture the client ID and provide it to us so we can enable the scope for your application.

Authentication

The easiest way to handle authentication is to use a developer token. Please note this token is only valid for one hour. For more information, visit the API Reference page for the token endpoint. From your scope-enabled app, this will automatically provide the scope for you, allowing you to begin testing and coding immediately. You can also use OAuth 2. In this case, you must specifically request the ai.readwrite scope.

Requests

Object names in bold are required

POST /2.0/ai/ask

This endpoint will send your prompt and your content or the text representation of your file to the LLM and return the value with an HTTP 200 response and a JSON object documented below. You must always provide a file ID, as this is how the API validates that you and your instance of Box have access to Box AI. You can optionally provide a string in the content object, however the content string will take precedence over the content of the file represented by the id.

Object Type Description Values
mode enum This tells Box AI what type of request you will be making “single_item_qa” - Ask a question about a single document
“multiple_item_qa” - Ask a question about a group of items. This is not yet fully implemented, so you may experience issues.
prompt string The question you wish to ask about your document or content “Summarize this document”
items array This is an array of JSON objects that describe the file or content you wish to add to your context [
  {
    “id” : “123”,
    “type” : “file”,
    “content” : “Add this string to my context”
  }
]
items.id string The box file Id you wish to add to your context. The API will pull the text representation of this file. “1233039227512“
items.type enum The type of object. “file” - the items.id points to a file. This is the only type implemented
“folder” - the items.id points to a folder. This is not yet supported
“hub” - the items.id points to a hub
items.content string This can be any text you wish to add to the LLM context you wish to ask questions in. “Add this additional information to my context in ChatGPT”

Example Request

{
    "mode": "single_item_qa", // enum: single_item_qa, multiple_item_qa
    "prompt": "What is this content about?",
    "items": [
        {
            "id": "123",
            "type": "file", // enum: file (eventually we will support other types like folder)
            "content": "string"
        },
        {
            "id": "123",
            "type": "file",
            "content": "string"
        },
        {
            "id": "123",
            "type": "hub",
            "content": "string"
        },
        {
            "id": "123",
            "type": "folder",
            "content": "string"
        }
    ]
}

Response

Object Type Description Values
answer string The answer from the LLM in JSON format "The document is a script titled "Five Feet and Rising" by Peter Sollett.”
created_at string The datetime stamp with timezone information that the answer was generated. “2012-12-12T10:53:43-08:00”
completion_reason string The indicator that the processing is finished. “done”

Example Response:

{
  "answer": "The document is a script titled \"Five Feet and Rising\" by Peter Sollett. It begins with a group of girls dancing on the sidewalk, their movements accompanied only by the sound of their bodies. The story then focuses on Amanda, a 14-year-old girl who sits outside her apartment listening to music and reading a magazine. She interacts with Jenette, Aaron, Donna, Michelle, Victor, Carlos, and other characters throughout the script.\nAmanda later goes to Pitt Street Pool where she meets Erica and Francesca along with another girl named Shai. Meanwhile, Victor talks to Carlos about Eddie's cousin from Compost while they are at the pool as well.\nLater in the day on Amanda's block again Chris walks past her stoop while holding his deflated football before walking away.\nOverall,the script follows various characters' interactions in different locations such as streets and pools throughout one day.",
  "created_at":"2023-07-24T09:50:09.350045274-07:00",
  "completion_reason":"done"
}

POST /2.0/ai/text_gen

This endpoint will send your prompt and your content or the text representation of your file to the LLM and return the value with an HTTP 200 response and a JSON object documented below. You must always provide a file ID, as this is how the API validates that you and your instance of Box have access to Box AI. You can optionally provide a string in the content object, however the content string will take precedence over the content of the file represented by the id. You can use dialogue_history to keep track of previous text generation requests in a given session.

Object Type Description Values
prompt string The question you wish to ask about your document or content “Summarize this document”
items array This is an array of JSON objects that describe the file or content you wish to add to your context [
  {
    “id” : “123”,
    “type” : “file”,
    “content” : “Add this string to my context”
  }
]
items.id string The box file Id you wish to add to your context. The API will pull the text representation of this file. “1233039227512“
items.type enum The type of object. “file” - the items.id points to a file. This is the only type implemented
“folder” - the items.id points to a folder. This is not yet supported
“hub” - the items.id points to a hub
items.content string This can be any text you wish to add to the LLM context you wish to ask questions in. “Add this additional information to my context in ChatGPT”
dialogue_history array Dialogue history contains the previous prompts and answers from this same item(s) [
  {
    “prompt”: “What is this content about?”,
    “answer”: “This is about public API schemas”,
    “created_at”: “2012-12-12T10:53:43-08:00”
  }
]
dialog_history.prompt string The previous question asked about the current file in the same conversation “What is this content about?”
dialog_history.answer string The answer to the previous question asked about the current file in the same conversation “This is about public API schemas”
dialog_history.created_at string The datetime stamp with timezone information that the previous question was asked and answered “2012-12-12T10:53:43-08:00”

Example Request

{
    "prompt": "What is this content about?",
    "items": [
        {
            "id": "123",
            "type": "file", // enum: file (eventually we will support other types like folder)
            "content": "string"
        },
        {
            "id": "123",
            "type": "file",
            "content": "string"
        },
        {
            "id": "123",
            "type": "hub",
            "content": "string"
        },
        {
            "id": "123",
            "type": "folder",
            "content": "string"
        }
    ],
    "dialogue_history": [
        {
            "prompt": "What is this content about?",
            "answer": "This is about public API schemas",
            "created_at": "2012-12-12T10:53:43-08:00"
        },
        {
            "prompt": "What is this content about?",
            "answer": "This is about public API schemas",
            "created_at": "2012-12-12T10:53:43-08:00"
        }
    ]
}

Response

Object Type Description Values
answer string The answer from the LLM in JSON format " about”
created_at string The datetime stamp with timezone information that the answer was generated. “2012-12-12T10:53:43-08:00”
completion_reason string The indicator that the processing is finished. This will only be included in the final response in the stream. “done”

Example Response

{
  "answer": "The document is a script titled \"Five Feet and Rising\" by Peter Sollett. It begins with a group of girls dancing on the sidewalk, their movements accompanied only by the sound of their bodies. The story then focuses on Amanda, a 14-year-old girl who sits outside her apartment listening to music and reading a magazine. She interacts with Jenette, Aaron, Donna, Michelle, Victor, Carlos, and other characters throughout the script.\nAmanda later goes to Pitt Street Pool where she meets Erica and Francesca along with another girl named Shai. Meanwhile, Victor talks to Carlos about Eddie's cousin from Compost while they are at the pool as well.\nLater in the day on Amanda's block again Chris walks past her stoop while holding his deflated football before walking away.\nOverall,the script follows various characters' interactions in different locations such as streets and pools throughout one day.",
  "created_at":"2023-07-24T09:50:09.350045274-07:00",
  "completion_reason":"done"
}

GET /2.0/metadata_instances/suggestions

Query Parameter Type Description Values
item string The reference for the item to pull metadata from. item=file_1233039227512
scope string The metadata template scope scope=enteprise_867
template_key string The metadata template key to apply template_key=myAwesomeTemplate
confidence string The confidence level the suggestion must surpass. Only “experimental” is supported in this release. confidence=experimental

Example Request

GET https://api.box.com/2.0/metadata_instances/suggestions?item=file_12345&scope=enterprise_867&template_key=myAwesomeTemplate&confidence=experimental

Response

The response is a JSON object presenting suggestions for possible metadata key-value pairs based on your template and document. The table below describes the fields in the response.

Object Type Description Values
$scope string The scope provided in the request. " enterprise_867”
$template_key string The metadata template key specified in the request. “myAwesomeTemplate”
suggestions JSON object A JSON object containing the metadata suggestions, by field. {
  “stringFieldKey”: “fieldVal1”,
  “floatFieldKey”: 124.0,
  “enumFieldKey”: “EnumOptionKey”,
  “multiSelectFieldKey”: [“multiSelectOption1”, “multiSelectOption5”]
}

Example Response

[
  {
    "$scope": "enterprise_867",
    "$templateKey": "myAwesomeTemplate",
    "suggestions": {
      "stringFieldKey": "fieldVal1",
      "floatFieldKey": 124.0,
      "enumFieldKey": "EnumOptionKey",
      "multiSelectFieldKey": ["multiSelectOption1", "multiSelectOption5"]
    }
  },
  {
    "$scope": "enterprise_867",
    "$templateKey": "myOtherAwesomeTemplate",
    "suggestions": {
      "stringFieldKey": "fieldVal1",
      "floatFieldKey": 124.0,
      "enumFieldKey": "EnumOptionKey",
      "multiSelectFieldKey": ["multiSelectOption1", "multiSelectOption5"]
    }
  }
]

Additional Resources

In order to help you get started with the API, we are providing some additional resources to help.

Sample Code

Troubleshooting

  • The item object contains a Box file ID in the id object. This is required for authorization of Box AI API access. It can also contain a string in the content object. If you provide both, the content string will take precedence over the content of the file represented by the id .
  • Multi doc support is not yet fully supported in the backend, but it should not fail with the API, so mode: “multiple_item_qa” may result in some errors.
  • Multi doc supports a maximum total file size for all included files of 10MB. This is the actual files, not the text representation. It will continue to work but the answer quality and API performance will degrade. Some file types are even more difficult to parse and there will be a worse experience at a lower size such as CSV and Excel files.
  • If you upload a new file and then try to make an API call, it will often fail because the text rep hasn’t been generated yet. The API call checks if a text rep is available, and if it isn’t, it returns an error and kicks off the text rep creation, so retrying in a little while, the request will likely work.
  • Metadata suggestions are just that, suggestions. To associate that Metadata with the file, you will need to use the standard Metadata API.
  • If you see a 403 error, be sure you have the ai.readwrite scope, as well as sufficient scope to access the file, like root_readwrite.
1 Like

As annouced today at BoxWorks, we will be relasing the Box AI API set in the coming months. This is a sneak peak at how the endpoints are structured and what you can do with them. Feel free to ask any questions or share any thoughts!

2 Likes

Attaching a summary of yesterday’s Q&A with our product team on the API here[here]
(AI and Productivity - BoxWorks recap)

If you are interested to participate in our limited beta please reach out.