Large file process

The large file process lets you upload files larger than 50MB in multiple data chunks. This process is designed to let you upload compressed files exceeding the serving limit of 100MB. You can then expand the compressed file archives using the API or UI.

File uploads over 100MB cannot be served directly through the static site process or downloaded using the File Store API. The files contained within the tar or gzip archive must not exceed 100MB if they are to be served by File Store once expanded into a folder.

The process uses the existing API endpoints for POST and PATCH, described in the API reference documentation. Additional fields and headers orchestrate a multi-stage process to upload large files across several requests.

Large file upload process limitations

These guidelines and recommendations apply specifically to the Large file process:

  • 50MB upload limit for large files uploaded through the File Store UI.

  • 100MB size limit for non-tar files uploaded through the large file process

  • Upload chunk conditions apply to all files:

    • The first upload chunk must be larger than 5MB.

    • The final upload chunk can be less than 5MB.

    • Upload chunks must be queued sequentially.

  • Long-running POST or PATCH operations are abandoned after one day.

  • Parameters successfully added or changed as part of an abandoned operation are retained.

  • A PATCH operation that adds additional parameters (for example, source or access) will take effect immediately.

  • File data will continue to return as empty for an initial POST operation to create a large file until the process fully completes.

  • File data will be returned during a PATCH operation on an existing file until the process finishes.

Starting the large file upload process

The large file upload process is started using a POST operation to the file creation endpoint if you are uploading a new file or a PATCH to the file change endpoint.

The initial request is sent as a JSON payload requesting the start of a resumable upload.

{
  "uploadType": "resumable",
  "source": "myFile.json", (1)
  "access": "public" (2)
}
1 Optional on the PATCH endpoint.
2 Optional on the PATCH endpoint.

If the request was sent using a POST operation, a file ID will be returned for the new file; this must be used in all requests to the data upload PATCH endpoint.

Examples

Replace the values enclosed in [] with their required values.
POST
curl -X POST https://[SERVER_URL]/__dxp/service/file-storage/api/v2/file -H "x-api-key: [API_KEY]" -H "x-dxp-tenant: [TENANT_ID]" -d '{ "uploadType": "resumable", "source": "myNewFile.json", "access": "public" }'
PATCH
curl -X PATCH https://[SERVER_URL]/__dxp/service/file-storage/api/v2/file/pass:[{{file-id}}] -H "x-api-key: [API_KEY]" -H "x-dxp-tenant: [TENANT_ID]" -d '{ "uploadType": "resumable" }'

Sending chunks of data

Once the large file upload process is started, data must be PATCHed to the file change endpoint. A Content-Range header must be provided to track what chunk of the upload is being processed. The only unit supported by this process is bytes. A size must be provided; * or unknown sizes are not supported. Read the MDN Web Docs on Content-Range for more information.

Data chunks must be sent sequentially, so a synchronous process is advised. The server returns a 202 Accepted response with no content for all but the last chunk. Upon receiving the final chunk, the server returns a 200 OK response with the updated file information in the response body.

Examples

Replace the values enclosed in [] with their required values.
First data chunk
curl -X PATCH https://[SERVER_URL]/__dxp/service/file-storage/api/v2/file/2eb852c8-5d00-4658-b46a-56e29641ccc0 -H "x-api-key: [API_KEY]" -H "x-dxp-tenant: [TENANT_ID]" -H "Content-Range: bytes 0-6427780/6427798" --data-binary "@myPartialFile.txt"
Final data chunk
curl -X PATCH https://[SERVER_URL]/__dxp/service/file-storage/api/v2/file/2eb852c8-5d00-4658-b46a-56e29641ccc0 -H "x-api-key: [API_KEY]" -H "x-dxp-tenant: [TENANT_ID]" -H "Content-Range: bytes 6427781-6427797/6427798" --data-binary "@myPartialFinal.txt"

TypeScript implementation example

This example of a Node.js script written in TypeScript pushes the contents of a large file to the Large File endpoint.

#!/usr/bin/env node
import fs from "fs";
import path from "path";

/**
 * This is a demo script to upload a large file to the LF Endpoint of filestore.
 * It should be run with Node.js.
 */

const API_KEY =
  "YOUR_API_KEY"; // You can generate an API key from the DXP Console
const TENANT_ID = "YOUR_TENANT_ID"; // You can find your tenant ID in the DXP Console
const DXP_URL = "YOUR_DXP_URL"; // This is the base URL of your DXP instance
const LFS_ENDPOINT = "__dxp/service/file-storage/api/v2/file"; // This is the LF endpoint
const FILE_NAME = "YOUR_FILE_NAME"; // The name of the file you want to upload

/**
 * This is the main function that will be run when the script is executed.
 * It will upload the file to the large files endpoint.
 * It will first make an initial request to the LF endpoint to get the file ID.
 * Then, it will stream the file data to the LF endpoint in chunks.
 * The chunk size is set to 5MB.
 * The script will log the progress as it goes.
 * If there are any errors, they will be logged to the console.
 * A success message will be logged to the console if the file is uploaded successfully.
 */
(async () => {
  console.log("Uploading file...");
  try {
    // Make our initial request to the LF endpoint
    const { fileId } = await doInitialRequest();

    // Now we can start streaming the file data to the LF endpoint
    await doStreamRequests(fileId);

    // Done
    console.log("File uploaded successfully!");
  } catch (error) {
    console.error("Error uploading file:", error);
  }
})();

/**
 * This function will stream the file data to the LF endpoint in chunks.
 * @param fileId The file ID returned from the initial request
 */
async function doStreamRequests(fileId: string) {
  // Get the file size so we can calculate the number of chunks we need to send
  const { size } = fs.statSync(path.resolve("./", FILE_NAME));

  // Set the chunk size to 5MB
  const chunkSize = 5 * 1024 * 1024;

  // Calculate the number of chunks we need to send
  const numberOfChunks = Math.ceil(size / chunkSize);

  // Create a buffer to store the chunk data
  const buffer = Buffer.alloc(chunkSize);

  // Open the file for reading
  const file = fs.openSync(path.resolve("./", FILE_NAME), "r");

  // Loop through the file and send the chunks
  for (let i = 0; i < numberOfChunks; i++) {
    // Read the chunk from the file
    const bytesRead = fs.readSync(file, buffer, 0, chunkSize, i * chunkSize);

    // Log the progress
    console.log(
      `Uploading chunk ${i + 1} of ${numberOfChunks} (${bytesRead} bytes)`
    );

    // Send the chunk to the LF endpoint
    await fetch(`${DXP_URL}/${LFS_ENDPOINT}/${fileId}`, {
      method: "PATCH",
      headers: {
        "x-api-key": API_KEY,
        "x-dxp-tenant": TENANT_ID,
        "content-type": "application/octet-stream",
        "content-range": `bytes ${i * chunkSize}-${
          i * chunkSize + bytesRead - 1
        }/${size}`,
      },
      body: buffer.subarray(0, bytesRead),
    });
  }

  // Close the file
  fs.closeSync(file);
}

/**
 * This function will make the initial request to the LF endpoint to get the file ID.
 * @returns The response from the LF endpoint
 */
async function doInitialRequest() {
  // Make our initial request to the LF endpoint
  const response = await fetch(`${DXP_URL}/${LFS_ENDPOINT}`, {
    method: "POST",
    headers: {
      "x-api-key": API_KEY,
      "x-dxp-tenant": TENANT_ID,
      "content-type": "application/json",
    },
    body: JSON.stringify({
      uploadType: "resumable",
      source: FILE_NAME, // optional on the PATCH endpoint
      access: "public", // optional on the PATCH endpoint
    }),
  });

  return await response.json();
}