Skip to main content

Unstructured

This example covers how to use Unstructured to load files of many types. Unstructured currently supports loading of text files, powerpoints, html, pdfs, images, and more.

Setup

You can run Unstructured locally in your computer using Docker. To do so, you need to have Docker installed. You can find the instructions to install Docker here.

docker run -p 8000:8000 -d --rm --name unstructured-api quay.io/unstructured-io/unstructured-api:latest --port 8000 --host 0.0.0.0

Usage

Once Unstructured is running, you can use it to load files from your computer. You can use the following code to load a file from your computer.

import { UnstructuredLoader } from "langchain/document_loaders/fs/unstructured";

const options = {
apiKey: "MY_API_KEY",
};

const loader = new UnstructuredLoader(
"src/document_loaders/example_data/notion.md",
options
);
const docs = await loader.load();

API Reference:

Directories

You can also load all of the files in the directory using UnstructuredDirectoryLoader, which inherits from DirectoryLoader:

import { UnstructuredDirectoryLoader } from "langchain/document_loaders/fs/unstructured";

const options = {
apiKey: "MY_API_KEY",
};

const loader = new UnstructuredDirectoryLoader(
"langchain/src/document_loaders/tests/example_data",
options
);
const docs = await loader.load();

API Reference: