Partition and load files using either the unstructured-client sdk and the
Unstructured API or locally using the unstructured library.
API:
This package is configured to work with the Unstructured API by default.
To use the Unstructured API, set
partitionViaApi: true and define apiKey. If you are running the unstructured
API locally, you can change the API rule by defining url when you initialize the
loader. The hosted Unstructured API requires an API key. See the links below to
learn more about our API offerings and get an API key.
Local:
To partition files locally, you must have the unstructured package installed.
You can install it with pip install unstructured.
By default the file loader uses the Unstructured partition function and will
automatically detect the file type.
In addition to document specific partition parameters, Unstructured has a rich set
of "chunking" parameters for post-processing elements into more useful text segments
for uses cases such as Retrieval Augmented Generation (RAG). You can pass additional
Unstructured kwargs to the loader to configure different unstructured settings.
Unstructured document loader interface.
Partition and load files using either the
unstructured-client
sdk and the Unstructured API or locally using theunstructured
library.API: This package is configured to work with the Unstructured API by default. To use the Unstructured API, set
partitionViaApi: true
and defineapiKey
. If you are running the unstructured API locally, you can change the API rule by definingurl
when you initialize the loader. The hosted Unstructured API requires an API key. See the links below to learn more about our API offerings and get an API key.Local: To partition files locally, you must have the
unstructured
package installed. You can install it withpip install unstructured
. By default the file loader uses the Unstructuredpartition
function and will automatically detect the file type.In addition to document specific partition parameters, Unstructured has a rich set of "chunking" parameters for post-processing elements into more useful text segments for uses cases such as Retrieval Augmented Generation (RAG). You can pass additional Unstructured kwargs to the loader to configure different unstructured settings.
Setup: Install the package:
Set the API key in your environment:
Instantiate:
References
https://docs.unstructured.io/api-reference/api-services/sdk https://docs.unstructured.io/api-reference/api-services/overview https://docs.unstructured.io/open-source/core-functionality/partitioning https://docs.unstructured.io/open-source/core-functionality/chunking