Skip to main content
The Get elements endpoint returns the parsed elements (chunks/partitions) of a source in the same format as Get build status elements. Each item includes explicit fields: element_id, element_type, text, markdown, html, img_base64 (optional), position, page_number, bounding_box, page_layout, and more.

Endpoint overview

Authentication

This endpoint requires authentication using an API token. Include your API token as a Bearer token in the Authorization header.
Learn how to create and manage API tokens in the API Tokens guide.

Request format

Headers

HeaderValueRequired
AuthorizationBearer YOUR_API_TOKENYes

Query parameters

ParameterTypeRequiredDescription
file_idstringYesUnique identifier of the source
pageintegerNo1-based page number. Use with page_size to enable pagination
page_sizeintegerNoNumber of elements per page (1–100). Use with page
suppress_img_base64booleanNoWhen true, img_base64 is omitted from each element (reduces payload size)
typestringNoFilter by element type (e.g. NarrativeText, Title, Table)
page_numberslist of integersNoRestrict to specific page numbers (repeat param for multiple: ?page_numbers=1&page_numbers=2)
elementsToRemovelist of stringsNoElement types to exclude (repeat param for multiple)
When page and page_size are omitted, all elements are returned (no pagination). When only one is provided, pagination is not applied.

Response format

Success response (200 OK)

Paginated response with items as BuildStatusElement (same shape as elements in Get build status):
{
  "items": [
    {
      "element_id": "0ee55f099828817da5485796b339aeab",
      "element_type": "Title",
      "text": "Attention Is All You Need",
      "markdown": "## Attention Is All You Need",
      "html": "<h2>Attention Is All You Need</h2>",
      "img_base64": null,
      "position": 5,
      "page_number": 1,
      "bounding_box": {
        "left": 211.488,
        "top": 148.43,
        "width": 188.41,
        "height": 17.22
      },
      "page_layout": { "width": 612.0, "height": 792.0 },
      "page_annotation": null,
      "page_keywords": null,
      "page_topics": null,
      "metadata": {}
    }
  ],
  "total": 393,
  "page": 1,
  "page_size": 10,
  "total_pages": 40
}

Pagination fields

FieldTypeDescription
itemsarrayElements in the current page (or all elements if pagination not used)
totalintegerTotal number of elements (matching filters)
pageinteger | nullCurrent page (1-based), or null when no pagination
page_sizeinteger | nullElements per page, or null when no pagination
total_pagesinteger | nullTotal pages, or null when no pagination

Element fields (BuildStatusElement)

FieldTypeDescription
element_idstring | nullUnique identifier for the element
element_typestring | nullType: e.g. Title, NarrativeText, Table, Image
textstringPlain text content
markdownstring | nullMarkdown representation when available
htmlstring | nullHTML representation when available
img_base64string | nullBase64-encoded image data (omitted when suppress_img_base64=true)
positioninteger | nullOrder/position within the document
page_numberinteger | nullPage number (1-based) where the element appears
bounding_boxobject | nullBounding box (e.g. left, top, width, height) when available
page_layoutobject | nullPage dimensions (width, height) when available
page_annotationstring | nullAnnotation/summary for the page
page_keywordsarray | nullKeywords extracted for the page
page_topicsarray | nullTopics extracted for the page
metadataobjectAdditional metadata

Element types

TypeDescription
TitleDocument and section titles
NarrativeTextMain body paragraphs
ListItemBullet or numbered list items
TableData tables
TableRowRows within tables
ImagePictures or graphics
HeaderHeader content
FooterFooter content
FormulaMathematical formulas
FigureCaptionCaptions for figures
PageNumberPage numbering
CodeSnippetCode segments
LinkHyperlinks
UncategorizedTextText that doesn’t fit other categories

Code examples

JavaScript/Node.js

const getElements = async (apiToken, fileId, options = {}) => {
  const params = new URLSearchParams({ file_id: fileId });
  if (options.page != null) params.set("page", String(options.page));
  if (options.pageSize != null) params.set("page_size", String(options.pageSize));
  if (options.suppressImgBase64) params.set("suppress_img_base64", "true");
  if (options.type) params.set("type", options.type);
  if (options.pageNumbers?.length) options.pageNumbers.forEach((n) => params.append("page_numbers", String(n)));
  if (options.elementsToRemove?.length) options.elementsToRemove.forEach((t) => params.append("elementsToRemove", t));

  const url = `https://sources.graphorlm.com/get-elements?${params}`;
  const response = await fetch(url, {
    method: "GET",
    headers: { Authorization: `Bearer ${apiToken}` },
  });
  if (!response.ok) throw new Error(`Failed: ${response.status} ${await response.text()}`);
  return response.json();
};

// Usage: first page of titles
getElements("grlm_your_api_token_here", "file_abc123", { page: 1, pageSize: 10, type: "Title" })
  .then((data) => data.items.forEach((el) => console.log(`${el.element_type}: ${el.text}`)))
  .catch(console.error);

Python

import requests

def get_elements(api_token, file_id, page=None, page_size=None, suppress_img_base64=False,
                 element_type=None, page_numbers=None, elements_to_remove=None):
    url = "https://sources.graphorlm.com/get-elements"
    headers = {"Authorization": f"Bearer {api_token}"}
    params = {"file_id": file_id}
    if page is not None:
        params["page"] = page
    if page_size is not None:
        params["page_size"] = min(100, max(1, page_size))
    if suppress_img_base64:
        params["suppress_img_base64"] = "true"
    if element_type:
        params["type"] = element_type
    if page_numbers:
        params["page_numbers"] = page_numbers
    if elements_to_remove:
        params["elementsToRemove"] = elements_to_remove

    response = requests.get(url, headers=headers, params=params, timeout=30)
    response.raise_for_status()
    return response.json()

# Usage: tables from pages 2–4
data = get_elements(
    "grlm_your_api_token_here",
    "file_abc123",
    page=1,
    page_size=50,
    element_type="Table",
    page_numbers=[2, 3, 4],
)
for el in data["items"]:
    print(f"Page {el['page_number']}: {el['text'][:100]}...")

cURL

# First page, 10 elements
curl -X GET "https://sources.graphorlm.com/get-elements?file_id=file_abc123&page=1&page_size=10" \
  -H "Authorization: Bearer grlm_your_api_token_here"
# Only NarrativeText, exclude images from payload
curl -X GET "https://sources.graphorlm.com/get-elements?file_id=file_abc123&type=NarrativeText&suppress_img_base64=true" \
  -H "Authorization: Bearer grlm_your_api_token_here"
# Filter by page numbers and exclude some types
curl -X GET "https://sources.graphorlm.com/get-elements?file_id=file_abc123&page_numbers=1&page_numbers=2&elementsToRemove=PageNumber&elementsToRemove=Footer" \
  -H "Authorization: Bearer grlm_your_api_token_here"

Error responses

Status codeDescription
400Invalid input (e.g. missing file_id)
404Source file not found
500Internal server error while loading elements
Example body:
{ "detail": "file_id is required" }
{ "detail": "File not found" }
{ "detail": "Internal server error occurred while loading file elements" }

Best practices

  • Use file_id: Obtain it from List sources or Get build status after ingestion.
  • Reduce payload: Set suppress_img_base64=true when you don’t need image data.
  • Filter server-side: Use type, page_numbers, and elementsToRemove to limit results.
  • Pagination: Use page and page_size (max 100) for large documents to avoid large responses.

Next steps

Get build status

Poll build status and optionally get elements for an async ingestion

List sources

List all sources and their file_ids

Upload sources

Ingest files, URLs, GitHub, or YouTube (async)

Reprocess source

Re-process a source with a different partition method (async)