Dataset endpoints allow you to manage the data components of your flows in GraphorLM. Dataset nodes are the entry points that connect your uploaded documents to the RAG pipeline.

What are Dataset Nodes?

Dataset nodes are fundamental components in GraphorLM flows that:
  • Connect Sources to Flows: Link your uploaded documents to processing pipelines
  • Control Data Input: Determine which files are included in each dataset
  • Enable Configuration: Allow customization of how documents are processed
  • Support Flow Logic: Act as the starting point for RAG pipelines
Dataset component

Available Endpoints

GraphorLM provides comprehensive REST API endpoints for dataset management:

Dataset Node Structure

Each dataset node contains:
{
  "id": "dataset-1748287628684",
  "type": "dataset",
  "position": { "x": 100, "y": 200 },
  "data": {
    "name": "Research Papers",
    "config": {
      "files": [
        "attention.pdf",
        "transformer_architecture.pdf",
        "bert.pdf"
      ]
    },
    "result": {
      "updated": false,
      "lastRun": "2024-01-15T10:30:00Z"
    }
  }
}

Key Components

ComponentDescription
IDUnique identifier for the dataset node
ConfigFile selection and processing settings
ResultStatus and metadata from last processing
PositionVisual placement in the flow editor

Authentication

All dataset endpoints require authentication via API tokens:
Authorization: Bearer YOUR_API_TOKEN
Learn how to generate and manage API tokens in the API Tokens guide.

URL Structure

Dataset endpoints follow a consistent URL pattern:
https://{flow_name}.flows.graphorlm.com/datasets[/{node_id}]
Where:
  • {flow_name}: The name of your deployed flow
  • {node_id}: The specific dataset node identifier (for update operations)

Basic Usage Example

// List all dataset nodes
const response = await fetch(`https://${flowName}.flows.graphorlm.com/datasets`, {
  headers: { 'Authorization': `Bearer ${apiToken}` }
});
const datasets = await response.json();

// Update a dataset with new files
const updateResponse = await fetch(`https://${flowName}.flows.graphorlm.com/datasets/${nodeId}`, {
  method: 'PATCH',
  headers: {
    'Authorization': `Bearer ${apiToken}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    config: { files: ['new_file.pdf'] }
  })
});

Common Workflow

  1. List Available Nodes: Use the List Dataset Nodes endpoint
  2. Update Configuration: Use the Update Dataset endpoint to configure files
  3. Deploy Flow: Deploy the flow to apply changes

Error Handling

Common error responses:
Error TypeHTTP StatusDescription
Authentication401Invalid or missing API token
Flow Not Found404Flow doesn’t exist or isn’t accessible
Node Not Found404Dataset node doesn’t exist in the flow
Files Not Found400Specified files don’t exist as sources

Next Steps