Skip to content

Items

Items are the actual data records collected from your sources. Each time a source runs, it extracts items and stores them in your aggregator.

An item is a single data record that:

  • Belongs to an aggregator
  • Came from a specific source
  • Conforms to a specific schema version
  • Has an identity hash (for deduplication)
  • Is immutable (never updated, only inserted)

When you fetch items via API, each item includes:

{
"id": "clx1abc123",
"source_id": "clx2def456",
"schema_version": 2,
"created_at": "2026-01-20T10:30:00Z",
"data": {
"title": "Senior DevOps Engineer",
"company": "Acme Corp",
"location": "Remote",
"url": "https://acme.com/jobs/123",
"tags": ["devops", "kubernetes", "aws"]
}
}
FieldDescription
idUnique item identifier
source_idWhich source produced this item
schema_versionSchema version the item conforms to
created_atWhen the item was first collected
dataThe actual extracted data

Items are never updated, only inserted. This means:

  • The created_at timestamp represents when we first saw this item
  • You can reliably query “what’s new since X” using the since parameter
  • Historical data is preserved

Items are deduplicated based on identity fields. If a source extracts an item with the same identity hash as an existing item, the new item is still stored (append-only), but you can use the dedupe query parameter to filter duplicates at query time.

The number of items you can store depends on your plan:

PlanItems
Free1,000
Starter25,000
Pro100,000

When you reach your limit:

  1. You’ll be notified at 80% capacity
  2. Sources pause automatically at 100%
  3. Delete old items or upgrade to resume

Fetch items via the API with various filters:

Terminal window
# Get all items
curl "https://api.fetchosaurus.com/api/v1/aggregators/{id}/items"
# Get items since a timestamp
curl "https://api.fetchosaurus.com/api/v1/aggregators/{id}/items?since=2026-01-20T00:00:00Z"
# Get items from a specific source
curl "https://api.fetchosaurus.com/api/v1/aggregators/{id}/items?source_id=clx2def456"
# Get deduplicated items
curl "https://api.fetchosaurus.com/api/v1/aggregators/{id}/items?dedupe=true"