Sources
A Source is a website or URL that feeds data into an aggregator. Each source has its own extraction configuration and schedule.
What is a Source?
Section titled “What is a Source?”A source represents a single data feed. For example, if you’re building a job board aggregator, each company’s career page would be a separate source:
https://jobs.lever.co/acme→ Source 1https://boards.greenhouse.io/acme→ Source 2https://weworkremotely.com/categories/devops→ Source 3
All three sources feed into the same aggregator and produce data in the same schema format.
Source Configuration
Section titled “Source Configuration”Each source has:
| Property | Description |
|---|---|
| URL | The target page to scrape |
| Extractor Config | Rules for extracting data (CSS selectors, field mappings) |
| Schedule | How often to run (hourly, daily, weekly, or manual) |
| Schema Version | Which schema version this source uses |
| Status | Active, paused, or needs correction |
Adding a Source
Section titled “Adding a Source”When you add a source to an aggregator:
- Paste the target URL
- Describe what to extract (or write the extractor config manually)
- The AI generates an extractor config mapped to your schema
- Preview the extracted data
- Run a test flight to verify
- Choose a schedule and save
Extractor Configuration
Section titled “Extractor Configuration”The extractor config defines how to pull data from the HTML. Example:
{ "container": ".job-listing", "fields": { "title": { "selector": "h2.job-title", "type": "text" }, "company": { "selector": ".company-name", "type": "text" }, "url": { "selector": "a.job-link", "type": "attribute", "attribute": "href" } }}Source Status
Section titled “Source Status”| Status | Meaning |
|---|---|
| Active | Running normally on schedule |
| Paused | Manually paused or limit reached |
| Needs Correction | Failed multiple times, requires attention |
When a source fails 3 consecutive times, it’s marked as “needs correction” and you’ll be notified.
Limits
Section titled “Limits”The number of sources you can create depends on your plan:
| Plan | Sources |
|---|---|
| Free | 3 |
| Starter | 25 |
| Pro | 100 |
Related Concepts
Section titled “Related Concepts”- Aggregators - The container for sources
- Schemas - How extracted data is structured