civicworks.io

For site operators

Bot policy

This page exists so that anyone seeing CivicWorks traffic in their logs can verify who we are, what we collect, and how to reach a real person.

How we identify ourselves

Every request our crawler sends includes a full User-Agent string with a project URL and a real contact email:

User-Agent: CivicWorksBot/1.0 (+https://civicworks.io/about; contact: david.yokum@unc.edu)

What we collect

Public job postings only. We read the same pages a visitor looking for a job would read. We do not:

  • Bypass authentication, paywalls, or access controls.
  • Collect applicant data or anything behind a login.
  • Resell data. CivicWorks is free to use and free to link to.

Rate-limit commitment

We keep our footprint light. The defaults we commit to, per site:

  • At most one request per 2 seconds from a single IP. In practice, usually less.
  • No parallel crawling of the same host.
  • We honor robots.txt and the Crawl-delay directive.
  • We back off on errors — 429, 503, or repeated 5xx responses pause the crawl for that host.
  • We prefer official data feeds — open data portals, APIs, and RSS — wherever a state publishes them, and only fall back to page-level crawling when they don’t exist.

Requesting changes or opting out

If you run a state job-board site and want us to slow down, change behavior, or stop crawling your site entirely, email david.yokum@unc.edu. We commit to:

  • Acknowledging your request within two business days.
  • Implementing opt-out requests immediately on acknowledgment — no justification required.
  • Working with you on an alternative (a feed, an API, a scheduled export) if you’d rather partner than block.

See also

About CivicWorks— who runs the project, which institution backs it, and the full data-sources methodology.