Skip to main content

Datasets · v2026.05

Cross-Agency Federal Violations

U.S. employers cited by two or more federal agencies — OSHA, WHD, MSHA, EPA, and NLRB joined into a single per-employer view of multi-agency workplace, wage, environmental, and labor-relations violations.

Published Apr 2026 · Refreshed May 2026 · Covers 2005–present · 76,310 employers · CC-BY-4.0

Also available on: Hugging Face · Kaggle · Zenodo

Methodology

Overview

This dataset is the per-employer cross-agency join that powers most of FastDOL's public reporting. Each row is one employer with federal enforcement activity in two or more of: OSHA, WHD, MSHA, EPA, and NLRB. Cross-agency presence is a strong signal — employers with enforcement records at two or more agencies look materially different on workplace fatality and labor-conflict measures than employers cited by only one.

Data sources

  • OSHA — Federal OSHA inspection and violation records (Establishment Search / OIS extracts).
  • WHD — Wage and Hour Division Compliance Action Database.
  • MSHA — Mine Safety and Health Administration Mine Data Retrieval System.
  • EPA — Enforcement and Compliance History Online (ECHO) facility-level enforcement and Quarterly Non-Compliance Reports.
  • NLRB — National Labor Relations Board case filings and decisions.
  • SAM.gov — System for Award Management federal exclusions list.
  • SEC EDGAR — Exhibit 21 subsidiary disclosures, used as one input to corporate-parent resolution.

All sources are public-records data refreshed at FastDOL on a monthly cadence.

Methodology

Each agency's records are first normalized in their own silver-layer table (*_norm), then resolved to a stable employer identity (employer_id) via the FastDOL clustering pipeline. Per-agency aggregates are joined to produce the gold-layer employer_profile_latest row. This dataset is the subset of that table where agency_violation_count >= 2.

Only federal-OSHA inspections are counted in osha_violations. Twenty-eight states operate state-plan OSHA programs that report independently and are not reflected here. State-plan coverage is a known limitation called out explicitly so downstream users can scope claims accordingly.

The risk_score column is a FastDOL composite measure that blends per-agency activity with industry peer comparisons. Its definition is documented in detail at fastdol.com/methodology.

Known limitations

  • Cross-agency joins rely on employer-name matching; entity resolution is high-confidence but imperfect, especially for generic LLC names and DBAs.
  • Federal-OSHA-only coverage means employers operating exclusively in state-plan states will appear under-counted on OSHA columns.
  • Enforcement records reflect what an agency cited, not what was upheld on appeal. Some violations are reduced, withdrawn, or settled on terms not visible in the public extract.
  • "0" in any column means no record at that agency for the matched employer identity — not an affirmation that no relevant activity occurred.
  • The agencies_with_violations count uses agencies where the employer has any cited record; a single MSHA violation and 200 OSHA violations both contribute "1" each.

Use cases

  • Investigative journalism — surfacing employers with broad-but-thin enforcement histories that a single-agency lookup would miss.
  • Cross-agency policy research on the relationship between safety, wage-and-hour, environmental, and labor-relations enforcement.
  • Underwriting and procurement risk diligence at the enterprise level.
  • Academic study of regulatory complementarity and substitution across federal agencies.

Schema

20 columns. Types as serialized in the Parquet file.

ColumnTypeDescription
employer_namestringLegal name of the employer as recorded by federal agencies.
citystringCity of the establishment associated with enforcement.
statestringUSPS two-letter state code (includes territories).
zipintegerZIP code of the primary establishment.
naics_codenumberNAICS industry classification code, stored as float64; nulls preserved.
naics_descriptionstringHuman-readable NAICS industry description.
parent_namestringResolved corporate parent (FastDOL entity resolution); null when no parent is identified.
agencies_with_violationsintegerNumber of distinct federal agencies with enforcement activity for this employer (>= 2).
osha_violationsintegerOSHA violations on record across all inspections.
osha_penaltiesintegerOSHA penalties assessed, in U.S. dollars.
osha_fatalitiesintegerOSHA-investigated workplace fatalities on record.
whd_casesintegerWage and Hour Division enforcement cases on record.
whd_backwagesnumberTotal WHD-assessed back-wages, in U.S. dollars.
msha_violationsintegerMine Safety and Health Administration violations on record.
msha_penaltiesnumberMSHA-assessed penalties, in U.S. dollars.
epa_noncompliance_quartersintegerNumber of quarters in EPA non-compliance status (ECHO QNCR).
epa_penaltiesintegerEPA-assessed penalties, in U.S. dollars.
nlrb_casesintegerNational Labor Relations Board case filings on record.
debarredbooleanWhether the employer appears on the SAM.gov federal exclusions list.
risk_scorenumberFastDOL composite risk score (0–100, continuous).

Cite this dataset

Plain text

Turner, Ben (2026). Cross-Agency Federal Violations (Version 2026.05) [Data set]. FastDOL. https://doi.org/10.5281/zenodo.20031853

BibTeX

@dataset{turner_federalenforcement_2026,
  author    = {Turner, Ben},
  title     = {Cross-Agency Federal Violations},
  year      = {2026},
  version   = {2026.05},
  publisher = {FastDOL},
  doi       = {10.5281/zenodo.20031853},
  url       = {https://www.fastdol.com/datasets/cross-agency-federal-violations}
}

Changelog

2026.05 — 2026-05-01

  • Initial public release on FastDOL.
  • Mirrored to Hugging Face at FastDOLz/cross-agency-federal-violations.
  • Source data refreshed through 2026-04-30 across all six contributing agencies.
  • Companion analysis published at /blog/what-combined-federal-enforcement-data-reveals.

Related datasets