U.S. Entity Email Security Intelligence Dataset:
A Multi-Sector DNS Authentication Registry Covering 801,359 Organizations

Lokentra Research Team
Lokentra U.S. Email Security Index (ESI)
March 2026
research@monitorworkspace.com • monitorworkspace.com/scorecard

Abstract

We present the Lokentra U.S. Email Security Index (ESI), the most comprehensive entity-attributed DNS email security dataset for the United States. The registry covers 801,359 organizations spanning nonprofits, for-profit businesses, government entities, and educational institutions across all 50 states and six territories, with 577,882 domains fully DNS-profiled across seven record types including SPF, DKIM, DMARC, CAA, and MTA-STS. We find that only 38.8% of U.S. organizations with a web presence have any DMARC record, and only 5.6% deploy DMARC at enforcement level (p=reject) — the only control that actively prevents domain spoofing. DKIM adoption stands at 36.9%, exposing the majority of U.S. organizations to email impersonation. Grade distribution is heavily skewed toward poor posture: 28.6% of scanned domains receive a Grade F (minimal or no authentication), compared to only 0.4% achieving Grade A (full enforcement). Google Workspace and Microsoft 365 are nearly equally deployed at the national level, together accounting for 51.4% of all classified email infrastructure. Sector analysis reveals dramatic variation: higher education achieves 71.4% A/B grades while IRS-registered nonprofits reach only 10.3%, a 61.1 percentage-point gap. The dataset links 434,659 entities to geocoordinates, enabling geographic analysis of security posture at the state, county, and municipal level. To our knowledge, this is the largest and most diverse entity-attributed email security dataset publicly described.

Keywords: email security, DMARC, SPF, DKIM, DNS analysis, cybersecurity, public sector, nonprofit, small business, Google Workspace, Microsoft 365, threat intelligence

1. Introduction

Email remains the primary vector for phishing, business email compromise (BEC), and ransomware deployment. The Federal Bureau of Investigation's Internet Crime Complaint Center (IC3) reported $2.9 billion in BEC losses in 2023, with spoofed email domains the primary mechanism. Despite the availability of free, standards-based defenses — SPF, DKIM, and DMARC — adoption across U.S. organizations outside the federal government remains inconsistently measured and broadly insufficient.

Binding Operational Directive (BOD) 18-01 mandated DMARC enforcement for federal .gov agencies in 2017. No equivalent mandate exists for the 800,000+ state, local, nonprofit, and private sector organizations that constitute the majority of U.S. infrastructure. This study attempts to measure the resulting security gap at scale.

This paper makes four contributions:

  1. A nationwide multi-sector entity registry of 801,359 organizations linked to internet domains, sourced from six authoritative government datasets.
  2. Full-spectrum DNS profiling across seven record types (MX, SPF, DKIM, DMARC, CAA, MTA-STS, TLS-RPT) for 577,882 domains, the largest known entity-attributed scan.
  3. Cross-sector benchmarking establishing baseline authentication rates for nonprofits, businesses, government, and education — enabling policy, procurement, and research comparisons.
  4. Geographic attribution linking 434,659 entities to geocoordinates, enabling state- and county-level risk mapping.

2. Dataset Overview

2.1 Registry at a Glance

801,359
Total entities registered
577,882
Domains DNS-profiled
59
States & territories covered
38.8%
DMARC adoption rate
28.6%
Grade F domains
79.4%
Active / resolving domains
Table 1. Full ESI Registry Scope
MetricValueNotes
Total entities registered801,359Organizations with verified government-source records
Entities with internet domain619,376 (77.3%)Website URL resolved to apex domain
Domains DNS-profiled577,882Full 7-check scan (MX, SPF, DKIM, DMARC, CAA, MTA-STS, TLS-RPT)
Domains alive / active459,018 (79.4%)MX record resolves; domain is email-capable
Domains dormant (no MX)72,650 (12.6%)Domain resolves but no email infrastructure
Domains dead (NXDOMAIN)44,766 (7.7%)Domain no longer exists in DNS
Domains timeout / error1,448 (0.3%)Resolution failed; domain state unknown
Entities geocoded434,659 (54.2%)Latitude/longitude from ZIP or county centroid
States and territories covered5950 states + DC + territories
DNS record types checked per domain7MX, SPF, DKIM, DMARC, CAA, MTA-STS, TLS-RPT

2.2 Sector Composition

Table 2. Registry Composition by Sector
SectorPrimary SourcesEntitiesShare
Nonprofits (IRS BMF)IRS Business Master File206,16125.7%
Businesses — large/mid (SAM.gov)SAM.gov federal registration144,62318.0%
Special districtsState registries, Census138,16517.2%
Small businesses (10KSB)Goldman Sachs 10,000 Small Businesses directory95,26011.9%
Nonprofits (SAM.gov registered)SAM.gov — nonprofit business type86,59610.8%
Government — municipalCensus TIGER, state registries40,8565.1%
Government — state agenciesState government portals28,9353.6%
MunicipalitiesCensus Gazetteer20,7352.6%
K-12 educationNCES CCD 2024-202520,0212.5%
TownshipsCensus Gazetteer12,1051.5%
Higher educationNCES IPEDS, state boards3,1710.4%
CountiesCensus Gazetteer3,1460.4%
Other public sectorState registries, EPA SDWIS1,5850.2%
Total801,359100%

3. Methodology

3.1 Entity Collection

Table 3. Primary Data Sources
SourceSegments CoveredMethodRecords
IRS Business Master File (BMF)All 501(c) nonprofitsAnnual federal data release206,161
SAM.gov Public Extract (V2)Businesses, SAM nonprofitsMonthly pipe-delimited flat file, 142 fields231,219
Goldman Sachs 10KSB DirectorySmall businessesInstitutional program participant list95,260
NCES CCD 2024-2025K-12 school districtsFederal flat file (CSV)20,021
NCES IPEDSHigher education institutionsFederal API3,171
U.S. Census Bureau (TIGER/Gazetteer)Government, municipalities, townships, countiesFederal datasets, state registry aggregation245,527

All entity records are ingested into a unified PostgreSQL schema with normalized fields for entity name, type, subtype, state, county, primary domain, and geocoordinates. Entities are deduplicated via SHA-1 hash of normalized name, entity type, and state. Cross-source matching is performed for entities appearing in multiple registries (e.g., a nonprofit registered in both IRS BMF and SAM.gov).

3.2 DNS Profiling Pipeline

Each entity's website URL is resolved to its apex domain, then profiled concurrently across seven DNS record types using a Python-based pipeline with 50 parallel worker threads. The pipeline classifies email providers from MX records using a provider detection library covering 12 classified providers (Google Workspace, Microsoft 365, Proofpoint, Mimecast, GoDaddy, Zoho Mail, Yahoo Mail, Amazon SES, Mailgun, SendGrid, iCloud Mail, and Fastmail). Email security gateways are detected by comparing MX host names against gateway fingerprint patterns. All check results are stored with timestamps in a versioned dns_checks table, enabling longitudinal tracking.

3.3 Scoring Rubric

Table 4a. SPF Scoring (30 points)
ConfigurationPoints
SPF record with -all (hard fail)30
SPF record with ~all (soft fail)15
SPF record with ?all or +all5
No SPF record0
Table 4b. DKIM Scoring (30 points)
ConfigurationPoints
DKIM public key published (v=DKIM1)30
No DKIM key found0
Table 4c. DMARC Scoring (40 points)
ConfigurationPoints
p=reject (full enforcement)40
p=quarantine20
p=none (monitoring only)10
No DMARC record0
Table 4d. Grade Scale
GradeScoreInterpretation
A90–100Full enforcement
B70–89Strong posture
C50–69Partial protection
D30–49Weak configuration
F0–29Minimal/no authentication

4. Findings

4.1 Email Authentication Adoption — National Overview

Table 5. Authentication Protocol Adoption — All 577,882 Scanned Domains
ProtocolDomains with recordAdoption rateSignificance
MX records (any)459,01879.4%Email infrastructure present
SPF (any)399,07469.0%Sender authorization configured
DKIM (any)213,29236.9%Message signing configured
DMARC (any policy)224,23038.8%Domain spoofing policy present
  — DMARC p=none141,04124.4%Monitoring only; no enforcement
  — DMARC p=quarantine50,2768.7%Partial enforcement
  — DMARC p=reject32,3615.6%Full enforcement — spoofing blocked
CAA records8,0891.4%Certificate issuance restrictions
MTA-STS2,8890.5%Transport layer encryption policy
TLS-RPT3,4670.6%Transport security reporting

The headline finding is stark: only 5.6% of U.S. organizations have DMARC configured at enforcement level (p=reject). An additional 8.7% use p=quarantine, leaving 85.7% of all scanned domains without active spoofing prevention. Of the 38.8% with any DMARC record, 62.9% are monitoring-only (p=none), providing visibility without protection.

DKIM adoption at 36.9% is notably lower than SPF (69.0%). This gap is consistent with the relative complexity of DKIM key management versus SPF's single-record configuration. CAA and MTA-STS remain rare, below 2% nationally.

4.2 Grade Distribution

Table 6. Email Security Grade Distribution — 577,882 Domains
GradeDomainsShareInterpretation
A (90–100)2,5840.4%SPF -all + DKIM + DMARC reject
B (70–89)138,27123.9%Strong posture, not full enforcement
C (50–69)129,88122.5%SPF + DKIM but weak/no DMARC
D (30–49)141,82424.5%SPF only or partial DMARC
F (0–29)165,32228.6%Minimal or no authentication

More than a quarter of all U.S. organization domains (28.6%) receive a Grade F — meaning they have either no SPF, no DKIM, and no DMARC, or only the weakest possible configurations. Combined D and F domains represent 53.1% of the scanned universe. Only 24.3% of domains achieve a B or better.

4.3 Email Provider Distribution

Table 7. Email Provider Distribution — 577,882 Domains
ProviderDomainsShare of classified
Google Workspace150,19732.7%
Microsoft 365146,88132.0%
Other / self-hosted114,71925.0%
Proofpoint (gateway)24,8585.4%
GoDaddy9,2392.0%
Mimecast (gateway)5,1781.1%
Zoho Mail3,4820.8%
Yahoo Mail1,8710.4%
Mailgun1,7850.4%
Amazon SES4850.1%
iCloud Mail3120.1%

Denominator: 459,018 domains with classified MX provider. Excludes 118,864 domains where MX lookup returned an error or provider was not classified.

Google Workspace and Microsoft 365 are nearly equally deployed at national scale, with a combined 64.7% share of classified email infrastructure. This near-parity contrasts sharply with sector-level findings: K-12 education is dominated by Google (4:1 over Microsoft), while higher education and government invert this pattern toward Microsoft. The 5.4% of domains using Proofpoint as a gateway correlates strongly with higher security posture, as gateway-proxied domains exhibit substantially better DMARC adoption (see cross-sector analysis).

4.4 Domain Health — Freshness and Status

Table 8. Domain Health Status Distribution
StatusDomainsShareDefinition
Fresh / Active459,01879.4%MX resolves; scanned within 30 days
Dormant (no MX)72,65012.6%Domain resolves but has no email infrastructure
Dead (NXDOMAIN)44,7667.7%Domain no longer exists in DNS
Unknown (timeout/error)1,4480.3%DNS resolution failed

The 7.7% NXDOMAIN rate reflects domain expiration or deliberate abandonment, concentrated in older IRS-registered nonprofits and legacy SAM.gov registrations. The 12.6% dormant rate represents organizations with a web presence but no active email infrastructure — typically small nonprofits or businesses using hosted email outside their primary domain. Combined, 20.3% of registered domains are email-inactive, meaning phishing exposure analysis should focus on the 79.4% that are email-capable.

4.5 Cross-Sector Security Ranking

Table 9. Email Security Posture by Sector — Ranked by A/B Grade Rate
SectorScannedMXSPFDMARCrejectA/BF
Higher education70091.9%89.0%23.0%71.4%7.4%
K-12 education15,56886.4%79.7%55.5%12.2%47.1%19.2%
Government — municipal38,75496.5%90.6%59.7%12.6%39.1%13.2%
Businesses (SAM.gov)133,80995.8%87.9%53.5%11.7%33.8%18.3%
Nonprofits (SAM.gov)83,44890.3%79.5%55.7%6.9%35.2%20.6%
Government — state27,81486.2%80.7%51.1%10.6%37.6%22.3%
Small businesses (10KSB)93,21490.9%81.9%43.5%6.3%27.3%17.8%
Nonprofits (IRS BMF)201,49264.3%49.8%22.4%2.0%10.3%49.6%

The sector gap is dramatic. Higher education leads with 71.4% of scanned domains achieving Grade A or B, while IRS-registered nonprofits achieve only 10.3% — a 61.1 percentage-point gap. The nonprofit disparity is largely explained by composition: the IRS BMF includes 200,000+ small civic organizations, religious groups, and community associations with minimal technical capacity. SAM.gov-registered nonprofits, which are federally active organizations with operational sophistication, achieve 35.2% A/B, three times better.

Government municipal entities outperform government state entities on F-rate (13.2% vs 21.4%), likely reflecting the broad state-level category including small administrative offices and legacy agencies alongside well-resourced departments. Businesses with federal contracts (SAM.gov) score significantly better than small businesses (10KSB) on A/B rate (33.8% vs 27.3%), consistent with federal procurement cybersecurity requirements creating upward pressure on contractor posture.

5. Provider-Security Correlation

Table 10. Authentication Rates by Provider Category
ProviderDomainsSPFDMARCDKIM
Email Security Gateway (Proofpoint, Mimecast)30,03697.3%75.7%76.2%
Enterprise Cloud (Google Workspace, M365)297,07893.2%60.4%52.1%
GoDaddy / budget hosting9,23951.3%23.8%18.2%
Yahoo Mail / iCloud / personal2,18338.4%14.1%9.7%
Other / self-hosted114,71984.0%25.9%21.3%

Email security gateway users achieve 97.3% SPF and 75.7% DMARC adoption — the strongest posture of any provider category. Domains using budget hosting providers (GoDaddy, shared hosting) exhibit dramatically weaker authentication, with SPF at 51.3% and DMARC at 23.8%. Organizations using personal email providers (Yahoo, iCloud) as their domain email provider reach only 38.4% SPF adoption, representing the most exposed cohort after domains with no MX infrastructure at all.

6. Dataset Schema

Table 11. Core Entity Schema (selected fields)
FieldTypeDescription
entity_idintegerUnique entity identifier
entity_namestringOfficial organization name
entity_typeenumnonprofit, business, business_smb, k12, higher_ed, govt_municipal, govt_state, county, special_district
entity_subtypestringDetailed classification within type
statestringTwo-letter USPS state code
countystringCounty name (where available)
primary_domainstringApex internet domain
latitude / longitudefloatGeocoordinates (ZIP-derived or county centroid)
source_namestringAuthoritative data source (IRS, SAM.gov, NCES, etc.)
DNS Check Fields (from dns_checks table)
gradestringA / B / C / D / F composite score
mx_providerstringClassified email provider
mx_gateway_vendorstringSecurity gateway if detected
spf_status / spf_qualifierstringpass/warn/fail; -all/~all/?all/+all
spf_lookup_countintegerDNS lookup depth (RFC 7208 10-limit)
dkim_status / dkim_selectorstringpass/fail; matching selector
dmarc_status / dmarc_policystringpass/warn/fail; none/quarantine/reject
dmarc_rua / dmarc_rufstringAggregate and forensic report URIs
caa_statusstringpass/fail — certificate authority authorization
mta_sts_statusstringpass/fail — transport encryption policy
tls_rpt_statusstringpass/fail — transport security reporting
checked_attimestampUTC timestamp of DNS scan

7. Research Applications

7.1 Cybersecurity Policy

7.2 Enterprise Sales & Market Intelligence

7.3 Academic Research

8. Limitations

9. Data Availability

Table 12. Licensing Tiers
TierScopeSuggested Use
State PackAll entities in one state, all sectorsState-level risk assessment, policy analysis
Sector PackAll entities in one sector, nationalMarket intelligence, sector research
GovernmentAll municipal + state entities (69,791)SLTT cybersecurity research
NonprofitAll nonprofits — IRS + SAM (292,757)Nonprofit sector risk analysis
BusinessAll SAM + SMB businesses (239,883)Commercial threat intelligence, sales targeting
Full RegistryAll 801,359 entities, all sectorsComprehensive research, national policy
API + Quarterly UpdatesFull registry, re-scanned quarterlyLongitudinal studies, dashboards, monitoring

Delivery formats: CSV/Parquet, SQLite (pre-indexed), REST API, or interactive dashboard. All deliveries include full methodology documentation, source provenance, data dictionary, and reproducibility scripts (Python). Interactive exploration available at the MonitorWorkspace Email Scorecard.

Contact: research@monitorworkspace.com

Citation: Lokentra Research Team (2026). U.S. Entity Email Security Intelligence Dataset: A Multi-Sector DNS Authentication Registry Covering 801,359 Organizations. Lokentra U.S. Email Security Index (ESI). https://lokentra.com/research/overall-dataset-paper.html

Data sources: IRS Business Master File; SAM.gov Public Extract V2 (March 2026); Goldman Sachs 10,000 Small Businesses program directory; NCES Common Core of Data 2024-2025; NCES IPEDS; U.S. Census Bureau TIGER/Gazetteer. All DNS data derived from publicly accessible DNS records.

Competing interests: Lokentra develops MonitorWorkspace, a Google Workspace administration platform. The ESI dataset is produced by the Lokentra Research Division independently of the product team.