U.S. Education Sector Email Security Intelligence Dataset:
A Multi-Source Entity Registry with DNS Authentication Profiling

Lokentra Research Team
Lokentra U.S. Email Security Index (ESI)
March 2026
research@monitorworkspace.com • monitorworkspace.com/scorecard

Abstract

We present the Lokentra U.S. Email Security Index (ESI), a comprehensive multi-sector entity registry with DNS email security profiling covering 577,882 domains across the United States. This paper describes the education subset: 20,021 K-12 school districts and charter LEAs and 3,171 higher education institutions spanning all 50 states and the District of Columbia, sourced from the NCES Common Core of Data (CCD) 2024-2025 school year, IPEDS, and state education agency directories. Each entity is linked to its internet domain, email provider, and a full DNS authentication profile (SPF, DKIM, DMARC) scored on a 100-point rubric. We find that K-12 DMARC adoption stands at 55.5%, compared to 89.0% for higher education — a 33.5 percentage-point gap attributable to institutional IT capacity, vendor ecosystem effects, and the absence of federal email security mandates for sub-federal entities. Google Workspace dominates K-12 email infrastructure at a 4:1 ratio over Microsoft 365, while higher education inverts this pattern. Only 8.2% of K-12 districts deploy email security gateways, versus 20.4% for higher education. The dataset includes phone numbers, physical addresses, grade spans, NCES LEAIDs, and operational school counts for 19,281 matched entities, enabling direct linkage to NCES enrollment, finance, and demographic datasets. To our knowledge, this is the largest entity-attributed email security dataset for the U.S. education sector.

Keywords: email security, DMARC, SPF, K-12, higher education, DNS analysis, public sector cybersecurity, Google Workspace, education technology

1. Introduction

Email remains the primary attack vector against educational institutions. Phishing campaigns targeting school districts have increased substantially, with the FBI's Internet Crime Complaint Center (IC3) reporting education as the most-targeted sector for ransomware incidents in 2023-2024. Despite growing federal attention to government cybersecurity, Binding Operational Directive (BOD) 18-01 — which mandated DMARC at enforcement level for federal agencies — does not extend to the approximately 130,000 K-12 schools and 6,000+ postsecondary institutions operating across the United States.

The security posture of these education entities is largely unmeasured. Existing studies focus on federal .gov domains or narrow state-level samples. No prior work has attempted a comprehensive, entity-attributed DNS analysis covering all U.S. K-12 districts and accredited institutions with email provider classification, security gateway detection, and authentication scoring.

This paper makes three contributions:

  1. A nationwide education entity registry linking 23,192 education entities to their internet domains, sourced from NCES CCD (2024-2025), IPEDS, and state education agency directories.
  2. Full-spectrum DNS profiling of education domains across seven record types, with email provider classification, gateway detection, and a 100-point authentication score.
  3. Empirical findings on the K-12/higher-ed security divide, the Google-Microsoft bifurcation, and the gateway adoption gap — with implications for federal cybersecurity policy.

2. Dataset Overview

2.1 Full Lokentra ESI Registry

Table 1. Full ESI Registry Scope
MetricValue
Total domains DNS-profiled577,882
Domains alive (resolving)459,018 (79.4%)
Public sector entities268,719
For-profit businesses (SAM.gov)239,883
Nonprofits (IRS BMF)292,757
States and territories covered50 + DC
DNS record types per domain7
Entity schema fields40

2.2 Education Subset

Table 2. Education Segment Summary
MetricValue
K-12 school districts and charter LEAs20,021
Higher education institutions3,171
Total education entities23,192
K-12 entities with website / domain15,568
K-12 entities with phone + address19,281 (100% of CCD-matched)
K-12 entities with NCES LEAID19,281
K-12 data freshness2024-2025 school year

3. Methodology

3.1 Entity Collection

Education entities were collected from three primary source categories:

Table 3. Data Sources
SourceCoverageMethod
NCES Common Core of Data (CCD) 2024-2025K-12 districts, all 50 states + territoriesFederal flat file (CSV)
NCES IPEDSHigher education, all 50 statesFederal API
State Departments of EducationK-12 districts, charter organizationsHTML scrape, CSV, API
State Higher Education Coordinating BoardsColleges, universitiesHTML scrape, API
U.S. Census Bureau GazetteerEntity coordinates, county attributionFederal dataset
SAM.gov Entity RegistrationFederal registration cross-referenceFederal API

The CCD 2024-2025 file (ccd_lea_029_2425_w_1a_073025.csv) provides 19,630 LEA records, of which 19,281 have SY_STATUS=1 (Open). These were ingested via a deterministic pipeline that matches existing entities by NCES LEAID and inserts unmatched entities as new records. The CCD file provides phone numbers, physical and mailing addresses, grade span, operational school count, charter status, and LEA type classification for 100% of open LEAs.

3.2 DNS Profiling Pipeline

Each entity's website URL is resolved to its apex domain, then profiled across seven DNS record types (A, CNAME, MX, NS, SPF, DMARC, SOA) using parallel resolution. The pipeline classifies email providers from MX records, detects third-party email security gateways by comparing MX and SPF records, infers the underlying mailbox platform behind proxies, and scores each domain on a 100-point authentication rubric. Entities are deduplicated via SHA-1 hash of normalized name, type, and county.

3.3 Scoring Rubric

Table 4a. SPF Scoring (30 points)
ConfigurationPoints
SPF record with -all (hard fail)30
SPF record with ~all (soft fail)15
SPF record with ?all or +all5
No SPF record0
Table 4b. DKIM Scoring (30 points)
ConfigurationPoints
DKIM public key published30
No DKIM key found0

Selectors checked: google, mail, selector1, selector2, s1, s2, k1

Table 4c. DMARC Scoring (40 points)
ConfigurationPoints
p=reject (full enforcement)40
p=quarantine (partial)20
p=none (monitoring only)10
No DMARC record0
Table 4d. Composite Grade Scale
GradeScoreInterpretation
A90–100Full enforcement
B70–89Strong posture
C50–69Partial protection
D30–49Weak configuration
F0–29Minimal/no authentication

DMARC receives the highest weight (40%) because it is the only protocol that directly prevents domain spoofing in the From: header — the field end users see and trust.

4. The Human Impact: Populations at Risk

Email security gaps are not abstract technical metrics. To estimate the human exposure, we combined operational school counts from the CCD 2024-2025 file (100,081 schools across 19,281 matched K-12 districts) with NCES national averages: 381 students per school, 51 staff and educators per school, and 2 parents or guardians per student.

Table 5. Estimated Population Exposure from K-12 Email Security Gaps
PopulationTotal in DatasetAt Risk (No DMARC, 44.5%)Critically Exposed (Grade F, 19.2%)
Students38.1 million17.0 million7.3 million
Parents and guardians76.3 million33.9 million14.6 million
Staff and educators5.1 million2.3 million980,000
Total people119.5 million53.2 million22.9 million

More than 53 million people — students, parents, and educators — are associated with K-12 districts that lack DMARC email authentication. Their districts can be impersonated by anyone sending a spoofed email. An additional 22.9 million are in Grade F districts with minimal or no email authentication at all. These are children, families, and teachers whose school communications can be forged without detection.

Estimates use NCES 2023-2024 national averages applied to operational school counts from CCD 2024-2025. Individual district enrollment data can be linked via NCES LEAID for precise counts.

5. Findings

5.1 Email Provider Bifurcation

Table 6. Email Provider Distribution by Education Segment
SegmentGoogle WorkspaceMicrosoft 365Ratio
K-12 school districts8,6112,1594.0 : 1
Higher education5791,6801 : 2.9

K-12 districts overwhelmingly use Google Workspace, consistent with Google Workspace for Education's free or heavily discounted licensing for K-12. Higher education institutions invert this pattern, favoring Microsoft 365 at a 2.9:1 ratio. This is the first empirical quantification of this bifurcation at national scale.

5.2 Email Authentication Adoption

Table 7. Authentication Protocol Adoption Rates
ProtocolK-12Higher EdNational AvgK-12 vs Higher Ed
MX records (any)86.4%93.0%79.4%-6.6pp
SPF (any)79.7%91.9%69.0%-12.2pp
DMARC (any)55.5%89.0%38.8%-33.5pp
DMARC at reject~12.2%~23.0%~5.6%-10.8pp

44.5% of K-12 domains have no DMARC record. These districts are fully vulnerable to domain spoofing — an attacker can send email appearing to come from superintendent@district.org and it will be delivered to recipients' inboxes without authentication failure. Higher education achieves 89.0% DMARC adoption, a 33.5 percentage-point lead likely attributable to larger IT teams, dedicated security staff, and higher cybersecurity awareness.

5.3 Email Security Gateway Adoption

Table 8. Email Security Gateway Adoption
Entity TypeDomains with MXUsing GatewayAdoption Rate
Higher education2,76756420.4%
K-1212,6071,0338.2%

Only 8.2% of K-12 districts route mail through a third-party email security gateway (Barracuda, Proofpoint, Mimecast, etc.), the lowest rate of any public-sector entity type. Domains using gateways exhibit substantially stronger authentication posture:

Table 9. Gateway Effect on Authentication
CohortSPFDMARC
Gateway-proxied domains97.3%75.7%
Non-proxied domains89.2%53.0%
Uplift+8.1pp+22.7pp

5.4 Provider-Security Correlation

Table 10. Authentication Adoption by Provider Category
Provider CategoryDomainsSPFDMARCGap
Email Security Proxy4,11197.3%75.7%21.6pp
Enterprise Cloud (Google/M365)19,19093.2%60.4%32.8pp
Government/Education Self-hosted56390.2%55.1%35.1pp
Other/Self-hosted3,60784.0%25.9%58.1pp
Budget Hosting (GoDaddy, IONOS)1,52251.3%23.8%27.5pp

Email infrastructure choice is strongly predictive of security posture. Districts using budget hosting providers exhibit SPF rates as low as 7.3% (GoDaddy), rendering them effectively unprotected. The vendor ecosystem a district selects determines its security floor.

6. Dataset Schema

Table 11. Entity Schema (Education-Relevant Fields, 40 columns total)
FieldTypeDescription
entity_namestringOfficial district or institution name
entity_typeenumk12 or higher_ed
entity_subtypestringschool_district, charter_district, community_college, university, etc.
statestringTwo-letter USPS code
countystringCounty name
primary_domainstringApex internet domain
mx_providerstringClassified email provider (Google, M365, etc.)
has_spf / has_dkim / has_dmarcbooleanProtocol presence flags
dmarc_policystringnone, quarantine, reject
email_proxystringSecurity gateway service name
underlying_providerstringReal mailbox platform behind gateway
dns_scoreinteger0–100 composite security score
gradestringA / B / C / D / F
CCD 2024-2025 Enrichment Fields
nces_leaidstringNCES LEA ID — canonical join key to enrollment, finance, demographics
phonestringDistrict main phone number
physical_addressstringPhysical street address
mailing_addressstringMailing address
grade_low / grade_highstringGrade span (e.g., PK–12)
operational_schoolsintegerNumber of operational schools in district
lea_typestringLEA type (regular, charter agency, regional, specialized)
charter_flagstringCharter status code
source_name / source_urlstringAuthoritative collection source with URL

7. Research Applications

7.1 Education Policy

7.2 EdTech and Procurement

7.3 Cybersecurity Research

8. Cross-Sector Benchmarking

The ESI scoring rubric is applied uniformly across the full entity registry, enabling cross-sector comparisons:

Table 12. Available Sector Datasets
SectorSourceEntities
Education (K-12 + Higher Ed)NCES CCD 2024-2025, IPEDS, state agencies23,192
For-profit businessesSAM.gov federal entity registrations239,883
NonprofitsIRS Business Master File (BMF)292,757
Public sector (non-education)State registries, Census, EPA SDWIS245,527
Total801,359

9. Limitations

10. Data Availability

Table 13. Licensing Tiers
TierScopeSuggested Use
State PackSingle state, all education entitiesRegional studies, state policy
K-12 National20,021 K-12 districts + chartersNational K-12 cybersecurity research
Higher Ed National3,171 institutionsHigher ed technology adoption
Full EducationK-12 + Higher Ed combinedCross-segment comparative research
Full RegistryAll 801,359 entitiesComprehensive multi-sector research
API + UpdatesQuarterly re-scan, REST APILongitudinal studies, dashboards

Delivery formats: CSV/Parquet, SQLite (pre-indexed), REST API, or interactive dashboard. All deliveries include full methodology documentation, source provenance, data dictionary, and reproducibility scripts (Python).

Contact: research@monitorworkspace.com

Free demo: monitorworkspace.com/scorecard

Citation: Lokentra Research Team (2026). U.S. Education Sector Email Security Intelligence Dataset: A Multi-Source Entity Registry with DNS Authentication Profiling. Lokentra U.S. Email Security Index (ESI). https://lokentra-site.web.app/research/education-dataset-paper.html

Data sources: NCES Common Core of Data (CCD) 2024-2025; NCES IPEDS; State education agency directories; U.S. Census Bureau Gazetteer; SAM.gov; IRS Business Master File. All data derived from publicly accessible DNS records and government-published registries.

Competing interests: Lokentra develops MonitorWorkspace, a Google Workspace administration platform. The ESI dataset is produced by the Lokentra Research Division independently of the product team.