Survey data is not like other data. Every number carries a weight—literally. The people in the sample represent different numbers of people in the population, and ignoring that produces estimates that are precise but wrong. Getting this right is the foundation of everything else.
Survey design is not something I learned in econometrics or statistics classes in graduate school. I learned it on the job at Cultivate Learning, and the first time I was called upon to design a sampling strategy was on the PPI project—a multi-state partnership funded by the Bill & Melinda Gates Foundation.
The Gates Foundation wanted each state in the partnership to provide CLASS and ECERS data that represented all the state preschool classrooms and programs in that state. The challenge was considerable: to design a proper sampling strategy, one needs a master list of all programs and classrooms—a sampling frame—which I was not given. The research partners in each state had their own ideas about data collection and didn’t much appreciate being told what to do, while at the same time statistical expertise for this kind of work was largely missing in those states.
So I wrote a 64-page sampling plan that functioned as both a technical guide and a diplomatic document—helping state partners maintain flexibility in how they collected their data while still collecting it in a representative way that could inform the overall Gates Foundation strategy. The plan covered appropriate sampling strategies, selection bias, stratification definitions, power analyses for each focus state (Tennessee, Oregon, Washington), CLASS rating protocols, and included a full survey sampling tutorial as an appendix.
Two years later, I designed the sampling strategy for the Washington Early Childhood Workforce Survey. This time I had more control. The challenge was to produce representative estimates for the entire state’s early childhood workforce—across 39 counties that range from King County (Seattle, 31.6% of the population) to Wahkiakum County (0.01%). I developed the stratified sampling frame, calculated design weights to account for differential response rates across strata, and produced population-level estimates with proper standard errors.
The result was a dataset that could answer questions about the entire workforce—their qualifications, working conditions, and employers—not just the subset who happened to respond. The difference between a convenience sample and a probability sample is the difference between anecdote and evidence.
The workforce survey data also opened an unexpected research direction. The WA DCYF workforce survey came out in December 2019, just before the first cases of COVID-19. A follow-up survey went out in April 2020, four months into the pandemic. Having representative data from both sides of a historic shock gave us a natural experiment. I led a team of researchers to analyze the impact on early childhood educators’ mental health, finding that the mean incidence of depression symptoms had increased by 35% and the probability of crossing the diagnostic threshold for depression had increased by 114%. The study—grounded in Deci and Ryan’s Self-Determination Theory—was published as a book chapter in 2023. None of it would have been possible without the sampling design that made the data representative in the first place.
As part of the PPI project I was assigned to the Finance strand of the Gates Foundation strategy. Most of the project was based on qualitative research—interviews with state preschool program executives, research-practice partners, and state politicians. I was a one-man quantitative analysis team. My boss gave me free rein—or, more accurately, a lack of direction—so I wasn’t quite sure how to fit in within a strategy where finance data does not obviously lend itself to qualitative analysis and no one gave me any quantitative data to work with.
So I started looking for datasets on my own. I began creating a map of publicly available early childhood education datasets, and this eventually crystallized into the state factsheets. In the process, I came across the National Survey of Early Care and Education (NSECE)—a rich, nationally representative dataset on the ECE workforce. As I familiarized myself with it, I linked my mentorship activity with my knowledge of the data and mentored my junior colleague Liu Liu into publishing “Early childhood educators’ pay equity: A dream deferred”—a paper that addressed a topic my boss was interested in but had never formulated as a specific research question.
Much of my work uses federal datasets that come with their own complex sampling designs: the Census Bureau’s American Community Survey (ACS PUMS), the Administration for Children and Families (ACF) CCDF enrollment data, and NIEER state pre-K yearbook data. Each requires careful handling of design weights, strata, and primary sampling units to produce valid inference.
For the state factsheets, I used ACS PUMS microdata to produce population estimates of children aged 3–4 by income group, ethnicity, and metro status—then compared these to preschool enrollment patterns. The analysis revealed a consistent finding across states: ethnic minorities are disproportionately represented in poverty, but their representation in preschool enrollment does not reflect those poverty rates. Rural children face a similar gap. These are not findings you can see without properly weighted data and the right comparisons.
- ACS PUMS (Census) — Microdata with person-level design weights for demographic and enrollment analysis across income, ethnicity, and geography
- ACF CCDF — Federal childcare subsidy enrollment data with state-level trends over time
- NIEER — State pre-K enrollment and funding data for cross-state comparisons
- ERS USDA — Rural-urban classification for geographic stratification
- NSECE — National Survey of Early Care and Education, nationally representative workforce data used for the pay equity study
- Washington Workforce Survey — Original survey instrument with self-designed stratified sampling and design weights
The factsheets were designed for policy audiences—state legislators, agency heads, program administrators—who need to see the story in the data without wading through regression tables. Each factsheet follows a consistent visual grammar: enrollment trends on the left, equity breakdowns in the center, and preschool access on the right. The side-by-side structure makes the gaps visible at a glance.
I produced these for multiple states, adapting the analysis to each state’s specific program landscape—Head Start, ECEAP in Washington, Oregon Pre-K/Oregon Head Start, Tennessee VPK—while maintaining a consistent analytical framework that makes cross-state comparison possible.