Data Appendix

Online Appendix:  Robustness Checks

The following is a supplemental appendix for the paper, “I Hope to Hell Nothing Goes Back to The Way It Was Before”: COVID-19, Marginalization, and Native Nations by Raymond Foxworth, Laura Evans, Gabriel Sanchez, Cheryl Ellenwood and Carmela Roybal.

We include multiple robustness checks in our analysis.  Across these measures, our results remain stable.

Exclusions and Inclusions

We begin with tests of whether exclusions or inclusions alter results. First, we consider whether two of the most publicized instances of the COVID-19 outbreak—the pandemic on the Navajo Nation and on tribal lands in South Dakota—are outliers that skew the results. We run the models excluding these tribes and the results are stable.

Next, we consider whether population density on tribal lands affects results. We find that, when controlling for population density, the effect of plumbing crosses the conventional boundary for statistical significance, and now p=.103.  Also, population density has a statistically significant and negative effect. We believe these findings indicate that where tribal populations are more dispersed, it is more difficult for tribes and other governments to provide needed services, including infrastructure like plumbing. We emphasize, however, that higher costs of serving dispersed populations do not absolve the federal government of its treaty and trust responsibilities to protect health and safety.

Finally, we evaluate the effects of tribal health administrative capacity by introducing two additional measures. To begin, we include the date of the creation of the Tribal Epidemiology Center (TEC) to which the tribe belongs. The IHS has created regional TECs to help tribes manage health data. TECs were phased in over time and some are newer than others.  Perhaps tribes with longer-standing TECs have had more time to maximize the advantages of this resource and therefore produce better public health data. We find statistically insignificant effects.  Also, we consider a tribe’s degree of autonomy in managing health programs. Some tribes have agreements whereby the IHS delegates to the tribe both funding and management for federal health programs. These agreements are of two types: self-government compacts and PL-638 contracts. Perhaps tribes that manage their own programs have developed more capacity to manage tribe-specific data. Indeed, we find positive and statistically significant effects from compacting programs.  Results on our main variables of interest are stable. These findings suggest that there are merits to subsequent, thorough investigations of tribes’ relations with IHS.

Appendix Table I: Exclusions and Inclusions

Statistical significance at the 1%, 5%, and 10% level is indicated by , ***, **, and * respectively.  Models use robust standard errors.

Models also include the controls present in models in the main text:  specifically, for Cases per 100K in state by June 11, Median age on tribe’s lands, Median household income on tribe’s lands (in thousands), Total population on tribe’s lands (in thousands), Percent American Indian or Alaska Native on tribe’s lands, Any health facility on tribal lands operated by IHS or by tribe: an indicator of ease of diagnosis, Rates of state racial misclassification in health records, Tribe belongs to the Phoenix Indian Health Board: stronger network connections to data collectors, State-recognized tribe: weaker network connections to data collectors

Functional Form

We consider alternate functional forms as well.  Across these specifications, our results are stable.

To begin, we use logistic regression to answer a basic and perhaps more straightforward question:  which tribes have outbreaks and which don’t? We include two different specifications of an outbreak, both of which we find reasonable. In one model, we measure whether or not a tribe has any cases. In a second model, we measure whether or not a tribe has at least five cases. We find the results are stable. A simpler specification is also less nuanced, and it is unsurprising that the correlation between the percent of households with plumbing and the percent of households that speak English (r=0.41) becomes consequential. As a result, we model separately the effects of plumbing and of speaking English.

Next, we consider whether a zero-inflated negative binomial changes the results.  This model uses a first-stage model to account for alternate sources of zero observations of cases, followed by a second-state model that accounts for the total number of cases.  In this specification results of interest are stable. For our first stage, we include 1) whether there is a health facility on tribal lands operated by IHS or by tribe, an indicator of ease of diagnosis; 2) rates of state racial misclassification in health records, 3) whether the tribe belongs to the Phoenix Indian Health Board, an indicator of whether a tribe has stronger network connections to data collectors, and 4) whether the tribe is state-recognized but not federally recognized, as an indicator of weaker network connections to data collectors.  Note that a variety of other specifications of the first stage generate models that fail to converge.

Finally, we consider whether results change when we use the COVID case rate as the dependent variable—rather than using COVID cases as the dependent variable and then controlling for population on tribal lands and the percent of that population that is Native American. We find that results are stable.  Given that we are analyzing an infrequent condition in small populations, however, this approach creates massive heteroskedasticity.  A quarter of the sample is tribes with a population of less than 100 members on tribal lands.  For these tribes, a single case causes the rate to skyrocket to the top of the distribution. Of course, we should doubt whether a single case indicates hugely different public health circumstances:  any individual medical diagnosis has prospects for false positive or false negative results. To mitigate the higher variation among small tribes, for both conceptual and practical reasons, we log the dependent variable:  specifically, we model it as ln(1+ rate per 100,000). To double-check this specification, we also model the dependent variable as ln(.01 + rate per 100,000) and ln(10 + rate per 100,000) to see if the results change; they do not.

Appendix Table II: Alternate functional forms

Native on tribe’s lands, Any health facility on tribal lands operated by IHS or by tribe: an indicator of ease of diagnosis, Rates of state racial misclassification in health records, Tribe belongs to the Phoenix Indian Health Board: stronger network connections to data collectors, State-recognized tribe: weaker network connections to data collectors

See text for further discussion for operationalization of the zero-inflated negative binomial and of the COVID case rate.