nsacpi
Expects Yuge Games
the article does discuss the data used and possible issues:
Let’s pause a minute to talk about where exactly this data comes from. Ideally you would want it to be from something like a random-digit-dial survey, the type typically used in public opinion polling, which with enough participants would produce a sample of each state that’s representative of its population and demographics. But the cost of running one such survey for all 50 states plus D.C. would be enormously prohibitive — to say nothing of doing so on a daily basis, which is necessary to produce the kind of real-time data of interest to epidemiologists.
So the CovidCast team partnered with Facebook, which is used by 70 percent of U.S. adults and has the ability to survey tens of thousands of them every day at relatively low cost. While the resulting state-level samples aren’t perfect representations of the general population, the researchers weight the responses using Census Bureau demographic data to ensure they’re a good approximation.
“If Facebook’s users are different from the U.S. population generally in a way that the survey weighting process doesn’t account for, then our estimates could be biased,” cautioned Alex Reinhart, a Carnegie Mellon professor of statistics and data science who works on CovidCast and wrote a book on statistical methods. “But if that bias doesn’t change much over time, then we can still use the survey to detect trends and changes.”
I think this last sentence is important. If their plan is create a time series. Then presumably the bias from using FB data doesnt change from one month to the next. So if by December there is a change in mask wearing in certain states and corresponding change in infections then that is useful information.
Let’s pause a minute to talk about where exactly this data comes from. Ideally you would want it to be from something like a random-digit-dial survey, the type typically used in public opinion polling, which with enough participants would produce a sample of each state that’s representative of its population and demographics. But the cost of running one such survey for all 50 states plus D.C. would be enormously prohibitive — to say nothing of doing so on a daily basis, which is necessary to produce the kind of real-time data of interest to epidemiologists.
So the CovidCast team partnered with Facebook, which is used by 70 percent of U.S. adults and has the ability to survey tens of thousands of them every day at relatively low cost. While the resulting state-level samples aren’t perfect representations of the general population, the researchers weight the responses using Census Bureau demographic data to ensure they’re a good approximation.
“If Facebook’s users are different from the U.S. population generally in a way that the survey weighting process doesn’t account for, then our estimates could be biased,” cautioned Alex Reinhart, a Carnegie Mellon professor of statistics and data science who works on CovidCast and wrote a book on statistical methods. “But if that bias doesn’t change much over time, then we can still use the survey to detect trends and changes.”
I think this last sentence is important. If their plan is create a time series. Then presumably the bias from using FB data doesnt change from one month to the next. So if by December there is a change in mask wearing in certain states and corresponding change in infections then that is useful information.
Last edited: