The case of the disappearing Statistics Canada data
2014/08/15 Leave a comment
Good piece on Statistics Canada and the impact of some of the changes made to reduce long-standing data series:
Last year, Stephen Gordon railed against StatsCan’s attention deficit disorder, and its habit of arbitrarily terminating long-standing series and replacing them with new data that are not easily comparable.
For what appears to be no reason whatsover, StatsCan has taken a data table that went back to 1991 and split it up into two tables that span 1991-2001 and 2001-present. Even worse, the older data have been tossed into the vast and rapidly expanding swamp of terminated data tables that threatens to swallow the entire CANSIM site. A few months ago, someone looking for SEPH wage data would get the whole series. Now, you’ll get data going back to 2001 and have to already know StatsCan won’t tell you that there are older data hidden behind the “Beware of the Leopard” sign.…
Statistics Canada must be the only statistical agency in the world where the average length of a data series gets shorter with the passage of time. Its habit of killing off time series, replacing them with new, “improved” definitions and not revising the old numbers is a continual source of frustration to Canadian macroeconomists.
Others are keeping tabs on the vanishing data. The Canadian Social Research Newsletter for March 2 referred to the cuts as the CANSIM Crash Diet and tallied some of the terminations:
- For the category “Aboriginal peoples” : 4 tables terminated out of a total of 7
- For the category “Children and youth” : 89 tables terminated out of a total of 130
- For the category “Families, households and housing” : 67 tables terminated out of a total of 112
- For the category “Government” : 62 tables terminated out of a total of 141
- For the category “Income, pensions, spending and wealth” : 41 tables terminated out of a total of 167
- For the category “Seniors” : 13 tables terminated out of a total of 30
As far as Statistics Canada’s troubles go, this will never get the same level of attention as the mystery of the 200 jobs. But, as it relates to the long-term reliability of Canadian data, it’s just as serious.
Given my work using NHS data, particularly ethnic origin, visible minority and religions, linked to social and economic outcomes, still in the exploration stage of what data and linkages are available – or not.
