Garbage in, garbage out: Canada’s big data problem

A reminder that despite the restoration of the Census, there still remain significant gaps in the collection, methodologies and dissemination of statistical data by the government:

In a recent article in the Toronto Star, Paul Wells lays out what he sees as Prime Minister Trudeau’s game plan for slowing Canada’s brain drain and making science pay. “Over the next year,” he writes, “the Trudeau government will seek to reinforce or shore up Canada’s advantage in three emerging fields: quantum tech, artificial intelligence and big data and analytics.”

As he should. If that’s the plan, it’s a good one. Canada’s future prosperity depends on our ability to innovate and retain the best talent in those three fields.

What we call “big data analytics” works by finding previously unknown patterns in the huge blocks of data that very large organizations — governments, for example — grow around themselves constantly, like coral. Finding those patterns can point the way to new efficiencies, new ways to fight crime and disease, new trends in business. But as with any complex system, what you get depends on what you put in. If the inputs aren’t accurate, the results won’t be, either. So before we embrace the “big data revolution”, we may want to look first at the worsening quality of the data our federal government produces, and that businesses, activists and social planners use.

Take something as basic as divorce. Statistics Canada first started reported marriage rates in 1921, divorce rates in 1972; it stopped collecting both data streams in 2011, citing “cost” concerns.

Marriage and divorce rates are exactly the kinds of data streams consumers of big data want collected, because they affect so many things: government policies, job markets, the service sector, housing starts — you name it. Having abandoned the field now for five years, StatsCan’s data volume on marital status isn’t nearly as useful as it might have been.

Take wildlife conservation. Recently an Ontario provincial backbencher proposed a private members bill to allow for unlimited hunting of cormorants. The bill’s proponent says the species is experiencing a population explosion. And we don’t know if he’s right or wrong — because the feds stopped collecting that data in 2011.

open quote 761b1bCanada used to publish statistical reports that were every bit as good as the Americans’ — in some cases, better. Then we stopped.

Here’s another big data blind spot: gasoline imports. After having reported data on gasoline imports regularly since 1973, StatsCan has been suppressing the numbers since 2013 due to what it calls “privacy” concerns. In the last reporting year, 2012, a staggering amount of imported gasoline came into the country — almost 4 billion litres.

Now, if you were thinking of expanding your oil refinery, or wanted to know more about how dependent this country is on foreign fuel, this would be pretty precious data — the kind you’d probably pay for. But the data aren’t reliable — any more than the StatsCan data on gasoline demand by province, which we use to work out whether carbon taxes are actually reducing demand for gasoline. It’s bad data; it has been for years. You’d think someone in the higher echelons of the federal or provincial governments would get annoyed.

Combing through StatsCan’s archive of reports can be a bewildering experience, even for experts. Its online database, CANSIM, is easy enough to use. It’s the reports themselves that sometimes fail you.

Say you want to understand trends in Ontario’s demand for natural gas. You’d start by looking at CANSIM table 129-0003, which shows an increase in sales of natural gas in 2007 over 2006 of 85 per cent. “Ah,” you think to yourself, “that must be because of the conversion of coal-burning plants to gas.” But no, that change occurred years later. Ask StatsCan and they’ll tell you that they changed their methodology that year — but didn’t bother re-stating the previous years’ numbers under the same methodology. Individually, the numbers are accurate — but the trend stops making sense.

StatsCan changed its methodology again this year; it now warns researchers to take care when comparing current and historical data. That’s an improvement over changing the methodology without telling anyone but it isn’t very helpful for understanding long-term trends.

And this isn’t just StatsCan’s problem. The National Energy Board published an excellent report showing where Canada’s crude ends up in the United States. Industry analysts use the numbers to understand the reasons why light and heavy crude are selling for what they’re selling for south of the border.

The NEB stopped reporting the data after September 2015. Ask why, and this is the response you get: “The Board has decided to discontinue publication of this data while we re-evaluate our statistical products.” That, of course, was a year ago.

Source: Garbage in, garbage out: Canada’s big data problem

Canada’s top general launches push to recruit women

Military, RCMP, CSIS.001The Forces have struggled with increasing diversity for some time, as has the RCMP. The target of a one percent increase per year is ambitious; their annual employment equity report (available from the Library of Parliament) will allow public tracking of progress over the next few years:

Canada’s top general has set out to transform the military with a new effort to boost the number of women in the ranks.

Gen. Jonathan Vance, the chief of defence staff, revealed on Friday that he has given a directive to do what good intentions have so far failed to accomplish — get more women into the Canadian Armed Forces.

Vance said he has tasked Lt.-Gen. Christine Whitecross, the chief of military personnel, to boost the number of women in uniform by 1 per cent a year over the coming decade.

That would allow the military to meet its long-standing goal of having women make up 25 per cent of its members.

“I have asked Gen. Whitecross to increase the percentage, through retention and recruiting, . . . of women in the armed forces by 1 per cent a year over the next 10 years,” Vance told a defence conference on Friday.

“If we don’t make it a task, if I don’t give an order, it’s not going to get done. We can’t just hope that it happens. We’re going to try hard to meet our diversity targets the same way.”

Officials said later that Vance had given the directive on Wednesday during a meeting with Whitecross.

…But meeting the goal could be a challenge. There are some 15,000 women in uniform, making up 15 per cent of the regular and reserve forces.

In all, the defence department has about 66,000 full-time soldiers, short of its approved staffing level of 68,000, and about 21,000 reservists, well below its target of 27,000.

In the past, many women who joined the military were familiar with the organization, thanks either to family connections or past involvement with cadets, Leuprecht said. As the military now looks to recruit more women, it will have to broaden its appeal, he said.

Leuprecht also said that the armed forces must work to have women better represented in trades across the organization, rather than concentrated in areas such as logistics and medicine.

Vance made clear Friday that his efforts to diversify the ranks won’t stop with boosting the number of women.

“I’m also wanting to increase all manner of diversity in the armed forces to better reflect the Canadian public. It’s important. We are of the public,” Vance said.

Visible minorities currently make up 6.5 per cent of the armed forces, short of the goal of 11.8 per cent. Aboriginal peoples represent 2.5 per cent of those in uniform, shy of the goal of 3.4 per cent.