Governments have undercounted the COVID-19 death toll by millions, the WHO says

Similar analysis to that of The Economist. Biggest surprise to me is the inclusion of the USA in the top 10 – but perhaps I shouldn’t be:

The COVID-19 pandemic directly or indirectly caused 14.9 million deaths in 2020 and 2021, the World Health Organization said on Thursday, in its newest attempt to quantify the outbreak’s terrible toll.

That’s around 2.7 times more than the 5.42 million COVID-19 deaths the WHO says were previously reported through official channels in the same 2-year period.

Here’s a rundown of four main points in WHO’s report:

Overall, deaths are far higher than those in official reports

In its tally, WHO aims to quantify “excess mortality,” accounting for people who lost their lives either directly, because of contracting COVID-19, or indirectly, because they weren’t able to get treatment or preventive care for other health conditions. The figure also takes into account the deaths that analysts say were prevented because of the pandemic’s wide-ranging effects, such as curtailing traffic and travel.

The pandemic’s current reported death toll is 6.2 million, according to Johns Hopkins University’s COVID-19 tracker.

India is seen suffering a much deeper loss than reported — a finding that India disputes

In some cases, WHO’s figures depict a shockingly wide gulf between official figures and its experts’ findings. That’s particularly true for India, where WHO says millions more people died because of the pandemic than has been officially reported.

India reported 481,000 COVID-19 deaths in 2020 and 2021. But William Msemburi, technical officer for WHO’s department of data and analytics, said on Thursday that the toll is vastly higher, with 4.74 million deaths either directly or indirectly attributable to the pandemic — although Msemburi said that figure has a wide “uncertainty interval,” ranging from as low as 3.3 million to as high as 6.5 million.

The data behind the staggering figures promise to expand the understanding of the pandemic’s true effects. But the findings are also a flashpoint in debates over how to account for unreported coronavirus deaths. India, for instance, is rejecting WHO’s findings.

India “strongly objects to use of mathematical models for projecting excess mortality estimates,” the country’s health ministry said on Thursday, insisting that WHO should instead rely on “authentic data” it has provided.

10 countries accounted for a large share of deaths

Deaths were not evenly distributed around the world. The WHO says about 84% of the excess deaths were concentrated in three regions: Southeast Asia, Europe and the Americas.

And about 68% of the excess deaths were identified in just 10 countries. WHO listed them in alphabetical order: Brazil, Egypt, India, Indonesia, Mexico, Russia, South Africa, Turkey and the United States.

Overall, WHO found the number of excess deaths was much closer to reported COVID-19 deaths in high-income countries than in lower income countries.

Many countries still lack reliable health statistics

The WHO says it relied on statistical models to derive its estimates, looking to fill in gaps in official data.

“Prior to the pandemic, we estimate that 6 out of every 10 deaths were unregistered” worldwide, said Stephen MacFeely, director of WHO’s department of data and analytics. “In fact, more than 70 countries do not produce any cause of death statistics. In the 21st century, this is a shocking statistic.”

By creating its report on excess mortality, WHO is pursuing several goals, such as urging governments to improve their health-care interventions for vulnerable populations and to adopt more rigorous and transparent reporting standards.

“Knowing how many people died due to the pandemic will help us to be better prepared for the next,” said Samira Asma, WHO’s assistant director-general for data and analytics.

Source: Governments have undercounted the COVID-19 death toll by millions, the WHO says

Cuts in Britain Could Cause a Covid Data Drought

Unfortunately, many governments are short sighted.

Canada did the same when it disbanded the Global Public Health Intelligence Network (GPHIN) the year before the pandemic, many provinces are no longer carrying out regular testing and reducing the frequency of reporting etc.

Interesting example of South Africa and how it is able to maintain monitoring at a reasonable cost:

The British government on Friday shut down or scaled back a number of its Covid surveillance programs, curtailing the collection of data that the United States and many other countries had come to rely on to understand the threat posed by emerging variants and the effectiveness of vaccines. Denmark, too, renowned for insights from its comprehensive tests, has drastically cut back on its virus tracking efforts in recent months.

As more countries loosen their policies toward living with Covid rather than snuffing it out, health experts worry that monitoring systems will become weaker, making it more difficult to predict new surges and to make sense of emerging variants.

“Things are going to get harder now,” Samuel Scarpino, a managing director at the Rockefeller Foundation’s Pandemic Prevention Institute, said. “And right as things get hard, we’re dialing back the data systems.”

Since the Alpha variant emerged in the fall of 2020, Britain has served as a bellwether, tracking that variant as well as Delta and Omicron before they arrived in the United States. After a slow start, American genomic surveillance efforts have steadily improved with a modest increase in funding.

“This might actually put the U.S. in more of a leadership position,” said Kristian Andersen, a virologist at Scripps Research Institute in La Jolla, Calif.

At the start of the pandemic, Britain was especially well prepared to set up a world-class virus tracking program. The country was already home to many experts on virus evolution, it had large labs ready to sequence viral genes, and it could link that sequencing to electronic records from its National Health Service.

In March 2020, British researchers created a consortium to sequence as many viral genomes as they could lay hands on. Some samples came from tests that people took when they felt ill, others came from hospitals, and still others came from national surveys.

That last category was especially important, experts said. By testing hundreds of thousands of people at random each month, the researchers could detect new variants and outbreaks among people who didn’t even know they were sick, rather than waiting for tests to come from clinics or hospitals.

“The community testing has been the most rapid indicator of changes to the epidemic, and it’s also been the most rapid indicator of the appearance of new variants,” said Christophe Fraser, an epidemiologist at the University of Oxford. “It’s really the key tool.”

By late 2020, Britain was performing genomic sequencing on thousands of virus samples a week from surveys and tests, supplying online databases with more than half of the world’s coronavirus genomes. That December, this data allowed researchers to identify Alpha, the first coronavirus variant, in an outbreak in southeastern England.

A few other countries stood out for their efforts to track the virus’s evolution. Denmark set up an ambitious system for sequencing most of its positive coronavirus tests. Israel combined viral tracking with aggressive vaccination, quickly producing evidence last summer that the vaccines were becoming less effective — data that other countries leaned on in their decision to approve boosters.

But Britain remained the exemplar in not only sequencing viral genomes, but combining that information with medical records and epidemiology to make sense of the variants.

“The U.K. really set itself up to give information to the whole world,” said Jeffrey Barrett, the former director of the Covid-19 Genomics Initiative at the Wellcome Sanger Institute in Britain.

Even in the past few weeks, Britain’s surveillance systems were giving the world crucial information about the BA.2 subvariant of Omicron. British researchers established that the variant does not pose a greater risk of hospitalization than other forms of Omicron but is more transmissible.

On Friday, two of the country’s routine virus surveys were shut down and a third was scaled back, baffling Dr. Fraser and many other researchers, particularly when those surveys now show that Britain’s Covid infection rates are estimated to have reached a record high: one in 13 people. The government also stopped paying for free tests, and either canceled or paused contact-tracing apps and sewage sampling programs.

“I don’t understand what the strategy is, to put together these very large instruments and then dismantle them,” Dr. Fraser said.

The cuts have come as Prime Minister Boris Johnson has called for Britain to “learn to live with this virus.” When the government released its plans in February, it pointed to the success of the country’s vaccination program and the high costs of various virus programs. Although it would be scaling back surveillance, it said, “the government will continue to monitor cases, in hospital settings in particular, including using genomic sequencing, which will allow some insights into the evolution of the virus.”

It’s true that life with Covid is different now than it was back in the spring of 2020. Vaccines drastically reduce the risk of hospitalization and death — at least in countries that have vaccinated enough people. Antiviral pills and other treatments can further blunt Covid’s devastation, although they’re still in short supply in much of the world.

Supplying free tests and running large-scale surveys is expensive, Dr. Barrett acknowledged, and after two years, it made sense that countries would look for ways to curb spending. “I do understand it’s a tricky position for governments,” he said.

But he expressed worry that cutting back too far on genomic surveillance would leave Britain unprepared for a new variant. “You don’t want to be blind on that,” he said

With a reduction in testing, Steven Paterson, a geneticist at the University of Liverpool, pointed out that Britain will have fewer viruses to sequence. He estimated the sequencing output could drop by 80 percent.

“Whichever way you look at it, it’s going to lead very much to a degradation of the insight that we can have, either into the numbers of infections, or our ability to spot new variants as they come through,” Dr. Paterson said.

Experts warned that it will be difficult to restart surveillance programs of the coronavirus, known formally as SARS-CoV-2, when a new variant emerges.

“If there’s one thing we know about SARS-CoV-2, it’s that it always surprises us,” said Paul Elliott, an epidemiologist at Imperial College London and a lead investigator on one of the community surveys being cut. “Things can change really, really quickly.”

Other countries are also applying a live-with-Covid philosophy to their surveillance. Denmark’s testing rate has dropped nearly 90 percent from its January peak. The Danish government announced on March 10 that tests would be required only for certain medical reasons, such as pregnancy.

Astrid Iversen, an Oxford virologist who has consulted for the Danish government, expressed worry that the country was trying to convince itself the pandemic was over. “The virus hasn’t gotten the email,” she said.

With the drop in testing, she said, the daily case count in Denmark doesn’t reflect the true state of the pandemic as well as before. But the country is ramping up widespread testing of wastewater, which might work well enough to monitor new variants. If the wastewater revealed an alarming spike, the country could start its testing again.

“I feel confident that Denmark will be able to scale up,” she said.

Israel has also seen a drastic drop in testing, but Ran Balicer, the director of the Clalit Research Institute, said the country’s health care systems will continue to track variants and monitor the effectiveness of vaccines. “For us, living with Covid does not mean ignoring Covid,” he said.

While Britain and Denmark have been cutting back on surveillance, one country offers a model of robust-yet-affordable virus monitoring: South Africa.

South Africa rose to prominence in November, when researchers there first discovered Omicron. The feat was all the more impressive given that the country sequences only a few hundred virus genomes a week.

Tulio de Oliveira, the director of South Africa’s Centre for Epidemic Response & Innovation, credited the design of the survey for its success. He and his colleagues randomly pick out test results from every province across the country to sequence. That method ensures that a bias in their survey doesn’t lead them to miss something important.

It also means that they run much leaner operations than those of richer countries. Since its start in early 2020, the survey has cost just $2.1 million. “It’s much more sustainable,” Dr. de Oliveira said.

In contrast, many countries in Africa and Asia have yet to start any substantial sequencing. “We are blind to many parts of the world,” said Elodie Ghedin, a viral genomics expert at the U.S. National Institute of Allergy and Infectious Diseases.

The United States has traveled a course of its own. In early 2021, when the Alpha variant swept across the country, American researchers were sequencing only a tiny fraction of positive Covid tests. “We were far behind Britain,” Dr. Ghedin said.

Since then, the Centers for Disease Control and Prevention has helped state and local public health departments start doing their own sequencing of virus genomes. While countries like Britain and Denmark pull back on surveillance, the United States is still ramping up its efforts. Last month, the C.D.C. announced a $185 million initiative to support sequencing centers at universities.

Still, budget fights in Washington are bringing uncertainty to the country’s long-term surveillance. And the United States faces obstacles that other wealthy countries don’t.

Without a national health care system, the country cannot link each virus sample with a person’s medical records. And the United States has not set up a regularly updated national survey of the sort that has served the United Kingdom and South Africa so well.

“All scientists would love it if we had something like that,” Dr. Ghedin said. “But we have to work with the confines of our system.”

Source: Cuts in Britain Could Cause a Covid Data Drought

Canadians’ health data are in a shambles

Unfortunately, all too true, with too few exceptions, based upon my admittedly anecdotal experience in Ottawa:

Canadians see new and increasingly powerful computerization in almost every facet of their day-to-day lives – everywhere, that is, except for something as fundamental as our health care, where systems are too often stuck in the past.

When we go to the doctor, we get prescriptions printed on paper; lab results are sent via fax; and typically, medical offices have no direct links to any patient hospitalization data. And while the pandemic sparked a mad scramble to set up many new data systems – to track who was infected, where there were ventilators, who has been vaccinated and with which vaccine – this has happened in a largely unco-ordinated way, with Ottawa and provincial governments each developing systems separately.

As a result, even these newest computer systems are duplicative, and they do not communicate across provincial boundaries, or even within some provinces – not even, for example, to connect vaccinations, infections, the genotype of the virus, hospitalizations, other diseases and deaths so they are centrally accessible. And so Canada’s recent health-data efforts have wasted millions of dollars while failing to provide the evidence base needed for real-time effective responses to the fluctuating waves of COVID-19 infections..

This kind of failure is not new. Even before the pandemic, key kinds of data have long been imprisoned by data custodians who are excessively fearful of privacy breaches, even though the data are generally collected and stored in secure computer databases. A broad range of critical health care data remains unavailable – not only for patients’ direct clinical care, research and quality control, but also for tracking adverse drug reactions, showing unnecessary diagnostic imaging and drug over-prescribing. The result is that major inefficiencies in the systems remain hidden – and may actually cause health problems, and even deaths by medical misadventure.

There are many directions one could point the finger of blame, but as a new report from the Expert Advisory Committee to the Public Health Agency of Canada found, the root cause is a failure of governance. Federal and provincial governments have failed to agree on strong enforcement of common data standards and interoperability, though this is not only a problem of federalism. Health-data governance problems are also evident within provinces where one health agency’s data system is not connected to others within the same province.

What Canada and the provinces have now is essentially provider-centric health-data systems – not just one but many kinds for hospitals, others for primary care, and yet others still for public health. What Canadians want and need is patient- or person-centric health data. That way, no matter where you are in the countryyour allergies, chronic diseases and prescriptions can be known instantly by care providers.

Private vendor-centric health-data software also pose a threat, as do data collected by powerful tech companies from new wearable technologies that offer to collect your health data for you. If Canada does not act swiftly and decisively to establish the needed governance, competing vendor software and individual data will continue the rapidly growing cacophony of proprietary standards. This trend is raising new concerns about privacy, along with untracked increases in health care costs.

The fundamental importance of standardized, interoperable, securely protected health data has been known for decades. There have been repeated efforts to achieve a modern effective health-data system for Canada. But federal cajoling and even financial incentives have failed. Much stronger governance mechanisms are required, and urgently, as the global pandemic has revealed.

The federal government has the constitutional authority to play a much stronger role, given its powers in spending, public health, statistics, as well as “peace, order and good government.” It also has readily available regulatory powers under the Canada Health Act.

Of course, high-quality data collection and data software have costs. But given the tens of billions of health care dollars the federal government is providing to the provinces through fiscal transfers, it is long past time they leveraged this clout – using both carrots and sticks – so Canadians can finally have informed, accessible health data when and where they need it most.

Michael Wolfson is a former assistant chief statistician at Statistics Canada, and a current member of the University of Ottawa’s Centre for Health Law, Policy and Ethics. Bartha Maria Knoppers is a professor, the Canada Research Chair in Law and Medicine, and director of the Centre of Genomics and Policy at McGill University’s Faculty of Medicine. They are both members of the Expert Advisory Group for the Pan-Canadian Health Data Strategy.

Source: https://www.theglobeandmail.com/opinion/article-canadians-health-data-are-in-a-shambles/

Sullivan: When The Narrative Replaces The News

Valid points by Sullivan on the media’s responsibility to provide context and background to hate crimes and incidents, including comparative data between groups and perpetrators:

There’s a reason for this shift. Treating the individual as unique, granting him or her rights, defending the presumption of innocence, relying on provable, objective evidence: these core liberal principles are precisely what critical theory aims to deconstruct. And the elite media is in the vanguard of this war on liberalism. 

This isn’t in any way to deny increasing bias against Asian-Americans. It’s real and it’s awful. Asians are targeted by elite leftists, who actively discriminate against them in higher education, and attempt to dismantle the merit-based schools where Asian-American students succeed — precisely and only because too many Asians are attending. And Asian-Americans are also often targeted by envious or opportunistic criminal non-whites in their neighborhoods. For Trump to give these forces a top-spin with the “China virus” made things even worse, of course. For a firsthand account of a Chinese family’s experience of violence and harassment, check out this piece.

The more Asian-Americans succeed, the deeper the envy and hostility that can be directed toward them. The National Crime Victimization Survey notesthat “the rate of violent crime committed against Asians increased from 8.2 to 16.2 per 1000 persons age 12 or older from 2015 to 2018.” Hate crimes? “Hate crime incidents against Asian Americans had an annual rate of increase of approximately 12% from 2012 to 2014. Although there was a temporary decrease from 2014 to 2015, anti-Asian bias crimes had increased again from 2015 to 2018.” 

Asians are different from other groups in this respect. “Comparing with Black and Hispanic victims, Asian Americans have relatively higher chance to be victimized by non-White offenders (25.5% vs. 1.0% for African Americans and 18.9% for Hispanics). … Asian Americans have higher risk to be persecuted by strangers … are less likely to be offended in their residence … and are more likely to be targeted at school/college.” Of those committing violence against Asians, you discover that 24 percent such attacks are committed by whites; 24 percent are committed by fellow Asians; 7 percent by Hispanics; and 27.5 percent by African-Americans. Do the Kendi math, and you can see why Kendi’s “White Supremacist domestic terror” is not that useful a term for describing anti-Asian violence.

But what about hate crimes specifically? In general, the group disproportionately most likely to commit hate crimes in the US are African-Americans. At 13 percent of the population, African Americans commit 23.9 percent of hate crimes. But hate specifically against Asian-Americans in the era of Trump and Covid? Solid numbers are not yet available for 2020, which is the year that matters here. There’s data, from 1994 to 2014, that finds little racial skew among those committing anti-Asian hate crimes. Hostility comes from every other community pretty equally. 

The best data I’ve found for 2020, the salient period for this discussion, are provisional data on complaints and arrests for hate crimes against Asians in New York City, one of two cities which seem to have been most affected. They record 20 such arrests in 2020. Of those 20 offenders, 11 were African-American, two Black-Hispanic, two white, and five white Hispanics. Of the black offenders, a majority were women. The bulk happened last March, and they petered out soon after. If you drill down on some recent incidents in the news in California, and get past the media gloss to the actual mugshots, you also find as many black as white offenders.

This doesn’t prove much either, of course. Anti-Asian bias, like all biases, can infect anyone of any race, and the sample size is small and in one place. But it sure complicates the “white supremacy” case that the mainstream media simply assert as fact. 

And, given the headlines, the other thing missing is a little perspective. Here’s a word cloud of the victims of hate crimes in NYC in 2020. You can see that anti-Asian hate crimes are dwarfed by those against Jews, and many other minorities. And when you hear about a 150 percent rise in one year, it’s worth noting that this means a total of 122 such incidents in a country of 330 million, of which 19 million are Asian. Even if we bring this number up to more than 3,000 incidents from unreported and far less grave cases, including “shunning”, it’s small in an aggregate sense. A 50 percent increase in San Francisco from 2019 – 2020, for example, means the number of actual crimes went from 6 to 9

Is it worse than ever? No. 2020 saw 122 such hate incidents. In 1996, the number was 350. Many incidents go unreported, of course, and hideous comments, slurs and abuse don’t count as hate “crimes” as such. I’m not discounting the emotional scars of the kind of harassment this report cites. I’m sure they’ve increased. They’re awful. Despicable. Disgusting.

But the theory behind hate crimes law is that these crimes matter more because they terrify so many beyond the actual victim. And so it seems to me that the media’s primary role in cases like these is providing some data and perspective on what’s actually happening, to allay irrational fear. Instead they contribute to the distortion by breathlessly hyping one incident without a single provable link to any go this — and scare the bejeezus out of people unnecessarily. 

The media is supposed to subject easy, convenient rush-to-judgment narratives to ruthless empirical testing. Now, for purely ideological reasons, they are rushing to promote ready-made narratives, which actually point away from the empirical facts. To run sixteen separate pieces on anti-Asian white supremacist misogynist hate based on one possibly completely unrelated incident is not journalism. It’s fanning irrational fear in the cause of ideological indoctrination. And it appears to be where all elite media is headed.

Source: https://andrewsullivan.substack.com/p/when-the-narrative-replaces-the-news-9ea?token=eyJ1c2VyX2lkIjoxMDcxOTUwNywicG9zdF9pZCI6MzM4NTQ3NDcsIl8iOiJ3SVY5SCIsImlhdCI6MTYxNjMyMjg4MiwiZXhwIjoxNjE2MzI2NDgyLCJpc3MiOiJwdWItNjEzNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.p90yZ3tRiph43-8Wq6msRWTYlRMmdY_GZy0T0FrTkOQ&utm_source=substack&utm_medium=email&utm_content=share

The ideologies of Canadian economists, according to Twitter – Macleans.ca

Think tanksInteresting analysis of Twitter use and followers to indicate ideological leanings by Stephen Tapp:

Four additional results are worth highlighting. First, there are indeed many Canadian think tanks: these results include 44. Having such a crowded playing field may explain much of the general public’s confusion about which think tank fits in where ideologically.

Second, according to my ideology measure, Canadian think tanks seem to be about evenly split on the left-right continuum: there are 21 think tanks to the left of centre and 23 to the right.

Third, the smile isn’t exactly symmetric. In this sample, and with this measure, the average “right-wing” think tank appears to be a bit more “ideological” than the average “left-wing” think tank. That said, the difference is not that large and may simply reflect what Halberstam and Knight found in the US: that conservatives are actually more tightly connected on social media than liberals.

Fourth, my preliminary analysis did not suggest any systematic relationship between ideology and Twitter followers. In other words, it does not appear that more extreme ideologies on their own are associated with a larger Twitter following.

… That said, we should always be careful when reducing a complex issue to a single number along a single dimension. The concept of ideology is inevitably problematic. Moreover, think tank ideologies are not uniform within a given organization and they change over time. Finally, of course, readers should not use these results to prejudge, discredit or approve of research by any of these organizations without a thorough reading of that research. I emphasize that these simple results are preliminary and just a first step; much more work is needed to better understand these complex issues.

The ideologies of Canadian economists, according to Twitter – Macleans.ca.

The Best Infographics of the Year: Nate Silver on the 3 Keys to Great Information Design and the Line Between Editing and Censorship

Fields of CommemorationSome neat examples and the principles are well articulated. Some of the use of graphics in the Globe, National Post, NY Times etc buttress his points:

Great works of information design are also great works of journalism.

[…]At the core of journalism is the mission of making sense of our complex world to a broad audience. Newsrooms … place emphasis on gathering information. But they’re also in the business of organizing that information into forms like stories. Visual approaches to organizing information also tell stories, but have a number of potential advantages against purely verbal ones:

Approachability. Human beings have strong visual acuity. Furthermore, our visual language is often more universal than our words. Data presented in the form of an infographic can transcend barriers of class and culture. This is just as important for experts as for laypersons: a 2012 study of academic economists found that they made much more accurate statistical inferences from a graphic presentation of data than when the same information was in tabular form.

Transparency. The community of information designers has an ethos toward sharing their data and their code — both with one another and with readers. Well-executed examples of information design show the viewer something rather than telling her something. They can peel away the onion, build trust, and let the reader see how the conclusions are drawn.

Efficiency. I will not attempt to tell you how many words a picture is worth. But surely visualization is the superior medium in some cases. In trying to figure out how to get from King’s Cross to Heathrow Airport on the London Tube, would you rather listen to a fifteen-minute soliloquy from the bloke at the pub — or take a fifteen-second glance at Beck’s map?

But alongside the tremendous power of information design in making sense of the world is also a dark side of potentially equal magnitude, which Silver captures elegantly:

That information design is part and parcel of journalism also means that it inherits journalism’s burdens. If it’s sometimes easier to reveal information by means of data visualization, that can make it easier to deceive… What one journalist thinks of as organizing information, the next one might call censorship.

But it’s long past time to give information designers their place at the journalistic table. The ones you’ll see in this book are pointing the way forward and helping the rest of us see the world a little more clearly.

The Best Infographics of the Year: Nate Silver on the 3 Keys to Great Information Design and the Line Between Editing and Censorship | Brain Pickings.

How StatsCan lost 42,000 jobs with the stroke of a key – Macleans.ca

Ouch. More a management than a technical issue, in terms of the lack of communication and risk analysis. And possibly partially a result of reduced capacity on the management and quality control side as a result of reduced funding:

Fast forward to July. StatsCan technicians were updating the Labour Force Survey computer systems. They were changing a field in the survey’s vast collection of databases called the “dwelling identification number.” The report doesn’t explain what this is, but it’s likely a unique code assigned to each of the 56,000 households in the survey so that analysts can easily track their answers over time. They assumed they only needed to make this change to some of the computer programs that crunch the employment data, but not all of them.

The changes themselves were happening piecemeal, rather than all at once, because the system that collects and analyzes the labour force survey is big, complicated and old it was first developed in 1997. Despite being a pretty major overhaul of the computer system, the report makes it clear that the agency considered the changes to be nothing but minor routine maintenance. After updating the system, no one bothered to test the changes to see if they had worked properly before the agency decided to release the data to the public, in large part because they considered it too minor to need testing.

One of the programs that was supposed to be updated — but wasn’t — was the program that fills in the blanks when people don’t answer all the survey questions. But since technicians had changed the identification code for households in some parts of the system, but not others, the program couldn’t match all the people in July survey to all the people in the June survey. The result was that instead of using the June survey results to update the July answers, all those households who didn’t answer the questions about being employed in July were essentially labelled as not in the labour force. With the push of a button, nearly 42,000 jobs disappeared.

… There is a particularly illuminating passage in the report that speaks to problems of miscommunication and misunderstanding at the agency:

“Based on the facts that we have gathered, we conclude that several factors contributed to the error in the July 2014 LFS results. There was an incomplete understanding of the LFS processing system on the part of the team implementing and testing the change to the TABS file. This change was perceived as systems maintenance and the oversight and governance were not commensurate with the potential risk. The systems documentation was out of date, inaccurate and erroneously supported the team’s assumptions about the system. The testing conducted was not sufficiently comprehensive and operations diagnostics to catch this type of error were not present. As well, roles and responsibilities within the team were not as clearly defined as they should have been. Communications among the team, labour analysts and senior management around this particular issue were inadequate.”

How StatsCan lost 42,000 jobs with the stroke of a key – Macleans.ca.

Charts, Colour Palettes, and Design

Ethnic Origin Based Charts.001

NHS 2011

As some of you may know, working fairly intensely on analyzing and charting Canadian multiculturalism as seen through the National Household Survey data from 2011 (not as reliable as the Census but what we have).

In looking at how to make charts as simple as clear and possible, came across some good design and related sites.

The above sample is illustrative of the work I am doing.

Starting with Perceptual Edge on data visualization, and the advantages of simplicity. A short clear article outlining good design principles, with some suggested colour palettes:

Practical Rules for Using Color in Charts – Perceptual Edge

For a wider choice of colour palettes, see Every ColorBrewer Scale.

And for users of iWork, this nifty and easy to follow tutorial on how to use the “Colour Picker” effectively and create customized palettes:

Using Apple’s “Color Picker” in Pages 5, Numbers 3, & Keynote 6 (iWork 2013)

Any feedback or suggestions always welcome.

Don’t beat up Statscan for one data error – Cross

More on StatsCan from Phillip Cross, former chief economic analyst. Worth reading for some of the history and how the agency reacted to previous cuts:

People should get agitated about Statscan over substantive issues. Wring your hands that the CPI over-states price changes over long periods. Write your MP complaining that the Labour Force Survey doesn’t follow the U.S. practice and exclude 15 year olds. Take to the barricades that for an energy superpower like Canada, measuring energy exports has become a monthly adventure, routinely revised by $1-billion a month. But don’t use the July employment incident to evaluate how the statistical system is functioning overall. They messed up one data point in one series. Big deal. Anyone who lets one data point affect their view of the economy should not be doing analysis. Move along folks, nothing to see here.

Don’t beat up Statscan for one data error – The Globe and Mail.

The case of the disappearing Statistics Canada data

Good piece on Statistics Canada and the impact of some of the changes made to reduce long-standing data series:

Last year, Stephen Gordon railed against StatsCan’s attention deficit disorder, and its habit of arbitrarily terminating long-standing series and replacing them with new data that are not easily comparable.

For what appears to be no reason whatsover, StatsCan has taken a data table that went back to 1991 and split it up into two tables that span 1991-2001 and 2001-present. Even worse, the older data have been tossed into the vast and rapidly expanding swamp of terminated data tables that threatens to swallow the entire CANSIM site. A few months ago, someone looking for SEPH wage data would get the whole series. Now, you’ll get data going back to 2001 and have to already know StatsCan won’t tell you that there are older data hidden behind the “Beware of the Leopard” sign.…

Statistics Canada must be the only statistical agency in the world where the average length of a data series gets shorter with the passage of time. Its habit of killing off time series, replacing them with new, “improved” definitions and not revising the old numbers is a continual source of frustration to Canadian macroeconomists.

Others are keeping tabs on the vanishing data. The Canadian Social Research Newsletter for March 2 referred to the cuts as the CANSIM Crash Diet and tallied some of the terminations:

  • For the category “Aboriginal peoples” : 4 tables terminated out of a total of 7
  • For the category “Children and youth” : 89 tables terminated out of a total of 130
  • For the category “Families, households and housing” : 67 tables terminated out of a total of 112
  • For the category “Government” : 62 tables terminated out of a total of 141
  • For the category “Income, pensions, spending and wealth” : 41 tables terminated out of a total of 167
  • For the category “Seniors” : 13 tables terminated out of a total of 30

As far as Statistics Canada’s troubles go, this will never get the same level of attention as the mystery of the 200 jobs. But, as it relates to the long-term reliability of Canadian data, it’s just as serious.

Given my work using NHS data, particularly ethnic origin, visible minority and religions, linked to social and economic outcomes, still in the exploration stage of what data and linkages are available – or not.

The case of the disappearing Statistics Canada data