‘Hawking index’ charts which bestsellers are the ones people never read

Fun example of innovative analysis (and for all those of you who claim to read Piketty or other similar tomes):

Jordan Ellenberg, a mathematician at the University of Wisconsin, Madison, has just about proved this suspicion correct.

In a cheeky analysis of data from Kindle e-readers, Mr. Piketty’s daunting 700-page doorstopper emerged as the least read book of the summer, according to Prof. Ellenberg, who calls his ranking the Hawking Index in honour of Mr. Hawking’s tome, famous as the most unread book of all time.

As a result, he is tempted to rename it the “Piketty Index,” because Mr. Piketty scored even worse than Mr. Hawking.

As such, both stand as extreme case studies in aspirational reading. Like the Economist magazine’s Big Mac index of hamburger prices around the world, which is both silly and serious, Prof. Ellenberg’s Hawking Index is funny, in that it reveals the vanity of many book choices. But it also offers an interesting psychological perspective on reading that is born of good intentions, and dies of boredom on the dock or beach.

The calculation is simple, and as Prof. Ellenberg says, “quick and dirty.” It exploits a feature of Kindle that allows readers to highlight favourite quotes. It averages the page number of the five most highlighted passages in Kindle versions, and ranks that as a percentage of the total page count. Although it does not measure how far people read into a book, it makes a decent proxy for it.

“Why do you buy a book? One reason is because you know you’re going to like it,” Prof. Ellenberg said. “Another reason might be, ‘Oh, I think this book will be good for me to read.’”

….. He said his formula illustrates what mathematicians call the problem of inference, meaning he cannot say for sure these books are going unread, just that he has strong evidence for it.

“You can make some observation about the world, but there’s some underlying fact about the world that you’d like to know, and you want to kind of reverse engineer. You want to go backwards from what you observed to what you think is producing the data you see,” he said.

Other books reveal different insights into why people buy books they start but do not finish. Michael Ignatieff’s political memoir Fire And Ashes, for example, scores comparatively well for non-fiction at 44%, far better than Hillary Clinton’s Hard Choices, which barely cracked 2%. Lean In, the self-help book by Facebook executive Sheryl Sandberg, scored 12.3%.

In fiction, The Luminaries, by Canadian-born New Zealand author Eleanor Catton, which won last year’s Man Booker Prize, scores a mere 19%, and would score a lot lower if not for one highlighted quote near the end.

Prime Minister Stephen Harper’s book on hockey, A Great Game, curiously has no highlighted passages, so cannot be ranked on the Hawking Index (or, equivalently, ranks as low as is theoretically possible).

Fiction tended to score higher, likely reflecting the tendency for non-fiction authors to put quotable thesis statements in the introduction. The only novel that was down in the range of the non-fiction books was Infinite Jest by David Foster Wallace.

Prof. Ellenberg does not mean to disparage the low ranking books, he said, noting that the reason people buy them in the first place is that they are rich in content.

“I think it’s good to do back of the envelope computations as long as you do them with the appropriate degree of humility, and understand what it is that they’re saying,” he said. “I think any statistical measure you make up, you take it as seriously as it deserves to be taken.”

‘Hawking index’ charts which bestsellers are the ones people never read

Better data alone won’t fix Canada’s economy – The Globe and Mail

Good piece on the need for a broader and more thoughtful approach to the use of data in a “big data” environment:

The bottom line is that being data-dependent doesn’t mean responding to every wiggle in data. Nor does it mean basing our decisions solely on data or models and nothing else.Yes, we need better data.

But that’s only a start. We also need to ask precise and well-posed questions – of ourselves in our analysis and of our policy makers in their choices – particularly as “big data” increases the availability of non-conventional data sources. In addition, we need to bring new approaches to bare on data, and clearly explain the results to non-specialists.

After concerns about jobs estimates during the Ontario election, let’s hope that the lesson learned by our politicians is not to withhold economic analysis in future campaigns. Instead, let’s hope it causes them to raise their game by presenting more credible analyses.

At the same time, let’s be realistic about what better data can accomplish. This means acknowledging that data give us imprecise measurements of reality, but when used responsibly and creatively, they help us make better choices and hold governments to account for their policy decisions.

Better data alone won’t fix Canada’s economy – The Globe and Mail.

Why Canada has a serious data deficit

More on the importance of good data by Barrie McKenna in the Globe’s Report on Business:

Prof. Gross [the C.D. Howe Institute researcher responsible for their study on Temporary Foreign Workers and their effect on increasing unemployment in AB and BC] acknowledged that perfect data is “very costly.”

So is bad data.

Employment Minister Jason Kenney recently imposed a moratorium on the use of temporary foreign workers in the restaurant industry, following embarrassing allegations of misuse by some McDonald’s franchise and other employers. And he has promised more reforms to come.

But who is to say that restaurants need imported foreign labour any less than hotels or coal mines, which are unaffected by the moratorium? And without better information, Mr. Kenney may compound his earlier decision to expand the program with an equally ill-considered move to shrink it.

The government’s troubles with the temporary foreign workers program is a classic case of bad data leading to dubious decision-making. Until recently, the government has relied on inflated Finance department job vacancy data, compiled in part by tracking job postings on Kijiji, a free classified-ad website. Statscan, meanwhile, was reporting that the national job vacancy rate was much smaller, and falling.

The problem goes way beyond temporary foreign workers. And it’s a data problem of the government’s own making. Ottawa has cut funds from important labour market research, slashed Statscan’s budget more savagely than many other departments, and scrapped a mandatory national census in favour of a less-accurate voluntary survey.

The Canadian government has demonstrated “a lack of commitment” to evidence-based decision-making and producing high-quality data, according to a global report on governance released last week by the Bertelsmann Foundation, a leading German think tank. The report ranked Canada in the middle of the pack and sliding on key measures of good governance compared with 40 other developed countries

One of the disadvantages of being in government for almost 10 years is that decisions which may have appeared to be cost-free can come back and haunt you.

Why Canada has a serious data deficit – The Globe and Mail.

Meet Joe/Jose/Youssef Canada

A good overview of Canada from the National Voluntary Survey. While not as accurate as the Census cancelled by the current government (higher cost for poorer quality data, less comparability with previous data), at the national and provincial levels captures the major trends.

Meet Joe/Jose/Youssef Canada.

Advice for Policy Makers and Researchers

While this was written to assist government scientists and policy makers better understand each other these are both very good lists, compiled by British and Australia policy makers and researchers. They capture the dynamic well between the technical expert and the more general policy advisor roles and perspectives, and tap into themes of ideology, evidence and risk of Policy Arrogance or Innocent Bias: Resetting Citizenship and Multiculturalism.

Good reading both within the public service and with the political level, given some of the ongoing tensions regarding evidence and anecdote and how the different perspectives play out.

Top 20 things scientists need to know about policy-making

  1. Making policy is really difficult
  2. No policy will ever be perfect
  3. Policy makers can be expert too
  4. Policy makers are not a homogenous group
  5. Policy makers are people too
  6. Policy decisions are subject to extensive scrutiny
  7. Starting policies from scratch is very rarely an option
  8. There is more to policy than scientific evidence
  9. Public opinion matters
  10. Economics and law are top dogs in policy advice
  11. Policy makers do understand uncertainty
  12. Parliament and government are different
  13. Policy and politics are not the same thing
  14. The UK has a brilliant science advisory system
  15. Policy and science operate on different timescales
  16. There is no such thing as a policy cycle
  17. The art of making policy is a developing science
  18. ‘Science policy’ isn’t a thing
  19. Policy makers aren’t interested in science per se
  20. We need more research’ is the wrong answer

Top 20 things politicians need to know about science

  1. Differences and chance cause variation
  2. No measurement is exact
  3. Bias is rife
  4. Bigger is usually better for sample size
  5. Correlation does not imply causation
  6. Regression to the mean can mislead
  7. Extrapolating beyond the data is risky
  8. Beware the base-rate fallacy
  9. Controls are important
  10. Randomisation avoids bias
  11. Seek replication, not pseudoreplication
  12. Scientists are human
  13. Significance is significant
  14. Separate no effect from non-significance
  15. Effect size matters
  16. Data can be dredged or cherry picked
  17. Extreme measurements may mislead
  18. Study relevance limits generalisations
  19. Feelings influence risk perception
  20. Dependencies change the risks


1921 census provides a glimpse into Toronto’s multicultural past

A reminder that Canada’s diversity has a long history. And the value of a consistent national census.

1921 census provides a glimpse into Toronto’s multicultural past | Toronto Star.

To restore faith in Statscan, free the Chief Statistician

Munir Sheikh, the former Chief Statistician of Canada, on the case for a more independent Statistics Canada to help improve trust in the quality of their reports.

To restore faith in Statscan, free the Chief Statistician – The Globe and Mail.

Canada’s voluntary census is worthless. Here’s why – The Globe and Mail

Another illustration of the effects of the move to a voluntary census.

Canada’s voluntary census is worthless. Here’s why – The Globe and Mail.

The cost of scrapping the long-form census – Beyond The Commons, Capital Read – Macleans.ca

The cost of scrapping the long-form census – Beyond The Commons, Capital Read – Macleans.ca.

National Household Survey: Canada’s immigrant population surges | Canada | News | National Post

One of the better infographics on the 2011 Census.

National Household Survey: Canada’s immigrant population surges | Canada | News | National Post.