‘Anything would be better:’ Critics warn Ottawa’s family-reunification lottery is flawed, open to manipulation – The Globe and Mail

Almost comical if it were not for the impact on people. And it should not be surprising, given our immigration system’s emphasis on high skilled economic immigrants, that some of them should have the mathematical and technical smarts to point out the lack of randomness:

Canada’s family-reunification program is using a common spreadsheet application to select candidates as part of a process critics say is flawed and open to manipulation.

As the first step in the program, the federal government uses Microsoft Excel to randomly pick applications in its lottery, The Globe and Mail has learned. Experts have warned that using Excel to conduct such a sensitive lottery could be problematic, and that the lottery process itself may make the system less fair over all.

The Parents and Grandparents Program allows Canadians to sponsor family members for permanent-resident status. The Liberals introduced a lottery in 2017 in an effort to make the system fairer – previously, applications were accepted on a first-come, first-served basis. The program receives roughly 100,000 applications each year and selects 10,000.

Details on the lottery, obtained through an Access to Information request shared with The Globe, show a procedure carried out in just a few steps: Immigration, Refugees and Citizenship Canada (IRCC) uses Excel to assign each application a random number, then takes the first 10,000 numbers.

Excel’s method for generating random numbers is “very bad,” according to Université de Montréal computer-science professor Pierre L’Ecuyer, an expert in random-number generation. “It’s a very old generator, and it’s really not state-of-the-art.” Prof. L’Ecuyer’s research has shown that Excel’s random-number generator doesn’t pass certain statistical tests, meaning it’s less random than it appears. Under the current system, “it may be that not everybody has exactly the same chance,” Prof. L’Ecuyer said.

Excel uses pseudo-random number generators, a class of algorithms that rely on formulas to generate numbers. These generators have a key flaw – they rely on a “seed” number to kick off the mathematical process. In the case of Excel, this seed is generated automatically by the application. “If you know one number at one step,” Prof. L’Ecuyer explained, “you can compute all the numbers that will follow.”

This means the process could be exploited by someone with the right skills. It’s happened before: In 1994, IT consultant Daniel Corriveau discovered a pattern in a keno game – which uses a random numbering system – at the Casino de Montréal and won $620,000 in a single evening. An investigation later determined the game was using the same seed number at the start of each day.

Using more robust generators, such as the ones used for cryptography, may not cost the government much, either. “Cryptographic generators are free. They are on the internet,” Prof. L’Ecuyer said. “Just pick one, you need to know about it and that’s all. It’s not complicated.

“Anything would be better.”

For its part, IRCC is satisfied with its use of Excel, spokeswoman Shannon Ker said in an e-mailed statement. “We stand by this randomized selection process as a sufficient means of equal opportunity for all who look to express an interest in sponsoring their parents and grandparents.”

Others would rather see the lottery scrapped altogether. For the past two years, Igor Wolford, a data-analytics manager at Loblaws, has applied to sponsor his parents in Russia. He hasn’t made it past the lottery stage, and recently started a website to petition the federal government to abandon the system.

Mr. Wolford has corresponded with members of Parliament about his concerns. “I actually prepared an Excel sheet showing how random processes work,” Mr. Wolford said. “After 10 years of selection, only half of people who were eligible 10 years ago would be selected.”

Number of people from an original pool of 95,000 applicants who haven‘t made it past the lottery stage
Assuming 20,000 new applicants each year and 10,000 applicants selected each year

https://s3.amazonaws.com/chartprod/cZByyEpAhKuPixSBk/thumbnail.png 

Although the lottery selects roughly one in 10 applications, the number of people who pass additional vetting and ultimately make it into the program is far lower.

“Last year, they selected the original 10,000 people [during the lottery], but only 6,000 people actually [made it into the program],” Mr. Wolford said. This is partly be cause the lottery is the first step in the process, meaning anyone can fill out the form.

IRCC responded to these complaints in 2018 by including a self-assessment screening for applicants. However, the questions are still optional, as one Twitter user noted.

When told the lottery was conducted in Excel, Mr. Wolford wasn’t surprised. “That’s a very sad process. It’s easily manipulatable,” he warned. According to IRCC, the process is double-blind, and to date there is no indication the system has been manipulated.

“The process has become unpredictable,” Mr. Wolford said. “Before, you knew that it would take seven years from start to finish, and you could plan your life. Right now, you don’t know if it will happen this year, in five years, in 15 years.”

“Because it’s a lottery, you might never be selected.”

via ‘Anything would be better:’ Critics warn Ottawa’s family-reunification lottery is flawed, open to manipulation – The Globe and Mail

Nine things everyone should know how to do with a spreadsheet | Macworld

As I am starting to use spreadsheets to analyze demographic and related data, my basic knowledge of spreadsheets is being challenged. Another primer from Macworld (but applies to Excel and Google’s Sheets as well).

As I have been only using sum and average functions, these examples of other functions caught my eye:

=MAXRANGE and =MINRANGE: Return the largest and smallest values in a range. Related to these two, I also often use =RANKCELL,RANGE, which returns the rank of a given cell within the specified range.

=NOW: Inserts the current date and time, which is then updated each time the spreadsheet recalculates. In both Excel and Sheets, you need to add a set of parentheses: =NOW.

=TRIMCELL: If you work with text that you copy and paste from other sources, there’s a good chance you’ll find extra spaces at the beginning or end of some lines of text. The TRIM function removes all those leading and trailing spaces but leaves the spaces between words.

Nine things everyone should know how to do with a spreadsheet | Macworld.