Google Interview Questions 1

In this series I will be attempting to answer current and retried Google interview questions. As outlined in the books How Google Works and Work Rules! Google has found that “boring,” non-riddle questions are best at predicting future performance, but some of the older questions I’ll be answering are riddles.

Question:

You are at a party with a friend and 10 people are present (including you and the friend). Your friend makes you a wager that for every person you find who has the same birthday as you, you get $1; for every person he finds who does not have the same birthday as you, he gets $2. Would you accept the wager?

Possible Answers:

(1) Trick question, the only parties I go to are ten-way birthday celebrations. I get $9 total.

(2) “Bob, I’ve been meaning to talk to you about your gambling addiction. Even at this glorious party you’re making bets. Bob, don’t you see? This isn’t a party at all, it’s an intervention.”

(3) If you’re familiar with the “birthday problem” which is a popular problem given in probability classes, you’ll know that it takes 23 people to have a 50-50 chance of two people sharing a birthday. (The usual purpose of the exercise is to surprise people with the low figure of 23 people, which is unintuitive for most). So with 23 people the expected value of the bet is ($1*1 person -$2*21 people)*50% – ($2*22 people)*50% or a $42.50 loss for you. There are even fewer people at this party. The chance that any two people among 10 people will share a birthday is only 11.7% (1- 365!/((365-10)! * 365^10)). Since we would expect to lose money in the case with 23 people, we would definitely expect to lose money in the case of only 10 people. (This assumes it’s possible your friend has the same birthday as you, otherwise the chances diminish further).

How delicate are relationships?

Recently I have been thinking of dating analytically as a stream of dates where one must build up “relationship capital” before certain pieces of information are revealed or certain negative events occur. The stream must more or less occur in a particular order. That is, if you examined a series of weekly dates over the course of a year and scrambled that order – maybe the 37th date came first, the 5th date came 9th, etc. – the outcome wouldn’t necessarily be the same.

Many of us believe that there are instances when a piece of personal information is revealed “too soon.” For example, if on a first date your companion tells you that he or she recently filed for bankruptcy it may be a “deal breaker” as you assume this reflects negatively on their level of responsibility (as well as being an “overshare”). However, if the same piece of information is revealed after you’ve been dating, say, three months you can weigh the strength of that assumption against the person you’ve come to know. Likewise, some intense external shocks can prematurely end a relationship if they occur too close to the beginning of a relationship whereas  if that same event occurred after sufficient time had passed both partners have would have built up enough “relationship capital” to weather the storm.

I decided to expand on this idea by creating a simple model (which I plan on elaborating over time) by assuming a couple goes on a date once a week for a year (or 52 dates over whatever time period you like). Three pieces of information or events must be reveled only after enough capital has accrued. Event 1, is low level and can only occur after Date 5, Event 2 must occur after Date 20, and Event 3, the most serious, must occur only after Date 40.

What if we kept the concept of “deal breakers” and randomly scrambled the order or the dates. How many relationships would still last a year or more (by chance) simply because events happened after enough capital was built up? It turns out that scrambling the order of the stream of dates results in a failed relationship about 88% of the time.

Of course, this is a theoretical exercise, not just because it’s impossible to scramble the order of dates, but because in practice it is us who decide if we want to be with a person despite difficult times or questionable personal sharing.

 

Steph Curry is Awesome

There have been many, many blog posts about Steph Curry’s dominance this year but let me add one more just for fun. Using data from dougstats.com, I looked into Curry’s dominance in 3-point shooting.

The average number of 3-pointers made so far this season (excluding Steph Curry) is 42. However, this isn’t exactly the group we want to compare Steph against, since we wouldn’t consider, say, DeAndre Jordan his peer in terms of 3-point responsibility. Instead, I considered guards that have played in 60+ games so far this season. After doing so the average number of 3-pointers made about doubles to 87 (again, excluding Curry). Meanwhile, Steph Curry has 343 threes and is on track to finish the season with 398, more than one hundred more than his own single-season record. Here’s how the outlier that is Steph Curry looks graphically.

Four years ago Klay Thompson would’ve been on track to beat Ray Allen’s single-season record, but because of Curry, Thompson has to settle for a distant second.

Screen Shot 2016-03-24 at 5.37.27 PM.png

Curry has been ahead of everyone else all season long and this distance only grows larger with each game as this chart shows (made with data from Basketball-Reference.com):

Screen Shot 2016-03-25 at 4.07.50 PM.png

How rare is Steph Curry’s season? If we think of creating an NBA guard as a random normal process and give it this seasons mean and standard deviation of three pointers made we can get a rough idea. As it turns out the distribution of threes is skewed (as you might expect), but if you squint a bit you can see the distribution of the square root of each player’s three pointers made is approximately normal (this was revealed by using the PowerTransform function in R’s car package).

Assuming the parameters above we would expect to see a “2015-2016 Steph Curry” about once every 200 seasons.

Screen Shot 2016-03-25 at 7.33.10 PM.png

Here is Curry’s shot chart so far this season (using data from stats.NBA.com and this tutorial from The Data Game):

Screen Shot 2016-03-25 at 10.14.03 PM.png

Will a 16 seed ever beat a 1 seed?

That question arose during a recent dinner with my friend Graham at a popular pizza restaurant in Seattle. Tony Kornheiser of PTI said last week that it will never happen. Graham agrees. I think it will happen in our lifetime. It has almost happened a number of times already.

It seemed to me from casual observation like 16 seeds are getting closer to winning on average, but I decided to check this by plotting the point differential of higher-seeded teams during the first round of the NCAA tournament (in the first round there are four 1-vs-16 games). Indeed, compared to many other matchups the 1-vs-16 matchup has shifted greatly over time.

Screen Shot 2016-03-19 at 12.03.18 AM.png

The margin of victory is still substantial, between 10 and 15 points so far this decade, but I remain confident that, let’s say, sometime in the next 40 years it will happen.  There have already been eight 15 seed victories over number 2 seeds and twenty-one 14 seed wins over 3 seeds. The situation looks even better when you consider the closest 1-vs-16 game during each tournament (since we only need one 16 seed to win).

Screen Shot 2016-03-22 at 1.36.38 PM.png

It seems that about two to three times every decade there are relatively close 1-vs-16 games and about once a decade there is an extremely close game (decided by just a couple of points). The 2000s did not fare well for 16 seeds.

I think the outcomes of close games are more stochastic than most. Leadership attribution bias seems to turn these stochastic events into narratives of late-game heroics and we’re prone to say that the 1 seed is more poised and resistant to pressure than players at smaller schools. Of course the history of the tournament has shown us many, many exceptions to this rule (if it’s a rule at all). At tension with this narrative is the story of the underdog that is just happy to be at the tournament and has nothing to lose, playing loose, having fun, and playing “to win” while the nervous champion is playing “not to lose.” So to me many of these games are closer to a coin flip than we like to think and given enough coin flips a tail is bound to come up eventually.

Also, think about it this way: How much difference is there between a 1 seed and a 2 seed, and between a 15 seed and a 16 seed? As you know if you ever watch the tournament seeding show there isn’t much of a difference. The lowest ranked number 1 seed isn’t much better – and could be worse – than some of the number 2 seeds, and eight times has a two seed lost in the first round. Of course you could argue the 2 seeds that lost should have actually been 3 seeds, although this year Michigan State lost as a number 2 seed and many considered them to be a favorite in the entire tournament (by some measures this is the biggest tournament upset ever). The larger point is that seeding is also somewhat stochastic and the question “can a 16 seed beat a 1 seed” is really the question of whether an overmatched team can exhibit a one-time victory over an opponent that is much more dominant on average. And we already know the answer to this question is “yes.”

To give the other side its due, since 1979 when seeding began, 22 of the 37 NCAA Tournament winners have been 1 seeds, so at least some of the top seeds are properly ranked and 16 seeds will have a tough time beating them when ranking is accurate.

It’s also a question as to why the decrease in point margin has occurred. Keep in mind the plot above is just a second-order trend line, although if you plot the underlying year-by-year margin it does follow the trend on average (of course). It’s interesting that the late 1980s and early 1990s was a time of lower point differential for 16-vs-1 games and that this period also correlated with three wins for 2 seeds in the first round (1991, 1993, 1997). Likewise the past few years have seen another decrease in 16-vs-1 game point differential and another string of 2-seed wins, two in 2012 and one each in 2013 and 2016. Similarly, between 1986 and 1999 there were thirteen times that a 14 beat a 3, and since 2010 this has occurred another six times (the intervening years saw only two instances of this in 2005 and 2006). These two periods also correspond to the highly touted recruiting classes of Michigan in 1991 (famously nicknamed the “Fab Five“) and Kentucky’s 2013 “mega class.” It may be that there are episodic shifts in recruiting that systematically leave certain types of talent on the table for smaller schools to cull and develop. (Of course, it may be I’m just seeing patterns where none exist).

My recent memory is that although there are a few good schools that still get the best players, smaller schools have seen that if they recruit good players (particularly good shooters or traditional big men) and play as a team they have a chance at beating anyone in the first rounds and perhaps going deep into the tournament. This increases their confidence and performance. This is another reason I think a 16 seed will eventually beat a 1, because the recent years of the tournament have expanded our imagination of what is possible. Think about the well-known phenomenon of a 12 seed always beating a 5 seed. How nervous do you think 5 seeds are every year? How much of this consistency in a 12 beating a 5 is self-reinforcing and due to the 5 seeds having “the jitters” and the 12 seeds being relatively overconfident?

S.L.U.T. Shaming

Just today I found this 2014 Matthew Yglesias article from VOX that mentions the dismal ROI of street cars. It turns out Matt has written lots of articles on street cars and public transportation more broadly (see this and this, for example), often arguing that buses are a better solution (partly because of cost). That in turn lead me to relisten to this EconTalk episode with Bent Flyvbjerg on megaprojects, which I highly recommend.

In recent years Seattle has opened two new streetcars, which appear to me to be totally useless, and I was happy to see Matt agreed in general. One of the strangest things is that the streetcars run on normal surface streets for portions of their routes, which means they’re subject to the same traffic delays as cars…and buses.

It lead me to look up some data on Seattle’s South Lake Union Trolley (affectionately known in Seattle as the S.L.U.T.).

Even a glance at the Wikipedia page presents trouble. Built in 2008, the S.L.U.T only carries 2,200 people per day, but has a capacity of 12,600, meaning daily ridership is less than 20 percent of capacity. Second, ridership has gone down (!) several hundred riders from the peak years 2011-2013. The South Lake Union area of Seattle is home to Amazon and many other companies and is being built up extremely quickly (new condos all around the area). Traffic there is simply horrific, which makes the ridership numbers even worse since it means people are using public transportation less as traffic worsens.

Matt suggested buses are a better alternative, which seems intuitive after you see how these trolleys operate. The S.L.U.T ended up costing $56 million (it’s just 1.3 miles long), and again as Matt suggests seems much more geared toward making the area seem hip and cool and shuttling around Amazonians than it does toward decreasing congestion. Indeed, this Seattle PI article details some of the political aspects of the plan, notably that “Paul Allen’s Vulcan owns 60 acres in the neighborhood, much of it along the streetcar line.” Vulcan was a major proponent of the project. A prominent selling point to local businesses was that property values would go up $100,000 or more.

I ran some simple numbers to compare the S.L.U.T with a comparable investment in buses. Out of the $56 million spent on S.L.U.T., 25 million was fronted by South Lake Union property owners (again, suggestive of why it was built in the first place), money that certainly wouldn’t have been available had buses been purchased instead. This leaves a potential $31 million budget. Buses cost around $500,000k for diesel options and as much as $1 million dollars for newer battery-powered coaches.

Running some numbers from King County Metro gives an average of 210 passengers per bus during weekdays. Even if Seattle went with the more expensive battery-powered buses and left $15 million for operating costs (meaning the city purchased 15 buses), daily capacity would still be 1,000 passengers more than ride the S.L.U.T. daily. And even this figure is misleading because the 210 number is based on people that actually ride the bus, not total capacity, which is likely much higher.

What’s more buses have many more advantages. For example, you don’t have to block roads for months at time to construct tracks within the road. And since they don’t run on tracks buses are obviously more geographically  flexible. Additionally, the average bus route is much more than 1.3 miles in length and buses serve poorer areas, which South Lake Union certainly is not.

Politically, though, this was never going to happen. Seattle buses are run by King County Metro Transit, but the S.L.U.T. was paid for by the Seattle Department of Transportation.

Why I’m not voting for president, but plan on complaining anyway

“If you don’t vote you don’t have the right to complain,” the saying goes. Alas, I don’t plan on voting, but I have no problem complaining anyway.

Of course the word “right” here is not used to denote a legal construct, but rather a cosmic one. A sort of you-got-what’s-coming-to-you quid pro quo. You didn’t marry LeeAnne so now you have no right to complain about being 35 and alone, or so my mother tells me.

But just as surely as cosmic rights exist, we’re all guilty of violating them everyday, especially when it comes to complaining when perhaps we know better. I might, for instance, procrastinate on the job by harassing my friends to “get out the vote” only to later complain about having to work late. Or I might refuse to take public transportation and then complain about all the traffic I have to sit in. And don’t get me started on all the wasted time we spend lamenting our relationship misadventures, so often the result of our own design.

So for starters even if I have no right to complain about who’s president if I didn’t vote, I’m going to complain anyway because that’s what we humans do. It happens in all kinds of settings, why should politics be any different? Indeed, it’s notable that you never hear the phrase “If you don’t vote you have no right to celebrate.” It seems there is something deeply human and seductive about complaint.

But, the example above regarding public transportation points out another flaw in the not-voting-equals-no-complaining calculus. Even if you decided to take the bus instead of drive it wouldn’t really do anything to mitigate the amount of traffic in your city. It’s not your single car that’s causing congestion after all. You should feel guiltless when complaining about traffic because – unless you happen to be the head of your city’s transportation department – there is quite literally nothing you as an individual can do to reduce traffic even if, paradoxically, you happen to be part of the problem.

By now you have likely unveiled my public-transportation-is-really-voting allusion. I’m sure you’ve heard it before, but stick with me for a moment. That’s right, I’m sorry people but your vote just doesn’t matter.  Let me clarify. Your vote actually does matter in all kinds of important ways. It allows you to express your preferences through our democratic process, to align yourself with politicians you believe to be sensible if not always wholly upstanding, to signal to the world how civic minded you are as you stroll into the afterwork cocktail party with an “I voted” sticker affixed to your lapel. It’s just that your vote doesn’t matter for the outcome of the election itself.

I hesitate to call this position anything other than fact. It has been shown both by mathematical calculations and by historical evidence. In truth the probability of an election being decided by your vote alone is not absolutely zero. The probability that you’ll be struck by lightening isn’t zero either, but you should probably go about living as if it were.

Why should my right to complain hinge on something so superfluous as a vote?

“But what about Florida?” you ask. Yes, in 2000 the U.S. presidential election was decided by a mere 537 votes. These five hundred votes might as well have been five million though because both numbers are larger than zero, the count difference it would take for your vote to decide the election.

“But what if everyone thought the way you do?,” you retort. Well, in that case we’d be in trouble. But everyone doesn’t think the way I do, which is why this piece is likely to draw your ire. If you’re the type of person that organizes the masses to get out and vote then you might matter a lot for an election, but your vote matters very little.

And while we’re at it — no, not voting is not the same thing as voting for Donald Trump. You can bet if Drumpf becomes president I’ll do plenty of complaining, not least because I don’t want my next trip to the White House to involve being blinded by the sun’s reflection off a gold-plated North Lawn.

The situation is even rosier for the would-be kvetch though because not only does voting not matter, but the president doesn’t matter that much either. Now comes the exciting part because I get to reference my favorite kind of bias, aptly-titled “leadership attribution bias.” In short, the president is a manager like any other: they get all of the credit when things go well and none of the blame when things go poorly. A cheap shot I know.

When you credit (or blame) the president you’re really referencing U.S. political institutions more broadly, and you have even less control over those than you have over who the next president is. The president is buffeted by all kinds of institutional and political forces: House and Senate constituencies, tit-for-tat political horse trading, the actions of both rouge and friendly nations, state and local policy, regulatory agencies, the judiciary, and the vacillating will of the American public to name a few.

The average American political scientist thinks the president matters much less than the average American citizen. Maybe they’re out of touch or overly wonkish, or maybe they’re better at understanding the complexities and constraints of the modern American presidency.

I haven’t even mentioned the fact that one might be disinclined to vote simply because none of the candidates in our not-so-diverse, two-party system fit the bill. Now that’s something to complain about.

Nor am I fond of the idea of absorbing the marginal voter into the presidential election decision simply because it’s everyone’s civic duty to vote. If someone is ignorant, let them abstain. It’s probably better than tackling a crash course in U.S. politics days before an election. And while you’re at it, when they’re forced to switch healthcare providers let them complain. The distance between their abstention and healthcare troubles is lightyears.

There are plenty of reasons not to vote. And there are certainly plenty of reasons to complain about policy outcomes. Abstention may seem foolish because it puts a decision that could be ours in the hands of another. But if we have a cosmic right to anything, it’s to complain despite our own foolishness.

[Relax. It’s intentionally incendiary people.]

How much disagreement is there about statistics?

So much that just this year the American Statistical Association put out a 12-page manuscript about p-values and it took them a year of discussion(!) before the manuscript was complete.

See also this very short 2006 article by Andrew Gelman and Hal Stern The Difference Between “Significant” and “Not Significant’ is not Itself Statistically Significant:

The error we describe is conceptually different from other oft-cited problems—that statistical significance is not the same as practical importance, that dichotomization into significant and nonsignificant results encourages the dismissal of observed differences in favor of the usually less interesting null hypothesis of no difference, and that any particular threshold for declaring significance is arbitrary…

In making a comparison between two treatments, one should look at the statistical significance of the difference rather than the difference between their significance levels. [Emphasis added].

And this related 2011 paper by Nieuwenhuis, Forstmann, and Wagenmakers, Erroneous analyses of interactions in neuroscience: a problem of significance, which found that half of the 160 papers reviewed, which all appear in top academic journals, used the wrong statistical procedure when evaluating p-values.

One-Sentence Reviews

Bosch
Slowly absorbing the cast of The Wire.

Bridge of Spies
For some reason I thought: Mr. Holland’s Opus meets the Cold War.

11.22.63
So far the show is horrible and much worse than the book, which I would only describe thus far as “fine.”

The 100 (Season 3)
What is even happening and who the f cares?

The Everything Store
Amazon employees used to go to Toys “R” Us during the holiday season, stock up on sold-out items, and then resell them through Amazon.com. Lolz.

Love
Liked it, but didn’t LOVE it. Get it?

Why foreign policy is difficult

Dear Excellency and friend,

I thank you very sincerely for your letter and for your offer to transport me towards freedom. I cannot, alas, leave in such a cowardly fashion.

As for you and in particular for your great country, I never believed for a moment that you would have this sentiment of abandoning a people which has chosen liberty. You have refused us your protection and we can do nothing about it. You leave us and it is my wish that you and your country will find happiness under the sky.

But mark it well that, if I shall die here on the spot and in my country that I love, it is too bad because we are all born and must die one day. I have only committed the mistake of believing in you, the Americans.

Please accept, Excellency, my dear friend, my faithful and friendly sentiments. Sirik Matak.

I was made aware of that letter by the movie Don’t Think I’ve Forgotten: Cambodia’s Lost Rock and Roll, which I saw last night at the Seattle Asian American Film Festival. As the title suggests the film focuses on music in Cambodia before and during the Vietnam War, and its loss after the rise of the Khmer Rouge.

The South Vietnamese and Cambodians were in an impossible position at that time. American opposition to the Vietnam war forced a withdrawal in 1973, though indirect US intervention lasted until 1975. In Cambodia the Khmer Rouge took over and exacted an unspeakable toll. This was especially hard for many Cambodian people because it was a civil war: Cambodian-on-Cambodian violence. During a private and especially emotional Thanksgiving dinner I heard a close friend mother’s recount her life under the Khmer Rouge. I don’t know how anyone had the strength to survive.

The situation is even more complicated, because the rise of the Khmer Rouge itself was a response to American bombing, which itself was a response to fear that communists from Vietnam would overtake part or all of Cambodia. There are many other twists and turns in this story that underscore the complexity and tragedy of foreign intervention and of not intervening.