Nothing you put on the Internet is a secret. Ever. Despite what assurances are made to you by YouTube, Facebook, Gmail, Amazon, or in this case, Netflix, there’s a very real possibility your personal data will leak, and it will embarrass you. That’s the hard lesson being learned by a closeted lesbian mom, who is suing the movies-by-mail company for revealing her sexuality. Except she’s got a great case, given that video rental records are actually among the most federally protected records in all the land.
In trying to create a smarter recommendation algorithm, Netflix launched a $1 million prize project and opened up its data records, “anonymously,” to contest entrants. Except that supposedly unidentifiable data of nearly a half million customers — including the rating you gave a movie, when, and the reviews you’ve posted about it — was easily matched up with other data, like that of IMDB, to reveal an anonymous Netflix user’s real identity. Reports Wired:
So it wasn’t surprising that just weeks after the contest began, two University of Texas researchers — Arvind Narayanan and Vitaly Shmatikov — identified several NetFlix users by comparing their “anonymous” reviews in the Netflix data to ones posted on the Internet Movie Database website. Revelations included identifying their political leanings and sexual orientation.
Does this lesbian mom — identified only as Jane Doe in Doe v. Netflix, which is seeking $2,500 in damages for every Netflix customer — have a case? Very possibly.
But video records count among the most privacy protected records in the U.S. — a reaction to a reporter getting Supreme Court–nominee Robert Bork’s records from a video store. The lead attorney on the new suit, Joseph Malley, recently reached a multimillion-dollar settlement with Facebook over its failed Beacon program, which drew fire in part for sharing users’ Blockbuster rentals with their friends.
Meanwhile, in its mission to make its recommendation engine, Netflix is working on a second contest. Except this time it would also make available members’ ZIP codes, ages, and genders — all but guaranteeing an increase in the number of recognizable “anonymous” customers. And all but guaranteeing more privacy violations, which is why Jane Doe’s lawsuit is asking a court to nix the second contest.
How about we take this to the next level?
Our newsletter is like a refreshing cocktail (or mocktail) of LGBTQ+ entertainment and pop culture, served up with a side of eye-candy.
It’s understandable why your movie rental records are considered so private: Your viewing habits, like your library book check-out data, tell an awful lot about your person. And because these things (movies, books) are consumed in the privacy of your home, there’s a reasonable expectation of privacy. Not only does federal statute (the Video Privacy Protection Act) make it a crime to release such data (and some states, like Michigan, go even farther), but Netflix’s own privacy policy says it won’t tell anyone if you like Brokeback Mountain or Eating Out 3.
What’s shocking, however, is that your Netflix movie rentals are given broader federal privacy protections than the arguably more personal data you upload to Facebook — all of which is very, very identifiable. And even more capable of outing you.
terrwill
The only way to write anything ” anonymous” today is
to break out the old pen and paper and book of stamps
dizzy spins
Im sorry–could someone please paraphrase this post in English? I dont understand what this contest is or how it revealed who this woman was or to whom it was revealed. Or is this a “theoretical” outing?
Robert
So wait… these people compared the ‘anonymous’ Netflix reviews to those reviews posted by (apparently) the same people on ANOTHER website that requires them to register.
It KIND of sounds like Netflix didn’t ‘accidently’ do anything. Their data was anonymous. These groups just matched up the reviews customers copy / pasted.
Yeah – Netflix releasing MORE info might create some legitimate reason to be upset… but really Queerty; your headline places the blame on Netflix here, and they did nothing wrong.
Mike in Asheville, nee "in Brooklyn"
@No. 3 Robert:
ditto
Alexa
Much as I love to be critical of ridiculous stuff in Queerty’s writings recently, in this case it’s the lesbian who is blaming Netflix, because she’s suing them, so the headline is fine.
alex
I’m all about pointing out the “ridiculous stuff” at Queerty, too. But, I agree with Alexa in that the headline is fine.
As a doctoral student that works with anonymous data (mostly Census data), I’m appalled at the University of Texas researchers. Granted, I’m not familiar with this case; but if these two actually took the data and attempted to discover the identity of the subject, U of T should fire them or expel them from any degree program.
RomanHans
Netflix didn’t “accidentally” do anything. They handed out all their user information, withholding names and addresses.
So, I’m with #2 here: is the real story that two University of Texas researchers correlated that data with IMDB users, discovering real names and sexual orientations and announcing these to the world?
Alan
Closet cases shouldn’t be allowed to pretend that they’re hero activists. They’re parasites. Fuck ’em.
Asop
What’s a lesbian?
Paul
I’m with #8 Alan…I happen to be an “online buddy” with a gay guy who is married with 6 kids. He wants to find “love” with another man. I told him to get a divorce and come out. He said it would hurt his family. I told him it would hurt them more when they found out he was having affairs with men. He is obviously desperate for some kind of gay life.
My point: come out, join the ranks, and help us eradicate our image as being nefarious, clandestine creatures! We need everyone!
Strepsi
I agree Queerty, #2 has a good point.
Furthermore, not just the researchers are off. Also the “private” people like this mom, who posted the same review anonymously and then identifiably on other sites, no?
Isn;t this the flip side of all this “sharing” of opinions that’s going on? Once you “share” an opinion, you are no longer private, no?
MikenStL
The problem is that lets say she watched a porn and a Disney movie, then posted a comment about the Disney movie on the Internet movie database. The researchers were then able to use the Disney comment to figure out that she had also rented the porn. That is how Netflix inadvertently released the ‘private’ info. This would be even more problematic in the 2nd contest with the addition of zip code data.
Jon
I don’t understand what the problem is by the way this article is written. If she posted the same review on IMDB and it wasn’t anonymous on IMDB then how did Netflix or the researchers out her? Furthermore, how does identifying the writer of an anonymous review reveal sexual orientation? Did she say in the review “I’m a big ‘ol dyke.”? Were they lesbian movies? I’m sure there are some “straight” people who watch lesbian movies. Simply watching a lesbian movie is not enough to identify someone as a lesbian.
There’s not enough information in this article for me to make an informed decision on which side I’m on in this debate.
Robert
Let’s say she posted the same item on IMDB and Netflix. Her Netflix information is still 100% anonymous. No data given to the researches can in any way give the researchers her REAL name or any REAL information needed to identify her in a legal sense.
They can however, cross-reference Anonymous Post by Netflix User #231 with identical reviews posted to IMDB.com.
And let’s see… does IMDB.com list the real First and Last name of its posters along with home addresses, telephone numbers, etc? Nope… didn’t think so.
So what do they have? They know Netflix User #231 and “FilmMuffDiver7” are the same person. Oh yeah… that’s a lawsuit that is just destined to be thrown out.
Jon
I’m also confused by why she’s suing. Were there damages? How did the researchers publish this information and how did she find out about it? Did they actually publish her name and there were real life consequenses for her? Did she get fired? Did her parents find out? What were the real life damages she suffered as a result of this entire matter? Is simply outing someone a reason to sue? Don’t blogs do that all the time?
Keith Kimmel
Well, Netflix deserves this. AOL did exactly this same thing a few years ago and the reaction of the Internet community as a whole was very negative. I don’t know of any lawsuits filed in that incident, but Netflix should have wanted to avoid the negative publicity that AOL got.
Those who fail to learn from the mistakes of the past are condemned to repeat them. Smile and pay up, Netflix. This isn’t the first time you have been sued for doing dirty, evil shit. Hopefully the fucking courts don’t let you settle this lawsuit by mailing a pile of free month of service coupons out.
Tara
Really, they had no business distributing this data. Any developer could easily have just taken the database tables and created fake data. Algorithms are complex equations. They work with data but do not require real data. Netflix was lazy and irresponsible.
SteamPunk
When was her user data decoded? The “anonymous” Netflix user data was decoded 2 years ago. Was hers one of them or more recently? The lady seems correct in saying that Netflix shouldn’t distribute here data publicly. However, if Netflix never stopped distributing the data after it was decoded/hacked back in 2007, then the case gets more interesting to me.
Secondly, we should never, ever assume that the information we post online is anonymous. That is not true.
B
No. 6 · alex wrote, “As a doctoral student that works with anonymous data (mostly Census data), I’m appalled at the University of Texas researchers. Granted, I’m not familiar with this case; but if these two actually took the data and attempted to discover the identity of the subject, U of T should fire them or expel them from any degree program.”
I would presume the researchers did not publicize anyone’s names and were studying security issues – showing how releasing what seems to be innocuous information can lead to people being identified. If you show the risks are real, something is more likely to be done to mitigate them.
Marc
This is ridiculous. The researches only identified a Netflix user whose ratings were similar to an IMDB user’s ratings. The only thing that proved was that the two users had similar tastes. It didn’t prove that both users were the same person, nor did it prove WHO the users actually were.
Of all the privacy issues on the web, this one is below tens of thousands of more important and egregious violations.
ossurworld
Let me get this straight (pardon the expression): if you reveal your identity through a contest or voluntarily putting information on a site, you can sue anybody who puts two and two together?
Serena
The real problem is that Netflix just announced a new Prize in August where they will be releasing ages and zip codes. This is basically an effort to stop them in their tracks as that is pretty private stuff and the grad students have shown that the data can be de-anonoymized.
Brian NJ
This is a frivolous lawsuit. Now our Netflix fees have to go up to pay for this no damages boo-hoo my embarrassment lawsuit?? What are her damages? Her neighbor saw that she she watches lesbian movies? How does that cost her any money! And a closet case who marries a guy is not a sainted being whose privacy is precious as jade from a mine. She may be piling up victims, and Netflix might just be the latest in a series of people whom she wants to bear the cost of her failure to accept herself.
I don’t want the courts clogged up with this nonsense. She should dump the lawyer, and buy a therapist instead who will tell her to be proud of herself, and tell her neighbor that Bound is great and she should rent it. Not every mistake is a federal offense.
Dan
A lot of people seem to have found this article difficult to understand. You do have to read between the lines a little.
The linked article says that the researchers determined several people’s sexual orientations from the information Netflix revealed. It seems pointless to argue that this couldn’t possibly happen, since apparently it already did.
What’s more, the Supreme Court has already ruled that sexual orientation is covered by the right to privacy in the famous case that knocked down the remaining antigay sodomy laws. My guess is that that ruling will influence the outcome of this case.
It isn’t correct that Jane Doe is suing because she revealed her own personal information publicly on the site. According to the suit, Netflix disclosed information that was provided to the company for its recommendation algrithm. The company then violated its own privacy policy by posting the information online.
In my experience, website movie ratings only appear in the aggregate: “This movie has received an average rating of four stars,” for example. By contrast, the information Netflix revealed was associated with individual customers, complete with a unique ID number.
This looks pretty damning: “The Plaintiffs’ and class members’ movie data and ratings, which were released without authorization or consent, have now become a permanent, public record on the Internet, free to be manipulated and exposed at the whim of those who have the Database.”
Finally, the same attorney just won a very similar suit against Facebook’s program. It sounds to me like Ms. Doe has a case.
PootieTang
whats netflix?
B
In No. 24. Dan wrote, “According to the suit, Netflix disclosed information that was provided to the company for its recommendation algrithm. The company then violated its own privacy policy by posting the information online.”
It’s a bit more complicated than that. Netflix apparently “anonymized” or “sanitized” the data it released (for example,by giving the users ids like user-1, user-2, …., which has nothing to to at all with their login names). Netflix assumed that this was sufficient to protect a user’s privacy. It turns out it isn’t. So Netflix may have violated users’ privacy, not through malice but through incompetence.
Read http://ebiquity.umbc.edu/blogger/2009/09/22/privacy-concerns-about-new-netflix-prize-data/ for a short ‘one pager’ and http://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf for a technical paper on how it works. If you look at the paper, it will be quite clear as to why Netflix marketing types could be clueless about the risks. If you aren’t mathematically inclined, you may find the paper a bit daunting.
Being clueless is, of course, not exactly the most compelling defense to use in court, so Netflix has every reason to be worried.
Schteve
Not only is Netflix not to blame for anything here (which most readers seem to agree with), but the researchers are completely innocent as well.
The only information being used here is either completely anonymous stuff (Netflix ratings) or publicly available info (IMDb ratings). If you don’t want anything about yourself becoming public as well (such as your sexual orientation), then don’t do things like make IMDb comments on the same movies your watch from Netflix.
Babs DAngelo
Sorry–I think this person is just trying to squeeze some holiday $$$ out of Netflix. If you’re paranoid, just don’t give personal reviews. I’m straight, but i like many of the movies with “gay content”…so what? I don’t care if someone has so much time on his hands that he analyzes my reviews for whatever purpose. People also think “the government” is spying on their phone conversations. Sounds to me like an egocentric fantasy. What, “the govt.” has millions on its payroll to listen in on all our foolish conversations? Get real, folks.
hyhybt
#27: The trouble is it’s *not* just ratings they released. As I understand it, they released lists of every movie each user ever rented, which they are not supposed to do under any circumstances whatever; if any item on your list can be tied to any comment you’ve made anywhere else on the internet, you can be connected to your entire list. Do you really think that people ought to expect that by, say, rating “Casablanca” two stars and “Airplane II” five, you’re telling the world you also watched “Movie You’d Never Want Anyone To Know About, Ever”?
B
No. 20 · Marc wrote, “This is ridiculous. The researches only identified a Netflix user whose ratings were similar to an IMDB user’s ratings. The only thing that proved was that the two users had similar tastes. It didn’t prove that both users were the same person, nor did it prove WHO the users actually were.”
Suppose you are a straight person applying for a job and the person you would be working for is a rabid homophobe. You don’t get the job because said manager checked netfix and IMDB databases and determined that there was an 80 percent chance that you were gay given the match. I can assure you that you would not be a happy camper, and proving what happened would not be easy. Someone ought to compensate a person put in that position.