Google Copyright Transparency Report

Google timed a nice Friday evening release to update of their policy toward copyright infringement.

Starting next week, we will begin taking into account a new signal in our rankings: the number of valid copyright removal notices we receive for any given site. Sites with high numbers of removal notices may appear lower in our results.

Wow. Sounds like trouble. Surely that means that YouTube's rankings are about to get torched.

Oh, nope. One quick exemption for the video king:

This data presents information specified in requests we received from copyright owners through our web form to remove search results that link to allegedly infringing content. It is a partial historical record that includes more than 95% of the volume of copyright removal requests that we have received for Search since July 2011. It does not include:

  • requests submitted by means other than our web form, such as fax or written letter
  • requests for products other than Google Search (e.g, requests directed at YouTube or Blogger)
  • requests sent to Google Search for content appearing in other Google products (e.g., requests for Search, but specifying YouTube or Blogger URLs).

Google does not state where the thresholds will be set & grants blanket immunity for themselves, yet they (illegitimately) emphasize that they are being transparent.

Only copyright holders know if something is authorized, and only courts can decide if a copyright has been infringed; Google cannot determine whether a particular webpage does or does not violate copyright law. So while this new signal will influence the ranking of some search results, we won’t be removing any pages from search results unless we receive a valid copyright removal notice from the rights owner. And we’ll continue to provide "counter-notice" tools so that those who believe their content has been wrongly removed can get it reinstated. We’ll also continue to be transparent about copyright removals.

YouTube vs Sites Cleaner Than YouTube

Courts have ruled that embedding a YouTube video is not copyright infringement. The EFF has mentioned that embedding a video is simply a link.

And yet, a UK student faces up to 10 years in jail in the US for founding a crowdsourced site which links to sites that allow you to watch TV online.

Kim DotCom suffered a militant raid on his house & had his assets frozen for running MegaUpload, which was a tiny spec of dirt compared to the size of YouTube.

On the copyright front YouTube was rotten from the start:

  • "In a July 19, 2005 e-mail to YouTube co-founders Chad Hurley and Jawed Karim, YouTube co-founder Steve Chen wrote: 'jawed, please stop putting stolen videos on the site. We’re going to have a tough time defending the fact that we’re not liable for the copyrighted material on the site because we didn’t put it up when one of the co-founders is blatantly stealing content from other sites and trying to get everyone to see it.'"
  • "Chen twice wrote that 80 percent of user traffic depended on pirated videos. He opposed removing infringing videos on the ground that 'if you remove the potential copyright infringements... site traffic and virality will drop to maybe 20 percent of what it is.' Karim proposed they 'just remove the obviously copyright infringing stuff.' But Chen again insisted that even if they removed only such obviously infringing clips, site traffic would drop at least 80 percent. ('if [we] remove all that content[,] we go from 100,000 views a day down to about 20,000 views or maybe even lower')."
  • "In response to YouTube co-founder Chad Hurley’s August 9, 2005 e-mail, YouTube co-founder Steve Chen stated: 'but we should just keep that stuff on the site. I really don’t see what will happen. what? someone from cnn sees it? he happens to be someone with power? he happens to want to take it down right away. he get in touch with cnn legal. 2 weeks later, we get a cease & desist letter. we take the video down.'"
  • "A true smoking gun is a memorandum personally distributed by founder Karim to YouTube’s entire board of directors at a March 22, 2006 board meeting. Its words are pointed, powerful, and unambiguous. Karim told the YouTube board point-blank:
    'As of today episodes and clips of the following well-known shows can still be found: Family Guy, South Park, MTV Cribs, Daily Show, Reno 911, Dave Chapelle. This content is an easy target for critics who claim that copyrighted content is entirely responsible for YouTube’s popularity. Although YouTube is not legally required to monitor content (as we have explained in the press) and complies with DMCA takedown requests, we would benefit from preemptively removing content that is blatantly illegal and likely to attract criticism.'"
  • "A month later, [YouTube manager Maryrose] Dunton told another senior YouTube employee in an instant message that 'the truth of the matter is probably 75-80 percent of our views come from copyrighted material.' She agreed with the other employee that YouTube has some 'good original content' but 'it’s just such a small percentage.'"
  • "In a September 1, 2005 email to YouTube co-founder Steve Chen and all YouTube employees, YouTube co-founder Jawed Karim stated, 'well, we SHOULD take down any: 1) movies 2) TV shows. we should KEEP: 1) news clips 2) comedy clips (Conan, Leno, etc) 3) music videos. In the future, I’d also reject these last three but not yet.'"

Broader Copyright Questions

There still are a lot of murky questions in Google's "transparency."

  • If a person embeds an image from Imgur, ImageShack, TinyPic, PhotoBucket or elsewhere & the page that has a hotlink gets a DMCA how does that count?
  • If a brand is large enough does it take many DMCAs to get hit?
  • Is there any analysis of the underlying business model of the site? What happens to document storage sites like DocStoc & Scribd, or even image sites like Pinterest?
  • What happens to sites that link at penalized sites too frequently?
  • What happens to ad networks that frequently fund such copyright violations?

HUGE Impact on the Web

Has anyone registered & yet? ;)

In terms of impact on the web for publishers, this change is every bit as big as Florida, Panda & Penguin. It may not seem so at first (as it will take time for market participants to consider the uses) but this is a huge deal. Consider some of the following scenarios...

  • You try to create something like YouTube for another form of content (Pinterest?) and it gets hit as spam for following Google's lead.
  • You offer a free blogging platform that competes with Blogspot, but it gets hit as spam for following Google's lead.
  • You decide to create a project like Google's book scanning project & you get hit as spam for following Google's lead.
  • You run an ad network & start growing quickly. As you grow some sketchier publishers enter your ad network. Like Google AdSense, a large portion of your ad network is filled with sites that have copyright violations on them. Suddenly working with your ad network gets people hit as spam because your business model is too similar to Google's.
  • You create a new social network & are struggling to compete with Google's preferential ranking & hard coded placements of their own network. You make your network more open to encourage growth & you get hit as spam.
  • If You are Amazon or eBay you can afford premium featured content to pull up your other listings. But if you can't afford their cost structure & hire freelance writers or work with outsourced workers to create some of your content & they use some copyright work without you knowing. But does Amazon now have to vigilantly review their reviews for plagiarism?
  • A competitor licenses some of their content as Creative Commons for years & doesn't mind wide use of it. Then you use it & one day they see you as a competitive threat and remove their Creative Commons license & bulk DMCA you. Or you have a lifetime syndication deal with a company, they later change the policy & claim that your documents are forged.
  • Getty images presumes you didn't license an image that you did & files a DMCA. At some point there is no purpose in targeting the webmaster or host...just go direct to Google knowing that you can create the equivalent of a "patent trolling" styled business model where you create a business model where it is cheaper for people to pay to have the issue resolved the quick way before they lodge a formal complaint. Some organizations might even have a subscription service set up where you pre-pay for immunity.
  • A former employee who wrote content for you claims you used it without permission. Or that same former employee used pirated images & longish quotes from other sources that they didn't disclose to you that they now highlight via DMCA.
  • You license data from a source & they do a mid-contract change leveraging the small print & have a bot lined up to send 40,000 DMCAs against you if you do not agree to the higher pricepoint.
  • Google is considering making an investment in your site & you want too much money. As an edge case near the threshold of this copyright limit you know you have immunity if you join the borg, but lack it if you don't work with them.
  • Big media players that play in the gray area will be fine, but smaller sites that try a similar model will be sunk by DMCAs and/or legal fees.
  • Your leading competitor realizes that your blog publishes comments by default with editorial review (and that even later has lax review) and then they file DMCA reports against you. Or they could just grab chunks of content from Google's leaderboard of complainers and post them into your web forum, knowing that those companies will file a DMCA report against you.
  • A site has some content public & some behind a paywall. With a page partially indexed, how does Google respond to DMCA requests when the alleged infraction is behind a registration wall or paywall?
  • A competitor (inspired by Google no doubt) hires off shore "contractors" to copy your site & then file DMCA reports against you in bulk. How long until people start uploading their own content to file their own DMCAs against certain sites with user generated content?
  • Even if your site is 100% legal, a combination of ignorance & crowd-driven vigilante justice can still take you down.
  • Any site that offers interactive features & has user generated content is at risk of being labeled as spam unless they have tight editorial control over user generated content. And at the same time, Google can enter vertical after vertical with scrape & displace garbage knowing that they don't have those editorial costs due to their self-granted blanket immunity.
  • If you do not register your sites with Google & counter claims (even bogus ones) then you are seen as being a spammer. And if you register with Google then when they don't like something one site does they can hit other sites all at the same time. No point going to the host or registrar, go direct to Google & start building up negative karma.

Why did Google feel the need to grant themselves blanket immunity from the policy?

That question was largely missing among the fanboi blogs & journalists who were encouraged by Google's "transparency."

24 Karat Pyrite On Sale for Only $100 an Ounce

If YouTube is going to win big, then that's a great place to invest, right?

Maybe not.

Some venture capitalists are investing in YouTube channels, but that is a fool's game.

  • Google is also investing in select channels (like Machinima). It is quite hard to outperform Google in returns while investing into a platform that they control & thus have better data on than you ever could.
  • As YouTube's dominance increases (and it will now that competing platforms with a similar business model will be smeared as spam), you can count on them offering premium partners crappier revenue share deals in years to come. They will offer nice deals to Warner Bros. & such, but the independent smaller players will get cut out of the ecosystem in much the same way as they did in Google's organic search results.
  • Google, prince of transparency (for everyone but Google), requires that premium publishers *not* disclose the terms of their deals: "The Partner Program forbids participants to reveal specifics about their ad-share revenue. Rates can vary depending on the size and demographics of the partner’s audience and an array of other metrics."

Note that I don't claim YouTube is a bad host for your own content, but that I am skeptical in applying the VC model to it with a belief that you can out-invest Google on their own site; particularly when they own the dominant platform, control the non-public revenue share rates, invest in competing channels & can offer free promotion + higher rates to anyone they invest into in order to dominate the category.

And the issue isn't just video either. The same dynamic can apply to just about any other infrastructural layer. For instance, Google could buy out a torrent site (say like uTorrent) and have that site gain immediately immunity for being part of the borg, while other sites that compete now absorb both greater editorial filtering costs & greater risks that destroy their ROI.

As Google continues to lock down search, you can expect more smart publishers to hedge investments in search and YouTube with investments in proprietary non-search applications that Google can't take away.

The Devil is in the Details

"We are optimistic that Google’s actions will help steer consumers to the myriad legitimate ways for them to access movies and TV shows online, and away from the rogue cyberlockers, peer-to-peer sites, and other outlaw enterprises that steal the hard work of creators across the globe. We will be watching this development closely — the devil is always in the details — and look forward to Google taking further steps to ensure that its services favor legitimate businesses and creators, not thieves." - Michael O’Leary, Senior Executive Vice President for Global Policy and External Affairs of the Motion Picture Association of America, Inc.

The concerned with Google pitching themselves as the preeminent authority on copyright is they have consistently played both sides of the fence.

When Google was competing against YouTube, this was how they viewed copyright internally.

Business Objectives Drive "Relevancy" Signals

Google is a big player in business online and off. They can sell private data exclusively & their online profits are so huge that they are now buying auto loan bonds.

Now that Google wants to sell premium content they (sort of) respect copyright (& are willing to hold the rest of the web to a higher standard than themselves to create this impression).

I have long believed that relevancy signals were often politically driven & that internal business development goals often lead or create various signals. Certainly that was obvious when Google+ was hardcoded in the search results. It was equally true when Knol outranked the original content sources. Google frequently pretends to be (belligerently) unaware of externalities, but when the issues impact their own business they gain an elevated sense of importance.

And these business objectives not only influence the relevancy algorithms, but also the editorial guidelines.

And even while Google is rolling out this "copyright violators are spammers" algorithm (which they are exempt from) they still chug on with their ebook offering:

They posted several of my 41 books up as free downloads (some were missing a few pages at most a single chapter) It took several e-mails from me pointing out that they were infringing copyright before they took them down. During the time my books were free on Google my sales of e-books fell dramatically. " - K C Watkins

When Google started scanning books an internal document stated: “[we want web searchers interested in book content to come to Google not Amazon” ... or, as put another way, in that same document, “[e]verything else is secondary … but make money.”

Published: August 13, 2012 by Aaron Wall in publishing & media


August 12, 2012 - 2:00pm

Great post Aaron. Two things come to mind:
- what's the use doing this if the actual offending content is already deindexed as soon as a DMCA notice is sent to Google? and won't it lead to the rise of fake DMCA complaints (had a client who faced this, good job we managed to explain them it was fake)? Googe has never been particularly good at attributing the content to its initial source
- so is the ideal get-rich-quick scheme, err, business model to set up a bunch of blogs on Blogger, scrape tons of content off elsewhere, slap AdSense here and there and watch the money roll into the bank account? Feels like 2005 again...

August 13, 2012 - 12:33am

"what's the use doing this if the actual offending content is already deindexed as soon as a DMCA notice is sent to Google? and won't it lead to the rise of fake DMCA complaints (had a client who faced this, good job we managed to explain them it was fake)? Googe has never been particularly good at attributing the content to its initial source"

The issue is not just that the pages themselves can be taken down, but that they can accrue negative karma for the whole site. If you hire a "contractor" to do some dirty work (like Google has done against Mocality & OpenStreetMap) you can also have another "contractor" then file the DMCA stuff. And if the hatchet job is bad enough it could tank the site.

"so is the ideal get-rich-quick scheme, err, business model to set up a bunch of blogs on Blogger, scrape tons of content off elsewhere, slap AdSense here and there and watch the money roll into the bank account? Feels like 2005 again..."

I am not saying that is the business model (the above post was mostly highlighting examples and direction things could go). Obviously Google wouldn't share any information with me on the topic, but they would share it with Danny. He wrote today:

Google told me today that the new penalty will look beyond just the number of notices. It will also take into account other factors, specifics that Google won’t reveal, but with the end result that YouTube — as well as other popular sites beyond YouTube — aren’t expected to be hit.

What other sites? Examples Google gave me include Facebook, IMDB, Tumblr and Twitter. But it’s not that there’s some type of “whitelist” of sites. Rather, Google says the algorithm automatically assesses various factors or signals to decide if a site with a high number of copyright infringement notices against it should also face a penalty.

So if you are large & lawyered up already (or operate on platforms that fit that criteria owned by companies valued in the billions of Dollars) the gray area stuff will fly (at least for a while). But now that they listed public exceptions, you can expect those sites to get abused, maybe enough so that eventually some of the exceptions need to be exceptions to the exception list. :D

August 12, 2012 - 4:57pm

Aaron you need to find out about the threatened lawsuits, negotiations, and how this new "ranking" factor is a compromise settlement with big media. Google loves money, and trades power for cash. As a settlement, this move is brilliant. Complaintiffs sign off because they think Google is partnering with them to deter copyright infringement (we know Google touts its manipulation of SEOs to media as a power... I imagine they showed that and book deals as evidence that they can do more good moving markets than paying cash). Later, as you probably know, they'll screw these guys, too.

The complaintiffs have been arguing for years since Youtube was acquired. And the set-asides were huge then, but not by today's standards?

I doubt this is a real ranking factor...more an excuse to attack competitors as you note, but not just competitors of Google...competitors of the noisiest complaintiffs that threaten Google's freedoms to manipulate various markets.

I seriously wonder where the brain drain is... I never thought so many gifted technologists would remain with Google when it got so abusive, so greedy, and so douchbaggish. The social stigma of being part of the modern Google is very real. Just a few years after "you work for Google?" was a compliment, "you work for Google?" is now a character-challenging inquiry. I simply can't believe it isn't hurting their recruitment and retention more than we see. The money can only do so much, and as the ranks fill with foreign born, bureaucracy-savvy and politically motivated money whores, the stresses of actually trying to contribute to something great will push the "local" innovation/achievement/make things awesome people out. It always has...

December 5, 2012 - 7:42am

is already being abused.

If the dinosaurs move that fast on something like this, then imagine all the link poisoning that is going on across the web due to Google's Penguin + splog network takedown + link warning messages + tightened anchor filters + "now you own it" because we have a Disavow tool. Yuck.

February 19, 2013 - 11:21am

... a move to de-fund competition:

The web search giant, which is embroiled in a long-running row over the way it deals with pirated material, is considering the radical measure so that it can get rid of the root cause instead of having to change its own search results.

Executives want to stop websites more or less dedicated to offering links to pirated films, music and books from making money out of the illegal material. The plans, still in discussion, would also block funding to websites that do not respond to legal challenges, for example because they are offshore.

August 12, 2012 - 9:06pm

Just like how Google will deem a site "low quality" for Adwords even if you are ranking #1 for the same terms you want to advertise on organically. Their excuse is that organic results are not used to determine Adwords quality.

Then, they will find some shady website that scraped your RSS feed and jumbled your content in with their spam and use that to accuse you of copyright infringement and exclude you from using Adwords.

Oh, I guess organic results CAN be used to determine Adwords quality...but only if it negatively affects the advertiser.

Just more of the same.

Just like all the unjust bannings of Adwords, Adsense, Youtube accts. etc....Google does whatever they want as long as it turns a profit.

Every once in awhile Google gets a slap on the wrist or a token fine, but overall Google is a large pseudo-government utility company that is allowed to completely monopolize their industry and destroy competition while harming the average consumer.

If you declare bankruptcy or don't pay your credit cards you will destroy your credit but even those black marks are eventually required by law to be removed after a time. Google gives no such mercy...they will mark you, your business and your household for life without any opportunity for reinstatement in any form.

August 13, 2012 - 6:30am

The DMCA takedown is just another reason to sift out smaller sites.

Sorting out and filtering genuinely low quality/spam sites from small legitimate business sites is hard. Google's equivilant of "let's go shopping" is to simply focus on showing mostly big brands in their commercial results, with a token sample of smaller sites to look like they're showing diverse results. I think they've basically given up trying to truly give you the actual BEST results because a) it's hard to do, and b) more compellingly for Google, the harder option is the least lucrative to them. Is it really the case that a bunch of big brands represent the best of the web? "Good enough" results happen to make Google the most revenue. When option A is easier to do AND makes you the most revenue, and option B is far more difficult and will make you less revenue, which option do you choose?

Every update they've applied this year has the subtext of "too big to be penalised" - if you're big enough, you'll be OK. It coincides with making their job easier, while forcing more and more businesses onto Adwords just to have a presence on Google.

August 14, 2012 - 3:19am

Google's off loading all the work to its users but takes no responsibility.

They really are getting worse and so are the results.

Lost count of how many times i have seen the same domains url's take up most if not all of a results page.

They're clearly favoring U.S owned sites and the big U.S brands.

Time to start using and Writing about Bing

Sanket Patel
August 18, 2012 - 5:20am

It is really shocked that google consider You-tube as a spam.

Add new comment

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.