Google Spans Entire Planet With GPS-Powered Database

Written By Unknown on Rabu, 19 September 2012 | 03.49

Three years ago, a top Google engineer named Vijay Gill was asked what he would do if someone gave him a magic wand.

At the time, Gill helped run the massive network of data centers that underpins Google's online empire, and he was sitting on stage at a conference in downtown San Francisco, discussing the unique challenges facing this globe-spanning operation. Jonathan Heilger — the man who oversaw Facebook's data centers — sat a few seats away, and it was Heiliger who asked Gill what he would add to Google's data centers if he had a magic wand.

Gill hesitated before answering. And when he did answer, he was coy. But he seemed to say he would use that magic wand to build a single system that could automatically and instantly juggle information across all of Google's data centers. Then he indicated that Google had already built one. "How do you manage the system and optimize it on a global level?" he said. "That is the interesting part."

'The conventional wisdom is that time synchronization like that, on a global scale, that is accurate enough for such a big distributed database … just isn't practical.'

— Andy Gross

It was little more than a teaser. But about four months later, Google dropped another hint. At a symposium in the mountains of Montana, Jeff Dean — one of Google's most important engineers — revealed that the web giant was working on something called Spanner, describing it as a "storage and computation system that spans all our data centers." He said the plan was to eventually juggle data across as many as 10 million servers sitting in "hundreds to thousands" of data centers across the globe.

The scope of the project was mind-boggling. But Dean provided few details, and it wasn't clear whether Google was actually using the platform in its live data centers. Then, on Tuesday, the paper hit the web.

This week, as reported by GigaOm and ZDnet, Google published a research paper detailing the ins and outs of Spanner. According to Google, it's the first database that can quickly store and retrieve information across a worldwide network of data centers while keeping that information "consistent" — meaning all users see the same collection of information at all times — and it's been driving the company's ad system and various other web services for years.

Spanner borrows techniques from some of the other massive software platforms Google built for its data centers, but at its heart is something completely new. Spanner plugs into a network of servers equipped with super-precise atomic clocks or GPS antennas akin to the one in your smartphone, using these time keepers to more accurately synchronize the distribution of data across such a vast network. That's right, Google attaches GPS antennas and honest-to-goodness atomic clocks to its servers.

"It's a big deal — and it's really novel," says Andy Gross, the principal architect of Basho, an outfit that builds an open source database called Riak that runs across thousands of servers — though not nearly as many as Spanner. "The conventional wisdom — at least among people with modest resources — is that time synchronization like that, on a global scale, that is accurate enough for such a big distributed database … just isn't practical."

Spanner may seem like an extreme undertaking, and certainly, it tackles an usual problem. Few other companies on Earth are forced to deal with so much data so quickly. But Google's massive data center creations have a way of trickling down to the rest of the tech world. The prime example is Hadoop, a widely used number-crunching platform that mimics technologies originally built at Google, and this trend will likely continue.

"If you want to know what the large-scale, high-performance data processing infrastructure of the future looks like, my advice would be to read the Google research papers that are coming out right now," Mike Olson, the CEO of Hadoop specialist Cloudera, said at recent event in Silicon Valley. According to Charles Zedlewski, vice president of products at Cloudera, the company was already aware of Spanner — after recruiting some ex-Google engineers — and it may eventually incorporate ideas from the paper into its software.

Facebook is already building a system that's somewhat similar to Spanner, in that it aims to juggle information across multiple data centers. Judging from our discussions with Facebook about this system — known as Prism — it's quite different from Google's creation. But it shows that other outfits are now staring down many of the same data problems Google first faced in years past.

Pages: 1 2 3 View All

Geek's Guide to the Galaxy 19 Sep, 2012


-
Source: http://feeds.wired.com/~r/wired/index/~3/9WauAepy69k/
--
Manage subscription | Powered by rssforward.com

Anda sedang membaca artikel tentang

Google Spans Entire Planet With GPS-Powered Database

Dengan url

http://sizeofhappiness.blogspot.com/2012/09/google-spans-entire-planet-with-gps.html

Anda boleh menyebar luaskannya atau mengcopy paste-nya

Google Spans Entire Planet With GPS-Powered Database

namun jangan lupa untuk meletakkan link

Google Spans Entire Planet With GPS-Powered Database

sebagai sumbernya

0 komentar:

Posting Komentar

techieblogger.com Techie Blogger Techie Blogger