Archive for the ‘Uncategorized’ Category

How About Accessing Hadoop Data through Microsoft Excel?

Friday, March 2nd, 2012

Until now, Hadoop might seem to be the superhot storage and processing platform for unstructured data, which Microsoft has left out on but things are on a track of change, as the company has teamed up with Hortonworks to make Hadoop data analyzable via both a JavaScript framework and Microsoft Excel.

How About Accessing Hadoop Data through Microsoft Excel?

Yes, it means that a large number of developers and business people will now be able to access Hadoop data using Microsoft Excel, their favorite tool by far. Of course, we have had Datameer and others trying to make Hadoop accessible to business users, but nothing compares to the Excel connector; we might soon witness Hadoop going mainstream now.

In an attempt to get Hadoop to work nicely with Windows, Microsoft has moved ahead of its computing tool called Dryad.  This new strategy includes Hadoop distributions for Windows Servers and the Windows Azure cloud computing platform, as well as a connection to SQL Server.

Hadoop on Azure is actually available in developer preview mode and we hear it will be generally available by the end of the quarter. They also say the SQL Server distribution will be available in preview mode in this quarter and will be generally available by June this year.

Microsoft is visibly committed to Hadoop, meaning it will aid in making Hadoop a million dollar market on a couple of years, like Hortonworks CEO Rob Bearden said the other day.

Source: http://www.gizmocrave.com/11191-how-about-accessing-hadoop-data-through-microsoft-excel/

Did you like this? Share it:

Meta Tags and Web 3.0

Friday, March 4th, 2011

In 2008, Google spidered its trillionth web page. That sounds impressive, but as LISNews, the Librarian And Information Science News, recently pointed out that figure represents but a tiny fraction of the information on the web. How so, you ask? Well, think of all those ecommerce databases, library catalogs, transport system fares and timetables… There are billions of pages that are only ever revealed to individual users when they access them with particular information requests. These pages are effectively invisible to search engine spiders and as such as known as the invisible web.

There have been search engines created that are capable of trawling catalogs by simulating user searches, but these only scrape the surface. Google itself recognizes the problem and has repeatedly announced efforts to reveal the invisible. But, as yet, there is no search engine that can answer a question, such as, “what’s the best and most inexpensive way to get me from an hotel near Mornington Crescent tube station in London to Los Angeles International Airport with a stop off in New York City?

Of course, such questions could have myriad answers and maybe there never will be a way to uncover enough of the invisible web without the intervention of expert human intermediaries, such as travel agents well versed in the London Underground and American Airlines flight paths and timetables.

However, you may have heard the notion of web 3.0 being bandied about during the last year or more. Web 1.0, was of course, the static flat web of hyperlinks and no interaction. Web 2.0 (ignoring the glossy mirrored logos and missing vowels [flickr etc]) is what we currently have. It’s the interactive web of comments on blogs, social bookmarking sites like del.icio.us, social networking sites such as LinkedIn and Facebook, microblogging (Plurk, Twitter, and the late Pownce), and all kinds of tools that converted the static flatland of html into the scrubbed dynamic web we all know and love(?) today.

Web 3.0 takes all this a step further adding machine-readable meaning to the packets of information. It is thus known to the technically minded as the semantic web. Once it is manifest the semantic web will take us to within a gnat’s whisker of that utopia in which you have the exact change for a trip from Mornington Crescent to LAX via JFK.

Before we get there though, there is the not-so-simple matter of enabling meaning within information sources. This concept brings us full circle to the early days of web design when every tool stressed the importance of meta tags. Meta tags were meant to provide the fledgling search engines way back in the 1990s with the means to extract significance and context – meaning in other words – from web pages.

Almost as soon as the first spiders read those meta tags, which may have included keywords, a description, and the name of the page author, and more, the so-called black hats of the search engine optimization (SEO) world began to game the system. They would stuff keywords into their sites’ meta tags that may or not have been related to the actual content of the site. The aim was to fool the search engines into ranking the site highly for particular keywords and so gain more traffic through this spammy technique than the site was naturally due.

Then, once the search engines recognized what was happening they deprecated the relevance of meta tags in the algorithms they used to generate the search engine results pages (SERPs). As such, meta tags have fallen out of favor. They still have relevance in a few of the simpler and less well-known search engines and they are often used to display key text in the SERPs. This means that it is not only black hats who have abandoned meta tags to some degree, but generalist webmasters often ignore their latent potency and simply do not include them in the pages they publish.

This could be a major blow to the emergence of the semantic web, the advent of web 3.0. Websites need their meta data, they need to be able to explain themselves to machines in an understandable way. Badawia Albassuny at the Department of Library and Information Science, King Abdulaziz University, Jeddah, Saudi Arabia, certainly recognizes this. She has recently surveyed the automatic metadata generation applications on the web, with a view to raising awareness of the possibilities.

If you use WordPress and other blogging tools and content management systems (CMS) you may have plugins installed that automatically add meta tags. If you use the Zemanta system and have customized your settings you may also have noticed that it has a built in system for adding semantics to links you include in your posts. I discussed Zemanta in a little more detail in a post entitled Free Blog Content recently.

source: http://www.sciencetext.com/meta-tags.html

Did you like this? Share it: