Jython UDFs were added to Pig in version 0.8, and are pretty stable in the current version, 0.9.2. They are highly convenient, and a major timesaver.
tags:pig,hadoop,jython
via NoSQL databases
As a follow up to the Hadoop tools ecosystem and The components and their functions in the Hadoop ecosystem , Prashanth Babu send the following Hadoop ecosystem map:
tags:hadoop
via NoSQL databases
One of the best presentations I've seen: concise, covering the topic from different angles, providing useful information, pitching a product and company in non-obtrusive ways.
tags:cassandra,datastax,hadoop,bigdata
via NoSQL databases
In the data deluge faced by businesses, there is also an increasing need to store and analyze vast amounts of unstructured data including data from sensors, devices, bots and crawlers. By many accounts, almost 80% of what businesses store is unstructured data — and this volume is predicted to grow exponentially over the next decade. We have entered the age of Big Data. Our customers have been asking us to help store, manage, and analyze both structured and unstructured data — in particular, data stored in Hadoop environments. As a first step, we will soon release a Community Technology Preview (CTP) of two new Hadoop connectors — one for SQL Server and one for PDW. The connectors provide interoperability between SQL Server/PDW and Hadoop environments, enabling customers to transfer data between Hadoop and SQL Server/PDW. With these connectors, customers can more easily integrate Hadoop with their Microsoft Enterprise Data Warehouses and Business Intelligence solutions to gain deeper business insights from both structured and unstructured data.
tags:hadoop,sql server,parallel data warehouse,pdw,microsoft
via NoSQL databases