Page History

...

Cooliris - Cooliris transforms your browser into a lightning fast, cinematic way to browse photos and videos, both online and on your hard drive.
- We have a 15-node Hadoop cluster where each machine has 8 cores, 8 GB ram, and 3-4 TB of storage.
- We use Hadoop for all of our analytics, and we use Pig to allow PMs and non-engineers the freedom to query the data in an ad-hoc manner.<<BR>>

Dataium
- We use Pig to sort and prep our data before it is handed off to our Java Map/Reduce jobs.

DropFire
- We generate Pig Latin scripts that describe structural and semantic conversions between data contexts
- We use Hadoop to execute these scripts for production-level deployments
- Eliminates the need for explicit data and schema mappings during database integration

...

Mortar Data
- We provide an open-source development framework and Hadoop Platform-as-a-Service
- Our service is powered by Pig, which we run on private, ephemeral clusters in Amazon Web Services

Ning
- We use Hadoop to store and process our log file
- We rely on Apache Pig for reporting, analytics, Cascading for machine learning, and on a proprietary [/hadoop/JavaScript] API for ad-hoc queries
- We use commodity hardware, with 8 cores and 16 GB of RAM per machine

Nokia | Ovi
- We use Pig for exploring unstructured datasets coming from logs, database dumps, data feeds, etc.
- Several data pipelines that go into building product datasets and for further analysis use Pig tied together with Oozie to other jobs
- We have multiple Hadoop clusters, some for R&D and some for production jobs
- In R&D we run on very commodity hardware: 8-core, 16GB RAM, 4x 1TB disk per data node
PayPal
- We use Pig to analyze transaction data in order to prevent fraud.
- We are the main contributors to the Pig-Eclipse project.

Realweb - Internet Advertising company based in Russia.
- We are using Pig over Hadoop to compute statistics on banner views, clicks, user behavior on target websites after click, etc.
- We've chosen Cloudera Hadoop (http://www.cloudera.com/hadoop/^{Image Removed}) packages on Ubuntu servers 10.04. Each machine has 2/4 cores, 4 GB ram, and 1 TB of storage.
- All jobs are written using Pig language and only few user defined functions were needed to achieve our needs.

...

Twitter
- We use Pig extensively to process usage logs, mine tweet data, and more.
- We have maintain Elephant Bird, a set of libraries for working with Pig, LZO compression, protocol buffers, and more.
- More details can be seen in this presentation: http://www.slideshare.net/kevinweil/nosql-at-twitter-nosql-eu-2010^{Image Removed}

Tynt
- We use Hadoop to assemble web publishers' summaries of what users are copying from their websites, and to analyze user engagement on the web.
- We use Pig and custom Java map-reduce code, as well as chukwa.
- We have 94 nodes (752 cores) in our clusters, as of July 2010, but the number grows regularly.

...

Versions Compared