THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- Ning
- We use Hadoop to store and process our log file
Wiki Markup We rely on Apache Pig for reporting, analytics, Cascading for machine learning, and on a proprietary \[[/hadoop/JavaScript|JavaScript]\] API for ad-hoc queries
- We use commodity hardware, with 8 cores and 16 GB of RAM per machine
...
- Realweb - Internet Advertising company based in Russia.
- We are using Pig over Hadoop to compute statistics on banner views, clicks, user behavior on target websites after click, etc.
- We've chosen Cloudera Hadoop (http://www.cloudera.com/hadoop/
) packages on Ubuntu servers 10.04. Each machine has 2/4 cores, 4 GB ram, and 1 TB of storage.
- All jobs are written using Pig language and only few user defined functions were needed to achieve our needs.
- Salesforce.com
- We have multiple 20-node clusters in production, a 10 node and 20 node development clusters
- Hadoop (native Java MapReduce) is used for Search and Recommendations
- We are using Apache Pig for log processing and Search, and to generate usage reports for several products and features at SFDC
- Pig makes it easy to develop custom UDFs. We developed our own library containing UDFs and loaders and are actively contributing back to the community
- The goal is to allow Hadoop/Pig to be used across Data Warehouse, Analytics and other teams making it easier for folks outside engineering to use data
- SARA Computing and Networking Services
- We provide a Hadoop service for scientific computing in The Netherlands
- Pig is being used by a number of scientists for fast exploration of large datasets
- Sciences extensively using Pig include Information Retrieval and Natural Language Processing
- Read more on our use of Hadoop in this presentation
- Read about selected use cases on Hadoop in this blogpost
...
- Twitter
- We use Pig extensively to process usage logs, mine tweet data, and more.
- We have maintain Elephant Bird, a set of libraries for working with Pig, LZO compression, protocol buffers, and more.
- More details can be seen in this presentation: http://www.slideshare.net/kevinweil/nosql-at-twitter-nosql-eu-2010
- Tynt
- We use Hadoop to assemble web publishers' summaries of what users are copying from their websites, and to analyze user engagement on the web.
- We use Pig and custom Java map-reduce code, as well as chukwa.
- We have 94 nodes (752 cores) in our clusters, as of July 2010, but the number grows regularly.
...