Data science Software Course Training in Ameerpet Hyderabad

Data science Software Course Training in Ameerpet Hyderabad

Wednesday, 3 May 2017

Pig : Word Count Using Pig Data Flow

Word Count Using Pig DataFlow:

[cloudera@quickstart ~]$ cat comment
hadoop is great
spark is great
hadoop and spark combination is great
[cloudera@quickstart ~]$ hadoop fs -copyFromLocal comment piglab
[cloudera@quickstart ~]$

grunt> ls piglab
hdfs://quickstart.cloudera:8020/user/cloudera/piglab/comment<r 1> 69
hdfs://quickstart.cloudera:8020/user/cloudera/piglab/emp<r 1> 158
hdfs://quickstart.cloudera:8020/user/cloudera/piglab/results1 <dir>
hdfs://quickstart.cloudera:8020/user/cloudera/piglab/results2 <dir>
grunt> cat piglab/comment
hadoop is great
spark is great
hadoop and spark combination is great
grunt>


 lines = load 'piglab/comment' as (line:chararray);

 words = foreach lines generate
         FLATTEN(TOKENIZE(line)) as word;
 gwords = group words by word;
 wcnt = foreach gwords generate
        group as word, COUNT(words) as cnt;

(is,3)
(and,1)
(great,3)
(spark,2)
(hadoop,2)
(combination,1)

------------------------------------














2 comments:

  1. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.

    https://www.emexotechnologies.com/online-courses/big-data-hadoop-training-in-electronic-city/

    ReplyDelete
  2. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
    Big Data Hadoop Training in electronic city

    ReplyDelete