Data science Software Course Training in Ameerpet Hyderabad

Data science Software Course Training in Ameerpet Hyderabad

Tuesday 16 May 2017

Pig : UDFs


Pig UDFS
----------

  UDF ---> user defined functions.
 
   adv:
       i)  custom functionalities.
      ii)  reusability.

 Pig UDFs can be developed by
    java
   python
    ruby
    c++
    javascript
    perl

step1:
   Develop udf code.

step2:
   export into jar file
   ex: /home/cloudera/Desktop/pigs.jar

step3:
   register jar file into pig.
 grunt> register Desktop/pigs.jar

step4:
   create temporory function  for udf class.

 grunt> define  ucase pig.analytics.ConvertUpper();

step5:
  calling the function:

 grunt>e =  foreach emp generate
      id, ucase(name) as name, sal,
        ucase(sex) as sex, dno;

 
package  pig.analytics;
import .....

--> ucase(name) ---> upper conversion

public class ConvertUpper extends EvalFunc<String>
  {
     public String exec(Tuple v)
      throws IOException
     {
        String str = (String)v.get(0);
        String res = str.toUpperCase();
        retrun res;
           
     }

 
 }
--------------------------
$ cat > samp
100,230,400
123,100,90
140,560,430

$ hadoop fs -copyFromLocal samp piglab

grunt> s = load 'piglab/samp'
            using PigStorage(',')
          as (a:int, b:int, c:int);



package pig.analytics;
....
public class RMax extends EvalFunc<Integer>
{
    public  Integer exec(Tuple v)
     throws IOException
    {
      int a =(Integer) v.get(0);
      int b =(Integer) v.get(1);
      int c =(Integer) v.get(2);

      int big = a; // 10,20,3
      if (a>big) big = a;
      if (b>big) big = b;
      if (c>big) big = c;
      return  big;
    }
 }

export into jar : Desktop/pigs.jar

grunt> register Desktop/pigs.jar;

grunt> define rmax pig.analytics.RMax();

grunt> res = foreach s generate *,
                   rmax(*) as max;

--------------------------------

 package pig.analytics;
 .......
 public class RowMax
   extends EvalFunc<Integer>
 {
    public Integer exec(Tuple v) throws IOException
    {
     List<Object> lobs = v.getAll() ;
     int max = 0;
     int cnt =0;
    // -20,-3,-40
     for(Object o : lobs)
     {
       cnt++;
       int val = (Integer)o;
       if(cnt==1) max = val;
       max = Math.max(max, val);
     }
     return max;
    }
 }

export in to jar : Desktop/pigs.jar
grunt> register Desktop/pigs.jar
grunt> define dynmax pig.analytics.RowMax();
grunt> r = foreach s generate *, dynmax(*) as m;
-----------------------------------------

emp = load 'piglab/emp' using PigStorage(',')
   as (id:int, name:chararray, sal:int,
     sex:chararray, dno:int);

grade()
dname()
gender()


package pig.analytics;
public class Gender extends EvalFunc<String>
{
 public String exec(Tuple v) throws IOException
 {
     String s =(String) v.get(0);
     s = s.toUpperCase();
     if (s.matches("F"))
       s = "Female";
     else
       s = "Male";
     return s;
 }
}
-----------------

package pig.analytics;
public class Grade extends EvalFunc<String>
{
 public String exec(Tuple v) throws IOException
 {
     String sal =(Integer) v.get(0);
     String grade;
     if (sal>=70000)
       grade="A";
     else if (sal>=50000)
          grade="B";
         else if (sal>=30000)
               grade="C";
              else
               grade="D";
     return grade;
 }
}
------
package pig.analytics;
public class DeptName extends EvalFunc<String>
{
 public String exec(Tuple v) throws IOException
 {
    int dno = (Integer)v.get(0);
    String dname;
    switch (dno){
    case 11 :
          dname = "Marketing";
          break;
    case 12 :
          dname = "HR";
          break;
    case 13 :
          dname = "Finance";
          break;
    default:
          dname = "Others";
     }
    return dname;  
 }
}
---------------------------------

---------------------------
export into jar : Desktop/pigs.jar;
grunt> register Desktop/pigs.jar;
grunt> define gender pig.analytics.Gender();
grunt> define grade pig.analytics.Grade();
grunt> define dept pig.analytics.DeptName();

grunt> res = foreach emp generate
    id, ucase(name) as name,
     sal, grade(sal) as grade,
    gender(sex) as sex,
    dept(dno) as dname ;
---------------------------------











     
         







17 comments:

  1. Very Well Written Article onHadoop Technology. Please Post More Post of this Technology To grab latest Updates and Information.
    Hadoop Training in Bangalore
    Hadoop Training in Marathahalli

    ReplyDelete
  2. Extremely Elegantly composed Article adoption Innovation. It would be ideal if you Post More Post of this Innovation To snatch most recent Updates and Data.
    Big data

    ReplyDelete
  3. Hi,
    Thanks for sharing the great information about Hadoop… Its useful and helpful information…Keep Sharing.
    Thanks
    Online Hadoop training

    ReplyDelete
  4. Hai,
    It's very nice blog
    Thank you for giving valuable information on Hadoop
    I'm expecting much more from you...

    ReplyDelete
  5. Hello....your blog is awesome and thanks for Posting. i read your content thats very interesting so please keep it up.
    Hadoop Training Center in Gurgaon

    ReplyDelete
  6. its is very very helpful for all of us and I never get bored while reading your article because, they are becomes a more and more interesting from the starting lines until the end. Big Data Hadoop Online Training Hyderabad

    ReplyDelete
  7. Hi Sir, really appreciate the work you are doing here.
    Can you please statistics all sessions in the blog. I am naive in Data Science area, so want to learn basics of statistics. Will be very helpful if you post them here.

    ReplyDelete
  8. Appreciation for really being thoughtful and also for deciding on certain marvelous guides most people really want to be aware of.
    Big data training in bangalore
    Hadoop training institute in bangalore

    ReplyDelete
  9. Such a nice blog, I really like what you write in this blog, I also have some relevant Information about Best Core HR Training In Hyderabad | Core Hr training institute in Hyderabad! if you want more information.

    ReplyDelete
  10. Excellent blog I visit this blog it's really awesome. The important thing is that in this blog content written clearly and understandable. The content of information is very informative. Thanks for the excellent and great idea. keep blogging
    Best HR Online Training Institute | Best Core HR Online Training Institute!

    ReplyDelete
  11. Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care and we take your comments to heart. I need a help for our website Visit our website once http://talentflames.com/index.html training with placementcompany in Hyderabad

    ReplyDelete
  12. Thanks for sharing the good information and post more information. I need some facilitate to my website. please check once http://talentflames.com/
    training and placement company in Hyderabad

    ReplyDelete