Data science Software Course Training in Ameerpet Hyderabad

Data science Software Course Training in Ameerpet Hyderabad

Tuesday, 1 August 2017

Hive(10AmTo1:00Pm) Lab1 notes : Hive Inner and External Tables

hive> create table samp1(line string);
-- here we did not select any database.
   default database in hive is "default".
 
  the hdfs location of default database is
   /user/hive/warehouse

 -- when you create a table in default database, under warehouse location, one directory will be created with table name.

   in  hdfs,
   /user/hive/warehouse/samp1 directory is created.

hive> create database mydb;

  when a database is created, in warehouse location, with name database and extension ".db" , one directory will be created.

 How to select database:

hive> use mydb;

hive> create table test1(line string);

  under mydb.db directory, test1 table directory will be created.

 /user/hive/warehouse/mydb.db/test1.

[cloudera@quickstart ~]$ ls file*
file1  file2  file3
[cloudera@quickstart ~]$ cat file1
aaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
[cloudera@quickstart ~]$ cat file2
bbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbb
[cloudera@quickstart ~]$ cat file3
cccccccccccccccccccc
ccccccccccccccccccc
ccccccccccccc
[cloudera@quickstart ~]$

hive> use default;
hive> load data local inpath 'file1'
      into table samp1;
-- when you  load file into table,
    the file will be copied into table's backend directory.

   in hdfs,
  /user/hive/warehouse/samp1/file1

hive> load data local inpath 'file2'
   into table samp1;

  now table directory has two files,
     file1 and file2.
hive> select * from samp1;
o/p:
aaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbb


-- hive will read all rows of all files of
   table directory.

another way of loading file into table.

$ hadoop fs -copyFromLocal file3
          /user/hive/warehouse/samp1



hive> select * from samp1;
OK
aaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbb
cccccccccccccccccccc
ccccccccccccccccccc
ccccccccccccc

hive> use mydb;
hive> show tables;
test1
hive> load data local inpath 'file1' into table test1;
hive>
in hdfs,
 /user/hive/warehouse/mydb.db/test1/file1

===============================

  Hive tables are basically two types.

 1) Inner tables [user managed tables]
 2) External tables.

 when inner table is dropped,
   both metadata and data(from hdfs) will be deleted.

 when external table is dropped ,
   only metadata will be deleted,
   but still data is safely available in hdfs table's backend location.

  so that you can reuse data in future.


where hive tables metadata will be stored.
  -- in rdbms,
    under  metastore database.

 when you submit a query in hive,
   hive will contact metastore, and indentify table's backend hdfs location, and reads data.

by default every table is inner table. [managed table].

 to create external table.

hive> create external table etab1(line string);

hive>load data local inpath 'file1'
      into table etab1;

hive> load data local inpath 'file2'
    into table etab1;

 now etab1 is created under mydb database,
  under etab1 table directory we have 3 files.

   these file locations will be updated hive metastore(in rdbms).

 when this table is dropped from hive..

hive> drop table etab1;

  -- from rdbms , metadata of this table will be deleted.
 -- but still in hdfs, the table directory and its files are available.
  [ data is not lost]

 so that , in future, hive or other ecosystem can use this data. [adv: reusability]

How to reuse it.
----------------

hive> use mydb;
hive> create table etab1(line string);
hive> select * from etab1;
aaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbb

-- when you create etab1,
  in hdfs , under database location,
  one directory should be created .
  but  under /user/hive/warehouse/mydb.db,
   already etab1 is existed with two files.

 if directory existed, hive will use it,
   if not existed, hive will create it.

============

hive> drop table etab1;
 -- both data and metadata will be deleted.
  bcoz, this time we create etab1 as "inner" table.


=================

Both inner and external tables can use
   custom hdfs locations.

hive> create table mytab(line string)
      location '/user/mydata';

 in hdfs,
   /user/mydata directory will be created
hive> load data local inpath 'file1'
      into table mytab;

  now file1 will be copied into /user/mydata.


hive> drop table mytab;
 here mytab is created as inner table.
  so both metadata and data (/user/mydata)
   will be deleted

hive> create external table urtab(line string)
   location '/user/urdata';

 now in hdfs,
   /user/urdata  directory will be created.


hive> load data local inpath 'file1'
     into table urtab;

hive> load data local inpath 'file2'
     into table urtab;

hive> load data local inpath 'file3'
     into table urtab;

now file1,2,3 will be copied into
   /user/urdata directory of hdfs.

hive> drop table urtab;
 -- only metadata from rdbms will be deleted.  still /user/urdata directory is available with file1,2,3 files.


Reusing next time:

hive> create table ourtab(line string)
     location '/user/urdata';

  here /user/urdata is already existed in hdfs. so hive will use it. if not existed hive will create it.

=====================================





















   






















   



















 


28 comments:

  1. Wonderful blog!!! I liked the complete article…. great written,Thanks for all the information you have provided…
    Hadoop Training in Hyderabad

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. I want real time hands on experience in Hadoop so can you please provide your phone number sir.

    My contact number : 8374272346

    ReplyDelete
  4. Hi,
    Thanks for sharing the great information about Hadoop… Its useful and helpful information…Keep Sharing.

    ReplyDelete
  5. Thanks for this Beautiful Post of Blog I really Liked your Blog It is very Helpfull
    hadoop Training In Bangalore
    Hadoop Training and Institute in Banaglore

    ReplyDelete
  6. Hi
    thanks for sharing usefull information.If any one wants any type of books fallow https://bigdatahadoopinstitute.blogspot.com/2017/07/hadoop-books_31.html#comment-form .Thanks

    ReplyDelete
  7. TIB Academy is one of the best Hadoop Training Institute in Bangalore. We Offers Hands-On Training with Live project.

    ReplyDelete
  8. Thanks for sharing the great information. More Info Big Data Path (https://bigdatapath.wordpress.com/)

    ReplyDelete
  9. thank you for sharing such a good and useful information, please keep on share like this
    hadoop training in hyderabad
    hadoop online training
    hadoop training in ameerpet

    ReplyDelete
  10. Thanks for providing very useful knowledge.

    ReplyDelete
  11. Big data in hadoop is the interesting topic and to get some important information. Big data Hadoop online Course

    ReplyDelete
  12. Your blog is really awesome and informative. Keep blogging and sharing. Thank you!

    Big Data Testing Classes

    ReplyDelete
  13. This comment has been removed by the author.

    ReplyDelete
  14. Wow it is really wonderful and awesome thus it is very much useful for me to understand many concepts and helped me a lot. it is really explainable very well and i got more information from your blog.

    rpa training in Chennai | rpa training in pune

    rpa online training | rpa training in bangalore

    ReplyDelete
  15. Wow it is really wonderful and awesome thus it is very much useful for me to understand many concepts and helped me a lot. it is really explainable very well and i got more information from your blog.

    rpa training in Chennai | rpa training in pune

    rpa online training | rpa training in bangalore

    ReplyDelete
  16. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
    Big Data Hadoop Training in electronic city

    ReplyDelete
  17. I like your attempt in providing good content on Hadoop. We have a similar site where we also provide good information on Big Data Hadoop

    ReplyDelete
  18. Impressive. Your story always bring hope and new energy. Keep up the good work.
    java training in chennai

    java training in marathahalli | java training in btm layout

    ReplyDelete
  19. Good Post, I am a big believer in posting comments on sites to let the blog writers know that they ve added something advantageous to the world wide web.
    python training in pune
    python training institute in chennai
    python training in Bangalore

    ReplyDelete
  20. Wow it is really wonderful and awesome thus it is very much useful for me to understand many concepts and helped me a lot. it is really explainable very well and i got more information from your blog.

    rpa training in velachery| rpa training in tambaram |rpa training in sholinganallur | rpa training in annanagar| rpa training in kalyannagar

    ReplyDelete
  21. Really you have done great job,There are may person searching about that now they will find enough resources by your post
    Devops training in marathahalli
    Devops training in rajajinagar

    ReplyDelete