欧洲亚洲成人,国产盗摄在线视频网站,性欧美18一19sex性欧美

? （1）建student & student1 表：（hive 托管）
create table student(id INT, age INT, name STRING)
partitioned by(stat_date STRING)
clustered by(id) sorted by(age) into 4 buckets
row format delimited fields terminated by ',';

create table studentrc(id INT, age INT, name STRING)
partitioned by(stat_date STRING)
clustered by(id) sorted by(age) into 4 buckets
row format delimited fields terminated by ',' stored as rcfile;

create table studentlzo(id INT, age INT, name STRING)
partitioned by(stat_date STRING)
clustered by(id) sorted by(age) into 4 buckets
row format delimited fields terminated by ',' stored as rcfile;

文件格式 textfile， sequencefile， rcfile
（2）設(shè)置環(huán)境變量：
set hive.enforce.bucketing = true;
（3）插入數(shù)據(jù)：
? LOAD DATA local INPATH '/home/hadoop/hivetest1.txt' OVERWRITE INTO TABLE student partition(stat_date="20120802");

(CPU使用率很高)
from student
insert overwrite table student1 partition(stat_date="20120802")
select id,age,name where stat_date="20120802" sort by age;

查看數(shù)據(jù)
select id, age, name from student? distribute by id ; // distribute相當(dāng)于mapreduce中的key

抽選數(shù)據(jù)(一般測(cè)試的情況下使用)
select * from student tablesample(bucket 1 out of 2 on id);
TABLESAMPLE(BUCKET x OUT OF y)
其中, x必須比y小, y必須是在創(chuàng)建表的時(shí)候bucket on的數(shù)量的因子或者倍數(shù), hive會(huì)根據(jù)y的大小來(lái)決定抽樣多少, 比如原本分了32分, 當(dāng)y=16時(shí), 抽取32/16=2分, 這時(shí)TABLESAMPLE(BUCKET 3 OUT OF 16) 就意味著要抽取第3和第16+3=19分的樣品. 如果y=64，這要抽取 32/64=1/2份數(shù)據(jù), 這時(shí)TABLESAMPLE(BUCKET 3 OUT OF 64) 意味著抽取第3份數(shù)據(jù)的一半來(lái)進(jìn)行.

rcfile操作

// 導(dǎo)入(gzip壓縮)
set hive.enforce.bucketing=true;
set hive.exec.compress.output=true; ?
set mapred.output.compress=true; ?
set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; ?
set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec; ?
from student
insert overwrite table studentrc partition(stat_date="20120802") ?
select id,age,name where stat_date="20120802" sort by age;

// lzo壓縮
set hive.io.rcfile.record.buffer.size = 16777216; // 16 * 1024 * 1024
set io.file.buffer.size = 131072; // 緩沖區(qū)大小 128 * 1024

set hive.enforce.bucketing=true;
set hive.exec.compress.output=true; ?
set mapred.output.compress=true; ?
set mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec; ?
set io.compression.codecs=com.hadoop.compression.lzo.LzoCodec; ?
from student
insert overwrite table studentlzo partition(stat_date="20120802") ?
select id,age,name where stat_date="20120802" sort by age;

// sequencefile導(dǎo)入
set hive.exec.compress.output=true; ?
set mapred.output.compress=true; ?
set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; ?
set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec; ?
insert overwrite table studentseq select * from student;

hive中使用rcfile

更多文章、技術(shù)交流、商務(wù)合作、聯(lián)系博主

微信掃碼或搜索：z360901061

微信掃一掃加我為好友

QQ號(hào)聯(lián)系： 360901061

您的支持是博主寫作最大的動(dòng)力，如果您喜歡我的文章，感覺我的文章對(duì)您有幫助，請(qǐng)用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧，狠狠點(diǎn)擊下面給點(diǎn)支持吧，站長(zhǎng)非常感激您！手機(jī)微信長(zhǎng)按不能支付解決辦法：請(qǐng)將微信支付二維碼保存到相冊(cè)，切換到微信，然后點(diǎn)擊微信右上角掃一掃功能，選擇支付二維碼完成支付。

【本文對(duì)您有幫助就好】元

2元

5元

10元

20元

自定義

日韩久久久精品,亚洲精品久久久久久久久久久,亚洲欧美一区二区三区国产精品 ,一区二区福利