hadoop – PIG中GROUP和COGROUP有什么区别?

我理解Group没有使用多个元组,因此我们在PIG中使用了COGROUP.但是,今天检查GROUP命令对我有用.我使用的是PIG-0.12.0.
我的命令和输出如下.

grunt> grpvar = GROUP C by $2, B by $2;
grunt> cogrpvar = COGROUP C by $2, B by $2;
grunt> describe grpvar;

grpvar: {group: chararray,C: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)},B: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)}}

grunt> describe cogrpvar;

cogrpvar: {group: chararray,C: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)},B: {(pid: int,pname: chararray,drug: chararray,gender: chararray,tot_amt: int)}}

GROUP预计会像这样工作吗?
GROUP和COGROUP有什么区别?

是的组应该像那样工作!

根据文档(http://pig.apache.org/docs/r0.12.0/basic.html#group):

Note: The GROUP and COGROUP operators are identical. Both operators
work with one or more relations. For readability GROUP is used in
statements involving one relation and COGROUP is used in statements
involving two or more relations. You can COGROUP up to but no more
than 127 relations at a time.

所以它只是为了可读性,两者之间没有区别.

相关文章
相关标签/搜索