[Solved-1 Solution] Group key value of map in pig ?
What is group by ?
GroupByKeycore transform is a parallel reduction operation used to process collections of key/value pairs.
- We use
GroupByKeywith an input
PCollectionof key/value pairs that represents a multimap, where the collection contains multiple pairs that have the same key, but different values.
GroupByKeytransform lets you gather together all of the values in the multimap that share the same key.
Here we have a file
We know that we can take the values feeding in the key. In the above example we took the map that contains the values with respect to the key "a". Assuming that we don’t know the key, we need to group the values with respect to keys in a relation and dump it.
Does pig allow such operations or need to go with UDF ?
- We can create a custom UDF which converts the map to a bag (using Pig v0.10.0):
Then use the below code
- Now group by key and use a nested foreach: