pig tutorial - apache pig tutorial - apache pig testing - pig latin - apache pig - pig hadoop



Apache PIG Testing and Diagnostics

  • Diagnostics operators
    • DESCRIBE - returns the schema of a relation
    • DUMP - displays results to screen
    • EXPLAIN - displays execution plans
    • ILLUSTRATE - displays a step-by-step execution of a sequence of statements
  • PigUnit
    • helps to perform unit testing and regression testing
  • Penny - framework for creating Pig monitoring and debugging tools
    • https://cwiki.apache.org/confluence/display/PIG/Penny
    • crash investigation – why did my pig script crash?
    • data histograms - print a histogram of a particular field of a particular intermediate data set.
    • record tracing – trace a particular record through the steps of a pig script, and show how it gets transformed
    • integrity alerts – send an alert if a record passing between steps i and j fails to satisfy a particular predicate
    • Overhead profiling – which steps in the pig script dominate the running time?

  • Related Searches to Apache Pig Overview