sqoop - sqoop Import New as well as Updated Data - lastmodified mode - append mode - apache sqoop - sqoop tutorial - sqoop hadoop



Use Sqoop to make incremental copy of a Oracle table to Hadoop

  • Step 1) Create a sqoop job
  • sqoop job \
    --create alarms \
    -- \
    import \
    --connect jdbc:oracle:thin:@devdb11-s.cern.ch:10121/devdb11_s.cern.ch \
    --username hadoop_tutorial \
    -P \
    --num-mappers 1 \
    --target-dir lemontest_alarms_i \
    --table LEMONTEST.ALARMS \
    --incremental append \
    --check-column alarm_id \
    --last-value 0 \
    Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
  • Step 2) Run the sqoop job
  • sqoop job --exec alarms
    
    Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team
  • Step 3) Run sqoop in incremental mode
  • sqoop import \
    --connect jdbc:oracle:thin:@devdb11-s.cern.ch:10121/devdb11_s.cern.ch \
    --username hadoop_tutorial \
    -P \
    --num-mappers 1 \
    --table LEMONTEST.ALARMS \
    --target-dir lemontest_alarms_i \
    --incremental append \
    --check-column alarm_id \
    --last-value 47354 \
    
    hdfs dfs -ls lemontest_alarms_i/
    Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team

    sqoop Import New as well as Updated Data - lastmodified mode

  • If you are adding new rows and updating existing data
  • You need two additional parameters
    • --check-column : A column name that should be checked for newly appended and updated data. date, time, datetime and timestamp are suitable data types for this column
    • --last-value : The last value that successfully imported into Hadoop. All the newly added and updated data after this value will be imported.
    sqoop import \
    --connect jdbc:mysql://mysql.example.com/testdb \
    --username sqoop \
    --password sqoop \
    --table employee \
    --incremental lastmodified \
    --check-column last_update_date \
    --last-value "2015-10-20 06:00:01"
    Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team

    Import New as well as Updated Data - Append Mode

  • If you are only adding new rows in your RDBMS (not updating existing data)
  • You need two additional parameters:
    • --check-column : A column name that should be checked for newly appended data. Integer would be a suitable data type for this column.
    • --last-value : The last value that successfully imported into Hadoop. All the newly added data after this value will be imported.
  • sqoop import \
    --connect jdbc:mysql://mysql.example.com/testdb \
    --username sqoop \
    --password sqoop \
    --table employee \
    --incremental append \
    --check-column id \
    --last-value 100
    Click "Copy code" button to copy into clipboard - By wikitechy - sqoop tutorial - team

    SQOOP forum

    Sqoop Installation and Download :

    Connecting Sqoop to other databases/datastores :

  • Import data directly into Hive Warehouse
  • Import data from RDBMS to HBase table
  • Import data to new catalog
  • Import the results of a query from a relational database into HDFS
  • Load JDBC Driver
  • Sqoop Export :

  • Sqoop Export basic example
  • Sqoop Import RDBMS Table to HDFS :

  • Sqoop Import RDBMS Table to HDFS
  • Merge data-sets imported via incremental import using Sqoop :

  • Import New as well as Updated Data - lastmodified mode - Append Mode
  • Hadoop with Kite SDK :

  • Kite SDK to demonstrate copying of various file formats to Hadoop
  • Sqoop Advanced :

  • Use Sqoop to copy an Oracle table to Hadoop

  • Related Searches to sqoop Import New as well as Updated Data - lastmodified mode - append mode