Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Some of the well known quirks for the Sqoop2 integration test suite is documented here so that developers can be aware of what to expect when running

How To run integration tests?

We recommend not running integration tests from your IDE - there can be some strange and unexpected errors there.

Use command line to run the tests:

You can run the entire integration test suite with from the root directory:

Code Block
languagebash
mvn clean integration-test

This will however also run the unit tests and hence it will take some time. If you want to iteratively run only the integration tests (all or just subset), you need to install Sqoop artifacts to your local maven cache:

Code Block
languagebash
mvn clean install -DskipTests

Then you can run just the integration tests with: (This will skip the unit tests)

noformat
Code Block
languagebash
mvn clean integration-test -pl test

to run a specific test:Assuming that you've installed the Sqoop artifacts into local maven cache, you can run one simple test using: (notice that for one test we're using target test rather then integration-test)

No Format
mvn clean test -pl test -Dtest=org.apache.sqoop.integration.connector.kafka.FromRDBMSToKafkaTest -DfailIfNoTests=false verify

I'm running tests on Mac computer

If you see new Java processes created UI application and stealing the focus, then you should export this property to avoid thatYou can disable the annoying java processes that pop up when you run the tests by running the following before using mvn:

No Format
export _JAVA_OPTIONS=-Djava.awt.headless=true

How to run the integration tests on LocalJobRunner instead of MiniCluster

To run with local mapreduce (faster and theoretically you should be able to attach a debugger): 

Warning

But there may be some quirks with HadoopLocalRunner and is not always recommended, but its way faster than the default minicluster option

No Format
mvn clean integration-test -pl test -Dsqoop.hadoop.runner.class=org.apache.sqoop.test.hadoop.HadoopLocalRunner -Dtest=org.apache.sqoop.integration.connector.kafka.FromRDBMSToKafkaTest -DfailIfNoTests=false verify

How does the integration test suite work? 

Minicluster ( psuedo distributed mode)

...

A good blog post explaining the modes of testing in Hadoop.

How does debug the integration tests? 

 

No Format
//todo:VB

 

What DB does integration tests use today for storing the Sqoop entities ?

By default it is embedded Derby 

 

Code Block
public class DerbyProvider extends DatabaseProvider {
  @Override
  public void start() {
    // Start embedded server
    try {
      port = NetworkUtils.findAvailablePort();
      LOG.info("Will bind to port " + port);
      server = new NetworkServerControl(InetAddress.getByName("localhost"), port);
      server.start(new LoggerWriter(LOG, Level.INFO));
      // Start won't thrown an exception in case that it fails to start, one
      // have to explicitly call ping() in order to verify if the server is
      // up. Check DERBY-1465 for more details.
      server.ping();
    } catch (Exception e) {
      LOG.error("Can't start Derby network server", e);
      throw new RuntimeException("Can't derby server", e);
    }
    super.start();
  }

NOTE: Even though there are other providers such as  MySQLProvider and PostgreSQLProvider, they are not used in any of the tests.

What are the datasets we use in some of the integration tests ?

Anything that extends the following base class

Code Block
public abstract class DataSet { ..}

 

Where to look for MR Job related logs in the integration tests?

Look under

/path/to/sqoop2/test/target under your source folder. Inside each of the MiniMRCluster_XXXX folders there will sub folders and logs.

...

Code Block
/path/to/sqoop2/test/target/MiniMRCluster_96106422

MiniMRCluster_96106422-localDir-nm-0_0	MiniMRCluster_96106422-localDir-nm-0_2	MiniMRCluster_96106422-logDir-nm-0_0	MiniMRCluster_96106422-logDir-nm-0_2

MiniMRCluster_96106422-localDir-nm-0_1	MiniMRCluster_96106422-localDir-nm-0_3	MiniMRCluster_96106422-logDir-nm-0_1	MiniMRCluster_96106422-logDir-nm-0_3

What happens when integration tests are abruptly terminated due to CTRL + C or failures?

 

Please look for zombie java processes and kill them all before running the integration tests. Currently the cluster does not cleanly shutdown.

...

Code Block
ps -ef | grep java
killall -9 java
 
or more advanced....
for p in `ps aux | grep java | grep YarnChild| sed -re "s/<username> ([0-9]+) ./\1/"`; do echo $p; kill -9 $p;  done

Unusual Tomcat failed to start issue found?

First check the tomcat.log under /path/to/sqoop//test/target/sqoop-cargo-tests/ org.apache.sqoop.integration.connector.jdbc.generic.FromRDBMSToHDFSTest/testBasic/log/tomcat.log

...

Solution : Nuke the directory /var/folders/l8/hyl1hnqj3vq57gdf8f9nb0740000gp/T/cargo 

 

Some related tickets that is in place to fix some of these quirks

 

...