THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
How does the integration test suite work?
Minicluster ( psuedo distributed mode)
- They use the Hadoop Minicluster** behind the scenes. to simulate the MR execution engine environment.
- Read more about Minicluster here
- http://gdfm.me/2010/08/03/how-to-run-a-minicluster-based-junit-test-with-eclipse/
- The integration tests are tightly tied to the MR Execution engine at this point. Some rework will be needed to get this working in a Spark execution engine context.
They use the Hadoop Minicluster** behind the scenes. to simulate the MR execution engine environment. It runs the test to simulate a real distributed cluster but only difference is that it happens in the same JVM. Hence it is also referred to as the psuedo distributed mode
LocalMode ( localRunner mode )
- When using this option -Dsqoop.hadoop.runner.class=org.apache.sqoop.test.hadoop.HadoopLocalRunner, it it does not use the minicluster and
...
In our code, this is how we detect that it is using localRunner
Code Block |
---|
/**
* Detect MapReduce local mode.
*
* @return True if we're running in local mode
*/
private boolean isLocal() {
// If framework is set to YARN, then we can't be running in local mode
if("yarn".equals(globalConfiguration.get("mapreduce.framework.name"))) {
return false;
}
// If job tracker address is "local" then we're running in local mode
return "local".equals(globalConfiguration.get("mapreduce.jobtracker.address"))
|| "local".equals(globalConfiguration.get("mapred.job.tracker"));
} |
www.lopakalogic.com/articles/hadoop-articles/hadoop-testing-with-minicluster/
What DB does integration tests use today for storing the Sqoop entities ?
By default it is embedded Derby
Code Block |
---|
public class DerbyProvider extends DatabaseProvider { @Override public void start() { // Start embedded server try { port = NetworkUtils.findAvailablePort(); LOG.info("Will bind to port " + port); server = new NetworkServerControl(InetAddress.getByName("localhost"), port); server.start(new LoggerWriter(LOG, Level.INFO)); // Start won't thrown an exception in case that it fails to start, one // have to explicitly call ping() in order to verify if the server is // up. Check DERBY-1465 for more details. server.ping(); } catch (Exception e) { LOG.error("Can't start Derby network server", e); throw new RuntimeException("Can't derby server", e); } super.start(); } |
NOTE: Even though there are other providers such as MySQLProvider and PostgreSQLProvider, they are not used in any of the tests.
What are the datasets we use in some of the integration tests ?
Anything that extends the following base class
Code Block |
---|
public abstract class DataSet { ..} |
...