NOTE THAT the information mentioned on this page is still in progress and may not be fully implemented yet.
Building from sources
Checkout sources:
$ svn co https://svn.apache.org/repos/asf/sqoop/branches/sqoop2
Then, change to Sqoop2 source directory and build them:
$ cd sqoop2 $ mvn install
Creating binaries
Now build and package Sqoop2 as distribution:
$ mvn package -Pdist
This process will create a directory and a tarball under dist/target
directory. The directory (named sqoop-2.0.0-SNAPSHOT
as of this writing) contains necessary binaries to run Sqoop2, and its structure looks something like
--+ bin --+ sqoop | + conf --+ sqoop_bootstrap.properties | | | + sqoop.properties | + client --+ lib --+ sqoop-common.jar | | | + sqoop-client.jar | | | + (3rd-party client dependency jars) | + server --+ bin --+ setenv.sh | | | + webapps --+ sqoop.war | + ...
As part of this process, a copy of the Tomcat server is also downloaded and put under the server
directory in the above structure.
Starting/Stopping Sqoop2 server
To start Sqoop2 server, change to Sqoop2 distribution directory and invoke the sqoop
shell script:
cd dist/target/sqoop-2.0.0-SNAPSHOT bin/sqoop server start
The Sqoop2 server is then running as a web application within the Tomcat server.
Similarly, to stop Sqoop2 server, do the following:
bin/sqoop server stop
Starting/Running Sqoop2 client
To start an interactive shell,
bin/sqoop client
This will bring up an interactive client ready for input commands:
Sqoop Shell: Type 'help' or '\h' for help. sqoop:000>
Alternatively, the shell client can be run in script mode. For example, a command script can be created as
echo "set server --host localhost --port 8080 --webapp sqoop" > sqoop.script echo "show version --all" >> sqoop.script
Then, the command script can be run with
bin/sqoop client sqoop.script
The command for the shell client looks something like <command> <function> <options>:
- set
- set server
- set server --host <host>
- set server --port <port>
- set server --webapp <webapp>
- set server
- show
- show version
- show version --all
- show version --server
- show version --client
- show version --protocol
Unknown macro: {hide-if}
– show connector
– show connection
– show job- create
- create connection --id <connector id> <more connection options>
- create job --cid <connection id> <more job options>
- start
- start job --jid <job id>
- stop
- stop job --jid <job id>
- create
- show version
Modifying configuration
Both the default bootstrap configuration sqoop_bootstrap.properties
and the main configuration sqoop.properties
are located under the conf
directory in the Sqoop2 distribution directory.
The bootstrap configuration sqoop_bootstrap.properties
controls what the mechanism is to provide configuration:
sqoop.config.provider=org.apache.sqoop.core.PropertiesConfigurationProvider
The main configuration sqoop.properties
controls what the mechanism is for repository, where the log files are, what the logging levels are, etc.
# Log4J system org.apache.sqoop.log4j.appender.file=org.apache.log4j.RollingFileAppender org.apache.sqoop.log4j.appender.file.File=logs/sqoop.log org.apache.sqoop.log4j.appender.file.MaxFileSize=25MB org.apache.sqoop.log4j.appender.file.MaxBackupIndex=5 org.apache.sqoop.log4j.appender.file.layout=org.apache.log4j.PatternLayout org.apache.sqoop.log4j.appender.file.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} [%l] %m%n org.apache.sqoop.log4j.debug=true org.apache.sqoop.log4j.rootCategory=WARN, file org.apache.sqoop.log4j.category.org.apache.sqoop=DEBUG org.apache.sqoop.log4j.category.org.apache.derby=INFO # Repository org.apache.sqoop.repository.provider=org.apache.sqoop.repository.JdbcRepositoryProvider org.apache.sqoop.repository.jdbc.handler=org.apache.sqoop.repository.derby.DerbyRepositoryHandler org.apache.sqoop.repository.jdbc.transaction.isolation=READ_COMMITTED org.apache.sqoop.repository.jdbc.maximum.connections=10 org.apache.sqoop.repository.jdbc.url=jdbc:derby:repository/db;create=true org.apache.sqoop.repository.jdbc.create.schema=true org.apache.sqoop.repository.jdbc.driver=org.apache.derby.jdbc.EmbeddedDriver org.apache.sqoop.repository.jdbc.user=sa org.apache.sqoop.repository.jdbc.password= org.apache.sqoop.repository.sysprop.derby.stream.error.file=logs/derbyrepo.log
Debugging information
The logs of the Tomcat server is located under the server/logs
directory in the Sqoop2 distribution directory.
The logs of the Sqoop2 server and the Derby repository are located as sqoop.log
and derbyrepo.log
(by default unless changed by the above configuration), respectively, under the logs
directory in the Sqoop2 distribution directory.