You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

When starting an Impala minicluster on a local machine using '$IMPALA_HOME/bin/start-impala-cluster.py', by default there is no authorization service, e.g., Sentry or Ranger, enabled on this cluster. We could enable the Ranger service on an Impala minicluster by the following steps.

  1. Execute '$IMPALA_HOME/testdata/bin/kill-all.sh'
  2. Execute 'source $IMPALA_HOME/bin/impala-config.sh'
  3. Execute '$IMPALA_HOME/buildall.sh -noclean -notests -ninja'
  4. Execute '$IMPALA_HOME/bin/create-test-configuration.sh -create_ranger_policy_db'
  5. Execute '$IMPALA_HOME/testdata/bin/run-all.sh'
  6. Execute '$IMPALA_HOME/testdata/bin/create-load-data.sh'
  7. Execute the following command. We note that the arguments passed into 'start-impala-cluster.py' could also be found at '$IMPALA_HOME/tests/authorization/test_ranger.py'. The first 4 arguments are passed to 'start-impala-cluster.py' directly and those surrounded by a pair of single quotation marks are passed to statestore, impalad, and catalogd, respectively.

$IMPALA_HOME/bin/start-impala-cluster.py \
--cluster_size=3 \
--num_coordinators=3 \
--log_dir=/tmp/ \
--log_level=1 \
'--state_store_args=--statestore_update_frequency_ms=50 --statestore_priority_update_frequency_ms=50 --statestore_heartbeat_frequency_ms=50' \
'--impalad_args=--server-name=server1 --ranger_service_type=hive --ranger_app_id=impala --authorization_provider=ranger' \
'--catalogd_args=--server-name=server1 --ranger_service_type=hive --ranger_app_id=impala --authorization_provider=ranger'

Once the minicluster has been started, we could log into the Impala shell as an administrator with username 'admin' by executing '$IMPALA_HOME/bin/impala-shell -u admin'. The account of 'admin' was added by the function 'setup-ranger' in 'create-load-data.sh' above. To tell whether or not the Impala is Ranger-enabled, we could try to execute 'refresh authorization' in the Impala shell. If the Ranger is enabled in Impala, we would see some output similar to the following.

[localhost:21000] default> refresh authorization;
Query: refresh authorization
Query submitted at: 2019-08-29 15:17:37 (Coordinator: http://fangyurao-OptiPlex-9020:25000)
Query progress can be monitored at: http://fangyurao-OptiPlex-9020:25000/query_plan?query_id=374567f0bf4ca48b:0769d23700000000
Fetched 0 row(s) in 0.02s

If the Ranger service is not correctly enabled, then after executing '$IMPALA_HOME/bin/impala-shell -u admin' followed by 'refresh authorization' in the Impala shell, we may see the following error message.

[localhost:21000] default> refresh authorization;
Query: refresh authorization
Query submitted at: 2019-08-29 15:23:37 (Coordinator: http://fangyurao-OptiPlex-9020:25000)
ERROR: AnalysisException: Authorization is not enabled. To enable authorization restart Impala with the --server_name=<name> flag.

We also note that currently to create 'admin' in the Ranger service before this account is created, we have to run '$IMPALA_HOME/testdata/bin/create-load-data.sh' (the 5th step above to start a Ranger-enabled Impala minicluster), which does much more than needed because this script will also load the whole test datasets, which is time-consuming. A better approach is thus to only call the function 'setup-ranger' in 'create-load-data.sh' only. To achieve this, we may consider moving the function 'setup-ranger' out of 'create-load-data.sh' and then we make 'create-load-data.sh' call 'setup-ranger'.

Troubleshooting

1.  Encounter errors like "AuthorizationException: User 'admin' does not have privileges to execute ..." in tests

Range may not be configured correctly. See logs in ${IMPALA_HOME}/logs/cluster/ranger/ranger-admin-${HOSTNAME}-${USER}.log, there may be errors like

2019-10-10 02:29:41,007 [http-bio-6080-exec-2] ERROR org.apache.ranger.common.ServiceUtil (ServiceUtil.java:1359) - Requested Service not found. serviceName=test_impala
2019-10-10 02:29:41,008 [http-bio-6080-exec-2] INFO  org.apache.ranger.common.RESTErrorUtil (RESTErrorUtil.java:345) - Request failed. loginId=null, logMessage="RANGER_ERROR_SERVICE_NOT_FOUND: ServiceName=test_impala"

If so, the faster way is to create the missing service manually. Run these commands (come from setup-ranger() in testdata/bin/create-load-data.sh) in your shell:

RANGER_SETUP_DIR="${IMPALA_HOME}/testdata/cluster/ranger/setup"

perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \
  "${RANGER_SETUP_DIR}/impala_group.json.template" > \
  "${RANGER_SETUP_DIR}/impala_group.json"

export GROUP_ID=$(wget -qO - --auth-no-challenge --user=admin --password=admin \
  --post-file="${RANGER_SETUP_DIR}/impala_group.json" \
  --header="accept:application/json" \
  --header="Content-Type:application/json" \
  http://localhost:6080/service/xusers/secure/groups |
  python -c "import sys, json; print json.load(sys.stdin)['id']")

perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \
  "${RANGER_SETUP_DIR}/impala_user.json.template" > \
  "${RANGER_SETUP_DIR}/impala_user.json"

wget -O /dev/null --auth-no-challenge --user=admin --password=admin \
  --post-file="${RANGER_SETUP_DIR}/impala_user.json" \
  --header="Content-Type:application/json" \
  http://localhost:6080/service/xusers/secure/users

wget -O /dev/null --auth-no-challenge --user=admin --password=admin \
  --post-file="${RANGER_SETUP_DIR}/impala_service.json" \
  --header="Content-Type:application/json" \
  http://localhost:6080/service/public/v2/api/service

Then you should be able to see the "test_impala" service in your Ranger portal (default to http://localhost:6080) like this

If you encounter errors in executing the wget commands, try restart Ranger by testdata/bin/run-ranger-server.sh. If Ranger fails to start, try reconfigure the ranger db by "bin/create-test-configuration.sh -create_ranger_policy_db".



  • No labels