Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added section on running automated code checks locally.

...

We use the Google Logging Library. See httphttps://google-gloggithub.googlecode.com/google/svnglog/trunkblob/docmaster/glogREADME.htmlrst 

The library defines logging levels of ERROR, INFO and WARNING. We also use verbose logging which can be turned on with with environment variables. e.g:

...

Code Block
export GLOG_logtostderr=1

Call Trace

Sometimes you may want to know how the code path comes into a function. In the backend, you can add some logs for GetStackTrace():

Code Block
VLOG_QUERY << "args: " << your_interested_var << std::endl << GetStackTrace();

Make sure "util/debug-util.h" is included in your file.

In the frontend, you can add some logs for a generated Exception:

Code Block
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class YourInterestedClass {
  private final static Logger LOG = LoggerFactory.getLogger(YourInterestedClass.class);
  public void yourInterestedFunc() {
    ...
    LOG.info("some message", new Exception("call trace"));
    ...
  }

Existing Hadoop Installations

If you can avoid it, don't install or run another Hadoop on your system. This is an easy source of problems when developing as out-of-date binaries, headers and configuration files can get silently picked up.

 

Useful Bookmarklets for Your Browser

Mirror upstream JIRA to downstream

We track all upstream P1 and P2 issues in our downstream Jira. Replication happens automatically, but can also be done by running the jira-mirror Jenkins job. To trigger this job you can use this bookmarklet on any public JIRA page:

Code Block
languagejs
 javascript:location.href='http://golden.jenkins.cloudera.com/view/Impala/job/jira-mirror/parambuild/?UPSTREAM_ISSUE='+document.location.href;

Navigate from upstream JIRA to downstream

Often you find yourself navigating between upstream and downstream JIRA pages related to the same issue. All downstream JIRAs should have a link to the upstream JIRA. To find the corresponding downstream JIRA from an upstream page, you can use this bookmarklet:

Code Block
languagejs
javascript:location.href='https://jira.cloudera.com/issues/?jql=text%20~%20%22'+document.location["pathname"].split('/')[2]+'%22';

It works on pages like https://issues.cloudera.org/browse/IMPALA-3641.

Navigate to latest version of docs

Google search will often send you to documentation of older releases. To navigate to the latest version of a documentation page, you can use this bookmarklet:

Code Block
languagejs
javascript:location.href='http://www.cloudera.com/documentation/enterprise/latest/topics/' + document.location["pathname"].substring(document.location["pathname"].lastIndexOf('/') + 1);

It works on pages like https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_parquet.html.

Developer Tooling

Impala developers use the following tooling to work on the code base:


Starting Minicluster with SSL

To start the minicluster with SSL you need a SSL certificate/key pair. It can be self-signed:

Code Block
languagebash
# Make sure you specify your Common Name as your host's FQDN
openssl req -newkey rsa:2048 -nodes -keyout key.pem -x509 -days 365 -out certificate.pem
 
# After building, you can start your Impala cluster with the same flags as documented in
# http://impala.apache.org/docs/build/html/topics/impala_ssl.html
# Note that we are setting the --catalog_service_host and --state_store_host to avoid them defaulting to localhost.
# SSL won't tolerate mismatch Common Name
 
$IMPALA_HOME/bin/start-impala-cluster.py --impalad_args='--backend_client_rpc_timeout_ms=10000 --catalog_service_host=$(hostname -f) --state_store_host=$(hostname -f) --ssl_server_certificate=$IMPALA_HOME/certificate.pem --ssl_private_key=$IMPALA_HOME/key.pem --ssl_client_ca_certificate=$IMPALA_HOME/certificate.pem' --catalogd_args='--catalog_service_host=$(hostname -f) --state_store_host=$(hostname -f) --ssl_server_certificate=$IMPALA_HOME/certificate.pem --ssl_private_key=$IMPALA_HOME/key.pem --ssl_client_ca_certificate=$IMPALA_HOME/certificate.pem' --state_store_args='--catalog_service_host=$(hostname -f) --state_store_host=$(hostname -f) --ssl_server_certificate=$IMPALA_HOME/certificate.pem --ssl_private_key=$IMPALA_HOME/key.pem --ssl_client_ca_certificate=$IMPALA_HOME/certificate.pem'


Running Automated Code Quality Checks Locally

When a patchset is published in Gerrit, automated code checks are ran.  To run these checks on local code before pushing, follow these steps.

Python

From the Impala home directory, run:

./bin/jenkins/critique-gerrit-review.py --dryrun 

C++

From the Impala home directory, run clang tidy (note: this runs a full build and thus takes a few minutes):

./bin/run_clang_tidy.sh

...