Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
git clone https://github.com/apache/orc.git
cd orc
mkdir build && cd build

# Export CC and CXX to let cmake use Impala's gcc
# Note that IMPALA-9760 changes the toolchain location in Impala 4.0. Before that you should use
#   export CC="${IMPALA_HOME}/toolchain/gcc-${IMPALA_GCC_VERSION}/bin/gcc"
#   export CXX="${IMPALA_HOME}/toolchain/gcc-${IMPALA_GCC_VERSION}/bin/g++"
export CC="${IMPALA_TOOLCHAIN_PACKAGES_HOME}/gcc-${IMPALA_GCC_VERSION}/bin/gcc"
export CXX="${IMPALA_TOOLCHAIN_PACKAGES_HOME}/gcc-${IMPALA_GCC_VERSION}/bin/g++"

# Use Impala's cmake. Don't build the java lib and libhdfspp.
# Get the latest command example in https://github.com/cloudera/native-toolchain/blob/master/source/orc/build.sh
${IMPALA_TOOLCHAIN_PACKAGES_HOME}/toolchain/cmake-${IMPALA_CMAKE_VERSION}/bin/cmake .. -DBUILD_JAVA=OFF -DBUILD_LIBHDFSPP=OFF -DINSTALL_VENDORED_LIBS=OFF -DBUILD_SHARED_LIBS=ON
# Then compile with multi-processes. $(nproc) is the number of virtual CPU cores.
CFLAGS="-fPIC -DPIC" make -j $(nproc)

# If succeeds, you should be able to find the binary at c++/src/liborc.a

Link Impala with your customized ORC library

Manually replace the ORC library in Impala's toolchain dir with your customized one. Then recompile Impala. Let's say ${ORC_HOME} is where you clone the ORC repo.

Code Block
cd# Before IMPALA-9760, the location is $IMPALA_HOME/toolchain instead.
cd $IMPALA_TOOLCHAIN_PACKAGES_HOME

# Backup the existing library
cp -r orc-${IMPALA_ORC_VERSION} orc-${IMPALA_ORC_VERSION}-bak
cd orc-${IMPALA_ORC_VERSION}

# Replace the library
cp ${ORC_HOME}/build/c++/src/liborc.a lib/liborc.a
# Replace the header files
rm include/orc/*
cp ${ORC_HOME}/build/c++/include/orc/orc-config.hh include/orc/
cp ${ORC_HOME}/c++/include/orc/*.hh include/orc/
# ORC-751 adds another header subdir 'sargs'. Copy it as well.
cp -r ${ORC_HOME}/c++/include/orc/sargs include/orc/

# Recompile Impala
cd $IMPALA_HOME
make -j $(nproc) impalad

Troubleshooting

1. version GLIBCXX not found

Code Block
CMake Error at /root/orc/build/protobuf_ep-prefix/src/protobuf_ep-stamp/protobuf_ep-build-RELWITHDEBINFO.cmake:49 (message):
  Command failed: 2

   'make'

  See also

    /root/orc/build/protobuf_ep-prefix/src/protobuf_ep-stamp/protobuf_ep-build-*.log


make[2]: *** [protobuf_ep-prefix/src/protobuf_ep-stamp/protobuf_ep-build] Error 1
make[1]: *** [CMakeFiles/protobuf_ep.dir/all] Error 2
make: *** [all] Error 2

$ cat protobuf_ep-prefix/src/protobuf_ep-stamp/protobuf_ep-build-err.log                                                                                                                                  
./js_embed: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by ./js_embed)
make[5]: *** [/root/orc/build/protobuf_ep-prefix/src/protobuf_ep/src/google/protobuf/compiler/js/well_known_types_embed.cc] Error 1
make[5]: *** Deleting file `/root/orc/build/protobuf_ep-prefix/src/protobuf_ep/src/google/protobuf/compiler/js/well_known_types_embed.cc'
make[4]: *** [CMakeFiles/libprotoc.dir/all] Error 2
make[3]: *** [all] Error 2

It's using the system provided libstdc++.so. It should use the one in Impala toolchain. Fix it by providing the path in LD_LIBRARY_PATH, e.g.

Code Block
export LD_LIBRARY_PATH=/root/Impala/toolchain/toolchain-packages-gcc7.5.0/kudu-f486f0813a/debug/lib