You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Release: none (trunk) as of 2012-05-18

Flume 1.x supports securely communicating with Hadoop using Kerberos.

In FLUME-1196, support was added for secure impersonation of Hadoop users. This was implemented similar to how Oozie implements secure user impersonation.

There are a few steps to setting up secure impersonation from Flume to Hadoop. The below steps assume you are using Kerberos. However, impersonation still works on non-Kerberos secured clusters, and Kerberos-specific aspects should be omitted in that case.

  1. Hadoop must be configured to allow impersonation.
  2. Set up a Kerberos keytab for the Kerberos principal and host Flume is connecting to HDFS from. This user must match the Hadoop configuration in Step 1 above.
    • Instructions for configuring Hadoop security can be found online which explain how to create a keytab file.
  3. Configure the HDFS sink with the following configuration options:
    • hdfs.kerberosPrincipal - fully-qualified principal. Note: _HOST will be replaced by the hostname of the local machine (only in-between the / and @ characters, though)
    • hdfs.kerberosKeytab - location on the local machine of the keytab containing user and host keys for the above principal
    • hdfs.proxyUser - "proxy" user to impersonate

Example snippet (the majority of the HDFS sink configuration options have been omitted here):

agent.sinks.sink-1.type = hdfs
agent.sinks.sink-1.hdfs.kerberosPrincipal = flume/_HOST@EXAMPLE-REALM.COM
agent.sinks.sink-1.hdfs.kerberosKeytab = /home/mpercy/flume.keytab
agent.sinks.sink-1.hdfs.proxyUser = will

In the above example, the flume user impersonates the user will. This will only be allowed if KDC authenticates the principal, and the Namenode authorizes impersonation of the specified proxy user by the provided principal.

  • No labels