Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • directory can be full URI. If scheme or authority are not specified, Hive will use the scheme and authority from hadoop configuration variable fs.default.name that specifies the Namenode URI.
  • if LOCAL keyword is used - then Hive will write data to the directory on the local file system.
  • Data written to the filesystem is serialized as text with columns separated by ^A and rows separated by newlines. If any of the columns are not of primitive type - then those columns are serialized to JSON format.
Notes
  • Insert INSERT OVERWRITE statements to directories, local directories and tables (or partitions) can all be used together within the same query.
  • Inserts INSERT OVERWRITE statements to HDFS filesystem directories is the best way to extract large amounts of data from Hive. Hive can write to HDFS directories in parallel from within a map-reduce job.
  • The directory is, as you would expect, OVERWRITten, in other words, if the specified path exists, it is clobbered and replaced with the output.