...
The File component provides access to file systems, allowing files to be processed by any other Camel Components or messages from other components to be saved to disk.
URI format
...
or
...
...
Where directoryName
represents the underlying file directory.
You can append query options to the URI in the following format, ?option=value&option=value&...
...
...
Camel supports only endpoints configured with a starting directory. So the directoryName
must be a directory. If you want to consume a single file only, you can use the fileName option e.g., by setting fileName=thefilename
. Also, the starting directory must not contain dynamic expressions with ${}
placeholders. Again use the fileName
option to specify the dynamic part of the filename.
...
...
Beware the JDK File IO API is a bit limited in detecting whether another application is currently writing/copying a file. And the implementation can be different depending on OS platform as well. This could lead to that Camel thinks the file is not locked by another process and start consuming it. Therefore you have to do you own investigation what suites your environment. To help with this Camel provides different readLock
options and doneFileName
option that you can use. See also the section Consuming files from folders where others drop files directly.
URI Options
Common
...
Name | Default Value | Description |
---|---|---|
|
| Automatically create missing directories in the file's path name. For the file consumer, that means creating the starting directory. For the file producer, it means the directory the files should be written to. |
|
| Write buffer sized in bytes. |
|
| Use Expression such as File Language to dynamically set the filename. For consumers, it's used as a filename filter. For producers, it's used to evaluate the filename to write. If an expression is set, it take precedence over the For the consumer, you can use it to filter filenames, so you can for instance consume today's file using the File Language syntax: |
|
| Flatten is used to flatten the file name path to strip any leading paths, so it's just the file name. This allows you to consume recursively into sub-directories, but when you eg write the files to another directory they will be written in a single directory. Setting this to |
|
| Camel 2.9.3: this option is used to specify the encoding of the file. You can use this on the consumer, to specify the encodings of the files, which allow Camel to know the charset it should load the file content in case the file content is being accessed. Likewise when writing a file, you can use this option to specify which charset to write the file as well. See further below for a examples and more important details. |
|
| Camel 2.9: whether to fallback and do a copy and delete file, in case the file could not be renamed directly. This option is not available for the FTP component. |
|
| Camel 2.13.1: Perform rename operations using a copy and delete strategy. This is primarily used in environments where the regular rename operation is unreliable e.g., across different file systems or networks. This option takes precedence over the |
Consumer
...
...
Name | Default Value | Description |
---|---|---|
|
| Milliseconds before polling the file/directory starts. |
|
| Milliseconds before the next poll of the file/directory. |
|
| Controls if fixed delay or fixed rate is used. See ScheduledExecutorService in JDK for details. In Camel 2.7.x or older the default value is From Camel 2.8 onward the default value is |
|
| Camel 2.8: The consumer logs a start/complete log line when it polls. This option allows you to configure the logging level for that. |
|
| If a directory, will look for files in all the sub-directories as well. |
|
| If |
|
| If |
|
| Expression (such as File Language) used to dynamically set the filename when moving it before processing. For example to move in-progress files into the |
|
| Expression (such as File Language) used to dynamically set the filename when moving it after processing. To move files into a |
|
| Expression (such as File Language) used to dynamically set a different target directory when moving files in case of processing (configured via For example, to move files into a Note: When moving the files to the “fail” location Camel will handle the error and will not pick up the file again. |
|
| Is used to include files, if filename matches the regex pattern (matching is case in-sensitive from Camel 2.17 onward). |
|
| Is used to exclude files, if filename matches the regex pattern (matching is case in-sensitive from Camel 2.17 onward). |
|
| Camel 2.10: Ant style filter inclusion, for example |
|
| Camel 2.10: Ant style filter exclusion. If both |
|
| Camel 2.11: Ant style filter which is case sensitive or not. |
|
| Option to use the Idempotent Consumer EIP pattern to let Camel skip already processed files. Will by default use a memory based LRUCache that holds 1000 entries. If |
|
| Camel 2.11: To use a custom idempotent key. By default the absolute path of the file is used. You can use the File Language, for example to use the file name and file size, you can do: |
...
|
| A pluggable repository |
|
| A pluggable in-progress repository |
|
| Pluggable filter as a |
|
| Camel 2.18: Filters the directory based on Simple language. For example to filter on current date, you can use a simple date pattern such as |
|
| Camel 2.18: Filters the file based on Simple language. For example to filter on file size, you can use |
|
| Camel 2.16: To shuffle the list of files (sort in random order). |
|
| Pluggable sorter as a |
|
| Built-in sort using the File Language. Supports nested sorts, so you can have a sort by file name and as a 2nd group sort by modified date. See sorting section below for details. |
|
| Used by consumer, to only poll the files if it has exclusive read-lock on the file e.g., the file is not in-progress or being written. Camel will wait until the file lock is granted. This option provides the built-in strategies:
Warning: most of the read lock strategies are not suitable for use in clustered mode. That is, you cannot have multiple consumers attempting to read the same file in the same directory. In this case, the read locks will not function reliably. The idempotent read lock supports clustered reliably if you use a cluster aware idempotent repository implementation such as from Hazelcast Component or Infinispan. |
|
| Optional timeout in milliseconds for the Note: for FTP the default |
|
| Camel 2.6: Interval in milliseconds for the read-lock, if supported by the read lock. This interval is used for sleeping between attempts to acquire the read lock. For example when using the |
|
| Camel 2.10.1: This option applied only for |
|
| Camel 2.15: This option applies only to |
|
| Camel 2.12: Logging level used when a read lock could not be acquired. By default a This option is only applicable for the
|
|
| Camel 2.14: Whether to use marker file with the |
|
| Camel 2.16: This option applied only for |
readLockRemoveOnCommit |
| Camel 2.16: This option applied only for |
|
| Camel 2.16: Whether or not read lock with marker files should upon startup delete any orphan read lock files, which may have been left on the file system, if Camel was not properly shutdown (such as a JVM crash). If turning this option to false then any orphaned lock file will cause Camel to not attempt to pickup that file, this could also be due another node is concurrently reading files from the same shared directory. |
|
| Camel 2.5: Similar to |
|
| Camel 2.6: If provided, Camel will only consume files if a done file exists. This option configures what file name to use. Either you can specify a fixed name. Or you can use dynamic placeholders. The done file is always expected in the same folder as the original file. See using done file and writing done file sections for examples. |
|
| Pluggable read-lock as a |
|
| An integer to define a maximum messages to gather per poll. By default no maximum is set. Can be used to set a limit of e.g. Notice: If this option is in use then the File and FTP components will limit before any sorting. For example if you have 100000 files and use |
|
| Camel 2.9.3: Allows for controlling whether the limit from |
| 0 | Camel 2.8: The minimum depth to start processing when recursively processing a directory. Using This option is supported by FTP consumer from Camel 2.8.2, 2.9 onward. |
|
| Camel 2.8: The maximum depth to traverse when recursively processing a directory. This option is supported by FTP consumer from Camel 2.8.2, 2.9 onward. |
|
| A pluggable |
|
| Camel 2.5: Whether the starting directory must exist. Mind that the |
|
| A pluggable The default implementation will log the caused exception at |
|
| Camel 2.9: If the polling consumer did not poll any files, you can enable this option to send an empty message (no body) instead. |
|
| Camel 2.10: Allows for bridging the consumer to the Camel routing Error Handler, which mean any exceptions occurred while trying to pickup files, or the likes, will now be processed as a message and handled by the routing Error Handler. By default the consumer will use the |
|
| Camel 2.10: Allows for configuring a custom/shared thread pool to use for the consumer. By default each consumer has its own single threaded thread pool. This option allows you to share a thread pool among multiple file consumers. |
|
| Camel 2.12: To use a custom scheduler to trigger the consumer to run. See more details at Polling Consumer, for example there is a Quartz2, and Spring based scheduler that supports CRON expressions. |
|
| Camel 2.12: To let the scheduled polling consumer backoff if there has been a number of subsequent idles/errors in a row. The multiplier is then the number of polls that will be skipped before the next actual attempt is happening again. When this option is in use then For more details see: Polling Consumer. |
|
| Camel 2.12: The number of subsequent idle polls that should happen before the |
|
| Camel 2.12: The number of subsequent error polls (failed due some error) that should happen before the |
| Camel 2.16: To use a custom | |
|
| Camel 2.17: Whether to enable probing of the content type. If enable then the consumer uses Camel 2.15-2.16.x the default is true. |
extendedAttributes | null | Camel 2.17: To enable gathering extended file attributes through |
Default behavior for file consumer
By default the file is not locked for the duration of the processing.
After the route has completed, files are moved into the
.camel
subdirectory, so that they appear to be deleted.The File Consumer will always skip any file whose name starts with a dot, such as
.
,.camel
,.m2
or.groovy
.Only files (not directories) are matched for valid filename, if options such as:
include
orexclude
are used.
Producer
...
...
Name | Default Value | Description |
---|---|---|
|
| What to do if a file already exists with the same name. The following values can be specified:
|
|
| This option is used to write the file using a temporary name and then, after the write is complete, rename it to the real name. Can be used to identify files being written to and also avoid consumers (not using exclusive read locks) reading in progress files. Is often used by FTP when uploading big files. |
|
| Camel 2.1: The same as |
|
| Camel 2.10.1: Expression (such as File Language) used to compute file name to use when This option only supports the following File Language tokens:
Note: the |
|
| Camel 2.2: Will keep the last modified timestamp from the source file (if any). Will use the Note: This option only applies to the file producer. It cannot be used with any of the FTP producers. |
|
| Camel 2.3: Whether or not to eagerly delete any existing target file. This option only applies when you use From Camel 2.10.1 onward this option is also used to control whether to delete any existing files when |
|
| Camel 2.6: If provided, then Camel will write a second file (called done file) when the original file has been written. The done file will be empty. This option configures what file name to use. You can either specify a fixed name, or you can use dynamic placeholders. The done file will always be written in the same folder as the original file. See writing done file section for examples. |
|
| Camel 2.10.1: Used to specify if a null body is allowed during file writing. If set to true then an empty file will be created, when set to false, and attempting to send a null body to the file component, a If |
|
| Camel 2.10.5/2.11: Whether to force syncing writes to the file system. You can turn this off if you do not want this level of guarantee, for example if writing to logs / audit logs etc; this would yield better performance. |
|
| Camel 2.15.0: Specify the file permissions which is sent by the producer, the chmod value must be between |
|
| Camel 2.17.0: Specify the directory permissions used when the producer creates missing directories, the chmod value must be between |
Default behavior for file producer
...
Lets illustrate this with an example:
...
When a file is dropped in the inbox
folder, the file consumer notices this and creates a new FileExchange
that is routed to the handleOrder
bean. The bean then processes the File
object. At this point in time the file is still located in the inbox
folder. After the bean completes, and thus the route is completed, the file consumer will perform the move operation and move the file to the .done
sub-folder.
The move
and the preMove
options are considered as a directory name though if you use an expression such as File Language, or Simple then the result of the expression evaluation is the file name to be used e.g., if you set
...
...
then that's using the File Language which we use return the file name to be used), which can be either relative or absolute. If relative, the directory is created as a sub-folder from within the folder where the file was consumed.
...
If you want to delete the file after processing, the route should be:
...
We have introduced a pre move operation to move files before they are processed. This allows you to mark which files have been scanned as they are moved to this sub folder before being processed.
...
...
You can combine the pre move and the regular move:
...
...
So in this situation, the file is in the inprogress
folder when being processed and after it's processed, it's moved to the .done
folder.
...
So if we want to move the file into a backup folder with today's date as the pattern, we can do:
...
...
About moveFailed
The moveFailed
option allows you to move files that could not be processed successfully to another location such as a error folder of your choice. For example to move the files in an error folder with a timestamp you can use moveFailed=/error/${
file:name.noext
}-${date:now:yyyyMMddHHmmssSSS}.${
file:ext
}.
...
File producer only
...
Header | Description |
---|---|
| Specifies the name of the file to write (relative to the endpoint directory). This name can be a |
| The absolute file path (path + name) for the output file that was written. This header is set by Camel and its purpose is providing end-users with the name of the file that was written. |
| Camel 2.11: Is used for overruling |
File consumer only
...
...
Header | Description |
---|---|
| Name of the consumed file as a relative file path with offset from the starting directory configured on the endpoint. |
| Only the file name (the name with no leading paths). |
| A |
| The absolute path to the file. For relative files this path holds the relative path instead. |
| The file path. For relative files this is the starting directory + the relative filename. For absolute files this is the absolute path. |
| The relative path. |
| The parent path. |
| A |
| A |
Batch Consumer
This component implements the Batch Consumer.
...
As the file consumer implements the BatchConsumer
it supports batching the files it polls. By batching we mean that Camel will add the following additional properties to the Exchange, so you know the number of files polled, the current index, and whether the batch is already completed.
...
...
Property | Description |
---|---|
| The total number of files that was polled in this batch. |
| The current index of the batch. Starts from 0. |
| A |
This allows you for instance to know how many files exist in this batch and for instance let the Aggregator2 aggregate this number of files.
...
Available as of Camel 2.9.3
The charset
option allows for configuring an encoding of the files on both the consumer and producer endpoints. For example if you read utf-8 files, and want to convert the files to iso-8859-1, you can do:
...
You can also use the convertBodyTo
in the route. In the example below we have still input files in utf-8 format, but we want to convert the file content to a byte array in iso-8859-1 format. And then let a bean process the data. Before writing the content to the outbox folder using the current charset.
...
...
If you omit the charset on the consumer endpoint, then Camel does not know the charset of the file, and would by default use "UTF-8". However you can configure a JVM system property to override and use a different default encoding with the key org.apache.camel.default.charset
.
In the example below this could be a problem if the files is not in UTF-8 encoding, which would be the default encoding for read the files.
In this example when writing the files, the content has already been converted to a byte array, and thus would write the content directly as is (without any further encodings).
...
You can also override and control the encoding dynamic when writing files, by setting a property on the exchange with the key Exchange.CHARSET_NAME
. For example in the route below we set the property with a value from a message header.
...
We suggest to keep things simpler, so if you pickup files with the same encoding, and want to write the files in a specific encoding, then favor to use the charset
option on the endpoints.
...
If you have some issues then you can enable DEBUG
logging on org.apache.camel.component.file
, and Camel logs when it reads/write a file using a specific charset.
For example the route below will log the following:
...
...
And the logs:
...
...
Common gotchas with folder and filenames
...
The sample code below produces files using the message ID as the filename:
...
To use report.txt
as the filename you have to do:
...
...
... the same as above, but with CamelFileName
:
...
...
And a syntax where we set the filename on the endpoint with the fileName
URI option.
...
...
Filename Expression
Filename can be set either using the expression option or as a string-based File Language expression in the CamelFileName
header. See the File Language for syntax and samples.
...
If you want only to consume files when a done file exists, then you can use the doneFileName
option on the endpoint.
...
...
Will only consume files from the bar folder, if a done file exists in the same directory as the target files. Camel will automatically delete the done file when it's done consuming the files. From Camel 2.9.3 onward Camel will not automatically delete the done file if noop=true
is configured.
However it is more common to have one done file per target file. This means there is a 1:1 correlation. To do this you must use dynamic placeholders in the doneFileName
option. Currently Camel supports the following two dynamic tokens: file:name
and file:name.noext
which must be enclosed in $
{}. The consumer only supports the static part of the done file name as either prefix or suffix (not both).
...
In this example only files will be polled if there exists a done file with the name file name.done. For example
...
You can also use a prefix for the done file, such as:
...
...
hello.txt
- is the file to be consumedready-hello.txt
- is the associated done file
...
After you have written a file you may want to write an additional done file as a kind of marker, to indicate to others that the file is finished and has been written. To do that you can use the doneFileName
option on the file producer endpoint.
...
...
Will simply create a file named done
in the same directory as the target file.
However it is more common to have one done file per target file. This means there is a 1:1 correlation. To do this you must use dynamic placeholders in the doneFileName
option. Currently Camel supports the following two dynamic tokens: file:name
and file:name.noext
which must be enclosed in ${}
.
...
...
Will for example create a file named done-foo.txt
if the target file was foo.txt
in the same directory as the target file.
...
Will for example create a file named foo.txt.done
if the target file was foo.txt
in the same directory as the target file.
...
...
Will for example create a file named foo.done
if the target file was foo.txt
in the same directory as the target file.
...
Read from a directory and write to another directory
...
...
Read from a directory and write to another directory using a overrule dynamic name
...
...
Listen on a directory and create a message for each file dropped there. Copy the contents to the outputdir
and delete the file in the inputdir
.
Reading recursively from a directory and writing to another
...
...
Listen on a directory and create a message for each file dropped there. Copy the contents to the outputdir
and delete the file in the inputdir
. Will scan recursively into sub-directories. Will lay out the files in the same directory structure in the outputdir
as the inputdir
, including any sub-directories.
...
Will result in the following output layout:
...
Using flatten
If you want to store the files in the outputdir
directory in the same directory, disregarding the source directory layout e.g., to flatten out the path, you just add the flatten=true
option on the file producer side:
...
...
Will result in the following output layout:
...
Reading from a directory and the default move operation
Camel will by default move any processed file into a .camel
subdirectory in the directory the file was consumed from.
...
Affects the layout as follows:
before
...
...
after
...
Read from a directory and process the message in java
...
The body will be a File
object that points to the file that was just dropped into the inputdir
directory.
...
Camel is of course also able to write files, i.e. produce files. In the sample below we receive some reports on the SEDA queue that we process before they are being written to a directory. Wiki Markup
Write to subdirectory using Exchange.FILE_NAME
Using a single route, it is possible to write a file to any number of subdirectories. If you have a route setup as such:
...
...
You can have myBean
set the header Exchange.FILE_NAME
to values such as:
...
...
This allows you to have a single route to write files to multiple destinations.
...
Sometime you need to temporarily write the files to some directory relative to the destination directory. Such situation usually happens when some external process with limited filtering capabilities is reading from the directory you are writing to. In the example below files will be written to the /var/myapp/filesInProgress
directory and after data transfer is done, they will be atomically moved to the /var/myapp/finalDirectory
directory.
...
...
Using Expressions for Filenames
In this sample we want to move consumed files to a backup folder using today's date as a sub-folder name:
...
See File Language for more samples.
...
Camel supports Idempotent Consumer directly within the component so it will skip already processed files. This feature can be enabled by setting the idempotent=true
option.
...
...
Camel uses the absolute file name as the idempotent key, to detect duplicate files. From Camel 2.11 onward you can customize this key by using an expression in the idempotentKey
option. For example to use both the name and the file size as the key
...
By default Camel uses a in memory based store for keeping track of consumed files, it uses a least recently used cache holding up to 1000 entries. You can plugin your own implementation of this store by using the idempotentRepository
option using the #
sign in the value to indicate it's a referring to a bean in the Registry with the specified id
.
...
...
Camel will log at DEBUG
level if it skips a file because it has been consumed before:
...
...
Using a file based idempotent repository
...
We configure our repository using Spring XML creating our file idempotent repository and define our file consumer to use our repository with the idempotentRepository
using #
sign to indicate Registry lookup: Wiki Markup
Using a JPA based idempotent repository
...
First we need a persistence-unit in META-INF/persistence.xml
where we need to use the class org.apache.camel.processor.idempotent.jpa.MessageProcessed
as model. Wiki Markup Wiki Markup idempotentRepository
using the #
syntax option:
...
Filter using org.apache.camel.component.file.GenericFileFilter
...
In the sample we have built our own filter that skips files starting with skip
in the filename: Wiki Markup #
notation) that we have defined in the spring XML file:
...
...
Filtering using ANT path matcher
...
There are now antInclude
and antExclude
options to make it easy to specify ANT style include/exclude without having to define the filter. See the URI options above for more information.
...
The sample below demonstrates how to use it: Wiki Markup
Sorting using Comparator
Camel supports pluggable sorting strategies. This strategy it to use the build in java.util.Comparator
in Java. You can then configure the endpoint with such a comparator and have Camel sort the files before being processed.
In the sample we have built our own comparator that just sorts by file name: Wiki Markup mySorter
) we have defined in the spring XML file:
...
In the Spring DSL route above notice that we can refer to beans in the Registry by prefixing the id with #
. So writing sorter=#mySorter
, will instruct Camel to go look in the Registry for a bean with the ID, mySorter
.
...
Camel supports pluggable sorting strategies. This strategy it to use the File Language to configure the sorting. The sortBy
option is configured as follows:
...
...
Where each group is separated with semi colon. In the simple situations you just use one group, so a simple example could be:
...
...
This will sort by file name, you can reverse the order by prefixing reverse:
to the group, so the sorting is now Z..A:
...
...
As we have the full power of File Language we can use some of the other parameters, so if we want to sort by file size we do:
...
...
You can configure to ignore the case, using ignoreCase:
for string comparison, so if you want to use file name sorting but to ignore the case then we do:
...
You can combine ignore case and reverse, however reverse must be specified first:
...
In the sample below we want to sort by last modified file, so we do:
...
And then we want to group by name as a 2nd option so files with same modifcation is sorted by name:
...
...
Now there is an issue here, can you spot it? Well the modified timestamp of the file is too fine as it will be in milliseconds, but what if we want to sort by date only and then subgroup by name?
Well as we have the true power of File Language we can use its date command that supports patterns. So this can be solved as:
...
...
Yeah, that is pretty powerful, oh by the way you can also use reverse per group, so we could reverse the file names:
...
...
Using GenericFileProcessStrategy
...
For example to skip any directories which starts with "skip"
in the name, can be implemented as follows: Wiki Markup
How to use the Camel error handler to deal with exceptions triggered outside the routing engine
...
For the file and ftp components this would be the case. However if you want to bridge the ExceptionHandler
so it uses the Camel Error Handling, then you need to implement a custom ExceptionHandler
that will handle the exception by creating a Camel Exchange and send it to the routing engine; then the error handling of the routing engine can get triggered.
...
...
The new option consumer.bridgeErrorHandler
can be set to true, to make this even easier. See further below for more details.
Here is such an example based upon an unit test.
First we have a custom ExceptionHandler
where you can see we deal with the exception by sending it to a Camel Endpoint named direct:file-error
: Wiki Markup
Then we have a Camel route that uses the Camel routing error handler, which is the onException
where we handle any IOException
being thrown. We then send the message to the same direct:file-error
endpoint, where we handle it by transforming it to a message, and then being sent to a Mock endpoint. This is just for testing purpose. You can handle the exception in any custom way you want, such as using a Bean or sending an email, etc.
Notice how we configure our custom MyExceptionHandler
by using the consumer.exceptionHandler
option to refer to #myExceptionHandler
which is a id of the bean registered in the Registry. If using Spring XML or OSGi Blueprint, then that would be a <bean id="myExceptionHandler" class="com.foo.MyExceptionHandler"/>
: Wiki Markup
The source code for this example can be seen here
...
If you want to use the Camel Error Handler to deal with any exception occurring in the file consumer, then you can enable the consumer.bridgeErrorHandler
option as shown below: Wiki Markup
...
...
When using consumer.bridgeErrorHandler
, then interceptors, OnCompletions does not apply. The Exchange is processed directly by the Camel Error Handler, and does not allow prior actions such as interceptors, onCompletion
to take action.
Debug logging
This component has log level TRACE
that can be helpful if you have problems. Endpoint See Also Include Page