File Component
The File component provides access to file systems; allowing files to be processed by any other Camel Components or messages from other components can be saved to disk.
URI format
file:fileOrDirectoryName[?options]
or
file://fileOrDirectoryName[?options]
Where fileOrDirectoryName represents the underlying file name. Camel will determine if fileOrDirectoryName is a file or directory.
URI Options
Name |
Default Value |
Description |
---|---|---|
initialDelay |
1000 |
milliseconds before polling the file/directory starts |
delay |
500 |
milliseconds before the next poll of the file/directory |
useFixedDelay |
false |
true to use fixed delay between pools, otherwise fixed rate is used. See ScheduledExecutorService in JDK for details. |
directory |
true |
TODO: |
recursive |
false |
if a directory, will look for changes in files in all the sub directories. |
lock |
true |
if true will lock the file for the duration of the processing |
delete |
false |
If delete is true then the file will be deleted when it is processed (the default is to move it, see below) |
noop |
false |
If true then the file is not moved or deleted in any way (see below). This option is good for read only data, or for ETL type requirements. If noop=true then Camel will set idempotent=true as well, avoiding consuming the same files over and over again. |
moveNamePrefix |
.camel/ |
The prefix String perpended to the filename when moving it. For example to move processed files into the done directory, set this value to 'done/' |
moveNamePostfix |
null |
The postfix String appended to the filename when moving it. For example to rename processed files from foo to foo.old set this value to '.old' |
append |
true |
When writing do we append to the end of the file, or replace it? |
autoCreate |
true |
If set to true Camel will create the directory to the file if the file path does not exists - Uses File#mkdirs() |
bufferSize |
128kb |
Write buffer sized in bytes. Camel uses a default of 128 * 1024 bytes. |
regexPattern |
null |
will only fire a an exchange for a file that matches the regex pattern |
includeNamePrefix |
null |
Is used to include files if filename is starting with the given prefix. Files not matching will be excluded. Is case sensitive. |
includeNamePostfix |
null |
Is used to include files if filename is ending with the given postfix. Files not matching will be excluded. Is case sensitive. |
excludeNamePrefix |
null |
Is used to exclude files if filename is starting with the given prefix. If both include and exclude is used the exclude takes precedence. Is case sensitive. |
excludeNamePostfix |
null |
Is used to exclude files if filename is ending with the given postfix. If both include and exclude is used the exclude takes precedence. Is case sensitive. |
expression |
null |
Use expression to dynamically set the filename. This allows you to very easily set dynamic pattern style filenames. If an expression is set it take precedes over the |
tempPrefix |
null |
Option for file producer only. This option is used to write the file using a temporary name, and then after the write is complete rename it to the real name. Can be used to identify files being written and also avoid consumers (not using exclusive read locks) reading in progress files. |
idempotent |
false |
Option to use the Idempotent Consumer EIP pattern to let Camel skip already processed files. Will default use a memory based LRUCache that holds 1000 entries. If noop=true then idempotent will be enabled as well to avoid consuming the same files over and over again. |
idempotentRepository |
null |
Pluggable repository as a org.apache.camel.processor.idempotent.MessageIdRepository class. Will default use MemoryMessageIdRepository if none is specified and idempotent is true. |
filter |
null |
Pluggable filter as a java.io.FileFilter class. Will skip files if filter returns false in its accept method. Camel ships with an ANT path matcher filter in the camel-spring component. See below for sample. |
sorter |
null |
Pluggable sorter as a java.util.Comparator<File> class. |
sortBy |
null |
Build in sort by using the File Language. Supports nested sorts so you can have a sort by file name and as a 2nd group sort by modified date. See sorting section below for details. |
preMoveNamePrefix |
null |
The prefix String perpended to the filename when moving it before processing. For example to move in progress files into the inprogress directory, set this value to 'inprogress/' |
preMoveNamePostfix |
null |
The postfix String appended to the filename when moving it before processing. For example to rename in progress files from foo to foo.inprogress set this value to '.inprogress' |
preMoveExpression |
null |
Use expression to dynamically set the filename when moving it before processing. For example to move in progress file into the order directory and use .bak as extension set this value to 'order/${ file:name.noext}.bak' |
readLock |
fileLock |
Used by FileConsumer, to only poll the files if it has exclusive read lock to the file (= the file is not in progress of being written). Camel will wait until the file lock is granted. This option provides the build in strategies: fileLock, rename, markerFile and none. fileLock is for using |
readLockTimeout |
0 |
Optional timeout in millis for the read lock, if supported by the read lock. If the read lock could not be granted and the timeout triggered then Camel will skip the file. At next poll Camel will try the file again, and this time maybe the read lock could be granted. |
exclusiveReadLockStrategy |
null |
Pluggable read lock as a |
Default behavior for file consumer
- By default the file is locked for the duration of the processing.
- After the route has completed they are moved into the .camel subdirectory; so that they appear to be deleted.
- The File Consumer will always skip any file which name starts with a dot, such as
".", ".camel", ".m2" or ".groovy"
. - Only files (not directories) is matched for valid filename if options such as:
consumer.regexPattern, excludeNamePrefix, excludeNamePostfix
is used. Notice: this only works properly in Camel 1.5.0, due to issue CAMEL-920.
Move and Delete operations
Any move or delete operations is executed after (post command) the routing has completed. So during processing of the Exchange the file is still located in the inbox folder.
Lets illustrate this with an example:
from("file://inobox?m oveNamePrefix=done/").to("bean:handleOrder");
When a file is dropped in the inbox folder the file consumer notices this and creates a new FileExchange
that is routed to the handleOrder bean. The bean then processes the File. At this point in time the File is still located in the inbox folder. After the bean completes and thus the route is completed the file consumer will perform the move operation and move the file to the done sub folder.
By default Camel will move consumed files to the sub folder .camel
relative where the file was consumed.
We have introduced a pre move operation to move files before they are processed. This allows you to mark which files has been scanned as they are moved to this sub folder before being processed.
The following options support pre move:
- preMoveNamePrefix
- preMoveNamePostfix
- preMoveExpression
from("file://inobox?preMoveNamePrefix=inprogress/").to("bean:handleOrder");
You can combine the pre move and the regular move:
from("file://inobox?preMoveNamePrefix=inprogress/&moveNamePrefix=../done/").to("bean:handleOrder");
So in this situation the file is in the inprogress folder when being processed, and after it's processed it's moved to the done folder.
Message Headers
The following message headers can be used to affect the behavior of the component
Header |
Description |
---|---|
CamelFileName |
Specifies the output file name (relative to the endpoint directory) to be used for the output message when sending to the endpoint. If this is not present and no expression either then a generated message Id is used as filename instead. |
CamelFileNameProduced |
The actual absolute filepath (path + name) for the output file that was written. This header is set by Camel and its purpose is providing end-users the name of the file that was written. |
CamelFileBatchTotal |
Total number of files being consumed in this batch. |
CamelFileBatchIndex |
Current index out of total number of files being consumed in this batch. |
Common gotchas with folder and filenames
When Camel is producing files (writing files) there are a few gotchas how to set a filename of your choice. By default Camel will use the message id as the filename, and since the message id is normally a unique generated id you will end up with filenames such as: ID-MACHINENAME-2443-1211718892437-1-0. If such a filename is not desired, then a filename must be provided in the message header "CamelFileName"
. The constant FileComponent.HEADER_FILE_NAME
can also be used.
The sample code below produces files using the message id as the filename:
from("direct:report").to("file:target/reports");
To use report.txt as the filename you have to do:
from("direct:report").setHeader(FileComponent.HEADER_FILE_NAME, constant("report.txt")).to( "file:target/reports");
... the same as above, but with "CamelFileName":
from("direct:report").setHeader("CamelFileName", constant("report.txt")).to( "file:target/reports");
Canel will default try to auto create the folder if it does not exists, and this is a bad combination with the UUID filename from above. So if you have:
from("direct:report").to("file:target/reports/report.txt");
And you want Camel to store in the file report.txt, then you need to tell Camel that its not a directory. This is done by setting the option directory to false:
from("direct:report").to("file:target/reports/report.txt?directory=false");
Then Camel will store the report in the report.txt as expected.
Filename Expression
Filename can be set either using the expression option or as a string based File Language expression in the CamelFileName
header. See the File Language for some samples.
Samples
Read from a directory and write to another directory
from("file://inputdir/?delete=true").to("file://outputdir")
Listen on a directory and create a message for each file dropped there. Copy the contents to the outputdir and delete the file in the inputdir.
Read from a directory and process the message in java
from("file://inputdir/").process(new Processor() { public void process(Exchange exchange) throws Exception { Object body = exchange.getIn().getBody(); // do some business logic with the input body } });
Body will be File object pointing to the file that was just dropped to the inputdir directory.
Read files from a directory and send the content to a jms queue
from("file://inputdir/").convertBodyTo(String.class).to("jms:test.queue")
By default the file endpoint sends a FileMessage which contains a File as body. If you send this directly to the jms component the jms message will only contain the File object but not the content. By converting the File to a String the message will contain the file contents what is probably what you want to do.
The route above using Spring DSL:
<route> <from uri="file://inputdir/"/> <convertBodyTo type="java.lang.String"/> <to uri="jms:test.queue"/> </route>
Writing to files
Camel is of course also able to write files, eg. producing files. In the sample below we receive some reports on the SEDA queue that we processes before they are written to a directory.
Write to subdirectory using FileComponent.HEADER_FILE_NAME
Using a single route, it is possible to write a file to any number of subdirectories. If you have a route setup as such:
<route> <from uri="bean:myBean"/> <to uri="file:/rootDirectory"/> </route>
You can have myBean
set the header FileComponent.HEADER_FILE_NAME
to values such as:
FileComponent.HEADER_FILE_NAME = hello.txt => /rootDirectory/hello.txt FileComponent.HEADER_FILE_NAME = foo/bye.txt => /rootDirectory/foo/bye.txt
This allows you to have a single route to write files to multiple destinations.
Using expression for filenames
In this sample we want to move consumed files to a backup folder using todays date as a sub foldername:
from("file://inbox?expression=backup/${date:now:yyyyMMdd}/${file:name}").to("...");
See File Language for more samples.
Avoiding reading the same file more than once (idempotent consumer)
Camel supports Idempotent Consumer directly within the component so it will skip already processed files. This feature can be enabled by setting the idempotent=true
option.
from("file://inbox?idempotent=true").to("...");
By default Camel uses a in memory based store for keeping track of consumed files, it uses a least recently used cache storing holding up to 1000 entries. You can plugin your own implementation of this store by using the idempotentRepository
option using the # sign in the value to indicate it's a referring to a bean in the Registry with this id.
<!-- define our store as a plain spring bean --> <bean id="myStore" class="com.mycompany.MyIdempotentStore"/> <route> <from uri="file://inbox?idempotent=true&idempotentRepository=#myStore"/> <to uri="bean:processInbox"/> </route>
Camel will log at DEBUG
level if it skips a file because it has been consumed before:
DEBUG FileConsumer is idempotent and the file has been consumed before. Will skip this file: target\idempotent\report.txt
Using a File based idempotent repository
In this section we will use the file based idempotent repository org.apache.camel.processor.idempotent.FileIdempotentRepository
instead of the in memory based that is used as default.
This repository uses a 1st level cache to avoid reading the file repository. It will only use the file repository to store the content of the 1st level cache. Thereby the repository can survive server restarts. It will load the content of the file into the 1st level cache upon startup. The file structure is very simple as it store the key in separate lines in the file. By default the file store has a size limit of 1mb when the file grew larger Camel will truncate the file store be rebuilding the content by flushing the 1st level cache in a fresh empty file.
We configure our repository using Spring XML creating our file idempotent repository and define our file consumer to use our repository with the idempotentRepository
using # sign to indicate Registry lookup:
Using a JPA based idempotent repository
In this section we will use the JPA based idempotent repository instead of the in memory based that is used as default.
First we need a persistence-unit in META-INF/persistence.xml
where we need to use the class org.apache.camel.processor.idempotent.jpa.MessageProcessed
as model.
Then we need to setup a Spring jpaTemplate in the spring XML file:
And finally we can create our JPA idempotent repository in the spring XML file as well:
And yes then we just need to refer to the jpaStore bean in the file consumer endpoint using the [[idempotentRepository}} using the # syntax option:
<route> <from uri="file://inbox?idempotent=true&idempotentRepository=#jpaStore"/> <to uri="bean:processInbox"/> </route>
Filter using java.io.FileFilter
Camel supports pluggable filtering strategies. This strategy it to use the build in java.io.FileFilter
in Java. You can then configure the endpoint with such a filter to skip certain filters before being processed.
In the sample we have build our own filter that skips files starting with skip in the filename:
And then we can configure our route using the filter attribute to reference our filter (using # notation) that we have defines in the spring XML file:
<!-- define our sorter as a plain spring bean --> <bean id="myFilter" class="com.mycompany.MyFileSorter"/> <route> <from uri="file://inbox?filter=#myFilter"/> <to uri="bean:processInbox"/> </route>
Filtering using ANT path matcher
The ANT path matcher is a java.io.FileFilter
that is shipped out-of-the-box in the camel-spring jar. So you need to depend on camel-spring if you are using Maven.
The reasons is that we leverage Spring's AntPathMatcher to do the actual matching.
The file paths is matched with the following rules:
?
matches one character*
matches zero or more characters**
matches zero or more directories in a path
The sample below demonstrates how to use it:
Sorting using Comparator
Camel supports pluggable sorting strategies. This strategy it to use the build in java.util.Comparator in Java. You can then configure the endpoint with such a comparator and have Camel sort the files before being processed.
In the sample we have build our own comparator that just sorts by file name:
And then we can configure our route using the sorterRef attribute to reference our sorter that we have defines in the spring XML file:
<!-- define our sorter as a plain spring bean --> <bean id="mySorter" class="com.mycompany.MyFileSorter"/> <route> <from uri="file://inbox?sorter=#mySorter"/> <to uri="bean:processInbox"/> </route>
Sorting using sortBy
Camel supports pluggable sorting strategies. This strategy it to use the File Language to configure the sorting. The sortBy is configured as:
sortBy=group 1;group 2;group 3;...
Where each group is separated with semi colon. In the simple situations you just use one group, so a simple example could be:
sortBy=file:name
This will sort by file name, you can reverse the order by prefixing reverse:
to the group, so the sorting is now Z..A:
sortBy=reverse:file:name
As we have the full power of File Language we can use some of the other parameters, so if we want to sort by file size we do:
sortBy=file:size
You can configure to ignore the case, using ignoreCase:
for string comparison, so if you want to use file name sorting but to ignore the case then we do:
sortBy=ignoreCase:file:name
You can combine ignore case and reverse, however reverse must be specified first:
sortBy=reverse:ignoreCase:file:name
In the sample below we want to sort by last modified file, so we do:
sortBy=file:modifed
And then we want to group by name as a 2nd option so files with same modifcation is sorted by name:
sortBy=file:modifed;file:name
Now there is an issue here, can you spot it? Well the modified timestamp of the file is too fine as it will be in millis, but what if we want to sort by date only and then sub group by name?
Well as we have the true power of File Language we can use the its date command that supports patterns. So this can be solved as:
sortBy=date:file:yyyyMMdd;file:name
Yeah that is pretty powerful, oh by the way you can also use reverse per group so we could reverse the file names:
sortBy=date:file:yyyyMMdd;reverse:file:name