Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This shows all of the examples of what you can do.

No Format

<?xml version="1.0" encoding="UTF-8" ?>

<mock>
    <!-- this file offers all of the options as documentation.
    Parsing will stop at an IOException, of course
    -->

    <!-- action can be "add" or "set" -->
    <metadata action="add" name="author">Nikolai Lobachevsky</metadata>

    <!-- element is the name of the sax event to write, p=paragraph
        if the element is not specified, the default is <p> -->

    <write element="p">some content</write>

    <!-- write something to System.out -->
    <print_out>writing to System.out</print_out>

    <!-- write something to System.err -->
    <print_err>writing to System.err</print_err>

    <!-- hang
        millis: how many milliseconds to pause.  The actual hang time will probably
            be a bit longer than the value specified.        
        heavy: whether or not the hang should do something computationally expensive.
            If the value is false, this just does a Thread.sleep(millis).
            This attribute is optional, with default of heavy=false.
        pulse_millis: (required if "heavy" is true), how often to check to see
            whether the thread was interrupted or that the total hang time exceeded the millis
        interruptible: whether or not the parser will check to see if its thread
            has been interrupted; this attribute is optional with default of true
    -->
    <hang millis="100" heavy="true" pulse_millis="10" interruptible="true" />

    <!-- As of Tika 1.27/2.0, we've integrated FakeLoad (https://github.com/msigwart/fakeload)
         which enables much more precision for resource consumption than does 
         "hang"
    
         millis = milliseconds to run, cpu is an integer % of the cpu to peg 
         and mb is the integer amount of memory to consume in megabytes -->
    <fakeload millis="100" cpu="10" mb="10"/>
    
    <!-- throw an exception or error; optionally include a message or not -->
    <throw class="java.io.IOException">not another IOException</throw>
    
    <!-- perform a genuine OutOfMemoryError -->
    <oom/>

    <!-- perform a system exit...what parser would do that?!  We had one once... -->
    <system_exit/>

    <!-- interrupt the thread -->
    <thread_interrupt/>

    <!-- print to stdout -->
    <print_out>some junk</print_out>

    <!-- print to stderr -->
    <print_err>some junk</print_err>

</mock>

References

  1. Tika to Ride 2. Evaluating Text Extraction