Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Distributed Entity Cache

...

Clear (DCC) Mechanism

Why and when use it?

The distributed cache clearing DCC mechanism is needed when you have multiple OFBiz servers in a cluster that are sharing a single database. When one entity engine has either a create, update or delete operation that goes through it, it will clear its own caches. But it can also sent out a message to the other servers in the pool to clear their caches.

This is a feature that runs through the service engine which operates on top of the entity engine. When you're doing a distributed cache clear it will result in service calls to the other OFBiz servers in the cluster. In most cases you will use Java Messaging Service, to send a message to a JMS server that can then be distributed to other servers in the cluster.

How to set it?

To keep it simple we will only set the mandatory values. There are other options which are covered by defaults.

Info
titleRMI deactivated since OFBIZ-6942

Because of The infamous Java serialization vulnerability the RMI container has been disabled in the default configuration of OFBiz, and hence JNDI (relies on RMI in OFBiz). So if you want to use the DCC mechanism you will first need to uncomment as explained at

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyOFBIZ-6942
(Note: I'm not quite sure this is required. Because JNDI relies on the RMI registry service provider but I don't think the RMI loader is required for the DCC, and OFBIZ-6942 is only about disabling the RMI loader. To be checked and updated...

The Entity Engine

Info

entityengine.xml

This is the easiest part, for a given delegator you only have to set its distributed-cache-clear-enable attribute to "true" (false by default). As an example:

Code Block
    <delegator name="default" entity-model-reader="main" entity-group-reader="main" entity-eca-reader="main" distributed-cache-clear-enabled="true">
        <group-map group-name="org.ofbiz" datasource-name="localderby"/>
        <group-map group-name="org.ofbiz.olap" datasource-name="localderbyolap"/>
        <group-map group-name="org.ofbiz.tenant" datasource-name="localderbytenant"/>
    </delegator>


The Service Engine

The location of the JMS definition is in the service-engineframework/service/config/serviceengine.xml file. By default you set a jms-service of name "serviceMessenger". You define there a JMS server with its name, a jndi name and a topic name. To make as less changes as possible we use "default" for the server name and set the values in the jndi.properties file. I could have also set a server name in jndiservers.xml but my catch phrase is "the less changes the better". This is the service-engine.xml setting, same on each servers

Code Block

<!-- JMS Service Active MQ Topic Configuration (set as default in jndi.properties, the less changes the better) -->
<jms-service name="serviceMessenger" send-mode="all">
    <server jndi-server-name="default"
        jndi-name="topicConnectionFactory"
        topic-queue="OFBTopic"
        type="topic"
        listen="true"/>
</jms-service>

...

I decided to use Apache ActiveMQ as JMS server and to simply set these properties in the jndi.properties files (commenting out the OOTB default):

Code Block

java.naming.factory.initial=org.apache.activemq.jndi.ActiveMQInitialContextFactory
java.naming.provider.url=tcp://172.18.7.4<AMQ-IP-SERVER1>:61616
topic.OFBTopic=OFBTopic
connectionFactoryNames=connectionFactory, queueConnectionFactory, topicConnectionFactory

...

At this stage you need to install an ActiveMQ server somewhere. Initially, I decided to install the last available release of ActiveMQ : 5.5.0. But it turned that there are some known issues in this release. So I finally took the 5.4.2 release. To test, I installed it on my XP development machine, and on the cluster. It could can also be embedded in OFBiz, but I decided to simply run it as an external broker. I don't think it's interesting to have it embedded in OFBiz: you just install, run it and forget about it (it sets /etc/ for you). For testing I used the ActiveMQ recommended default setting for that. For production you will want to run it as a Unix Daemon (or Windows Service).

...

You can then monitor ActiveMQ using the Web Console by pointing your browser at http://localhost:8161/admin/Image Removed and then topics page

Single point of failure

The setting above is sufficient in a staging environment but is a single point of failure in a production environment. So we need to create a cluster of ActiveMQ brokers. Since they should not consume much resources (only 256MB of memory at max and not much CPU cycles), we can put each instance on the same machines than the OFBiz instances.

...

It's fairly simple to set through JNDI. For this we only need to replace in jndi.properties files

Code Block

java.naming.provider.url=tcp://172.18.7.4<AMQ-IP-SERVER1>:61616

by

Code Block

java.naming.provider.url=failover:(tcp://172.18.7.4:61616<AMQ-IP-SERVER1>61616?soTimeout=60000,tcp://172.18.7.5:61616<AMQ-IP-SERVER2>61616?soTimeout=60000)?randomize=false&backup=true&trackMessages=true

You may add any number of AMQ instances you want in the failover/tcp chain. The soTimeout=60000 parameter prevents to keep too much useless connections opened. On the broker side you need to use transport.soTimeout=60000 in activemq.xml. Currently, the number of connections keep to increase. I hope to fix that soon... (see note below)

I tried to add &backup=true&trackMessages=true at the end of the failover chain (ie after ?randomize=false) but the connectionsd created are held and it seems there are no means to close them. It's weird, sot I asked on ActiveMQ user ML, no answers yet...
See Transport Options for details on the these 2 last parameters. There is a also a link at bottom of this page if ever you need to escalate more smoothly dynamic setting of failover. But it would need more work in OFBiz...

See

Notes

If you get a "Too many open files" error. In Unix like systems, Network connections are actually backed by files descriptors. So Edit /etc/security/limits.conf to increase nofile for the users running ActiveMQ. like

Code Block

user - nofile 10000
root - nofile 10000

While at it, I recommend also to increase the max heap used by ActiveMQ. For instance we increased ActiveMQ max heap to 512MB and 1GB respectively on stagging and production clusters from default 256MB, can't hurt...

We also increased the max number of files descriptors for root which runs ActiveMQ at startup

Some possible pitfalls

This section can be bypassed but might help in case of troubles
I first installed ActiveMQ 5.5.0 on my developement machine on XP. When I ran OFBiz I got this non blocking error

Code Block

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

This lead to this tip

*Failed to load class org.slf4j.impl.StaticLoggerBinder*

This error is reported when the org.slf4j.impl.StaticLoggerBinder class could not be loaded into memory. This happens when no appropriate SLF4J binding could be found on the class path. Placing one (and only one) of slf4j-nop.jar, slf4j-simple.jar, slf4j-log4j12.jar, slf4j-jdk14.jar or logback-classic.jar on the class path should solve the problem. As of SLF4J version 1.6, in the absence of a binding, SLF4J will default to a no-operation (NOP) logger implementation. You can download SLF4J bindings from the project download page.

...

Then you should not get issues with held connections or too many open connections

On the broker side (in activemq.xml) you need to

  1. set advisorySupport="false" for the broker (except if you want to use advisory messages)
  2. use transport.soTimeout=60000 and set enableStatusMonitor="true" for Openwire connector

    Code Block
    <broker xmlns="http://activemq.apache.org/schema/core" brokerName="localhost" dataDirectory="${activemq.base}/data" destroyApplicationContextOnStop="true" advisorySupport="false">
    ....
    <transportConnector name="openwire" uri="tcp://0.0.0.0:61616?transport.soTimeout=60000" enableStatusMonitor="true"/>