Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.



Excerpt

This page descibes hits for developing and researching failures of Apache Ignite tests

 


Table of Contents

Ignite Specific Test Frameworks

Test compatibility with older versions

Ignite test has build in in Framework to test compatibiltycompatibility. This framework provides an opportunity to start working with Ignite Ignite instances of previously released versions.


The entire module is built on top of the Ignite Testing Framework,   especially on the MiltiJVM-mode classes. There is a class IgniteCompatibilityAbstractTest class IgniteCompatibilityAbstractTest that provides methods to start Ignite Ignite nodes with versions which have been previously released in the Maven Maven repository in separate JVM and allows them to join topology.

Framework The framework is looking for artifacts of a specific version in the Maven local local repository, and if they don’t exist there, they will be downloaded and and stored via Maven.


The main implemented API:

Code Block
startGrid(name, version, configurationClosure);
startGrid(name, version, configurationClosure, postStartupClosure);

You can simply specify a version of Ignite, which you want to start,   define the configuration in the configurationClosure and set the the actions on the started node in the postStartupClosure.

It’s very easy straightforward to use it for writing unit tests, here is a simple examplewhich demonstrates the main functional.  which demonstrates main functions.

Test of .NET API parity with Java API

This test checks that everything that is on the public API in the configuration, is there in the  .NET, unless specified otherwise. Exceptions are:

  • "it's not necessary in  .NET"
  • "it's not yet supported in  .NET".

If there is a public API, but it is not in the  .NET class, or in the list of unnecessary, or in the list of known unsupported, then the test fails. This fix explicitly marks the property as yet unimplemented in class

modules/platforms/dotnet/Apache.Ignite.Core.Tests/ApiParity/IgniteConfigurationParityTest.cs 

there is String array MissingProperties. This array stores properties that are missing on .NET side. Adding property to this list disable Parity test fails, but it is reasonable to add properties only with corresponding issue creation first. Issue number can be added as comment

Test configurations

Since OptimizedMarshaller was removed in Ignite 2.0 from the PublicAPI, several unnecessary test suites were removed from the build plan from Ignite 2.0.

Please use for Ignite 2.0+ tests appropriate run configs from Ignite 2.0 project, which is 14 test suites shorter than the previous plan.

Use  -> Run All to run all suites for changes. Select your PR in branch selection.


Misc

Locate run configuration which runs test case

Usually it is clear from test suite naming to which run config it belongs.

But it is not clear where test is executed on teamcity it is posible to do the following.

Way 1: Using code

  • Step 1) for TestCase it is possible to find usages in idea. Some TestSuite including this case may be found. If it is corresponds to some Run Config name, suite found.
  • Step 2) find usage may be repeated to find grouping TestSuite.
  • Double check of search result: TEST_SUITE parameter in run configuration includes full class name found suite from step 2.

Way 2: Use search in top right corner in teamcity

Image Added

Make sure to select 'Ignite 2.0 Tests' group if 2+ tests are required


Enable Test Debug

To enable debug messages for test it is possible to set in 

incubator-ignite/modules/core/src/test/config/log4j-test.xml

This XML contains commented out examples of enable debug for particular packages

No Format
<category name="org.apache.ignite.cache.query"> <!-- Uncomment to enable Ignite query execution debugging. -->
 <level value="DEBUG"/>
</category>

For example for debugging Exchange messages following XML may be inserted test config:

Code Block
<category name="org.apache.ignite.internal.processors.cache.distributed.dht">
    <level value="DEBUG"/>
</category>


Info

Be careful with committing log with debug enabled, it may generate huge amount of messages at continious integration.


Test timeout

Fast run config timed out

If relatively fast run configuration timed out

Check required time test was timed out (or timeout set on run configuration). If it is relatively low (e.g. 10 minutes) and other successful runs required 3-9 minutes consider timeout increase.

Check agent type - some windows agents works slower than linux.

Check thread dump, if build is still running (tests even not started), consider timeout increase.

No Format
"main" prio=6 tid=0x0000000001798000 nid=0x188c runnable [0x000000000168d000]
  java.lang.Thread.State: RUNNABLE
    at java.io.WinNTFileSystem.getBooleanAttributes(Native Method)
    at java.io.File.exists(File.java:813)
    at org.apache.maven.plugin.compiler.AbstractCompilerMojo.hasNewFile(AbstractCompilerMojo.java:1185)


Timed out suite with sufficient timeout

If timeout is already high, e.g. 2h or more, timeout probably indicates problem in code. To find out reason 

1) download full build log from TC (it is faster to download compressed build log).

2) search 'timed out' or 'Test has been timed out' to find out which test was failed

No Format
[19:24:43]W:		 [org.apache.ignite:ignite-core] [2017-06-19 16:24:43,353][ERROR][main][root] Test has been timed out and will be interrupted (threads dump will be taken before interruption) [test=testPutAllAsyncFailover, timeout=120000]

This line is logged at the end of test execution. 

3) Search backwards 'Starting test'

No Format
[19:22:43] :	 [Step 4/5] [2017-06-19 16:22:43,352][INFO ][main][root] >>> Starting test: CacheAsyncOperationsFailoverTxTest#testPutAllAsyncFailover <<<

This line is logged at the beginning of test execution. 

Most likey there is some exception, assetion error between these 2 logged messages. 


Also it is possible now to run test locally if hang up or not.

4) Thread dump analysis

After timed out tests there is also thread dump is logged. To find out abnormal activiy in this dump it is usefull to take into account following information

 - pool type (included into pool name)

 - node name (for test may include test name) 

Normal thread execution examples
NameDescriptionNormal trace
sys

System execution pool, responsible for processing internal system messages.

See also message flow section from Ignite Tests How To

Waiting for task to exexute

No Format
state=TIMED_WAITING
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)     
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)


ttl-cleanup-worker

Entry cleanup worker. Provides functionality of expiration for cache entries

Periodic sleep and wakeup

No Format
state=TIMED_WAITING 
at java.lang.Thread.sleep(Native Method)
o.a.i.i.processors.cache.GridCacheSharedTtlCleanupManager$ CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:137)


exchange-worker

One thread per node. Partition maps exchange. Usage of one thread for exchange provides strict actions order.

See also "Partition Map Exchange" section from Ignite Tests How To
If there is no exchange waits on the quue
grid-nio-worker-tcp-comm

nio-acceptor

grid-timeout-worker

sys-stripeSee also 'Striped pool' section from Part 2

Waiting on queue

No Format
state=WAITING
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:315)
at o.a.i.i.util.StripedExecutor$StripeConcurrentQueue.take(StripedExecutor.java:581)


tcp-disco-sock-reader

Reads socket

No Format
at SocketInputStream.socketRead0(Native Method)


tcp-disco-ip-finder

tcp-disco-msg-worker

Waiting on queue

No Format
 at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:682)     at o.a.i.spi.discovery.tcp.ServerImpl$ MessageWorkerAdapter.body(ServerImpl.java:6565)


update-thread

restart-thread

test-runnerRuns test itself

Test method e.g .CacheAsyncOperationsFailoverAbstractTest.testPutAllAsyncFailover()

disco-event-worker

disco-event-worker

Waiting on queue

No Format
  at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at o.a.i.i.managers.discovery.GridDiscoveryManager$ DiscoveryWorker.body0(GridDiscoveryManager.java:2448)









mainStart up test runner thread and waits to complete within
getTestTimeout()
ThreadImpl.dumpThreads0 - this thread checks timeout occurred and initializes thread dump