Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.



Excerpt

This page descibes hits for developing and researching failures of Apache Ignite tests

 


Table of Contents

Ignite Specific Test Frameworks

...

there is String array MissingProperties. This array stores properties that are missing on .NET side. Adding property to this list disable Parity test fails, but it is reasonable to add properties only with corresponding issue creation first. Issue number can be added as comment

Test configurations

Since OptimizedMarshaller was removed in Ignite 2.0 from the PublicAPI, several unnecessary test suites were removed from the build plan from Ignite 2.0.

Please use for Ignite 2.0+ tests appropriate run configs from Ignite 2.0 project, which is 14 test suites shorter than the previous plan.

Use  -> Run All to run all suites for changes. Select your PR in branch selection.


Misc

Locate run configuration which runs test case

Usually it is clear from test suite naming to which run config it belongs.

But it is not clear where test is executed on teamcity it is posible to do the following.

Way 1: Using code

  • Step 1) for TestCase it is possible to find usages in idea. Some TestSuite including this case may be found. If it is corresponds to some Run Config name, suite found.
  • Step 2) find usage may be repeated to find grouping TestSuite.
  • Double check of search result: TEST_SUITE parameter in run configuration includes full class name found suite from step 2.

Way 2: Use search in top right corner in teamcity

Image Added

Make sure to select 'Ignite 2.0 Tests' group if 2+ tests are required


Enable Test Debug

To enable debug messages for test it is possible to set in 

incubator-ignite/modules/core/src/test/config/log4j-test.xml

This XML contains commented out examples of enable debug for particular packages

No Format
<category name="org.apache.ignite.cache.query"> <!-- Uncomment to enable Ignite query execution debugging. -->
 <level value="DEBUG"/>
</category>

For example for debugging Exchange messages following XML may be inserted test config:

Code Block
<category name="org.apache.ignite.internal.processors.cache.distributed.dht">
    <level value="DEBUG"/>
</category>


Info

Be careful with committing log with debug enabled, it may generate huge amount of messages at continious integration.


Test timeout

Fast run config timed out

If relatively fast run configuration timed out

Check required time test was timed out (or timeout set on run configuration). If it is relatively low (e.g. 10 minutes) and other successful runs required 3-9 minutes consider timeout increase.

Check agent type - some windows agents works slower than linux.

Check thread dump, if build is still running (tests even not started), consider timeout increase.

No Format
"main" prio=6 tid=0x0000000001798000 nid=0x188c runnable [0x000000000168d000]
  java.lang.Thread.State: RUNNABLE
    at java.io.WinNTFileSystem.getBooleanAttributes(Native Method)
    at java.io.File.exists(File.java:813)
    at org.apache.maven.plugin.compiler.AbstractCompilerMojo.hasNewFile(AbstractCompilerMojo.java:1185)


Timed out suite with sufficient timeout

If timeout is already high, e.g. 2h or more, timeout probably indicates problem in code. To find out reason 

1) download full build log from TC (it is faster to download compressed build log).

2) search 'timed out' or 'Test has been timed out' to find out which test was failed

No Format
[19:24:43]W:		 [org.apache.ignite:ignite-core] [2017-06-19 16:24:43,353][ERROR][main][root] Test has been timed out and will be interrupted (threads dump will be taken before interruption) [test=testPutAllAsyncFailover, timeout=120000]

This line is logged at the end of test execution. 

3) Search backwards 'Starting test'

No Format
[19:22:43] :	 [Step 4/5] [2017-06-19 16:22:43,352][INFO ][main][root] >>> Starting test: CacheAsyncOperationsFailoverTxTest#testPutAllAsyncFailover <<<

This line is logged at the beginning of test execution. 

Most likey there is some exception, assetion error between these 2 logged messages. 


Also it is possible now to run test locally if hang up or not.

4) Thread dump analysis

After timed out tests there is also thread dump is logged. To find out abnormal activiy in this dump it is usefull to take into account following information

 - pool type (included into pool name)

 - node name (for test may include test name) 

Normal thread execution examples
NameDescriptionNormal trace
sys

System execution pool, responsible for processing internal system messages.

See also message flow section from Ignite Tests How To

Waiting for task to exexute

No Format
state=TIMED_WAITING
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)     
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)


ttl-cleanup-worker

Entry cleanup worker. Provides functionality of expiration for cache entries

Periodic sleep and wakeup

No Format
state=TIMED_WAITING 
at java.lang.Thread.sleep(Native Method)
o.a.i.i.processors.cache.GridCacheSharedTtlCleanupManager$ CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:137)


exchange-worker

One thread per node. Partition maps exchange. Usage of one thread for exchange provides strict actions order.

See also "Partition Map Exchange" section from Ignite Tests How To
If there is no exchange waits on the quue
grid-nio-worker-tcp-comm

nio-acceptor

grid-timeout-worker

sys-stripeSee also 'Striped pool' section from Part 2

Waiting on queue

No Format
state=WAITING
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:315)
at o.a.i.i.util.StripedExecutor$StripeConcurrentQueue.take(StripedExecutor.java:581)


tcp-disco-sock-reader

Reads socket

No Format
at SocketInputStream.socketRead0(Native Method)


tcp-disco-ip-finder

tcp-disco-msg-worker

Waiting on queue

No Format
 at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:682)     at o.a.i.spi.discovery.tcp.ServerImpl$ MessageWorkerAdapter.body(ServerImpl.java:6565)


update-thread

restart-thread

test-runnerRuns test itself

Test method e.g .CacheAsyncOperationsFailoverAbstractTest.testPutAllAsyncFailover()

disco-event-worker

disco-event-worker

Waiting on queue

No Format
  at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at o.a.i.i.managers.discovery.GridDiscoveryManager$ DiscoveryWorker.body0(GridDiscoveryManager.java:2448)









mainStart up test runner thread and waits to complete within
getTestTimeout()
ThreadImpl.dumpThreads0 - this thread checks timeout occurred and initializes thread dump