Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • luceneQueryReturnsCorrectResultsAfterClientAndServersAreRolledOver
  • luceneQueryReturnsCorrectResultAfterTwoLocatorsWithTwoServersAreRolled
  • luceneQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion

Code Block
collapsetrue
org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit > luceneQueryReturnsCorrectResultsAfterClientAndServersAreRolledOver[0] FAILED
    org.apache.geode.test.dunit.RMIException: While invoking org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit$$Lambda$204/1077331008.run in VM 2 running on Host 192.168.2.6 with 4 VMs
        at org.apache.geode.test.dunit.VM.invoke(VM.java:393)
        at org.apache.geode.test.dunit.VM.invoke(VM.java:363)
        at org.apache.geode.test.dunit.VM.invoke(VM.java:308)
        at org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.verifyLuceneQueryResultInEachVM(LuceneSearchWithRollingUpgradeDUnit.java:553)
        at org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.putSerializableObjectAndVerifyLuceneQueryResult(LuceneSearchWithRollingUpgradeDUnit.java:500)
        at org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.luceneQueryReturnsCorrectResultsAfterClientAndServersAreRolledOver(LuceneSearchWithRollingUpgradeDUnit.java:264)
        Caused by:
        java.lang.reflect.InvocationTargetException
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:497)
            at org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.verifyLuceneQueryResults(LuceneSearchWithRollingUpgradeDUnit.java:536)
            at org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.lambda$verifyLuceneQueryResultInEachVM$b83f705c$2(LuceneSearchWithRollingUpgradeDUnit.java:553)
            Caused by:
            java.lang.IllegalArgumentException: Could not load codec 'Lucene62'.  Did you forget to add lucene-backward-codecs.jar?
                Caused by:
                java.lang.IllegalArgumentException: An SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene62' does not exist.  You need to add the corresponding JAR file supporting this SPI to your classpath.  The current classpath supports the following names: [Lucene70, Asserting, CheapBastard, FastCompressingStoredFields, FastDecompressionCompressingStoredFields, HighCompressionCompressingStoredFields, DummyCompressingStoredFields, SimpleText]

...

org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit luceneQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion failed with:

Code Block
collapsetrue
[info 2017/10/18 16:14:35.167 PDT <RMI TCP Connection(1)-192.168.2.6> tid=0x13] Got result: EXCEPTION_OCCURRED
java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.verifyLuceneQueryResults(LuceneSearchWithRollingUpgradeDUnit.java:536)
	at org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeDUnit.lambda$verifyLuceneQueryResultInEachVM$b83f705c$2(LuceneSearchWithRollingUpgradeDUnit.java:553)
Caused by: org.apache.geode.InternalGemFireError: Unable to create index repository
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:118)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager$$Lambda$35/248730969.apply(Unknown Source)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.computeRepository(AbstractPartitionedRepositoryManager.java:108)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:137)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepositories(AbstractPartitionedRepositoryManager.java:91)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.distributed.LuceneQueryFunction.execute(LuceneQueryFunction.java:105)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.internal.cache.PartitionedRegionDataStore.executeOnDataStore(PartitionedRegionDataStore.java:2911)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.internal.cache.partitioned.PartitionedRegionFunctionStreamingMessage.operateOnPartitionedRegion(PartitionedRegionFunctionStreamingMessage.java:98)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:332)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:374)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:440)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:662)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.distributed.internal.DistributionManager$9$1.run(DistributionManager.java:1108)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in java.lang.Thread.run(Thread.java:745)
	at org.apache.geode.distributed.internal.ReplyException.handleAsUnexpected(ReplyException.java:89)
	at org.apache.geode.internal.cache.execute.FunctionStreamingResultCollector.waitForCacheOrFunctionException(FunctionStreamingResultCollector.java:465)
	at org.apache.geode.internal.cache.partitioned.PRFunctionStreamingResultCollector.getResult(PRFunctionStreamingResultCollector.java:91)
	at org.apache.geode.cache.lucene.internal.LuceneQueryImpl.findTopEntries(LuceneQueryImpl.java:121)
	at org.apache.geode.cache.lucene.internal.LuceneQueryImpl.findKeys(LuceneQueryImpl.java:69)
	... 31 more
Caused by: org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource BufferedChecksumIndexInput(segments_2)): 7 (needs to be between 4 and 6)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:216)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:302)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:286)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:938)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.computeIndexRepository(IndexRepositoryFactory.java:84)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.computeRepository(PartitionedRepositoryManager.java:42)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:116)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager$$Lambda$35/248730969.apply(Unknown Source)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.computeRepository(AbstractPartitionedRepositoryManager.java:108)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepository(AbstractPartitionedRepositoryManager.java:137)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.getRepositories(AbstractPartitionedRepositoryManager.java:91)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.cache.lucene.internal.distributed.LuceneQueryFunction.execute(LuceneQueryFunction.java:105)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.internal.cache.PartitionedRegionDataStore.executeOnDataStore(PartitionedRegionDataStore.java:2911)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.internal.cache.partitioned.PartitionedRegionFunctionStreamingMessage.operateOnPartitionedRegion(PartitionedRegionFunctionStreamingMessage.java:98)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.internal.cache.partitioned.PartitionMessage.process(PartitionMessage.java:332)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:374)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:440)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:662)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in org.apache.geode.distributed.internal.DistributionManager$9$1.run(DistributionManager.java:1108)
	at Remote Member '192.168.2.6(57006)<v3>:32773(version:GEODE 1.2.0)' in java.lang.Thread.run(Thread.java:745)

Analysis

TBD - More work hereThis test does not fail every time.

The test:

  • creates a LuceneIndex in 3 servers on an older version servers
  • rolls each server one at a time and:
    • creates the region
    • executes a query
    • verifies the results

The LuceneQueryImpl.findKeys method causes LuceneQueryFunction to be invoked in an older member. That older member tries to create an IndexWriter and fails.

The exception in the older member is:

Code Block
[vm2_v120] [info 2017/10/25 16:33:49.576 PDT <Pooled Waiting Message Processor 7> tid=0x78] Exception thrown while constructing Lucene Index for bucket:53 for file region:/__PR/_B__index#__aRegion.files_53
[vm2_v120] [warn 2017/10/25 16:33:49.577 PDT <Pooled Waiting Message Processor 7> tid=0x78] Task failed with exception
[vm2_v120] org.apache.geode.InternalGemFireError: Unable to create index repository
[vm2_v120] 	at org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:118)
[vm2_v120] 	at org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager$$Lambda$35/1749028657.apply(Unknown Source)
[vm2_v120] 	at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
[vm2_v120] 	at org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.computeRepository(AbstractPartitionedRepositoryManager.java:108)
[vm2_v120] 	at org.apache.geode.cache.lucene.internal.LuceneBucketListener.lambda$afterPrimary$0(LuceneBucketListener.java:40)
[vm2_v120] 	at org.apache.geode.cache.lucene.internal.LuceneBucketListener$$Lambda$34/1771106111.run(Unknown Source)
[vm2_v120] 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[vm2_v120] 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[vm2_v120] 	at org.apache.geode.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:662)
[vm2_v120] 	at org.apache.geode.distributed.internal.DistributionManager$6$1.run(DistributionManager.java:989)
[vm2_v120] 	at java.lang.Thread.run(Thread.java:745)
[vm2_v120] Caused by: org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource BufferedChecksumIndexInput(segments_2)): 7 (needs to be between 4 and 6)
[vm2_v120] 	at org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:216)
[vm2_v120] 	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:302)
[vm2_v120] 	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:286)
[vm2_v120] 	at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:938)
[vm2_v120] 	at org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.computeIndexRepository(IndexRepositoryFactory.java:84)
[vm2_v120] 	at org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.computeRepository(PartitionedRepositoryManager.java:42)
[vm2_v120] 	at org.apache.geode.cache.lucene.internal.AbstractPartitionedRepositoryManager.lambda$computeRepository$0(AbstractPartitionedRepositoryManager.java:116)
[vm2_v120] 	... 10 more

The SegmentInfos.readCommit method is reading an int from a FileIndexInput which is gotten from RegionDirectory.openInput:

Code Block
public IndexInput openInput(String name, IOContext context) throws IOException {
	ensureOpen();
	final File file = fs.getFile(name);
	return new FileIndexInput(name, file);
}

The FileSystem.getFile method does a get on the fileAndChunkRegion:

Code Block
final File file = (File) fileAndChunkRegion.get(name);

I'm not sure which File is being requested but all of the activity on that bucket is from current version vm0:

Code Block
[vm0] [warn 2017/10/25 16:33:37.613 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1.fdx
[vm0] [warn 2017/10/25 16:33:37.620 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1.fdt
[vm0] [warn 2017/10/25 16:33:37.672 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1.nvd
[vm0] [warn 2017/10/25 16:33:37.682 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1.nvm
[vm0] [warn 2017/10/25 16:33:37.697 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1.nvd
[vm0] [warn 2017/10/25 16:33:37.699 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1.nvm
[vm0] [warn 2017/10/25 16:33:37.703 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1.fdt
[vm0] [warn 2017/10/25 16:33:37.706 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1.fdx
[vm0] [warn 2017/10/25 16:33:37.708 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1_Lucene50_0.doc
[vm0] [warn 2017/10/25 16:33:37.711 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1_Lucene50_0.pos
[vm0] [warn 2017/10/25 16:33:37.714 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1_Lucene50_0.tim
[vm0] [warn 2017/10/25 16:33:37.717 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1_Lucene50_0.tip
[vm0] [warn 2017/10/25 16:33:37.721 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1_Lucene50_0.tim
[vm0] [warn 2017/10/25 16:33:37.724 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1_Lucene50_0.tip
[vm0] [warn 2017/10/25 16:33:37.725 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1_Lucene50_0.doc
[vm0] [warn 2017/10/25 16:33:37.729 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1_Lucene50_0.pos
[vm0] [warn 2017/10/25 16:33:37.730 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1.fnm
[vm0] [warn 2017/10/25 16:33:37.733 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1.fnm
[vm0] [warn 2017/10/25 16:33:37.737 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1.cfs
[vm0] [warn 2017/10/25 16:33:37.739 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1.cfe
[vm0] [warn 2017/10/25 16:33:37.750 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1.cfe
[vm0] [warn 2017/10/25 16:33:37.752 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1.cfs
[vm0] [warn 2017/10/25 16:33:37.754 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=_1.si
[vm0] [warn 2017/10/25 16:33:37.758 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=_1.si
[vm0] [warn 2017/10/25 16:33:37.778 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.createFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=pending_segments_2
[vm0] [warn 2017/10/25 16:33:37.788 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=pending_segments_2
[vm0] [warn 2017/10/25 16:33:37.789 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.renameFile fileAndChunkRegion=_B__index#__aRegion.files_53; source=pending_segments_2
[vm0] [warn 2017/10/25 16:33:37.790 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.updateFile fileAndChunkRegion=_B__index#__aRegion.files_53; key=pending_segments_2
[vm0] [warn 2017/10/25 16:33:37.792 PDT <Event Processor for GatewaySender_AsyncEventQueue_index#_aRegion_3> tid=0x5a] FileSystem.putIfAbsentFile fileAndChunkRegion=_B__index#__aRegion.files_53; name=segments_2

So, it looks like a newer version server with Lucene 7 puts a File into the fileAndChunkRegion, and an older version server with Lucene 6 gets it out and attempts to read it unsuccessfully.

I added the luceneQueryReturnsCorrectResultsAfterClientAndServersAreRolledOverAllBucketsCreated test to LuceneSearchWithRollingUpgradeDUnit that reproduces this issue every time.

The main sequence in this test that causes the failure is:

  1. start two servers with old version using Lucene 6
  2. roll one server to new version server using Lucene 7
  3. do puts into primary buckets in new server which creates entries in the fileAndChunk region with Lucene 7 format
  4. stop the new version server which causes the old version server to become primary for those buckets
  5. do a query which causes the IndexFormatTooNewException to be thrown

Hydra Test Execution

The lucene.bt and backwardCompatibilityLucene.bt had only 1 hung test which I don't think was related to Lucene 7.x.

Code Block
lucene/serialParReg.conf                                       P    00:11:14    closed/pivotalgf-assembly/build/hydraTest/serialParReg-1023-192645
lucene/concParReg.conf                                         P    00:24:33    closed/pivotalgf-assembly/build/hydraTest/concParReg-1023-193804
lucene/serialParRegPersist.conf                                P    00:15:20    closed/pivotalgf-assembly/build/hydraTest/serialParRegPersist-1023-200244
lucene/serialParRegIncremental.conf                            P    00:10:49    closed/pivotalgf-assembly/build/hydraTest/serialParRegIncremental-1023-201809
lucene/concParRegPersist.conf                                  P    00:22:47    closed/pivotalgf-assembly/build/hydraTest/concParRegPersist-1023-202902
lucene/concParRegIncremental.conf                              P    00:12:30    closed/pivotalgf-assembly/build/hydraTest/concParRegIncremental-1023-205153
lucene/concParRegBridge.conf                                   P    00:25:06    closed/pivotalgf-assembly/build/hydraTest/concParRegBridge-1023-210430
lucene/concParRegBridge.conf                                   P    00:25:05    closed/pivotalgf-assembly/build/hydraTest/concParRegBridge-1023-212942
lucene/concParRegBridge.conf                                   P    00:25:46    closed/pivotalgf-assembly/build/hydraTest/concParRegBridge-1023-215452
lucene/concParRegBridge.conf                                   P    00:23:51    closed/pivotalgf-assembly/build/hydraTest/concParRegBridge-1023-222043
lucene/concParRegBridgePersist.conf                            P    00:24:18    closed/pivotalgf-assembly/build/hydraTest/concParRegBridgePersist-1023-224441
lucene/concParRegBridgePersist.conf                            P    00:23:09    closed/pivotalgf-assembly/build/hydraTest/concParRegBridgePersist-1023-230905
lucene/concParRegBridgePersist.conf                            P    00:22:24    closed/pivotalgf-assembly/build/hydraTest/concParRegBridgePersist-1023-233220
lucene/concParRegBridgePersist.conf                            P    00:22:15    closed/pivotalgf-assembly/build/hydraTest/concParRegBridgePersist-1023-235449
lucene/serialParRegHA.conf                                     P    00:10:56    closed/pivotalgf-assembly/build/hydraTest/serialParRegHA-1024-001710
lucene/concParRegHA.conf                                       P    00:32:02    closed/pivotalgf-assembly/build/hydraTest/concParRegHA-1024-002811
lucene/serialParRegHA.conf                                     P    00:11:25    closed/pivotalgf-assembly/build/hydraTest/serialParRegHA-1024-010017
lucene/concParRegHA.conf                                       P    00:32:37    closed/pivotalgf-assembly/build/hydraTest/concParRegHA-1024-011147
lucene/serialParRegHAPersist.conf                              P    00:13:38    closed/pivotalgf-assembly/build/hydraTest/serialParRegHAPersist-1024-014428
lucene/serialParRegHAPersist.conf                              P    00:13:36    closed/pivotalgf-assembly/build/hydraTest/serialParRegHAPersist-1024-015811
lucene/concParRegHAPersist.conf                                P    00:19:42    closed/pivotalgf-assembly/build/hydraTest/concParRegHAPersist-1024-021152
lucene/concParRegHAPersist.conf                                P    00:20:06    closed/pivotalgf-assembly/build/hydraTest/concParRegHAPersist-1024-023138
lucene/concParRegHABridge.conf                                 P    00:34:45    closed/pivotalgf-assembly/build/hydraTest/concParRegHABridge-1024-025150
lucene/concParRegHABridge.conf                                 P    00:33:29    closed/pivotalgf-assembly/build/hydraTest/concParRegHABridge-1024-032640
lucene/concParRegHABridge.conf                                 P    00:33:07    closed/pivotalgf-assembly/build/hydraTest/concParRegHABridge-1024-040015
lucene/concParRegHABridge.conf                                 P    00:33:54    closed/pivotalgf-assembly/build/hydraTest/concParRegHABridge-1024-043328
lucene/concParRegHABridgePersist.conf                          P    00:21:58    closed/pivotalgf-assembly/build/hydraTest/concParRegHABridgePersist-1024-050729
lucene/concParRegHABridgePersist.conf                          P    00:21:55    closed/pivotalgf-assembly/build/hydraTest/concParRegHABridgePersist-1024-052931
lucene/concParRegRecover.conf                                  P    00:13:42    closed/pivotalgf-assembly/build/hydraTest/concParRegRecover-1024-055135
lucene/concParRegRecover.conf                                  P    00:15:49    closed/pivotalgf-assembly/build/hydraTest/concParRegRecover-1024-060521
lucene/concParRegRecover.conf                                  P    00:14:59    closed/pivotalgf-assembly/build/hydraTest/concParRegRecover-1024-062115
lucene/luceneQueriesWithRebalance.conf                         P    00:06:02    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithRebalance-1024-063618
lucene/luceneQueriesWithRebalance.conf                         P    00:06:53    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithRebalance-1024-064226
lucene/luceneQueriesWithRebalancePersist.conf                  P    00:11:01    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithRebalancePersist-1024-064923
lucene/luceneQueriesWithRebalancePersist.conf                  P    00:09:16    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithRebalancePersist-1024-070030
lucene/luceneQueriesWithHA.conf                                P    00:12:03    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithHA-1024-070951
lucene/luceneQueriesWithHA.conf                                H    00:16:37    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithHA-1024-072200
lucene/luceneQueriesWithHA.conf                                P    00:12:38    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithHA-1024-073844
lucene/luceneQueriesWithHA.conf                                P    00:12:56    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithHA-1024-075128
lucene/luceneQueriesWithHA.conf                                P    00:13:05    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithHA-1024-080428
lucene/luceneQueriesWithHAPersist.conf                         P    00:14:38    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithHAPersist-1024-081740
lucene/luceneQueriesWithHAPersist.conf                         P    00:15:38    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithHAPersist-1024-083223
lucene/luceneQueriesWithHAPersist.conf                         P    00:15:51    closed/pivotalgf-assembly/build/hydraTest/luceneQueriesWithHAPersist-1024-084807
lucene/concDestroyIndex.conf                                   P    00:12:10    closed/pivotalgf-assembly/build/hydraTest/concDestroyIndex-1024-090403
lucene/backwardCompatibility/concParRegBridge.conf             P    00:25:50    closed/pivotalgf-assembly/build/hydraTest/concParRegBridge-1024-091618
lucene/backwardCompatibility/concParRegBridge.conf             P    00:25:09    closed/pivotalgf-assembly/build/hydraTest/concParRegBridge-1024-094213
lucene/backwardCompatibility/concParRegBridgePersist.conf      P    00:23:35    closed/pivotalgf-assembly/build/hydraTest/concParRegBridgePersist-1024-100726
lucene/backwardCompatibility/concParRegBridgePersist.conf      P    00:24:43    closed/pivotalgf-assembly/build/hydraTest/concParRegBridgePersist-1024-103106
lucene/backwardCompatibility/concParRegHABridge.conf           P    00:33:06    closed/pivotalgf-assembly/build/hydraTest/concParRegHABridge-1024-105556
lucene/backwardCompatibility/concParRegHABridgePersist.conf    P    00:21:18    closed/pivotalgf-assembly/build/hydraTest/concParRegHABridgePersist-1024-112907

The errors.txt file contained:

Code Block
THREAD vm_0_thr_7_accessor1_mclaren_11724 Subthread Dynamic Client VM Stopper
HANG Timeout during dynamic action: Failed to stop vm_3 within 300 seconds
hydra.HydraTimeoutException: Failed to stop vm_3 within 300 seconds
	at hydra.ClientMgr.killClientVm(ClientMgr.java:969)
	at hydra.ClientMgr.stopClientVm(ClientMgr.java:819)
	at hydra.ClientMgr._stopClientVm(ClientMgr.java:761)
	at hydra.ClientMgr$2.run(ClientMgr.java:720)
	at java.lang.Thread.run(Thread.java:745)
--------------------------------------------------------------------------------

...