THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- Running System Test
- Specify what test cases to run
- Test Report Interpretation
- List of testcases and the description
Quick Start
Running System Test
Code Block |
---|
1. Check out kafka codebase: a. ~ $ git clone https://git-wip-us.apache.org/repos/asf/kafka.git b. ~ $ cd <kafka> c. <kafka> $ git checkout 0.8 2. Under <kafka>, build kafka a. <kafka> $ ./sbt package assembly-package-dependency gradlew jar 3. Edit <kafka>/config/log4j.properties to uncomment the following 2 lines Set JAVA_HOME environment variable (optional, but recommended): #log4j.logger.kafka.perf=DEBUG, kafkaAppender #log4j.logger.kafka.perf.ProducerPerformance$ProducerThread=DEBUG, kafkaAppender The above 2 lines will make ProducerPerformance print debugging messages (shown below) such that System Test would need to validate data loss. [2013-07-08 09:32:51,933] DEBUG Topic:test_1:ThreadID:0:MessageID:0000000000:xxxxxxxxxxx [2013-07-08 09:32:51,933] DEBUG Topic:test_1:ThreadID:1:MessageID:0000000100:xxxxxxxxxxx 4a. export JAVA_HOME=/usr/java/jdk1.7.0_67 4. Make sure that you can ssh to localhost without a password. 5. Under <kafka>/system_test, execute the following command to start System Test : $ python -u -B system_test_runner.py 2>&1 | tee system_test_output.log |
...
Code Block |
---|
# ================================================ # # Test results interpretations # # ================================================ 1. PASSED case - A PASSED test case should have a test result similar to the following : _test_case_name : testcase_0201 _test_class_name : ReplicaBasicTest arg : bounce_broker : true arg : broker_type : controller arg : message_producing_free_time_sec : 15 arg : num_iteration : 3 arg : num_messages_to_produce_per_producer_call : 50 arg : num_partition : 3 arg : replica_factor : 3 arg : signal_type : SIGTERM arg : sleep_seconds_between_producer_calls : 1 validation_status : No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r1.log : 711 No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r2.log : 711 No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r3.log : 711 No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r1.log : 700 No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r2.log : 700 No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r3.log : 700 No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r1.log : 604 No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r2.log : 604 No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r3.log : 604 Unique messages from consumer on [test_1] : 2000 Unique messages from producer on [test_1] : 2000 Validate for data matched on topic [test_1] : PASSED <---------- Validate for data matched on topic [test_1] across replicas : PASSED <---------- All validations Validate for merged log segment checksum in cluster [source] : PASSED <---------- PASSED Validate index log in cluster [source] : PASSED <---------- 2. FAILED case - A FAILED test case is shown below with data loss in topic test_1 : _test_case_name : testcase_5005 _test_class_name : MirrorMakerTest arg : bounce_leader : false arg : bounce_mirror_maker : true arg : bounced_entity_downtime_sec : 30 arg : message_producing_free_time_sec : 15 arg : num_iteration : 1 arg : num_messages_to_produce_per_producer_call : 50 arg : num_partition : 2 arg : replica_factor : 3 arg : sleep_seconds_between_producer_calls : 1 validation_status : Unique messages from consumer on [test_1] : 1392 <------ Unique messages from consumer on [test_2] : 1400 | Unique messages from producer on [test_1] : 1400 | Unique messages from producer on [test_2] : 1400 | Validate for data matched on topic [test_1] : FAILED <--------- FAILED because of data matched issue on topic "test_1" Validate for data matched on topic [test_2] : PASSED Validate for merged log segment checksum in cluster [source] : PASSED Validate for merged log segment checksum in cluster [target] : PASSED 3. Skipped case will have result similar to the following (No validation status details) : _test_case_name : testcase_0201 _test_class_name : ReplicaBasicTest arg : bounce_broker : true arg : broker_type : controller arg : message_producing_free_time_sec : 15 arg : num_iteration : 3 arg : num_messages_to_produce_per_producer_call : 50 arg : num_partition : 3 arg : replica_factor : 3 arg : signal_type : SIGTERM arg : sleep_seconds_between_producer_calls : 1 validation_status : |
Test Case Description
Kafka System Teststestcase description
Misc
Directory Structure Overview
...
Code Block |
---|
<kafka> |- /bin |- /config |- /contrib |- /core |- /lib |. |. |. |- /system_test | |- system_test_runner.py | | # system_test_runner.py is the main script to start system test as follows : | # | # 1. for each test suite directory (XXXX_testsuite) under systemTestEnv.SYSTEM_TEST_BASE_DIR (<kafka>/system_test) : | # 2. get a list of module scripts (*.py) in test suite directory | # 3. for each file in the list of module scripts : | # 4. get class name from the Python module script | # 5. retrieve corresponding suite & module names from class name | # 6. dynamically load the module and start the test class | # 7. save each test case result in systemTestEnv.systemTestResultsList | # 8. for each result in systemTestEnv.systemTestResultsList : | # 9. print result | |- /utils # This is a directory that contains all helper classes / util functions for system test | | |- kafka_system_test_utils.py # utilities specific to Kafka system testing (e.g. Kafka test cases data loss validation) | |- replication_utils.py # utilities specific to replication testing (e.g. Leader election log message pattern) | |- setup_utils.py # generic helper for system test setup (e.g. System Test environment setup) | |- system_test_utils.py # utilities for generic testing purposes (e.g. reading JSON data file) | |- testcase_env.py # testcase environment setup (e.g. data structure initialization such as brokers-pid mapping) | |- metrics.py # utilities for metrics collection | |- pyh.py # from http://code.google.com/p/pyh (open source) | |- cluster_config.json # this file contains the following properties: | # 1. what entities (Producer, Consumer) should be running | # 2. which cluster (source or target) | # 3. where they should be running (physical nodes) | | # cluster_config.json is used to specify the logical machines configuration : | # | # entity_id : In each testcase, there may be zookeeper(s), broker(s), producer(s) and | # consumer(s) involved. "entity_id" is used to uniquely identify each | # component inside the system test. | # hostname : It is used to specify the name of the machine in the distributed environment. | # "localhost" is used by default. | # role : The supported values are "zookeeper", "broker", "mirror_maker", "migration_tool", | # "producer", "consumer". | # cluster_name : The supported values are "source", "target" | # kafka_home : Specify the Kafka installation directory of each machine in a distributed environment. | # "default" is used by default and ../ is assumed to be "kafka_home". | # java_home : Specify the JAVA_HOME of each machine in a distributed environment. | # 1. "default" is used by default. If JAVA_HOME is specified in the environment, | # this value will be used. Otherwise, System Test executes a shell command "which java" | # to find java bin dir and set JAVA_HOME accordingly. If no java binary can be found, | # it throws Exception and exit. | # 2. If a path is specified other than "default", System Test will verify java binary. | # Otherwise, it throws Exception and exit. | # jmx_port : Specify a JMX_PORT for each component. It must be unique inside the cluster_config.json. | # | # { | # "cluster_config": [ | # { | # "entity_id" : "0", <----------------------------------------------------------------------------------| | # "hostname" : "localhost", "entity_id" in cluster_config must match the corresponding "entity id" in | | # "role" : "zookeeper", testcase_NNNN.properties as shown below | | # "cluster_name" : "source", | | # "kafka_home" : "default", | | # "java_home" : "default", | | # "jmx_port" : "9990" | | # }, | | # . | | # . . | # . . | # } . | |- /XXXX_testsuite | | # XXXX_testsuite is a directory which contains Python scripts and test case directories for a specific group of functional testings. | # For example, mirror_maker_testsuite contains mirror_maker_test.py in which it will start mirror maker instances which is not | # required for the test cases in migration_tool_testsuite. | # | # The following are the existing test suites: | # migration_tool_testsuite | # mirror_maker_testsuite | # replication_testsuite | |- <testsuite>.py | | # <testsuite>.py : This Python script may be implemented with the test logic for a group of test scenarios. | # For example, in the "mirror_maker_testsuite", mirror_maker_test.py is the testsuite script | # which is implemented with the following : | # 1. start zookeeper(s) in source cluster | # 2. start broker(s) in source cluster | # 3. start zookeeper(s) in target cluster | # 4. start broker(s) in target cluster | # 5. start mirror maker(s) | # 6. start producer | # 7. start consumer | # 8. stop all entities | # 9. validate no data loss | # | # The above test logic is implemented in a generalized fashion and read from | # testcase_NNNN_properties.json for the following test case arguments : | # 1. replication factor (broker) | # 2. no. of partitions (broker) | # 3. log segment bytes (broker) | # 4. topics (producer / consumer) | # . | # . | # . | # | # Therefore, each testsuite can be thought of a functional / feature test group | # by varying a combination of settings for easier maintenance. | |- /config | | # config : This config directory contains the TEMPLATE properties files for this testsuite. | # For "replication_testsuite", only server.properties and zookeeper.properties are | # required. However, "mirror_maker_testsuite" has mirror_consumer.properties and | # mirror_producer.properties as well. | # | # System Test reads the properties from each files in this directory as a TEMPLATE | # and override the values with those from testcase_NNNN_properties.json accordingly. | |- server.properties |- zookeeper.properties |- (migration_consumer.properties) # only in migration_tool_testsuite |- (migration_producer.properties) # only in migration_tool_testsuite |- (mirror_maker_consumer.properties) # only in mirror_maker_testsuite |- (mirror_maker_producer.properties) # only in mirror_maker_testsuite | |- /testcase_NNNN | | # testcase_NNNN is a directory which contains a json file to specify the arguments | # required for the <testsuite>.py script to execute. | # | # The main arguments are : | # 1. testcase_args : These are the arguments specific for that test case such as | # "replica_factor", "num_partition", "bounce_broker", ... etc. | # 2. entities : These are the arguments required to start running a certain | # entity such as a broker : port, replication factor, log file | # directory, ... etc. | |- (config) # generated when this testcase is executed in system test runtime | |- (logs) # generated when this testcase is executed in system test runtime | |- testcase_NNNN_properties.json # this file contains new values to override the default settings of | # various entities (e.g. ZK, Broker, Producer, Mirror Maker, ...) | # such as "log.segment.bytes", "num.partitions", "broker.id", ... | # { | # "description": {"01":"Replication Basic : Base Test", | # "02":"Produce and consume messages to a single topic - single partition.", | # "03":"This test sends messages to 3 replicas", | # . | # . | # . | # }, | # "testcase_args": { | # "broker_type" : "leader", | # "bounce_broker" : "false", | # "replica_factor" : "3", | # "num_partition" : "1", | # "num_iteration" : "1", | # . . | # . . | # . . | # }, | | # "entities": [ | | # { | | # "entity_id" : "0", <---------------------------------------------------------------| | # "clientPort" : "2188", 1. All attributes defined in this entity must be the | # "dataDir" : "/tmp/zookeeper_0", corresponding attributes for the "role" specified | # "log_filename" : "zookeeper_2188.log", by "entity_id" defined above in cluster_config | # "config_filename" : "zookeeper_2188.properties" 2. In this case, the attributes are all zookeeper | # }, related as the cluster_config has already defined | # . entity_id 0 as role "zookeeper" above | # . | # . | # } |
...
The following describes the steps to troubleshoot a failing case running in a local machine. To troubleshoot failures in distributed environment (e.g. Hudson build failures), please refer to Kafka System Tests
Refer to Running System Test on how to quick starting System Test
...