Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

Quick Start

Running System Test

Code Block
1. Check out kafka codebase:

   a. ~ $ git clone https://git-wip-us.apache.org/repos/asf/kafka.git
   b. ~ $ cd <kafka>
   c. <kafka> $ git checkout 0.8

2. Under <kafka>, build kafka

   a. <kafka> $ ./sbt package assembly-package-dependency
gradlew jar
 
3. Edit <kafka>/config/log4j.properties to uncomment the following 2 lines
Set JAVA_HOME environment variable (optional, but recommended):
 
   #log4j.logger.kafka.perf=DEBUG, kafkaAppender
   #log4j.logger.kafka.perf.ProducerPerformance$ProducerThread=DEBUG, kafkaAppender

   The above 2 lines will make ProducerPerformance print debugging messages (shown below) such that System Test would need to validate data loss.

   [2013-07-08 09:32:51,933] DEBUG Topic:test_1:ThreadID:0:MessageID:0000000000:xxxxxxxxxxx
   [2013-07-08 09:32:51,933] DEBUG Topic:test_1:ThreadID:1:MessageID:0000000100:xxxxxxxxxxx

4a. export JAVA_HOME=/usr/java/jdk1.7.0_67
 
4. Make sure that you can ssh to localhost without a password.
5. Under <kafka>/system_test, execute the following command to start System Test :

   $ python -u -B system_test_runner.py 2>&1 | tee system_test_output.log

...

Code Block
# ================================================
#
#         Test results interpretations
#
# ================================================

1. PASSED case - A PASSED test case should have a test result similar to the following :

_test_case_name  :  testcase_0201
_test_class_name  :  ReplicaBasicTest
arg : bounce_broker  :  true
arg : broker_type  :  controller
arg : message_producing_free_time_sec  :  15
arg : num_iteration  :  3
arg : num_messages_to_produce_per_producer_call  :  50
arg : num_partition  :  3
arg : replica_factor  :  3
arg : signal_type  :  SIGTERM
arg : sleep_seconds_between_producer_calls  :  1
validation_status  :
     No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r1.log  :  711
     No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r2.log  :  711
     No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r3.log  :  711
     No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r1.log  :  700
     No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r2.log  :  700
     No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r3.log  :  700
     No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r1.log  :  604
     No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r2.log  :  604
     No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r3.log  :  604
     Unique messages from consumer on [test_1]  :  2000
     Unique messages from producer on [test_1]  :  2000
     Validate for data matched on topic [test_1]  :  PASSED                               <----------
     Validate for data matched on topic [test_1] across replicas  :  PASSED               <---------- All validations
     Validate for merged log segment checksum in cluster [source]  :  PASSED              <----------   PASSED
     Validate index log in cluster [source]  :  PASSED                                    <----------


2. FAILED case - A FAILED test case is shown below with data loss in topic test_1 :

_test_case_name  :  testcase_5005
_test_class_name  :  MirrorMakerTest
arg : bounce_leader  :  false
arg : bounce_mirror_maker  :  true
arg : bounced_entity_downtime_sec  :  30
arg : message_producing_free_time_sec  :  15
arg : num_iteration  :  1
arg : num_messages_to_produce_per_producer_call  :  50
arg : num_partition  :  2
arg : replica_factor  :  3
arg : sleep_seconds_between_producer_calls  :  1
validation_status  :
     Unique messages from consumer on [test_1]  :  1392                                   <------
     Unique messages from consumer on [test_2]  :  1400                                          |
     Unique messages from producer on [test_1]  :  1400                                          |
     Unique messages from producer on [test_2]  :  1400                                          |
     Validate for data matched on topic [test_1]  :  FAILED                               <--------- FAILED because of data matched issue on topic "test_1"
     Validate for data matched on topic [test_2]  :  PASSED
     Validate for merged log segment checksum in cluster [source]  :  PASSED
     Validate for merged log segment checksum in cluster [target]  :  PASSED


3. Skipped case will have result similar to the following (No validation status details) :

_test_case_name  :  testcase_0201
_test_class_name  :  ReplicaBasicTest
arg : bounce_broker  :  true
arg : broker_type  :  controller
arg : message_producing_free_time_sec  :  15
arg : num_iteration  :  3
arg : num_messages_to_produce_per_producer_call  :  50
arg : num_partition  :  3
arg : replica_factor  :  3
arg : signal_type  :  SIGTERM
arg : sleep_seconds_between_producer_calls  :  1
validation_status  :

Test Case Description

Kafka System Teststestcase description

Misc

Directory Structure Overview

...

Code Block
<kafka>
  |- /bin
  |- /config
  |- /contrib
  |- /core
  |- /lib
  |.
  |.
  |.
  |- /system_test
       |
       |- system_test_runner.py
       |
       |  # system_test_runner.py is the main script to start system test as follows :
       |  #
       |  #   1. for each test suite directory (XXXX_testsuite) under systemTestEnv.SYSTEM_TEST_BASE_DIR (<kafka>/system_test) :
       |  #   2.     get a list of module scripts (*.py) in test suite directory
       |  #   3.     for each file in the list of module scripts :
       |  #   4.         get class name from the Python module script
       |  #   5.         retrieve corresponding suite & module names from class name
       |  #   6.         dynamically load the module and start the test class
       |  #   7.         save each test case result in systemTestEnv.systemTestResultsList
       |  #   8. for each result in systemTestEnv.systemTestResultsList :
       |  #   9.     print result
       |
       |- /utils      # This is a directory that contains all helper classes / util functions for system test
       |
       |    |- kafka_system_test_utils.py    # utilities specific to Kafka system testing   (e.g. Kafka test cases data loss validation)
       |    |- replication_utils.py          # utilities specific to replication testing    (e.g. Leader election log message pattern)
       |    |- setup_utils.py                # generic helper for system test setup         (e.g. System Test environment setup)
       |    |- system_test_utils.py          # utilities for generic testing purposes       (e.g. reading JSON data file)
       |    |- testcase_env.py               # testcase environment setup                   (e.g. data structure initialization such as brokers-pid mapping)
       |    |- metrics.py                    # utilities for metrics collection             
       |    |- pyh.py                        # from http://code.google.com/p/pyh            (open source)
       |
       |- cluster_config.json                # this file contains the following properties:
       |                                     #   1. what entities (Producer, Consumer) should be running
       |                                     #   2. which cluster (source or target)
       |                                     #   3. where they should be running (physical nodes)
       |
       |  # cluster_config.json is used to specify the logical machines configuration :
       |  #
       |  #   entity_id    : In each testcase, there may be zookeeper(s), broker(s), producer(s) and
       |  #                  consumer(s) involved. "entity_id" is used to uniquely identify each
       |  #                  component inside the system test.
       |  #   hostname     : It is used to specify the name of the machine in the distributed environment.
       |  #                  "localhost" is used by default.
       |  #   role         : The supported values are "zookeeper", "broker", "mirror_maker", "migration_tool",
       |  #                  "producer", "consumer".
       |  #   cluster_name : The supported values are "source", "target"
       |  #   kafka_home   : Specify the Kafka installation directory of each machine in a distributed environment.
       |  #                  "default" is used by default and ../ is assumed to be "kafka_home".
       |  #   java_home    : Specify the JAVA_HOME of each machine in a distributed environment.
       |  #                  1. "default" is used by default. If JAVA_HOME is specified in the environment, 
	   |  # 					this value will be used. Otherwise, System Test executes a shell command "which java"
       |  #                     to find java bin dir and set JAVA_HOME accordingly. If no java binary can be found,
       |  #                     it throws Exception and exit.
       |  #                  2. If a path is specified other than "default", System Test will verify java binary.
       |  #                     Otherwise, it throws Exception and exit.
       |  #   jmx_port     : Specify a JMX_PORT for each component. It must be unique inside the cluster_config.json.
       |  #
       |  # {
       |  #   "cluster_config": [
       |  #   {
       |  #     "entity_id"    : "0",                  <----------------------------------------------------------------------------------|
       |  #     "hostname"     : "localhost",             "entity_id" in cluster_config must match the corresponding "entity id" in       |
       |  #     "role"         : "zookeeper",             testcase_NNNN.properties as shown below                                         |
       |  #     "cluster_name" : "source",                                                                                                |
       |  #     "kafka_home"   : "default",                                                                                               |
       |  #     "java_home"    : "default",                                                                                               |
       |  #     "jmx_port"     : "9990"                                                                                                   |
       |  #   },                                                                                                                          |
       |  #   .                                                                                                                           |
       |  #   .                                                                                                                           .
       |  #   .                                                                                                                           .
       |  # }                                                                                                                             .
       |
       |- /XXXX_testsuite
       |
       |  # XXXX_testsuite is a directory which contains Python scripts and test case directories for a specific group of functional testings.
       |  # For example, mirror_maker_testsuite contains mirror_maker_test.py in which it will start mirror maker instances which is not
       |  # required for the test cases in migration_tool_testsuite.
       |  #
       |  # The following are the existing test suites:
       |  #   migration_tool_testsuite
       |  #   mirror_maker_testsuite
       |  #   replication_testsuite
       |
            |- <testsuite>.py
            |
            |  #   <testsuite>.py : This Python script may be implemented with the test logic for a group of test scenarios.
            |  #                    For example, in the "mirror_maker_testsuite", mirror_maker_test.py is the testsuite script
            |  #                    which is implemented with the following :
            |  #                    1. start zookeeper(s) in source cluster
            |  #                    2. start broker(s) in source cluster
            |  #                    3. start zookeeper(s) in target cluster
            |  #                    4. start broker(s) in target cluster
            |  #                    5. start mirror maker(s)
            |  #                    6. start producer
            |  #                    7. start consumer
            |  #                    8. stop all entities
            |  #                    9. validate no data loss
            |  #
            |  #                    The above test logic is implemented in a generalized fashion and read from
            |  #                    testcase_NNNN_properties.json for the following test case arguments :
            |  #                    1. replication factor (broker)
            |  #                    2. no. of partitions  (broker)
            |  #                    3. log segment bytes  (broker)
            |  #                    4. topics             (producer / consumer)
            |  #                    .
            |  #                    .
            |  #                    .
            |  #
            |  #                    Therefore, each testsuite can be thought of a functional / feature test group
            |  #                    by varying a combination of settings for easier maintenance.
            |
            |- /config
            |
            |  #   config         : This config directory contains the TEMPLATE properties files for this testsuite.
            |  #                    For "replication_testsuite", only server.properties and zookeeper.properties are
            |  #                    required. However, "mirror_maker_testsuite" has mirror_consumer.properties and
            |  #                    mirror_producer.properties as well.
            |  #
            |  #                    System Test reads the properties from each files in this directory as a TEMPLATE
            |  #                    and override the values with those from testcase_NNNN_properties.json accordingly.
            |
                 |- server.properties
                 |- zookeeper.properties
                 |- (migration_consumer.properties)        # only in migration_tool_testsuite
                 |- (migration_producer.properties)        # only in migration_tool_testsuite
                 |- (mirror_maker_consumer.properties)     # only in mirror_maker_testsuite
                 |- (mirror_maker_producer.properties)     # only in mirror_maker_testsuite
                 |
                 |- /testcase_NNNN
                      |
                      |  # testcase_NNNN is a directory which contains a json file to specify the arguments
                      |  # required for the <testsuite>.py script to execute.
                      |  #
                      |  # The main arguments are :
                      |  # 1. testcase_args : These are the arguments specific for that test case such as
                      |  #    "replica_factor", "num_partition", "bounce_broker", ... etc.
                      |  # 2. entities      : These are the arguments required to start running a certain
                      |  #                    entity such as a broker : port, replication factor, log file
                      |  #                    directory, ... etc.
                      |
                      |- (config)    # generated when this testcase is executed in system test runtime
                      |
                      |- (logs)      # generated when this testcase is executed in system test runtime
                      |
                      |- testcase_NNNN_properties.json     # this file contains new values to override the default settings of
                      |                                    #   various entities (e.g. ZK, Broker, Producer, Mirror Maker, ...)
                      |                                    #   such as "log.segment.bytes", "num.partitions", "broker.id", ...
                      |   # {
                      |   #   "description": {"01":"Replication Basic : Base Test",
                      |   #                   "02":"Produce and consume messages to a single topic - single partition.",
                      |   #                   "03":"This test sends messages to 3 replicas",
                      |   #                   .
                      |   #                   .
                      |   #                   .
                      |   #   },
                      |   #   "testcase_args": {
                      |   #     "broker_type"    : "leader",
                      |   #     "bounce_broker"  : "false",
                      |   #     "replica_factor" : "3",
                      |   #     "num_partition"  : "1",
                      |   #     "num_iteration"  : "1",
                      |   #     .                                                                                                          .
                      |   #     .                                                                                                          .
                      |   #     .                                                                                                          .
                      |   #   },                                                                                                           |
                      |   #   "entities": [                                                                                                |
                      |   #    {                                                                                                           |
                      |   #       "entity_id"       : "0",                 <---------------------------------------------------------------|
                      |   #       "clientPort"      : "2188",                        1. All attributes defined in this entity must be the
                      |   #       "dataDir"         : "/tmp/zookeeper_0",               corresponding attributes for the "role" specified
                      |   #       "log_filename"    : "zookeeper_2188.log",             by "entity_id" defined above in cluster_config
                      |   #       "config_filename" : "zookeeper_2188.properties"    2. In this case, the attributes are all zookeeper
                      |   #    },                                                       related as the cluster_config has already defined
                      |   #    .                                                        entity_id 0 as role "zookeeper" above
                      |   #    .
                      |   #    .
                      |   # }


...

The following describes the steps to troubleshoot a failing case running in a local machine. To troubleshoot failures in distributed environment (e.g. Hudson build failures), please refer to Kafka System Tests

Refer to Running System Test on how to quick starting System Test

...