(This document is still Work In Progress!)
Performance test tool design
The general goal is to have an extendable performance test framework that will drive the Knox Gateway configured by different use cases.
Some initial requirements of this framework:
- easy to extend with new use-cases
- has the ability to execute long-running jobs
- generate meaningful and easy-to-read reports of those metrics automatically (in configurable time periods)
In the first phase, the task was to conduct and codify performance testing for scalability and performance benchmarking with concurrent clients, with long-running jobs that stress the backend of the token state server store:
- use different token state server implementations
- turn on/off the token state service mechanism
- use concurrent clients who dealing with tokens (make sure they actually use the tokens and periodically renew them)
The tool itself is fed by its own configuration file located in $YOUR_KNOX_PROJECT_ROOT/gateway-performance-test/src/test/resources/performance.test.configuration.properties
:
# Gateway connection related properties perf.test.gateway.url.protocol=https perf.test.gateway.url.host=localhost perf.test.gateway.url.port=8443 perf.test.gateway.jmx.port=8888 # report generation related properties perf.test.report.generation.periodInSecs=30 perf.test.report.generation.json.enabled=true perf.test.report.generation.yaml.enabled=true # Knox Token use case related properties perf.test.usecase.knoxtoken.enabled=true perf.test.usecase.knoxtoken.topology.gateway=sandbox perf.test.usecase.knoxtoken.topology.tokenbased=tokenbased perf.test.usecase.knoxtoken.numOfThreads=3 perf.test.usecase.knoxtoken.testDurationInSecs=60 perf.test.usecase.knoxtoken.requestDelayLowerBoundInSecs=5 perf.test.usecase.knoxtoken.requestDelayUpperBoundInSecs=10
As of today (17 Aug 2020), there is only one use-case implementation exists to address the above-written acquire/renew/use Knox Delegation token case. The related resources (Java classes, properties files) are located in the gateway-performance-test
Maven module. Here is the list of the most relevant resources:
src/main/java
o.a.k.g.p.t.PerformanceTestRunner
- this is the entry point of the tool. This class comes with a main method that reads the given configuration file and executes all enabled use-case runnerso.a.k.g.p.t.ResponseTimeCache
- this class acts as a holder of response times and shared between the worker threads (which write into the cache) and the report generation threads (reading data from it)o.a.k.g.p.t.reporting.GatewayMetricsReporter
- this class generates the human-readable reports in JSON and YAML format in a fixed schedule marked byperf.test.report.generation.periodInSecs
o.a.k.g.p.t.knoxtoken.KnoxTokenUseCaseRunner
- this class is responsible for- start
N
worker threads that are acquiring Knox DTs parallel (marked byperf.test.usecase.knoxtoken.numOfThreads
) - and 2 more threads to
- renew an already acquired Knox DT
- do an HDFS ls command using an already acquired Knox DT
- start
o.a.k.g.p.t.knoxtoken.KnoxTokenWorkerThread
- this represents the job that actually acquires/renews/uses Knox DTs. The renew/use actions are running only on 1-1 thread and they wait 2 times more time between two subsequential calls then simply executing the acquire action. In other words, by default, a worker thread acquires a Knox DT between every 5 to 10 seconds (onN
threads) whereas a worker thread which renews/uses a previously acquired Knox token waits between 10-20 seconds.
- stores the already acquired Knox DTs (if the number of DTs reaches 500 the cache is cleaned automatically)o.a.k.g.p.t.knoxtoken.
KnoxTokenCache
src/test/resources
performance.test.configuration.properties
- contains the above-described configuration fileperformanceTest-log4j.properties
- the Log4j configuration of the tool. By default, it prints log messages on the STDOUT as well as writes them intotarget/logs/performanceTest.log
Knox gateway requisites
The performance test tool tries to connect to a Knox instance to
- acquire/renew/use Knox delegation tokens using its token API (
/knoxtoken/api/v1/token
) - fetch useful metrics via JMX
To be able to execute the last item, before running the Knox gateway you are testing against, the following configuration should be done:
- Set
KNOX_GATEWAY_DBG_OPTS
environment variable as follows:
export KNOX_GATEWAY_DBG_OPTS="$KNOX_GATEWAY_DBG_OPTS -Dcom.sun.management.jmxremote.port=8888 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"
2. Enable JMX reporting in gateway-site.xml:
<property> <name>gateway.metrics.enabled</name> <value>true</value> </property> <property> <name>gateway.jmx.metrics.reporting.enabled</name> <value>true</value> </property>
Once you opened the necessary JMX port, you also need to make sure you have at least one topology with the KNOXTOKEN
service. During my tests, I extended the sandbox
topology with the following service configuration:
<service> <role>KNOXTOKEN</role> <param> <name>knox.token.ttl</name> <value>36000000</value> </param> <param> <name>knox.token.audiences</name> <value>tokenbased</value> </param> <param> <name>knox.token.target.url</name> <value>https://localhost:8443/gateway/tokenbased</value> </param> <param> <name>knox.token.exp.server-managed</name> <value>true</value> </param> <param> <name>knox.token.renewer.whitelist</name> <value>guest</value> </param> </service>
If you plan to create a new topology for this purpose, please change the perf.test.usecase.knoxtoken.topology.gateway
configuration accordingly.
As you can see, the newly added service references another topology called tokenbased
. As its name suggests, that particular topology uses JWT authentication and is configured as follows:
<?xml version="1.0" encoding="UTF-8"?> <topology> <name>tokenbased</name> <gateway> <provider> <role>federation</role> <name>JWTProvider</name> <enabled>true</enabled> <param> <name>knox.token.audiences</name> <value>tokenbased</value> </param> </provider> </gateway> <service> <role>WEBHDFS</role> <url>http://YOUR_HDFS_SERVICE_HOST:20101/webhdfs</url> </service> </topology>
Since the 'KnoxToken Use Case'
tries to use an already acquired Knox DT to run an action I chose to do this as simple as possible: using KnoxShell's class we issue an ls
command with a KnoxShell session that uses a Knox DT. It is very important that the tokenbased
topology comes with the WEBHDFS
service for this purpose.
If you plan to have this topology with a different name (or you already have one that uses JWT and has WEBHDFS
), please update the perf.test.usecase.knoxtoken.topology.tokenbased
configuration accordingly.
How to run
Running the performance tool is as simple as running the following Maven command in the project root:
mvn -DskipTests -Dcheckstyle.skip=true -Dfindbugs.skip=true -Dpmd.skip=true -Drat.skip -Pgateway-performance-test package -am -pl gateway-performance-test
The tool will pick up the above-mentioned configuration file and execute all enabled use-case runners (currently there is only one implementation). You can make the desired changes in that properties file before executing your performance test rounds as your requirements needs. For instance, increasing the number of parallel threads to 10 and the test duration to 6 hours you need to update
perf.test.usecase.knoxtoken.numOfThreads=10 perf.test.usecase.knoxtoken.testDurationInSecs=21600
The JSON/YAML test results are generated under target/testResults/[json|yaml]:
heapGauges.YYYY-MM-DD.[json|yaml]. Sample:
--- metrics:name=heap.init,type=gauges: Number: 268435456 Value: 268435456 metrics:name=heap.usage,type=gauges: Number: 0.04556474316352607 Value: 0.04556474316352607 metrics:name=heap.committed,type=gauges: Number: 537919488 Value: 537919488 metrics:name=heap.max,type=gauges: Number: 3817865216 Value: 3817865216 metrics:name=heap.used,type=gauges: Number: 174097040 Value: 174094952
responseTimes.YYYY-MM-DD.[json|yaml]. Sample:
--- acquireResponseTimes: _data: - 204 - 204 - 204 - 18 - 19 - 21 - 19 - 16 - 13 - 11 - 10 - 14 - 13 - 11 - 12 - 11 - 14 - 10 mode: - 11.0 - 204.0 min: 10.0 max: 204.0 mean: 45.777777777777786 geometricMean: 21.52096561302238 renewResponseTimes: _data: - 449 mode: - 449.0 min: 449.0 max: 449.0 mean: 449.0 geometricMean: 449.0000000000001
timers.YYYY-MM-DD.[json|yaml]. Sample:
--- metrics:name=client./gateway/sandbox/knoxtoken/api/.POST-requests,type=timers: Mean: 402.24274066722455 StdDev: 91.5058294023066 "75thPercentile": 418.193352 "98thPercentile": 628.076518 RateUnit: "events/second" "95thPercentile": 628.076518 "99thPercentile": 628.076518 Max: 628.076518 Count: 9 FiveMinuteRate: 0.00943518979245522 "50thPercentile": 404.177556 MeanRate: 0.0036809848722434983 Min: 115.866694 OneMinuteRate: 0.01908228359419756 DurationUnit: "milliseconds" "999thPercentile": 628.076518 FifteenMinuteRate: 0.017525836234274554 metrics:name=client./gateway/sandbox/knoxtoken/api/.GET-requests,type=timers: Mean: 11.77009795297069 StdDev: 8.295945542138911 "75thPercentile": 12.517118 "98thPercentile": 39.362919999999995 RateUnit: "events/second" "95thPercentile": 31.40917 "99thPercentile": 39.362919999999995 Max: 1183.7718479999999 Count: 77 FiveMinuteRate: 0.11133648462048056 "50thPercentile": 8.21257 MeanRate: 0.03134296740421139 Min: 6.442307 OneMinuteRate: 0.2513294281670072 DurationUnit: "milliseconds" "999thPercentile": 39.362919999999995 FifteenMinuteRate: 0.12650178822825908 metrics:name=client./gateway/tokenbased/webhdfs/v1.GET-requests,type=timers: Mean: 529.0935038293519 StdDev: 37.26100859914753 "75thPercentile": 540.324314 "98thPercentile": 682.462197 RateUnit: "events/second" "95thPercentile": 540.324314 "99thPercentile": 682.462197 Max: 4007.184354 Count: 8 FiveMinuteRate: 0.012584932473001562 "50thPercentile": 520.5765289999999 MeanRate: 0.003278659499845765 Min: 462.182908 OneMinuteRate: 0.033547665513349575 DurationUnit: "milliseconds" "999thPercentile": 682.462197 FifteenMinuteRate: 0.018536358285136553
tokenStateStatistics.YYYY-MM-DD.[json|yaml]. Sample:
--- metrics:name=TokenStateService,type=Statistics: KeystoreInteractions: removeAlias: 11 saveAlias: 25 getAlias: 41 GatewayCredentialsFileSize: 89299 NumberOfTokensAdded: 89 NumberOfTokensRenewed: 11