Table of Contents |
---|
Apache GitHub Pull Request
https://github.com/apache/knox/pull/365(This document is still Work In Progress!)
Performance test tool design
The general goal is to have an extendable performance test framework that will drive the Knox Gateway configured by different use cases.
...
- easy to extend with new use-cases
- has the ability to execute long-running jobs
- has the ability to connect to it and query different metrics
- generate meaningful and easy-to-read reports of those metrics automatically (in configurable time periods)
...
Code Block |
---|
# Gateway connection related properties perf.test.gateway.url.protocol=https perf.test.gateway.url.host=localhost perf.test.gateway.url.port=8443 perf.test.gateway.jmx.port=8888 # report generation related properties perf.test.report.generation.periodInSecs=30 perf.test.report.generation.json.enabled=true perf.test.report.generation.yaml.enabled=true # Knox Token use case related properties perf.test.usecase.knoxtoken.enabled=true perf.test.usecase.knoxtoken.topology.gateway=sandbox perf.test.usecase.knoxtoken.topology.tokenbased=tokenbased perf.test.usecase.knoxtoken.numOfThreads=3 perf.test.usecase.knoxtoken.testDurationInSecs=60 perf.test.usecase.knoxtoken.requestDelayLowerBoundInSecs=5 perf.test.usecase.knoxtoken.requestDelayUpperBoundInSecs=10 |
As of today (17 Aug 2020), there is only one use-case implementation exists to address the above-written acquire/renew/use Knox Delegation token case. The related resources (Java classes, properties files) are located in the gateway-performance-test
Maven module. Here is the list of the most relevant resources:
src/main/java/org.apache.knox.gateway.performance.test
PerformanceTestRunner
- this is the entry point of the tool. This class comes with a main method that reads the given configuration file and executes all enabled use-case runnersResponseTimeCache
- this class acts as a holder of response times and shared between the worker threads (which write into the cache) and the report generation threads (reading data from it)reporting.GatewayMetricsReporter
- this class generates the human-readable reports in JSON and YAML format in a fixed schedule marked byperf.test.report.generation.periodInSecs
knoxtoken.KnoxTokenUseCaseRunner
- this class is responsible for- start
N
worker threads that are acquiring Knox DTs parallel (marked byperf.test.usecase.knoxtoken.numOfThreads
) - and 2 more threads to
- renew an already acquired Knox DT
- do an HDFS ls command using an already acquired Knox DT
- start
knoxtoken.KnoxTokenWorkerThread
- this represents the job that actually acquires/renews/uses Knox DTs. The renew/use actions are running only on 1-1 thread and they wait 2 times more time between two subsequential calls then simply executing the acquire action. In other words, by default, a worker thread acquires a Knox DT between every 5 to 10 seconds (onN
threads) whereas a worker thread which renews/uses a previously acquired Knox token waits between 10-20 seconds.
- stores the already acquired Knox DTs (if the number of DTs reaches 500 the cache is cleaned automatically)knoxtoken.
KnoxTokenCache
src/test/resources
performance.test.configuration.properties
- contains the above-described configuration fileperformanceTest-log4j.properties
- the Log4j configuration of the tool. By default, it prints log messages on the STDOUT as well as writes them intotarget/logs/performanceTest.log
Knox gateway requisites
The performance test tool tries to connect to a Knox instance to
- acquire/renew/use Knox delegation tokens using its token API (
/knoxtoken/api/v1/token
) - fetch useful metrics via JMX
...
Code Block |
---|
<property> <name>gateway.metrics.enabled</name> <value>true</value> </property> <property> <name>gateway.jmx.metrics.reporting.enabled</name> <value>true</value> </property> |
Once you opened the necessary JMX port, you also need to make sure you have at least one topology with the KNOXTOKEN
service. During my tests, I extended the sandbox
topology with the following service configuration:
Code Block |
---|
<service>
<role>KNOXTOKEN</role>
<param>
<name>knox.token.ttl</name>
<value>36000000</value>
</param>
<param>
<name>knox.token.audiences</name>
<value>tokenbased</value>
</param>
<param>
<name>knox.token.target.url</name>
<value>https://localhost:8443/gateway/tokenbased</value>
</param>
<param>
<name>knox.token.exp.server-managed</name>
<value>true</value>
</param>
<param>
<name>knox.token.renewer.whitelist</name>
<value>guest</value>
</param>
</service> |
If you plan to create a new topology for this purpose, please change the perf.test.usecase.knoxtoken.topology.gateway
configuration accordingly.
As you can see, the newly added service references another topology called tokenbased
. As its name suggests, that particular topology uses JWT authentication and is configured as follows:
Code Block |
---|
<?xml version="1.0" encoding="UTF-8"?>
<topology>
<name>tokenbased</name>
<gateway>
<provider>
<role>federation</role>
<name>JWTProvider</name>
<enabled>true</enabled>
<param>
<name>knox.token.audiences</name>
<value>tokenbased</value>
</param>
</provider>
</gateway>
<service>
<role>WEBHDFS</role>
<url>http://YOUR_HDFS_SERVICE_HOST:20101/webhdfs</url>
</service>
</topology> |
Since the 'KnoxToken Use Case'
tries to use an already acquired Knox DT to run an action I chose to do this as simple as possible: using KnoxShell's class we issue an ls
command with a KnoxShell session that uses a Knox DT. It is very important that the tokenbased
topology comes with the WEBHDFS
service for this purpose.
If you plan to have this topology with a different name (or you already have one that uses JWT and has WEBHDFS
), please update the perf.test.usecase.knoxtoken.topology.tokenbased
configuration accordingly.
How to run
Running the performance tool is as simple as running the following Maven command in the project root:
...
Code Block |
---|
perf.test.usecase.knoxtoken.numOfThreads=10 perf.test.usecase.knoxtoken.testDurationInSecs=21600 |
Test results
The JSON/YAML test results are generated under target/testResults/[json|yaml]
:
heapGauges.YYYY-MM-DD.[json|yaml]
. Sample:Code Block --- metrics:name=heap.init,type=gauges: Number: 268435456 Value: 268435456 metrics:name=heap.usage,type=gauges: Number: 0.04556474316352607 Value: 0.04556474316352607 metrics:name=heap.committed,type=gauges: Number: 537919488 Value: 537919488 metrics:name=heap.max,type=gauges: Number: 3817865216 Value: 3817865216 metrics:name=heap.used,type=gauges: Number: 174097040 Value: 174094952
r
esponseTimes.YYYY-MM-DD.[json|yaml]
. Sample:Code Block --- acquireResponseTimes: _data: - 204 - 204 - 204 - 18 - 19 - 21 - 19 - 16 - 13 - 11 - 10 - 14 - 13 - 11 - 12 - 11 - 14 - 10 mode: - 11.0 - 204.0 min: 10.0 max: 204.0 mean: 45.777777777777786 geometricMean: 21.52096561302238 renewResponseTimes: _data: - 449 mode: - 449.0 min: 449.0 max: 449.0 mean: 449.0 geometricMean: 449.0000000000001
timers.YYYY-MM-DD.[json|yaml]
. Sample:Code Block --- metrics:name=client./gateway/sandbox/knoxtoken/api/.POST-requests,type=timers: Mean: 402.24274066722455 StdDev: 91.5058294023066 "75thPercentile": 418.193352 "98thPercentile": 628.076518 RateUnit: "events/second" "95thPercentile": 628.076518 "99thPercentile": 628.076518 Max: 628.076518 Count: 9 FiveMinuteRate: 0.00943518979245522 "50thPercentile": 404.177556 MeanRate: 0.0036809848722434983 Min: 115.866694 OneMinuteRate: 0.01908228359419756 DurationUnit: "milliseconds" "999thPercentile": 628.076518 FifteenMinuteRate: 0.017525836234274554 metrics:name=client./gateway/sandbox/knoxtoken/api/.GET-requests,type=timers: Mean: 11.77009795297069 StdDev: 8.295945542138911 "75thPercentile": 12.517118 "98thPercentile": 39.362919999999995 RateUnit: "events/second" "95thPercentile": 31.40917 "99thPercentile": 39.362919999999995 Max: 1183.7718479999999 Count: 77 FiveMinuteRate: 0.11133648462048056 "50thPercentile": 8.21257 MeanRate: 0.03134296740421139 Min: 6.442307 OneMinuteRate: 0.2513294281670072 DurationUnit: "milliseconds" "999thPercentile": 39.362919999999995 FifteenMinuteRate: 0.12650178822825908 metrics:name=client./gateway/tokenbased/webhdfs/v1.GET-requests,type=timers: Mean: 529.0935038293519 StdDev: 37.26100859914753 "75thPercentile": 540.324314 "98thPercentile": 682.462197 RateUnit: "events/second" "95thPercentile": 540.324314 "99thPercentile": 682.462197 Max: 4007.184354 Count: 8 FiveMinuteRate: 0.012584932473001562 "50thPercentile": 520.5765289999999 MeanRate: 0.003278659499845765 Min: 462.182908 OneMinuteRate: 0.033547665513349575 DurationUnit: "milliseconds" "999thPercentile": 682.462197 FifteenMinuteRate: 0.018536358285136553
tokenStateStatistics.YYYY-MM-DD.[json|yaml]
. Sample:Code Block --- metrics:name=TokenStateService,type=Statistics: KeystoreInteractions: removeAlias: 11 saveAlias: 25 getAlias: 41 GatewayCredentialsFileSize: 89299 NumberOfTokensAdded: 89 NumberOfTokensRenewed: 11