Table of Contents | ||
---|---|---|
|
Rewrite rules in Apache Knox can be difficult to follow if you are just starting to use Apache Knox, this blog tries to cover the basics of Apache Knox rewrite rules and then go in depth and talk about more advanced rules and how to use them. This blog builds upon the Adding a service to Apache Knox by Kevin Minder
...
- As a matter of convention this should match the directory beneath the service implementation name.
- The topology file can optionally contain <topology><service><version> but usually doesn’t. This would be used to select a specific version of an implementation there were multiple. This can be important if the protocols for a service evolve over time.
...
Code Block |
---|
<service><routes><route path="/weather/**"></routes></service> |
- This tells the gateway that all requests starting starting with /weather/ are handled by this service.
- Due to a limitation this will not include requests to /weather (i.e. no trailing /)
- The
**
means zero or more paths similar to Ant. - The scheme, host, port, gateway and topology components are not included (e.g. https://localhost:8443/gateway/sandbox)
- Routes can, but typically don’t, take query parameters into account.
- In this simple form there is no direct relationship between the route path and the rewrite rules!
Simple rewrite rules
...
Code Block |
---|
<rules><rule pattern="*://*:*/**/weather/{path=**}?{**}"/></rules> |
...
- Defines how the URL matched by the rule will be rewritten.
- The $serviceUrl[WEATHER]} looks up the <service><url> for the <service><role>WEATHER. This is a implemented as rewrite function and is another custom extension point.
- The {path=**} extracts zero or more values for the 'path’ parameter from the matched URL.
- The {**} extracts any “unused” parameters and uses them as query parameters.
...
Scope
Rewrite Rewrites rules can be applied to inbound (requests going to the Gateway - from browser, curl etc.) or outbound (response going from the Gateway towards browser) requests/responses. The direction is indicated by the "dir" attribute
Code Block |
---|
<rule dir="IN"> |
The possible values are IN and OUT for inbound and outbound requests.
Flow
Flows are the logical AND, OR, ALL operators on the rules. So, a rewrite rule could match a pattern A OR pattern B, a rule could match a pattern A AND pattern B, a rule could match ALL the given patterns.
Valid flow values are:
- OR
- AND
- ALL
e.g. OR (match )
global and local to the service they are defined in. After Apache Knox 0.6.0 all the rewrites rules are local unless they are explicitly defined as global.
To define global rules use the property 'gateway.global.rules.services' in 'gateway-site.xml' that takes a list of services whose rewrite rules are made global. for. e.g.
Code Block |
---|
<property>
<name>gateway.global.rules.services</name>
<value>"NAMENODE","JOBTRACKER", "WEBHDFS", "WEBHCAT", "OOZIE", "WEBHBASE", "HIVE", "RESOURCEMANAGER"</value>
</property> |
Note: Rewrite rules rules for these services "NAMENODE","JOBTRACKER", "WEBHDFS", "WEBHCAT", "OOZIE", "WEBHBASE", "HIVE", "RESOURCEMANAGER" are global by default.
If you want to define a single rule to be scoped inside a global rewrite rules you can do so by using the attribute 'scope' e.g.
Code Block |
---|
<!-- Limit the scope of this rule just to WEBHDFS service -->
<rule dir="OUT" scope="WEBHDFS" name="WEBHDFS/webhdfs/outbound" pattern="hdfs://*:*/{path=**}?{**}">
<rewrite template="{$frontend[url]}/webhdfs/v1/{path=**}?{**}"/>
</rule> |
Direction
Rewrite rules can be applied to inbound (requests going to the Gateway - from browser, curl etc.) or outbound (response going from the Gateway towards browser) requests/responses. The direction is indicated by the "dir" attribute
Code Block |
---|
<rule dir="IN"> |
The possible values are IN and OUT for inbound and outbound requests.
Flow
Flows are the logical AND, OR, ALL operators on the rules. So, a rewrite rule could match a pattern A OR pattern B, a rule could match a pattern A AND pattern B, a rule could match ALL the given patterns.
Valid flow values are:
- OR
- AND
- ALL
e.g. OR (match )
Code Block |
---|
<rule name="test-rule-with-complex-flow" flow="OR">
<match pattern="*://*:*/~/{path=**}?{**}">
<rewrite template="test-scheme-output://test-host-output:777/test-path-output/test-home/{path}?{**}"/>
</match>
<match pattern="*://*:*/{path=**}?{**}">
<rewrite template="test-scheme-output://test-host-output:42/test-path-output/{path}?{**}"/>
</match>
</rule> |
Rewrite Variables
These variables can be used with the rewrite function.
$username
Username of authenticated user
Code Block |
---|
<rule name="OOZIE/oozie/user-name">
<rewrite template="{$username}"/>
</rule> |
$inboundurl
Code Block |
---|
<rule dir="OUT" name="NODEUI/node/static" pattern="/static/{**}">
<rewrite template="{$frontend[url]}/node/static/{**}?host={$inboundurl[host]}"/>
</rule> |
$serviceAddr
Code Block |
---|
<rule name="hdfs-addr">
<rewrite template="hdfs://{$serviceAddr[NAMENODE]}"/>
</rule> |
$serviceHost
Code Block |
---|
<rule name="nn-host">
<rewrite template="{$serviceHost[NAMENODE]}"/>
</rule> |
$serviceMappedAddr
Code Block |
---|
<rule name="OOZIE/oozie/name-node-url">
<rewrite template="hdfs://{$serviceMappedAddr[NAMENODE]}"/>
</rule> |
$serviceMappedHost
Code Block |
---|
$serviceMappedUrl
Code Block |
---|
<match pattern="{path=**}">
<rewrite template="{$serviceMappedUrl[NAMENODE]}/{path=**}"/>
</match> |
$servicePath
Code Block |
---|
<rule name="nn-path">
<rewrite template="{$servicePath[NAMENODE]}"/>
</rule> |
$servicePort
Code Block |
---|
<rule name="hdfs-path">
<match pattern="{path=**}"/ |
Code Block |
<rule name="test-rule-with-complex-flow" flow="OR"> <match pattern="*://*:*/~/{path=**}?{**}"> <rewrite template="test-scheme-output://test-host-output:777/test-path-output/test-home/{path}?{hdfs://{$serviceHost[NAMENODE]}:{$servicePort[NAMENODE]}/{path=**}"/> </match> <matchrule> |
$serviceScheme
Code Block |
---|
<rule dir="IN" name="NODEUI/logs" pattern="*://*:*/**/{path=**node/logs/?{host}?{**port}"> <rewrite template="test-scheme-output{$serviceScheme[NODEUI]}://test-{host-output:42/test-path-output/{path}?{**}"/> </match> </rule> |
Rewrite Variables
...
}:{port}/logs/"/>
</rule> |
$serviceUrl
- $serviceUrl[SERVICE_NAME] - looks up the <service><url> for the <service><role>SERVICE_NAME
...
$import - This function enhances the $frontend function by adding '@import' prefix to the $frontend path. e.g.
Code Block <rewrite template="{$import[", url]}/stylesheets/pretty.css";"/>
. It takes following parameters as options:
- $import - Adds @import as a prefix to the frontend url. e.g. @import https://localhost:8443/gateway/sandbox
- $import[", url] - Adds " as prefix along with @import, the rewritten frontend url will be '@import https://localhost:8443/gateway/sandbox
$username
$username - This variable is used when we need to get the impersonated principal name (primary principal in case impersonated principal is absent).
Code Block <rewrite template="test-output-scheme://{host}:{port}/test-output-path/{path=**}?user.name={$username}?{**}?test-query-output-name=test-query-output-value"/>
...
Code Block |
---|
<topology> <gateway> ... <provider> <role>hostmap</role> <name>static</name> <enabled>true</enabled> <param><name>external-host-name</name><value>internal-host-name</value></param> </provider> ... </gateway> ... </topology> |
$inboundurl
Only used by outbound rules
Code Block |
---|
<rewrite template="{gateway.url}/datanode/static/{**}?host={$inboundurl[host]}"/> |
Rules Filter
Sometimes you want the ability to rewrite the *.js, *.css and other non-html pages. FIlters are a way to rewrite these non-html files. FIlters are based on the content-type of the page.
...
Uses Content-Type "application/xml", "text/xml", "*/xml"
Pattern Matching
Pattern matching for Knox unfortunately does not match the standard Regex format. Following is how pattern matching works in some of the cases
...
query = $7
fragment = $9
JSON Parsing
For parsing JSON documents Knox uses JSONPATH
...
* http://www.ics.uci.edu/pub/ietf/uri/#Related