UTF-8 is a compact, efficient Unicode and multi-byte encoding standard/format in which each character can be encoded in as little as one byte and as many as four bytes. Unicode is the only practical character set option for applications that support multilingual documents.

UTF-8 is an important encoding because it is ASCII compatible, easily processed, unlike other multi-byte encodings.

NetScaler processes several multi-lingual applications, processing ASCII as well as UTF-8 content. SET_CHAR_SET(UTF_8) method enables NetScaler to make policy based decision on UTF-8 content using AppExpert expressions.

Let’s understand this in-depth, with an example:

USE CASE:

Consider a case where the response page contains three different languages, Chinese, Japanese and Greek. Now, the NetScaler needs to understand all the three languages and translate them to their equivalent meaning in English.

  • Add patset Sample_Patset
  • Bind patset Sample_Patset “日”
  • Bind patset Sample_Patset “月”
  • Bind patset Sample_Patset ” Ιούλιος”

With these commands, we have created a Pattern Set List Sample_Patset as mentioned below, which contains the translation of July in Chinese (), Japanese () and Greek (Ιούλιος)

Sample_Patset

Index Pattern Set List
1
2
3 Ιούλιος


  • Add rewrite action Act1 Replace_all HTTP.RES.BODY(10000) ‘”July”‘ -search patset(“Sample_Patset”)

Action Act1, searches for occurrence of any of the patterns mentioned in the Sample_Pattern list in the response data and replaces those with July.

  • Add rewrite policy Pol1 HTTP.RES.BODY(10000).SET_CHAR_SET(UTF_8).CONTAINS_ANY(“Sample_Patset”) Act1

Pol1 policy using SET_CHAR_SET(UTF_8) method, sets the character set to UTF-8 to use for subsequent functions that  operates on given text i.e. Sample_Patset list patterns in this case.

Now, when HTTP Response Data as given below is received by NetScaler from the backend servers, NetScaler replaces the strings which matched with defined patset and sends it to client.

HTTP Response Data received from the Backend Servers

< HTTP/1.1 200 OK

< Date: Wed, 20 Jul 2011 09:33:06 GMT

< Server: Apache/2.2.6 (Fedora)

< Accept-Ranges: bytes

< Content-Length: 149

< Connection: close

< Content-Type: text/html; charset=UTF-8

<html><head><meta http-equiv=”Content-Type” content=”text/html; charset=ISO-8859-1″></head><head></head><body>

日 月 Ιούλιος

</body></html>

Response after it is manipulated by configured Action on NetScaler

< HTTP/1.1 200 OK

< Date: Wed, 20 Jul 2011 09:35:13 GMT

< Server: Apache/2.2.6 (Fedora)

< Accept-Ranges: bytes

< Content-Length: 141

< Connection: close

< Content-Type: text/html; charset=UTF-8

<html><head><meta http-equiv=”Content-Type” content=”text/html; charset=ISO-8859-1″></head><head></head><body>

July July July

</body></html>