Amazon S3 is a “Simple Storage Service” offered by Amazon Web Services (AWS) that provides object storage through web services interfaces (REST, SOAP, and BitTorrent), as well as a secure method of storing files.

But getting to S3 from an EC2 instance that does not allow direct access to the internet can be difficult. First, you have to configure Internet Gateways, NAT Gateways and manage the route-tables to enable the EC2 instance to access public resources. Then, you need to send “signed” requests to access non-public data from S3 buckets.

There’s an easier way.

The solution I detail below provides a way to convert unauthenticated requests from an EC2 instance to authenticated requests using Citrix ADC as a Secure S3 Proxy without editing the routes for the EC2 instance.

Authenticating Requests (AWS Signature version 4)

The basic storage units of Amazon S3 are objects which are organized into buckets. Buckets and objects can be created, listed, and retrieved using REST APIs. These buckets can be made public or accessible only to particular users. If the bucket is made non-public, then the HTTP requests need to be signed.

The following diagram illustrates the process of computing the signature:

A sample GET request for a file, Test.txt from a S3 bucket, TestBucket looks like the following:

GET /TestBucket/Test.txt HTTP/1.1
User-Agent: curl/7.13.1 (x86_64-unknown-freebsd6.3) libcurl/7.13.1 OpenSSL/1.0.1p zlib/1.2.3
Host: 11.12.13.14
Accept: */*

The S3 Proxy needs to rewrite the request transparently to the following:

GET /TestBucket/Test.txt HTTP/1.1
User-Agent: <user-agent>
Host: <hostname>
Accept: */*
x-amz-content-sha256: <hashed content>
Authorization: AWS4-HMAC-SHA256 Credential=<AccessID>/<Date>/<Region>/s3/aws4_request,SignedHeaders=host;x-amz-content-sha256;x-amz-date,Signature=<calculated signature>
x-amz-date:<date>

Signing the HTTP request using Citrix ADC

This solution uses a python script to construct the Authorization header, this python script is invoked by http callout. The python script can be run on any external machine.

Once Citrix ADC receives the response from the python script, it inserts all the headers (x-amz-content-sha256, Authorization & x-amz-date) using HTTP rewrite.

Citrix ADC also replaces the hostname to the S3 domain name.

Configuring the Python Server

  1. Create a user on AWS and generate access key for the same user [You can also use an existing User]
  2. Update the ‘bucket policy’ for the S3 bucket so that only the above User can access this S3 bucket [Check Annexure]
  3. Copy the python script [Annexure] on a Linux server in the same VPC [Any server reachable to the VPX can be used]
  4. SSH to the Linux server
  5. Edit the python script to update the host & region
  6. Set the environment variables for the access-key & secret-key

export ACCESS_KEY=<access_key>
export SECRET_KEY=<secret_key>

  1. Run the python script in the background
  2. Kill the SSH session

Configuring the Citrix ADC

  1. Configure a HTTP domain based servicegroup and bind to an SSL LB vserver (HTTP can also be used)

add nameserver <nameserver>
add server server_s3 <S3 domain name>
add servicegroup sg_s3_aws SSL
bind servicegroup sg_s3_aws server_s3 443
add lb vserver vs_s3 SSL <VIP> 443
bind lb vserver vs_s3 sg_s3
add ssl certkey key1 -cert <cert> -key <key>
bind ssl vserver vs_s3 -certkeyName key1

  1. Configure httpcallout to send traffic to the Python server

add ns variable var1 -type text(64000) -scope global
add policy httpCallout hc1 -IPAddress <Python Server> -port 8000 -returnType TEXT -hostExpr HTTP.REQ.HOSTNAME -urlStemExpr "\"calc_signature_v4.py\"" -headers url(HTTP.REQ.URL) -scheme http -resultExpr "HTTP.RES.BODY(2000)"
add ns assignment assign_var1 -variable "$var1" -set "sys.http_callout(hc1)"

  1. Configure rewrite configuration to insert the HTTP headers and replace the Host header

add rewrite action rw_act_insert_s3_header insert_http_header x-amz-content-sha256 "$var1"
add rewrite action rw_act_hostname replace HTTP.REQ.HOSTNAME <S3 domain name>
add rewrite policy rw_pol_assign_var1 TRUE assign_var1
add rewrite policy rw_pol_insert_s3_header TRUE rw_act_insert_s3_header
add rewrite policy rw_pol_replace_hostname TRUE rw_act_hostname
bind lb vserver vs_s3 -policyName rw_pol_assign_var1 -priority 10 -gotoPriorityExpression NEXT -type REQUEST
bind lb vserver vs_s3 -policyName rw_pol_insert_s3_header -priority 20 -gotoPriorityExpression NEXT -type REQUEST
bind lb vserver vs_s3 -policyName rw_pol_replace_hostname -priority 30 -gotoPriorityExpression END -type REQUEST

  1. Configure integrated caching to speed up the performance [OPTIONAL]

add cache contentGroup cache_cg1 -relExpiry 300
add cache policy cache_pol_s3 -rule TRUE -action CACHE -storeInGroup cache_cg1 -undefAction NOCACHE
bind cache global cache_pol_s3 -priority 100 -gotoPriorityExpression END -type REQ_OVERRIDE

Note: Internet access needs to be provided to the server-side subnet. Please refer to https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Internet_Gateway.html for the same.

Annexure — Complete Citrix ADC Configuration

enable ns feature LB REWRITE IC
enable ns mode USNIP
set cache parameter -memLimit 1000

add nameserver <nameserver>
add server server_s3 s3.eu-central-1.amazonaws.com
add servicegroup sg_s3_aws SSL
bind servicegroup sg_s3_aws server_s3 443
add lb vserver vs_s3 SSL <VIP> 443
bind lb vserver vs_s3 sg_s3_aws
add ssl certkey key1 -cert <cert> -key <key>
bind ssl vserver vs_s3 -certkeyName key1

add ns variable var1 -type text(64000) -scope global
add policy httpCallout hc1 -IPAddress <Python Server> -port 8000 -returnType TEXT -hostExpr HTTP.REQ.HOSTNAME -urlStemExpr "\"calc_signature.py\"" -headers url(HTTP.REQ.URL) -scheme http -resultExpr "HTTP.RES.BODY(2000)"
add ns assignment assign_var1 -variable "$var1" -set "sys.http_callout(hc1)"
add rewrite action rw_act_insert_s3_header insert_http_header x-amz-content-sha256 "$var1"
add rewrite action rw_act_hostname replace HTTP.REQ.HOSTNAME "\"s3.eu-central-1.amazonaws.com\""
add rewrite policy rw_pol_assign_var1 TRUE assign_var1
add rewrite policy rw_pol_insert_s3_header TRUE rw_act_insert_s3_header
add rewrite policy rw_pol_replace_hostname TRUE rw_act_hostname
bind lb vserver vs_s3 -policyName rw_pol_assign_var1 -priority 10 -gotoPriorityExpression NEXT -type REQUEST
bind lb vserver vs_s3 -policyName rw_pol_insert_s3_header -priority 20 -gotoPriorityExpression NEXT -type REQUEST
bind lb vserver vs_s3 -policyName rw_pol_replace_hostname -priority 30 -gotoPriorityExpression END -type REQUEST

add cache contentGroup cache_cg1 -relExpiry 300
add cache policy cache_pol_s3 -rule TRUE -action CACHE -storeInGroup cache_cg1 -undefAction NOCACHE
bind cache global cache_pol_s3 -priority 100 -gotoPriorityExpression END -type REQ_OVERRIDE

Python Script

from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
import os
import sys, os, base64, datetime, hashlib, hmac
import requests

# ************* EDIT VALUES AS PER S3 REGION USED *************
method = 'GET'
service = 's3'
host = 's3.eu-central-1.amazonaws.com'
region = 'eu-central-1'
# ************************** END ******************************

access_key = os.environ['ACCESS_KEY'] 
secret_key = os.environ['SECRET_KEY']
if access_key is None or secret_key is None:
    print('No access key is available.')
    sys.exit()

#Create custom HTTPRequestHandler class  
class KodeFunHTTPRequestHandler(BaseHTTPRequestHandler):

  #handle GET command  
  def do_GET(self):
    try:
      if self.path.endswith('.py'):
        #Get the URL
        url1 = self.headers['url']
        #Calculate headers
        head1 = calc_header(url1)
        #send code 200 response  
        self.send_response(200)
        #send header first  
        self.send_header('Content-type','text-html')
        self.end_headers()
        #send content to client  
        self.wfile.write(head1)
      return

    except IOError:
      self.send_error(404, 'file not found')

def sign(key, msg):
    return hmac.new(key, msg.encode('utf-8'), hashlib.sha256).digest()

def getSignatureKey(key, dateStamp, regionName, serviceName):
    kDate = sign(('AWS4' + key).encode('utf-8'), dateStamp)
    kRegion = sign(kDate, regionName)
    kService = sign(kRegion, serviceName)
    kSigning = sign(kService, 'aws4_request')
    return kSigning
   
def calc_header(canonical_uri):
  t = datetime.datetime.utcnow()
  amzdate = t.strftime('%Y%m%dT%H%M%SZ')
  datestamp = t.strftime('%Y%m%d') # Date w/o time, used in credential scope
  endpoint = 'http://' + host + canonical_uri
  canonical_querystring = '';
  payload_hash = hashlib.sha256(('').encode('utf-8')).hexdigest()
  canonical_headers = 'host:' + host + '\n' + 'x-amz-content-sha256:' + payload_hash \
  + '\n' + 'x-amz-date:' + amzdate + '\n'
  signed_headers = 'host;x-amz-content-sha256;x-amz-date'
  canonical_request = method + '\n' + canonical_uri + '\n' + canonical_querystring \
  + '\n' + canonical_headers + '\n' + signed_headers + '\n' + payload_hash
  algorithm = 'AWS4-HMAC-SHA256'
  credential_scope = datestamp + '/' + region + '/' + service + '/' + 'aws4_request'
  string_to_sign = algorithm + '\n' +  amzdate + '\n' +  credential_scope + '\n' \
  +  hashlib.sha256(canonical_request.encode('utf-8')).hexdigest()
  signing_key = getSignatureKey(secret_key, datestamp, region, service)
  signature = hmac.new(signing_key, (string_to_sign).encode('utf-8'), hashlib.sha256).hexdigest()
  authorization_header = algorithm + ' ' + 'Credential=' + access_key + '/' + credential_scope \
  + ',' +  'SignedHeaders=' + signed_headers + ',' + 'Signature=' + signature
  headers = {'x-amz-date':amzdate, 'Authorization':authorization_header\
        , 'x-amz-content-sha256':payload_hash}
  request_url = endpoint
  header = payload_hash + '\r\n' + 'Authorization: ' + authorization_header + '\r\n' \
        + 'x-amz-date: ' + amzdate + '\r\n'
  return header
      
def run():
  #ip and port of server  
  #by default http server port is 8000  
  server_address = ('127.0.0.1', 8000)
  httpd = HTTPServer(server_address, KodeFunHTTPRequestHandler)
  httpd.serve_forever()

if __name__ == '__main__':
  run()

Sample Bucket Policy

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam:::user/"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::/*"
}
]
}

Accessing S3 from an EC2 instance doesn’t have to be complicated. Using Citrix ADC as a Secure S3 Proxy will help you enable a secure way to store files.

Learn more about Citrix ADC on the product documentation page.