I'm doing something very similar to this, the setup I'm using is:<p>DNSMadeEasy has a global traffic redirector ( <a href="http://www.dnsmadeeasy.com/services/global-traffic-director/" rel="nofollow">http://www.dnsmadeeasy.com/services/global-traffic-director/</a> )<p>That then sends a request to the closest Linode data center.<p>Linode instances run nginx which redirect to Varnish, and the Varnish backend is connected via VPN to the main app servers (based in the London datacenter as the vast majority of my users are in London).<p>I use Varnish behind nginx to additionally place a fast cache close to the edge to prevent unnecessary traffic over the VPN.<p>Example: USA to London traffic passes over the VPN running within Linode, and the SSL connection for an East Coast user is just going to Newark. If the requested file was for a recently requested (by some other user) static file, then the file would come from Varnish and the request would not even leave the Newark data center.
Maybe I don't understand the problem correctly, but why not just preflight an HTTPS request when your widget loads?<p>In the time it takes the user to pick their file(s) to upload, the initial SSL negotiation will most likely have finished. And if you upload multiple files serially, the browser should even reuse the current SSL context, so it wouldn't be ~300ms per file.
How do you manage the keepalive connection pool? Are you managing this in nginx (via HTTP 1.1 backend support?) or using a different service?<p>We ran a test of this approach using a similar stack in 2010. We had Ireland, Singapore, Sydney backhauling to Dallas, TX for a reasonably large population of users. Managing the backend pool was a bit of a challenge without custom code. nginx didn't yet support HTTP 1.1 backend connections. The two best options I could find at that time were Apache TrafficServer and perlbal. perlbal won and was pretty easy to set up with a stable warm connection pool.<p>Despite good performance gains we didn't put the system into production. The monitoring and maintenance burden was high and we lacked at that time a homogeneous network -- I tested Singapore and Australia using VPS providers as Amazon and SoftLayer (our vendors of choice) weren't there yet.<p>As a side-effect of using the VPS vendors we did and trying to keep costs in control, we had to ratchet the TTL for this service down uncomfortably low to allow for cross-region failover. In Australia the additional DNS hit nearly wiped out the gains in SSL negotiation.<p>With today's increased geographical coverage and rich set of services from Amazon, this is a much less daunting project if you can stomach the operational overhead.<p>Note that the lack of sanely-priced bandwidth and hosting providers in Australia is a huge problem. When Amazon lands EC2 there, it's going to really shake up that market.
So, if my understanding is correct, are they are trading SSL handshake latency (which occurs once per connection), for the potential latency incurred by having traffic redirected from multiple servers around the world to a single set of application servers?<p>It seems like in the diagram, the West Coast Client, instead of making a direct connection to the APP servers on the right, is instead making a connection to the ELB on the left, which then forwards the traffic to the nginx server, which forwards it to another ELB, which forwards it to the App servers.<p>If the client connected directly to the ELB in front of the App Servers, they would incur the SSL handshake latency, but would avoid the four extra hops (two per send and two per receive) on the ELB and nginx.<p>Over the lifetime of the connection, is it possible that this latency could be longer than 200 ms?
Isn't this a "poor man's version" of what cloudflare offers?<p>They even have an optimized version called railgun (<a href="https://www.cloudflare.com/railgun" rel="nofollow">https://www.cloudflare.com/railgun</a>) that only ships the diff across country.
The "pool of warm keep-alive connections to the main web servers" is still sending the traffic over HTTPS, then?<p>Edit: I'm clear that latency is reduced and how that's accomplished. I just wanted to get clarification that the connections between the early SSL termination and the web servers was also encrypted, too.
You can get this from a CDN like AWS CloudFront as well. CloudFront will keep a pool of persistent connections to the origin, whether it's S3 or a custom origin. You can also do HTTP or HTTPS over the port of your choice on the backend, enabling "mullet routing". The minimum TTL is 0, allowing you to vary content for each request.<p>One issue with CloudFront is the POST PUT DELETE verbs aren't currently supported, which is a kink for modifying data. You could use Route 53s LBR feature to route requests to nearby EC2 instances, then proxy back to your origin.
Would it be more effective to forward plain HTTP over a VPN instead? For example, you set up your servers in London, East Coast and West Coast and configure a VPN. People connect to their local servers via HTTPS and that server forwards it to London via HTTP; the request would be encrypted by the VPN. The advantage is that your proxy - Nginx is good for this - can bring up additional connections quicker.
So, the way I understand it, the connection between the load balanacer <-> web server is over the private network, right? And with VPC, your private network is isolated and can't be snooped by other Amazon customers?<p>Sounds cool, but this would only work on Amazon or datacenters w/ cross-data center private networks (SoftLayer has this, for example).
I think maintaining a pool of 'warm' https sessions between the nginx and the app server is not a very flexible approach. What happens when all of those are occupied?
Wouldn't it be nicer to have an IPsec tunnel between the nginx and the app server and open http sessions on demand?
Wait. "The actual HTTP request would then be sent to the intermediate instance which then forwards it on" are you forwarding this on in plain text ? Is the traffic at least traversing a VPN between the two locations ?
This post shows off how engineers aren't the best at showing off their work. I think if the author abstracted this post and didn't dive so far into the technical aspects of the problem, it could appeal to a much wider audience.<p>For example, the discussion of nginx could be abstracted into a discussion of graph theory, where a handshake has to occur with a secure cluster of nodes.<p>This is all just IMHO. Great post though!