Basically, it goes like this:
Very simple, right.
Well, we started to fill it up with more apps and recently I realized we'll run out of memory sooner or later. So, what I tried is I changed the way passenger was getting started: I reduced "--min-instances" to 0 and left it to minimun (1) on one server only.
It worked... but not the way I wanted. The problem is, while Nginx is load balancing incoming requests it naturally heath-checks the nodes (server 1, 2, ...N) which is a good thing, but, my "--min-instances 0" startup parameter had a very small impact, because:
- Nginx doesn't know about my "min-instances" parameter and consider all the nodes always to "be ready to serve";
- when a new request gets routed (by Nginx) to a node with "min-instances 0", it might take quite a few moments for the first response to be spit out by the rails app instance (i.e. a sort of "warming up"), so the whole thing started to feel much slower;
- since Nginx does a round-robin on load balancing, my rails instances were starting up and then shut down (by Phusion Passenger which is, by the way, doing a great job) because the activity, i.e. incoming requests, was actually not that high.
So, I wanted to make something very simple, with a minimun code/whatever writing/configuring.
Here's what I came up with:
Here's what I came up with:
Nothing fancy, you'll say. A "standard" stack... - exactly! There's a little thing though, this "haproxy_autoscale" python script I wrote. What it does is something very simple:
- hook up on HAProxy logfile (I actually use syslog and option httplog)
- calculate an average response time over 10-15 latest requests (i.e. how fast my rails instances are responding back to the client/browser)
- if the average is starting to get above a threshold for a while, start to scale up.
- when requests activity goes down (e.g. no high load within 5 min), scale down.
By "scale up" and "scale down" I simply mean assigning to a backend server the MAINT status using HAProxy socket commands:
scale up:
echo "enable server app1-backend/server1" | socat stdio /haproxy/socket
scale down:
echo "disable server app1-backend/server1" | socat stdio /haproxy/socket
That's it. What happens is most of my app servers and rails instances do not get bothered anymore unless there really is a high load. That way I got my RAM back and can stuff up even more apps/hadoop map-red/whatever.
In case you're interested, here's that haproxy_autoscale.py script.
I have to warn you, though: by no way you should use it in a production environment as is. It's an ongoing experiment I'm running these days. This little script still needs quite a few touches, but it'll give you an idea.
import sys import sys import os import time import re from threading import Timer from datetime import datetime import urllib2 import random # response time threshold in milliseconds: when backend starts responding # slower than the threshold we scale up, otherwise scale down. THRESHOLD = 500 # num of requests to calc average NUM_REQ = 15 # need this backend to set correct initial status of backend servers BACKEND = "app1backend" # any url that goes straight to the backend is fine as warmup_url # active == True will initially set its status as UP, MAINT otherwise # always_up - never set MAINT on that backend (leave at least one host as always_up) SERVERS = { 'server1': { 'active': True, 'always_up': True, 'warmup_url': 'http://server1:1234/' }, 'server2' : { 'active': False, 'always_up': False, 'warmup_url': 'http://server2:1234/' }, 'server3' : { 'active': False, 'always_up': False, 'warmup_url': 'http://server3:1234/' }, 'server4' : { 'active': False, 'always_up': False, 'warmup_url': 'http://server4:1234/' }, } # see http://code.google.com/p/haproxy-docs/wiki/UnixSocketCommands CMD_DISABLE = 'echo "disable server b-%s/%s" | socat stdio /haproxy/socket' CMD_ENABLE = 'echo "enable server b-%s/%s" | socat stdio /haproxy/socket' CMD_SET_WEIGHT = 'echo "set weight b-%s/%s %d" | socat stdio /haproxy/socket' def watch(thefile): """ opens thefile and keeps reading new lines. this is supposed to be a syslog log file. """ thefile.seek(0,2) # Go to the end of the file while True: line = thefile.readline() if not line: time.sleep(0.1) # Sleep briefly continue yield line def host_to_scaleup(): """ searches through the list of not yet active backends and returns a random choice, otherwise returns None """ hosts = filter(lambda h: not SERVERS[h]['active'], SERVERS) if len(hosts): return random.choice(hosts) # otherwise return None, nothing to scale up def host_to_scaledown(): """ filters only active hosts and returns a random choice, None otherwise. """ hosts = filter(lambda h: SERVERS[h]['active'] and not SERVERS[h]['always_up'], SERVERS) if len(hosts): return random.choice(hosts) # otherwise return None, nothing to scale down def scale_up(backend, host): """ send a 'warmup' request to the host in question and adds it to the HAProxy's active backend servers list, i.e. sets UP status """ warmup_url = SERVERS[host]['warmup_url'] print "%s: warming up at %s" % (datetime.now(), warmup_url) req = urllib2.Request(warmup_url) req.add_header('User-Agent', 'haproxy_autoscale') try: r = urllib2.urlopen(req) #print r.info() except urllib2.HTTPError, e: print "*** didn't get a 200/OK response, sorry: ", e.code except urllib2.URLError, e: print "*** couldn't reach the backend server: ", e.reason else: # send socket commands to (re-)enable the backend cmd1 = CMD_ENABLE % (backend, host) cmd2 = CMD_SET_WEIGHT % (backend, host, 10) os.system(cmd1) os.system(cmd2) SERVERS[host]['active'] = True def scale_down(backend, host): """ removes host from HAProxy active backend servers list, i.e. sets MAINT status """ print "%s: turning DOWN b-%s/%s" % (datetime.now(), backend, host) cmd1 = CMD_SET_WEIGHT % (backend, host, 0) cmd2 = CMD_DISABLE % (backend, host) os.system(cmd1) # for some reasong cmd1 does not always work # so we set weight to 0, just in case. os.system(cmd2) SERVERS[host]['active'] = False # this is where we store response times resps = [] def avg_resp_time(new_val): """ adds new_val to the resps arrays and returns average over all requests in the list. """ resps.append(new_val) if len(resps) > NUM_REQ: # keep list length up to the NUM_REQ maximum items del(resps[0]) return sum(resps) / len(resps) # otherwise we return None: not enough data def random_scale_up(backend): """does the opposite of random_scale_down()""" h_up = host_to_scaleup() if h_up: scale_up(backend, h_up) reset_cooldown_timer(backend) def random_scale_down(backend): """ runs after about 5 mins of inactivity (e.g. no incoming requests) """ h_down = host_to_scaledown() if h_down: print "%s: scaling down %s" % (datetime.now(), h_down) scale_down(backend, h_down) SERVERS[h_down]['active'] = False reset_cooldown_timer(backend) # when no requests are coming in anymore we still # want to scale down automatically, after some time. cooldown_timer = None def reset_cooldown_timer(backend): """ creates new timer to scale down after 5 min of inactivity """ global cooldown_timer if cooldown_timer: cooldown_timer.cancel() cooldown_timer = Timer(60*5, random_scale_down, [backend]) cooldown_timer.start() # set initial status of every backend server for h in SERVERS: if SERVERS[h]['active']: scale_up(BACKEND, h) else: scale_down(BACKEND, h) print "watching %s ..." % sys.argv[1] # regexp to match against haproxy log file p = re.compile('.*b-([a-zA-Z0-9\-_]+)/([a-zA-Z0-9\-_]+) \d+/\d+/\d+/(\d+)/.*') # scale up/down count threshold scale_threshold_count = 0 # endless loop for line in watch(open(sys.argv[1])): r = p.match(line) if r: backend, host, rt = r.groups() if host in SERVERS: SERVERS[host]['active'] = True # set as active since it's in the logs # calculate average response time resp_time = avg_resp_time(int(rt)) # check whether we have enough data to reason if resp_time is None: continue elif resp_time > THRESHOLD: # scale up, if we can and need to scale_threshold_count += 1 if scale_threshold_count < 3: # haven't reached count max continue # please, do scale print "%s: avg resp time: %d" % (datetime.now(), resp_time) random_scale_up(backend) scale_threshold_count = 0 # reset the counter
If you want to try it, just change /haproxy/socket in CMD_DISABLE, CMD_ENABLE and CMD_SET_WEIGHT (at the beginning of the script) to where your haproxy socket is and run it like that:
python haproxy_autoscale.py /path/to/your/haproxy/httplog
Let me know what you think.
No comments:
Post a Comment
What do you think?