diff options
author | michael-grunder <michael.grunder@gmail.com> | 2014-12-02 08:06:31 +0300 |
---|---|---|
committer | michael-grunder <michael.grunder@gmail.com> | 2015-05-06 01:05:30 +0300 |
commit | 48e6e67a8286ac2aab95132763bc78af508b9e90 (patch) | |
tree | 31bd21f87dc95e80ad57666ae035eb5f05152d7c /cluster_library.c | |
parent | d804342a6f8258aa25147eeacbd4e3b22fa4faa6 (diff) |
Initial commit of formalized "redirection" timeout logic
When phpredis is communicating with a cluster, there are two
different kinds of timeout events.
The first, is your standard read or write timeout where the socket is
blocked either because of network issues, or because Redis is taking
longer than the timeout to complete the request.
The second is unique to cluster. Because Redis Cluster attempts to
automatically failover (in the case of replicas), phpredis cluster
will attempt to get data from a node where it thinks the key would
live, and upon a failure to connect, try a different node (at random).
This is because Redis could be resharding the connection and may point
the client to a new (now good node). However, if it's not yet detected
a failure, it will just bounce us back to the prior node (which could
be actually down or have just sputtered due to various issues).
So in this case, phpredis uses a second timeout mechanism where we keep
track (in milleseconds) when we entered the query/response loop. Once
we've been unsuccessful up to this timeout, phpredis will abort with
a different (catchable) exception.
TODO: It may be a good idea to implement some small delay, so we don't
hit the cluster with lots of requests over and over until the cluster
comes back.
Diffstat (limited to 'cluster_library.c')
-rw-r--r-- | cluster_library.c | 30 |
1 files changed, 27 insertions, 3 deletions
diff --git a/cluster_library.c b/cluster_library.c index 09b34fc3..ee2a61aa 100644 --- a/cluster_library.c +++ b/cluster_library.c @@ -430,6 +430,18 @@ unsigned short cluster_hash_key(const char *key, int len) { return crc16((char*)key+s+1,e-s-1) & REDIS_CLUSTER_MOD; } +/* Grab the current time in milliseconds */ +long long mstime(void) { + struct timeval tv; + long long mst; + + gettimeofday(&tv, NULL); + mst = ((long long)tv.tv_sec)*1000; + mst += tv.tv_usec/1000; + + return mst; +} + /* Hash a key from a ZVAL */ unsigned short cluster_hash_key_zval(zval *z_key) { const char *kptr; @@ -1260,9 +1272,15 @@ PHPAPI int cluster_send_slot(redisCluster *c, short slot, char *cmd, PHPAPI short cluster_send_command(redisCluster *c, short slot, const char *cmd, int cmd_len TSRMLS_DC) { - int resp; + int resp, timedout=0; + long msstart; - // Issue commands until we find the right node or fail + /* Grab the current time in milliseconds */ + msstart = mstime(); + + /* Our main cluster request/reply loop. This loop runs until we're able + * to get a valid reply from a node, hit our "request" timeout, or encounter + * a CLUSTERDOWN state from Redis cluster. */ do { // Send MULTI to the node if we haven't yet. if(c->flags->mode == MULTI && SLOT_SOCK(c,slot)->mode != MULTI) { @@ -1309,13 +1327,19 @@ PHPAPI short cluster_send_command(redisCluster *c, short slot, const char *cmd, } slot = c->redir_slot; } - } while(resp != 0 && !c->clusterdown); + + /* If we didn't get a valid response and we do have a timeout check it */ + timedout = resp && c->waitms ? mstime() - msstart >= c->waitms : 0; + } while(resp != 0 && !c->clusterdown && !timedout); // If we've detected the cluster is down, throw an exception if(c->clusterdown) { zend_throw_exception(redis_cluster_exception_ce, "The Redis Cluster is down (CLUSTERDOWN)", 0 TSRMLS_CC); return -1; + } else if (timedout) { + zend_throw_exception(redis_cluster_exception_ce, + "Timed out attempting to find data in the correct node!", 0 TSRMLS_CC); } // Inform the cluster where to read the rest of our response, |