Initial commit of formalized "redirection" timeout logic

When phpredis is communicating with a cluster, there are two different kinds of timeout events. The first, is your standard read or write timeout where the socket is blocked either because of network issues, or because Redis is taking longer than the timeout to complete the request. The second is unique to cluster. Because Redis Cluster attempts to automatically failover (in the case of replicas), phpredis cluster will attempt to get data from a node where it thinks the key would live, and upon a failure to connect, try a different node (at random). This is because Redis could be resharding the connection and may point the client to a new (now good node). However, if it's not yet detected a failure, it will just bounce us back to the prior node (which could be actually down or have just sputtered due to various issues). So in this case, phpredis uses a second timeout mechanism where we keep track (in milleseconds) when we entered the query/response loop. Once we've been unsuccessful up to this timeout, phpredis will abort with a different (catchable) exception. TODO: It may be a good idea to implement some small delay, so we don't hit the cluster with lots of requests over and over until the cluster comes back.
author: michael-grunder <michael.grunder@gmail.com> 2014-12-02 08:06:31 +0300
committer: michael-grunder <michael.grunder@gmail.com> 2015-05-06 01:05:30 +0300
commit: 48e6e67a8286ac2aab95132763bc78af508b9e90 (patch)
tree: 31bd21f87dc95e80ad57666ae035eb5f05152d7c /cluster_library.c
parent: d804342a6f8258aa25147eeacbd4e3b22fa4faa6 (diff)
1 files changed, 27 insertions, 3 deletions
diff --git a/cluster_library.c b/cluster_library.c
index 09b34fc3..ee2a61aa 100644
--- a/cluster_library.c
+++ b/cluster_library.c
@@ -430,6 +430,18 @@ unsigned short cluster_hash_key(const char *key, int len) {
     return crc16((char*)key+s+1,e-s-1) & REDIS_CLUSTER_MOD;
 }
 
+/* Grab the current time in milliseconds */
+long long mstime(void) {
+    struct timeval tv;
+    long long mst;
+
+    gettimeofday(&tv, NULL);
+    mst = ((long long)tv.tv_sec)*1000;
+    mst += tv.tv_usec/1000;
+
+    return mst;
+}
+
 /* Hash a key from a ZVAL */
 unsigned short cluster_hash_key_zval(zval *z_key) {
     const char *kptr;
@@ -1260,9 +1272,15 @@ PHPAPI int cluster_send_slot(redisCluster *c, short slot, char *cmd,
 PHPAPI short cluster_send_command(redisCluster *c, short slot, const char *cmd, 
                                   int cmd_len TSRMLS_DC)
 {
-    int resp;
+    int resp, timedout=0;
+    long msstart;
 
-    // Issue commands until we find the right node or fail
+    /* Grab the current time in milliseconds */
+    msstart = mstime();
+
+    /* Our main cluster request/reply loop.  This loop runs until we're able
+     * to get a valid reply from a node, hit our "request" timeout, or encounter
+     * a CLUSTERDOWN state from Redis cluster. */
     do {
         // Send MULTI to the node if we haven't yet.
         if(c->flags->mode == MULTI && SLOT_SOCK(c,slot)->mode != MULTI) {
@@ -1309,13 +1327,19 @@ PHPAPI short cluster_send_command(redisCluster *c, short slot, const char *cmd,
             }
             slot = c->redir_slot;
         }
-    } while(resp != 0 && !c->clusterdown);
+
+        /* If we didn't get a valid response and we do have a timeout check it */
+        timedout = resp && c->waitms ? mstime() - msstart >= c->waitms : 0;
+    } while(resp != 0 && !c->clusterdown && !timedout);
 
     // If we've detected the cluster is down, throw an exception
     if(c->clusterdown) {
         zend_throw_exception(redis_cluster_exception_ce,
             "The Redis Cluster is down (CLUSTERDOWN)", 0 TSRMLS_CC);
         return -1;
+    } else if (timedout) {
+        zend_throw_exception(redis_cluster_exception_ce,
+            "Timed out attempting to find data in the correct node!", 0 TSRMLS_CC);
     }
 
     // Inform the cluster where to read the rest of our response,
author	michael-grunder <michael.grunder@gmail.com>	2014-12-02 08:06:31 +0300
committer	michael-grunder <michael.grunder@gmail.com>	2015-05-06 01:05:30 +0300
commit	48e6e67a8286ac2aab95132763bc78af508b9e90 (patch)
tree	31bd21f87dc95e80ad57666ae035eb5f05152d7c /cluster_library.c
parent	d804342a6f8258aa25147eeacbd4e3b22fa4faa6 (diff)