最近在研究"一致性HASH算法"(Consistent Hashing),用于解決memcached集群中當(dāng)服務(wù)器出現(xiàn)增減變動時對散列值的影響。后來 在JAVAEYE上的一篇文章中,找到了其中的 KetamaHash 算法的JAVA實(shí)現(xiàn)(一種基于虛擬結(jié)點(diǎn)的HASH算法),于是為了加深理解,對照 JAVA版本,用C#重寫了一個。放到這里,如果大家感興趣的話, 可以下載測試一下,如果發(fā)現(xiàn)寫法有問題請及時告之我,以便我及時修正。
????? 下面是對Ketama的介紹:
Ketama is an implementation of a consistent hashing algorithm, meaning you can add or remove servers from the memcached pool without causing a complete remap of all keys.
Here’s how it works:
* Take your list of servers (eg: 1.2.3.4:11211, 5.6.7.8:11211, 9.8.7.6:11211)
* Hash each server string to several (100-200) unsigned ints
* Conceptually, these numbers are placed on a circle called the continuum. (imagine a clock face that goes from 0 to 2^32)
* Each number links to the server it was hashed from, so servers appear at several points on the continuum, by each of the numbers they hashed to.
* To map a key->server, hash your key to a single unsigned int, and find the next biggest number on the continuum. The server linked to that number is the correct server for that key.
* If you hash your key to a value near 2^32 and there are no points on the continuum greater than your hash, return the first server in the continuum.
If you then add or remove a server from the list, only a small proportion of keys end up mapping to different servers.
????
????? 我的理解,這類方法其實(shí)有點(diǎn)像大學(xué)里的微積分的思想(通過連續(xù)函數(shù)的取值區(qū)間來計算圖形面積)。通過把已知的實(shí)結(jié)點(diǎn)(memcached服務(wù)IP端口)列表結(jié)成一個圓,然后在兩兩實(shí)結(jié)點(diǎn)之間“放置”盡可能多的虛擬節(jié)點(diǎn)(上面文中的unsigned ints), 假設(shè)用戶數(shù)據(jù)映射在虛擬節(jié)點(diǎn)上(用戶數(shù)據(jù)真正存儲位置是在該虛擬節(jié)點(diǎn)代表的實(shí)際物理服務(wù)器上),這樣就能抑制分布不均勻,最大限度地減小服務(wù)器增減時的緩存重新分布。如下圖:
???
????????
???? 下面是添加結(jié)點(diǎn)時:
???
????????
????????
???
????? 如這篇文章所說(總結(jié)一致性哈希(Consistent Hashing) ):
???
????? Consistent Hashing最大限度地抑制了hash鍵的重新分布。另外要取得比較好的負(fù)載均衡的效果,往往在服務(wù)器數(shù)量比較少的時候需要增加虛擬節(jié)點(diǎn)來保證服務(wù)器能均勻的分布在圓環(huán)上。因?yàn)槭褂靡话愕膆ash方法,服務(wù)器的映射地點(diǎn)的分布非常不均勻。使用虛擬節(jié)點(diǎn)的思想,為每個物理節(jié)點(diǎn)(服務(wù)器)在圓上分配100~200個點(diǎn)。這樣就能抑制分布不均勻,最大限度地減小服務(wù)器增減時的緩存重新分布。用戶數(shù)據(jù)映射在虛擬節(jié)點(diǎn)上,就表示用戶數(shù)據(jù)真正存儲位置是在該虛擬節(jié)點(diǎn)代表的實(shí)際物理服務(wù)器上。
?????? 了解了原理,下面看一下具體實(shí)現(xiàn)。
?????? JAVA實(shí)現(xiàn)代碼取自Spy Memcached client中的類,原文的作者也盡可能的對代碼進(jìn)行注釋說明,所以讓我剩了不少時間。
???
?????? 下面是相應(yīng)的.NET實(shí)現(xiàn)(注釋參考JAVA版本):???
public class KetamaNodeLocator
{
??? //原文中的JAVA類TreeMap實(shí)現(xiàn)了Comparator方法,這里我圖省事,直接用了net下的SortedList,其中Comparer接口方法)
??? private SortedList<long, string> ketamaNodes = new SortedList<long, string>();
??? private HashAlgorithm hashAlg;
??? private int numReps = 160;
??? //此處參數(shù)與JAVA版中有區(qū)別,因?yàn)槭褂玫撵o態(tài)方法,所以不再傳遞HashAlgorithm alg參數(shù)
??? public KetamaNodeLocator(List<string> nodes, int nodeCopies)
??? {
??????? ketamaNodes = new SortedList<long, string>();
??????? numReps = nodeCopies;
??????? //對所有節(jié)點(diǎn),生成nCopies個虛擬結(jié)點(diǎn)
??????? foreach (string node in nodes)
??????? {
??????????? //每四個虛擬結(jié)點(diǎn)為一組
??????????? for (int i = 0; i < numReps / 4; i++)
??????????? {
??????????????? //getKeyForNode方法為這組虛擬結(jié)點(diǎn)得到惟一名稱
??????????????? byte[] digest = HashAlgorithm.computeMd5(node + i);
??????????????? /** Md5是一個16字節(jié)長度的數(shù)組,將16字節(jié)的數(shù)組每四個字節(jié)一組,分別對應(yīng)一個虛擬結(jié)點(diǎn),這就是為什么上面把虛擬結(jié)點(diǎn)四個劃分一組的原因*/?
??????????????? for (int h = 0; h < 4; h++)
??????????????? {
??????????????????? long m = HashAlgorithm.hash(digest, h);
??????????????????? ketamaNodes[m] = node;
??????????????? }
??????????? }
??????? }
??? }
??? public string GetPrimary(string k)
??? {
??????? byte[] digest = HashAlgorithm.computeMd5(k);
??????? string rv = GetNodeForKey(HashAlgorithm.hash(digest, 0));
??????? return rv;
??? }
??? string GetNodeForKey(long hash)
??? {
??????? string rv;
??????? long key = hash;
??????? //如果找到這個節(jié)點(diǎn),直接取節(jié)點(diǎn),返回??
??????? if (!ketamaNodes.ContainsKey(key))
??????? {
??????????? //得到大于當(dāng)前key的那個子Map,然后從中取出第一個key,就是大于且離它最近的那個key 說明詳見: http://www.iteye.com/topic/684087
??????????? var tailMap = from coll in ketamaNodes
????????????????????????? where coll.Key > hash
????????????????????????? select new { coll.Key };
??????????? if (tailMap == null || tailMap.Count() == 0)
??????????????? key = ketamaNodes.FirstOrDefault().Key;
??????????? else
??????????????? key = tailMap.FirstOrDefault().Key;
??????? }
??????? rv = ketamaNodes[key];
??????? return rv;
??? }
}
???
???
????? 下面的代碼與JAVA中有所不同,它使用靜態(tài)方法實(shí)現(xiàn):???
public class HashAlgorithm
{
??? public static long hash(byte[] digest, int nTime)
??? {
??????? long rv = ((long)(digest[3 + nTime * 4] & 0xFF) << 24)
??????????????? | ((long)(digest[2 + nTime * 4] & 0xFF) << 16)
??????????????? | ((long)(digest[1 + nTime * 4] & 0xFF) <<
??????????????? | ((long)digest[0 + nTime * 4] & 0xFF);
??????? return rv & 0xffffffffL; /* Truncate to 32-bits */
??? }
??? /**
???? * Get the md5 of the given key.
???? */
??? public static byte[] computeMd5(string k)
??? {
??????? MD5 md5 = new MD5CryptoServiceProvider();
??????
??????? byte[] keyBytes = md5.ComputeHash(Encoding.UTF8.GetBytes(k));
??????? md5.Clear();
??????? //md5.update(keyBytes);
??????? //return md5.digest();
??????? return keyBytes;
??? }
}
???
???
?????? 下面是.net版本下的測試結(jié)果
???
?????? 分布平均性測試:測試隨機(jī)生成的眾多key是否會平均分布到各個結(jié)點(diǎn)上測試結(jié)果如下:????
???
????????
???
????? 最上面一行是參數(shù)說明,節(jié)點(diǎn)數(shù)目,總共有多少key,每個節(jié)點(diǎn)應(yīng)該分配key的比例是多少。下面是每個結(jié)點(diǎn)分配到key的數(shù)目和比例。 多次測試后發(fā)現(xiàn),這個Hash算法的節(jié)點(diǎn)分布都在標(biāo)準(zhǔn)比例左右徘徊。
????? 節(jié)點(diǎn)增刪測試:在環(huán)上插入N個結(jié)點(diǎn),每個節(jié)點(diǎn)nCopies個虛擬結(jié)點(diǎn)。隨機(jī)生成眾多key,在增刪節(jié)點(diǎn)時,測試同一個key選擇相同節(jié)點(diǎn)的概率,測試如果如下:
???????????
????? 上面三行分別是正常情況,節(jié)點(diǎn)增加,節(jié)點(diǎn)刪除情況下的節(jié)點(diǎn)數(shù)目。下面兩行表示在節(jié)點(diǎn)增加和刪除情況下,同一個key分配在相同節(jié)點(diǎn)上的比例(命中率)。
????? 后來我修改了幾次增刪結(jié)點(diǎn)的數(shù)量,基本驗(yàn)證了JAVA那位仁兄所說的那樣:????
????? 多次測試后發(fā)現(xiàn),命中率與結(jié)點(diǎn)數(shù)目和增減的節(jié)點(diǎn)數(shù)量有關(guān)。同樣增刪結(jié)點(diǎn)數(shù)目情況下,結(jié)點(diǎn)多時命中率高。同樣節(jié)點(diǎn)數(shù)目,增刪結(jié)點(diǎn)越少,命中率越高。這些都與實(shí)際情況相符。
???? 這里還有一些鏈接,都是介紹和討論Consistent Hashing的,有興趣的朋友可以看一下,呵呵:)
???
???? 總結(jié)一致性哈希(Consistent Hashing)
???
???? Ketama一致性Hash算法學(xué)習(xí)(含Java代碼)?????
????? 下面是對Ketama的介紹:
Ketama is an implementation of a consistent hashing algorithm, meaning you can add or remove servers from the memcached pool without causing a complete remap of all keys.
Here’s how it works:
* Take your list of servers (eg: 1.2.3.4:11211, 5.6.7.8:11211, 9.8.7.6:11211)
* Hash each server string to several (100-200) unsigned ints
* Conceptually, these numbers are placed on a circle called the continuum. (imagine a clock face that goes from 0 to 2^32)
* Each number links to the server it was hashed from, so servers appear at several points on the continuum, by each of the numbers they hashed to.
* To map a key->server, hash your key to a single unsigned int, and find the next biggest number on the continuum. The server linked to that number is the correct server for that key.
* If you hash your key to a value near 2^32 and there are no points on the continuum greater than your hash, return the first server in the continuum.
If you then add or remove a server from the list, only a small proportion of keys end up mapping to different servers.
????
????? 我的理解,這類方法其實(shí)有點(diǎn)像大學(xué)里的微積分的思想(通過連續(xù)函數(shù)的取值區(qū)間來計算圖形面積)。通過把已知的實(shí)結(jié)點(diǎn)(memcached服務(wù)IP端口)列表結(jié)成一個圓,然后在兩兩實(shí)結(jié)點(diǎn)之間“放置”盡可能多的虛擬節(jié)點(diǎn)(上面文中的unsigned ints), 假設(shè)用戶數(shù)據(jù)映射在虛擬節(jié)點(diǎn)上(用戶數(shù)據(jù)真正存儲位置是在該虛擬節(jié)點(diǎn)代表的實(shí)際物理服務(wù)器上),這樣就能抑制分布不均勻,最大限度地減小服務(wù)器增減時的緩存重新分布。如下圖:
???
????????
???? 下面是添加結(jié)點(diǎn)時:
???
????????
????????
???
????? 如這篇文章所說(總結(jié)一致性哈希(Consistent Hashing) ):
???
????? Consistent Hashing最大限度地抑制了hash鍵的重新分布。另外要取得比較好的負(fù)載均衡的效果,往往在服務(wù)器數(shù)量比較少的時候需要增加虛擬節(jié)點(diǎn)來保證服務(wù)器能均勻的分布在圓環(huán)上。因?yàn)槭褂靡话愕膆ash方法,服務(wù)器的映射地點(diǎn)的分布非常不均勻。使用虛擬節(jié)點(diǎn)的思想,為每個物理節(jié)點(diǎn)(服務(wù)器)在圓上分配100~200個點(diǎn)。這樣就能抑制分布不均勻,最大限度地減小服務(wù)器增減時的緩存重新分布。用戶數(shù)據(jù)映射在虛擬節(jié)點(diǎn)上,就表示用戶數(shù)據(jù)真正存儲位置是在該虛擬節(jié)點(diǎn)代表的實(shí)際物理服務(wù)器上。
?????? 了解了原理,下面看一下具體實(shí)現(xiàn)。
?????? JAVA實(shí)現(xiàn)代碼取自Spy Memcached client中的類,原文的作者也盡可能的對代碼進(jìn)行注釋說明,所以讓我剩了不少時間。
???
?????? 下面是相應(yīng)的.NET實(shí)現(xiàn)(注釋參考JAVA版本):???
public class KetamaNodeLocator
{
??? //原文中的JAVA類TreeMap實(shí)現(xiàn)了Comparator方法,這里我圖省事,直接用了net下的SortedList,其中Comparer接口方法)
??? private SortedList<long, string> ketamaNodes = new SortedList<long, string>();
??? private HashAlgorithm hashAlg;
??? private int numReps = 160;
??? //此處參數(shù)與JAVA版中有區(qū)別,因?yàn)槭褂玫撵o態(tài)方法,所以不再傳遞HashAlgorithm alg參數(shù)
??? public KetamaNodeLocator(List<string> nodes, int nodeCopies)
??? {
??????? ketamaNodes = new SortedList<long, string>();
??????? numReps = nodeCopies;
??????? //對所有節(jié)點(diǎn),生成nCopies個虛擬結(jié)點(diǎn)
??????? foreach (string node in nodes)
??????? {
??????????? //每四個虛擬結(jié)點(diǎn)為一組
??????????? for (int i = 0; i < numReps / 4; i++)
??????????? {
??????????????? //getKeyForNode方法為這組虛擬結(jié)點(diǎn)得到惟一名稱
??????????????? byte[] digest = HashAlgorithm.computeMd5(node + i);
??????????????? /** Md5是一個16字節(jié)長度的數(shù)組,將16字節(jié)的數(shù)組每四個字節(jié)一組,分別對應(yīng)一個虛擬結(jié)點(diǎn),這就是為什么上面把虛擬結(jié)點(diǎn)四個劃分一組的原因*/?
??????????????? for (int h = 0; h < 4; h++)
??????????????? {
??????????????????? long m = HashAlgorithm.hash(digest, h);
??????????????????? ketamaNodes[m] = node;
??????????????? }
??????????? }
??????? }
??? }
??? public string GetPrimary(string k)
??? {
??????? byte[] digest = HashAlgorithm.computeMd5(k);
??????? string rv = GetNodeForKey(HashAlgorithm.hash(digest, 0));
??????? return rv;
??? }
??? string GetNodeForKey(long hash)
??? {
??????? string rv;
??????? long key = hash;
??????? //如果找到這個節(jié)點(diǎn),直接取節(jié)點(diǎn),返回??
??????? if (!ketamaNodes.ContainsKey(key))
??????? {
??????????? //得到大于當(dāng)前key的那個子Map,然后從中取出第一個key,就是大于且離它最近的那個key 說明詳見: http://www.iteye.com/topic/684087
??????????? var tailMap = from coll in ketamaNodes
????????????????????????? where coll.Key > hash
????????????????????????? select new { coll.Key };
??????????? if (tailMap == null || tailMap.Count() == 0)
??????????????? key = ketamaNodes.FirstOrDefault().Key;
??????????? else
??????????????? key = tailMap.FirstOrDefault().Key;
??????? }
??????? rv = ketamaNodes[key];
??????? return rv;
??? }
}
???
???
????? 下面的代碼與JAVA中有所不同,它使用靜態(tài)方法實(shí)現(xiàn):???
public class HashAlgorithm
{
??? public static long hash(byte[] digest, int nTime)
??? {
??????? long rv = ((long)(digest[3 + nTime * 4] & 0xFF) << 24)
??????????????? | ((long)(digest[2 + nTime * 4] & 0xFF) << 16)
??????????????? | ((long)(digest[1 + nTime * 4] & 0xFF) <<

??????????????? | ((long)digest[0 + nTime * 4] & 0xFF);
??????? return rv & 0xffffffffL; /* Truncate to 32-bits */
??? }
??? /**
???? * Get the md5 of the given key.
???? */
??? public static byte[] computeMd5(string k)
??? {
??????? MD5 md5 = new MD5CryptoServiceProvider();
??????
??????? byte[] keyBytes = md5.ComputeHash(Encoding.UTF8.GetBytes(k));
??????? md5.Clear();
??????? //md5.update(keyBytes);
??????? //return md5.digest();
??????? return keyBytes;
??? }
}
???
???
?????? 下面是.net版本下的測試結(jié)果
???
?????? 分布平均性測試:測試隨機(jī)生成的眾多key是否會平均分布到各個結(jié)點(diǎn)上測試結(jié)果如下:????
???
????????
???
????? 最上面一行是參數(shù)說明,節(jié)點(diǎn)數(shù)目,總共有多少key,每個節(jié)點(diǎn)應(yīng)該分配key的比例是多少。下面是每個結(jié)點(diǎn)分配到key的數(shù)目和比例。 多次測試后發(fā)現(xiàn),這個Hash算法的節(jié)點(diǎn)分布都在標(biāo)準(zhǔn)比例左右徘徊。
????? 節(jié)點(diǎn)增刪測試:在環(huán)上插入N個結(jié)點(diǎn),每個節(jié)點(diǎn)nCopies個虛擬結(jié)點(diǎn)。隨機(jī)生成眾多key,在增刪節(jié)點(diǎn)時,測試同一個key選擇相同節(jié)點(diǎn)的概率,測試如果如下:
???????????
????? 上面三行分別是正常情況,節(jié)點(diǎn)增加,節(jié)點(diǎn)刪除情況下的節(jié)點(diǎn)數(shù)目。下面兩行表示在節(jié)點(diǎn)增加和刪除情況下,同一個key分配在相同節(jié)點(diǎn)上的比例(命中率)。
????? 后來我修改了幾次增刪結(jié)點(diǎn)的數(shù)量,基本驗(yàn)證了JAVA那位仁兄所說的那樣:????
????? 多次測試后發(fā)現(xiàn),命中率與結(jié)點(diǎn)數(shù)目和增減的節(jié)點(diǎn)數(shù)量有關(guān)。同樣增刪結(jié)點(diǎn)數(shù)目情況下,結(jié)點(diǎn)多時命中率高。同樣節(jié)點(diǎn)數(shù)目,增刪結(jié)點(diǎn)越少,命中率越高。這些都與實(shí)際情況相符。
???? 這里還有一些鏈接,都是介紹和討論Consistent Hashing的,有興趣的朋友可以看一下,呵呵:)
???
???? 總結(jié)一致性哈希(Consistent Hashing)
???
???? Ketama一致性Hash算法學(xué)習(xí)(含Java代碼)?????
更多文章、技術(shù)交流、商務(wù)合作、聯(lián)系博主
微信掃碼或搜索:z360901061

微信掃一掃加我為好友
QQ號聯(lián)系: 360901061
您的支持是博主寫作最大的動力,如果您喜歡我的文章,感覺我的文章對您有幫助,請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點(diǎn)擊下面給點(diǎn)支持吧,站長非常感激您!手機(jī)微信長按不能支付解決辦法:請將微信支付二維碼保存到相冊,切換到微信,然后點(diǎn)擊微信右上角掃一掃功能,選擇支付二維碼完成支付。
【本文對您有幫助就好】元
