If keys are small integers, we can use an array to implement a symbol table, by interpreting the key as an array index so that we can store the value associated with key i in array position i.
Search algorithms that use hashing consist of two separate parts. The first step is to compute a hash function that transforms the search into an array index. Ideally, different keys would map to different indices. This ideal is generally beyond our reason, so we have to face the possibility that two or more different keys may hash to the same array index. Thus, the second part of a hashing search is a collision-resolution process that deals with this situation.
Hash Functions. If we have an array that can hold M key-value pairs, then we need a function that can transform any given key into an index into that array:an integer in the range[0,M-1]. We seek a hash function that is both easy to compute and uniformly distributes the keys.
The most commonly used method for hashing integers is called modular hashing. We choose the array size M to be prime, and for any positive integer key k, compute the remainder when diving k by M. This function is very easy to compute(k%M, in Java), and is effective in dispering the keys evenly between 0 and M-1.
Modular hashing works for long keys such as strings. too: we simply treat them as huge integers. For example, the code below computes a modular hash function for a String s.
Codes:
int hash = 0;
for (int i = 0; i < s.length(); i++)
{
hash = (R*h + s.chartAt(i)) %M;
}
Java Conventions. Java helps us address the basic problem that every type of data needs a hash function by every data type must implement a method called hashCode()(which returns a 32-bit integer). The implementation of hashCode() for an object must be consistent with equals. That is, is a.equals(b) is true, then a.hashCode() must have the same numerical value as b.hashCode(). If the hashCode() values are the same, the objects may or may not be equals, and we must use equals() to decide which condition holds.
Search algorithms that use hashing consist of two separate parts. The first step is to compute a hash function that transforms the search into an array index. Ideally, different keys would map to different indices. This ideal is generally beyond our reason, so we have to face the possibility that two or more different keys may hash to the same array index. Thus, the second part of a hashing search is a collision-resolution process that deals with this situation.
Hash Functions. If we have an array that can hold M key-value pairs, then we need a function that can transform any given key into an index into that array:an integer in the range[0,M-1]. We seek a hash function that is both easy to compute and uniformly distributes the keys.
The most commonly used method for hashing integers is called modular hashing. We choose the array size M to be prime, and for any positive integer key k, compute the remainder when diving k by M. This function is very easy to compute(k%M, in Java), and is effective in dispering the keys evenly between 0 and M-1.
Modular hashing works for long keys such as strings. too: we simply treat them as huge integers. For example, the code below computes a modular hash function for a String s.
Codes:
int hash = 0;
for (int i = 0; i < s.length(); i++)
{
hash = (R*h + s.chartAt(i)) %M;
}
Java Conventions. Java helps us address the basic problem that every type of data needs a hash function by every data type must implement a method called hashCode()(which returns a 32-bit integer). The implementation of hashCode() for an object must be consistent with equals. That is, is a.equals(b) is true, then a.hashCode() must have the same numerical value as b.hashCode(). If the hashCode() values are the same, the objects may or may not be equals, and we must use equals() to decide which condition holds.
Comments
Post a Comment