Java HashMap工作原理：不仅仅是HashMap

V2AS问路

Java HashMap工作原理：不仅仅是HashMap

阅读原文时间：2023年07月09日阅读：1

前言：

几乎所有java程序员都用过hashMap，但会用不一定会说。

近年来hashMap是非常常见的面试题，如何为自己的回答加分？需要从理解开始。

“你用过hashMap吗？”，“怎么用的？”，“为什么要用？”

这些问题是面试中经常碰到的，相信大部分的猿哥都能回答，“用过”，然后回答hashMap的一些特性，如HashMap是以键值对存储，可以接受null键值和值，很快但不安全等等。但是后面的问题来了：

“你知道HashMap的工作原理吗？”，“他的get()方法是怎么实现的？”

到这里有些猿哥可能说不出来，有些知道原理但不能自信的表达。

正题：

hashMap原理，为什么说不仅仅是hashMap，因为我们需要从其他的知识点切入。

（1）不仅仅是hashMap：==和equals

在 Java 中比较两个对象是否相等主要是通过 ==号，比较的是他们的内存地址。而equals()方法则是继承自Object类中的，默认也是通过==来实现的。所以在没有重写Object的equals()方法的前提下，比较两个对象使用==和equals将会返回同样的结果。

前面了解到==比较的是内存地址，那么我只是想比较值是否相同该怎么办？

这里我们用常用String类来举例，例如比较如下两个字符串是否相同String a = "Hello" 和 String b = new String("Hello")，这里的相同有两种情形，是要比较 a 和 b 是否是同一个对象（内存地址是否相同），还是比较它们的内容是否相等？这个具体需要怎么区分呢？

如果使用 == 那么就是比较它们在内存中是否是同一个对象，但是 String 对象的默认父类也是 Object，所以默认的equals方法比较的也是内存地址，所以我们要重写 equals方法，正如 String 源码中所写的那样。

public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String anotherString = (String)anObject;
int n = value.length;
if (n == anotherString.value.length) {
char v1[] = value;
char v2[] = anotherString.value;
int i = 0;
while (n-- != 0) {
if (v1[i] != v2[i])
return false;
i++;
}
return true;
}
}
return false;
}

这样当我们 a == b时是判断 a 和 b 是否是同一个对象，a.equals(b)则是比较 a 和 b 的内容是否相同，这应该很好理解。当然java中还有很多类都重写的equals()方法，这里不赘述。

（2）不仅仅是hashMap：hashCode()

简单了解：hashCode()方法返回对象的哈希码值，它的实现主要是为了提高哈希表(例如java.util.Hashtable提供的哈希表)的性能。Object类中默认定义了该方法为不同的对象返回不同的整形值，有人说这个值就是由类的内存地址转换而来，但是也有人证实它仅仅只是一个随机数，whatever！我们为什么要知道类的内存地址呢，是吧？

Object 类的 hashCode方法有一个通用约定，从注释中可以看出：

/**
* Returns a hash code value for the object. This method is
* supported for the benefit of hash tables such as those provided by
* {@link java.util.HashMap}.
*

* The general contract of {@code hashCode} is:
*

Whenever it is invoked on the same object more than once during * an execution of a Java application, the {@code hashCode} method * must consistently return the same integer, provided no information * used in {@code equals} comparisons on the object is modified. * This integer need not remain consistent from one execution of an * application to another execution of the same application. *
If two objects are equal according to the {@code equals(Object)} * method, then calling the {@code hashCode} method on each of * the two objects must produce the same integer result. *
It is not required that if two objects are unequal * according to the {@link java.lang.Object#equals(java.lang.Object)} * method, then calling the {@code hashCode} method on each of the * two objects must produce distinct integer results. However, the * programmer should be aware that producing distinct integer results * for unequal objects may improve the performance of hash tables. *

* As much as is reasonably practical, the hashCode method defined by
* class {@code Object} does return distinct integers for distinct
* objects. (This is typically implemented by converting the internal
* address of the object into an integer, but this implementation
* technique is not required by the
* Java™ programming language.)
*
* @return a hash code value for this object.
* @see java.lang.Object#equals(java.lang.Object)
* @see java.lang.System#identityHashCode
*/
public native int hashCode();

大概意思就是：

在应用程序的执行期间，只要对象的 equals 方法的比较操作所用到的信息没有被修改，那么对同一个对象的多次调用，hashCode 方法都必须始终返回同一个值。

如果两个对象根据 equals 方法比较是相等的，那么调用这两个对象中的 hashCode 方法都必须产生同样的整数结果。

如果两个对象根据 equals 方法比较是不相等的，那么调用者两个对象中的 hashCode 方法，则不一定要求 hashCode 方法必须产生不同的结果。但是给不相等的对象产生不同的整数散列值，是有可能提高散列表（hash table）的性能。

所以，为了满足上面第二点，当我们重写对象equals()方法时也需要重写 hashCode()方法？

重点了解：为什么重写对象equals()方法时也需要重写 hashCode()方法？举例！

我们自定义一个学生类并重写equals()方法，但不重写hashCode()方法，然后比较两个对象，如下代码

public class Student {

private String name;

private String age;

public Student(String name,String age){  
    this.name = name;  
    this.age = age;  
}

@Override  
public boolean equals(Object anObject) {  
    if (this == anObject) {  
        return true;  
    }  
    if (anObject instanceof Student) {  
        Student anotherStudent = (Student) anObject;

        if (this.getName() == anotherStudent.getName()  
                || this.getAge() == anotherStudent.getAge())  
            return true;  
    }  
    return false;  
}

public static void main(String[] args){
Student student1 = new Student("小明", "12岁");
Student student2 = new Student("小明", "12岁");

    System.out.println("equals结果：" + student1.equals(student2));  
    System.out.println("对象1的散列值：" + student1.hashCode() + "，对象2的散列值：" + student2.hashCode());  
}

equals结果：true
对象1的散列值：1555009629，对象2的散列值：41359092

Process finished with exit code 0

很明显，结果违背了第二条约定。试想String类如果只是重写equals方法而不重写hashCode方法会发生什么？在接下来的hashMap中回答这个问题。

HashMap的工作原理

HashMap基于hashing原理，我们通过put()和get()方法储存和获取对象。当我们将键值对传递给put()方法时，它调用键对象的hashCode()方法来计算hashcode，然后找到bucket位置来储存值对象。HashMap使用链表来解决碰撞问题，当发生碰撞了，对象将会储存在链表的下一个节点中。 HashMap在每个链表节点中储存键值对对象。

当两个不同的键对象的hashcode相同时会发生什么？它们会储存在同一个bucket位置的链表中。

上图便是我们通常说的hashMap是以数组+链表的形式存储数据，其中每一个方框代表一个Entry实体，Entry类实际上是一个单向的链表结构，它具有Next指针，可以连接下一个Entry实体，以此来解决Hash冲突的问题。所以在bucket中即存储了value也存储了key。

当我们用get()方法获取值时，首先调用key对象的hashCode方法找到bucket位置,如果bucket上存储了多个对象，则会遍历链表通过equals方法找到对应的key所对应的value返回。

如果String类只是重写equals方法而不重写hashCode方法

那么当使用new String("s") 作为 Key 然后 put 一个值，但是再根据 new String("s") 去 Get 的时候就会得到一个null的结果，因为他们的hashCode不相同，显然这不是我们想看到的，所以String对象重写equals()方法时也需要重写 hashCode()方法。同时这也是为什么我们经常用String作为key的原因，因为String类是final的，这样可以保证它的hashCode不会改变。

总结：

当我们理解了==，equals以及hashCode后，理解hashMap的get()方法的原理应该是分分钟的事了。而对于put()方法，我们需要理解它是通过计算key的hashCode将Entry实体存储到对应bucket位置，实体包括hashCode,key,value,next。如果同一个位置发生碰撞就存储在链表的下一个节点。

当然很多面试中还会涉及比如比较hashMap和hashTable之类的，这边简单说一个小技巧方便大家理解，像hashMap和hashTable的区别，ArrayList和Vector的区别的问题记住一点，常用的一般是快的，而快的原因是因为它是异步的，而异步就会有线程安全问题。所以hashMap和ArrayList效率相对高，异步且线程不安全。hashTable和Vector相率相对低，同步且线程安全。

好了，说到这里希望本文能让大家对java中的 ==，equals，hashCode，hashMap有一个比较全面的理解，在今后面试过程中能够从理解出发，自信表达。

手机扫一扫

移动阅读更方便

你可能感兴趣的文章

201971010131-张兴盼《面向对象程序设计（java）》第十二周学习总结

自己动手实现rpc框架(一) 实现点对点的rpc通信

Java stream 流

Lombok注解及其作用

Spring源码学习笔记13——总结篇, 从IOC到AOP