jvm源码解析java对象头
阅读原文时间:2021年07月18日阅读:3

  认真学习过java的同学应该都知道,java对象由三个部分组成:对象头,实例数据,对齐填充,这三大部分扛起了java的大旗对象,实例数据其实就是我们对象中的数据,对齐填充是由于为了规则分配内存空间,java对象大小一定是8字节的整数倍,但是我们也不能让程序员来控制吧,所以当不够8位时,会自动填充至8的整数倍,对象头记录了hash值,gc年龄,锁状态(偏向锁还会记录线程id),gc状态等等,它还保存了对象的class指针,可谓是核心中的核心,有兴趣的同学可以去看一下关于我写的对象的一些介绍:https://www.cnblogs.com/gmt-hao/p/13817564.html。那么接下来我们就从jvm层面来剖析对象头的实现,还是老规矩,先撸代码。

  java作为面向对象的语言,作为代表的对象原始类名称也很有代表性:oop,我们进oop.hpp中看一下:

// oopDesc is the top baseclass for objects classes. The {name}Desc classes describe
// the format of Java objects so the fields can be accessed from C++.
// oopDesc is abstract.
// (see oopHierarchy for complete oop class hierarchy)
//
// no virtual functions allowed
…省略
class oopDesc {
friend class VMStructs;
private:
volatile markOop _mark;
union _metadata {
Klass* _klass;
narrowKlass _compressed_klass;
} _metadata;

先看一下注释,oopDesc代表所有object对象的最上层基类,至于后面一句我理解的话其实这一块的意思就是说用c++中的字段定义java对象的格式,,再看下面定义的几个字段,_mark 就是mark world,而_metadata里面有俩属性, _klass和_compressed_klass,前者就是正常的指针,而后者是压缩指针,压缩指针在1.8默认开启,可以通过-XX:-UseCompressedOops关闭,这里就不做详细赘述,反正记住都是class指针,指向具体的klass就行了,先看Klass的注释

// A Klass provides:
// 1: language level class object (method dictionary etc.)
// 2: provide vm dispatch behavior for the object
// Both functions are combined into one C++ class.
这段话的意思是Klass提供了语言级别的类对象(如方法,字典表等),vm调度行为再一个c++ 类里面

// One reason for the oop/klass dichotomy in the implementation is
// that we don't want a C++ vtbl pointer in every object. Thus,
// normal oops don't have any virtual functions. Instead, they
// forward all "virtual" functions to their klass, which does have
// a vtbl and does the C++ dispatch depending on the object's
// actual type. (See oop.inline.hpp for some of the forwarding code.)
// ALL FUNCTIONS IMPLEMENTING THIS DISPATCH ARE PREFIXED WITH "oop_"!
这段话的意思大致是解释为什么要把klass 和 对象实体分成两部分来实现,他说不希望一个c++的虚方法指针存放在每个对象中,从而普通的对象不存放任何虚方法,有着虚方法的klass可以根据对象的实际类型进行c++的调度。
现在我大概是明白了,这不就是多态吗,原来多态的实现是这么玩的,在编译时期,对象是不知道自己具体调用的方法的,而在实际运行时去klass中去找实际类型调用对应方法。
我们再看一下实际类加载的klass子类InstanceKlass:

class InstanceKlass: public Klass {
friend class VMStructs;
friend class ClassFileParser;
friend class CompileReplay;

protected:
// Constructor 构造函数
InstanceKlass(int vtable_len, //虚方法表大小
int itable_len, //接口函数表大小
int static_field_size, //静态变量个数
int nonstatic_oop_map_size, //非静态变量个数
ReferenceType rt, //引用类型
AccessFlags access_flags, //当前类的访问修饰符(public private)
bool is_anonymous); //是否匿名
。。。。。。。

// See "The Java Virtual Machine Specification" section 2.16.2-5 for a detailed description
// of the class loading & initialization procedure, and the use of the states.
enum ClassState {
allocated, // allocated (but not yet linked)
loaded, // loaded and inserted in class hierarchy (but not linked yet)
linked, // successfully linked/verified (but not initialized yet)
being_initialized, // currently running class initializer
fully_initialized, // initialized (successfull final state)
initialization_error // error happened during initialization
};

protected:
// Annotations for this class 类注解信息
Annotations* _annotations;
// Array classes holding elements of this class.
Klass* _array_klasses;
// Constant pool for this class.
ConstantPool* _constants;
// The InnerClasses attribute and EnclosingMethod attribute. The
// _inner_classes is an array of shorts. If the class has InnerClasses
// attribute, then the _inner_classes array begins with 4-tuples of shorts
// [inner_class_info_index, outer_class_info_index,
// inner_name_index, inner_class_access_flags] for the InnerClasses
// attribute. If the EnclosingMethod attribute exists, it occupies the
// last two shorts [class_index, method_index] of the array. If only
// the InnerClasses attribute exists, the _inner_classes array length is
// number_of_inner_classes * 4. If the class has both InnerClasses
// and EnclosingMethod attributes the _inner_classes array length is
// number_of_inner_classes * 4 + enclosing_method_attribute_size.
Array* _inner_classes;

// the source debug extension for this klass, NULL if not specified.
// Specified as UTF-8 string without terminating zero byte in the classfile,
// it is stored in the instanceklass as a NULL-terminated UTF-8 string
char* _source_debug_extension;
// Array name derived from this class which needs unreferencing
// if this class is unloaded.
Symbol* _array_name;

// Number of heapOopSize words used by non-static fields in this klass
// (including inherited fields but after header_size()).
int _nonstatic_field_size;
int _static_field_size; // number words used by static fields (oop and non-oop) in this klass
// Constant pool index to the utf8 entry of the Generic signature,
// or 0 if none.
u2 _generic_signature_index;
// Constant pool index to the utf8 entry for the name of source file
// containing this klass, 0 if not specified.
u2 _source_file_name_index;
u2 _static_oop_field_count;// number of static oop fields in this klass
u2 _java_fields_count; // The number of declared Java fields
int _nonstatic_oop_map_size;// size in words of nonstatic oop map blocks

// _is_marked_dependent can be set concurrently, thus cannot be part of the
// _misc_flags.
bool _is_marked_dependent; // used for marking during flushing and deoptimization

可以看到初始化的Klass的构造方法包含了像虚函数表大小,引用类型等等基本信息,再往下可以看到这里面字段增加了注解属性,当前常量池中保存的当前类引用,内部类等等。

  说完klass,我们在聊一聊今天的重头戏mark word,我们首先还是先看一下作者的注释:

The markOop describes the header of an object.
markOop描述了一个对象头

//
// Note that the mark is not a real oop but just a word.
// It is placed in the oop hierarchy for historical reasons.
请注意mark只是一个word(32位机器上就是32个字节,64位就是64个字节)而不是一个真实对象,由于一些历史原因他被留在了oop结构中

//
// Bit-format of an object header (most significant first, big endian layout below):
//对象的字节格式采用大端模式(高位字节放低位地址)
// 32 bits:
// --------
// hash:25 ------------>| age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:23 epoch:2 age:4 biased_lock:1 lock:2 (biased object)
// size:32 ------------------------------------------>| (CMS free block)
// PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)
//
// 64 bits:
// --------
// unused:25 hash:31 -->| unused:1 age:4 biased_lock:1 lock:2 (normal object)
// JavaThread*:54 epoch:2 unused:1 age:4 biased_lock:1 lock:2 (biased object)
// PromotedObject*:61 --------------------->| promo_bits:3 ----->| (CMS promoted object)
// size:64 ----------------------------------------------------->| (CMS free block)

  第一句就点明了它作为我们这一章的主角地位,markOop描述了一个对象头,好家伙,这个才是真正的对象头,看了一圈网上的文章,基本都是在描述mark word和klass指针之类的,但是没关系,只是定义不同。

  再看下面的字节格式,我们主要看一下64位系统,根据上述提供的我们看一下这4种情况:

  1.未加锁但调用了hash是这样的:

  

  2.加了偏向锁,并偏向指定线程:

  3.CMS标记:

  

 4.回收就不谈了,肯定是空的。

  这里其实存在一个问题,可以看到第二种偏向锁的场景是没办法再存hash值的,那难道我加了偏向锁就不能在获取hash值了吗,答案当然是否定的,要分析这个我们先来看一段代码:

public class Response {
}

@Slf4j
public class TestHeader {

static Response response = new Response();  
public static void aaa(Response response) throws InterruptedException {  
    log.info(Thread.currentThread().getName() + "out" +ClassLayout.parseInstance(response).toPrintable());

    synchronized (response){  
        log.info(Thread.currentThread().getName() + ClassLayout.parseInstance(response).toPrintable());  
        sleep(5000);  
        log.info(Thread.currentThread().getName());  
    }  
}

public static void main(String\[\] args) throws InterruptedException {  
    Thread t1 = new Thread("t1"){  
        @SneakyThrows  
        @Override  
        public void run(){  
            sleep(2000);  
            aaa(response);  
        }  
    };  
    Thread t2 = new Thread("t2"){  
        @SneakyThrows  
        @Override  
        public void run(){  
            aaa(response);  
        }  
    };  
    t1.start();  
    t2.start();  
    t1.join();  
    t2.join();  
}  

}

这里Response是一个空对象,没有计算hash,我们看打印结果:

16:03:40.326 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 05 00 00 00 (00000101 00000000 00000000 00000000) (5)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:03:40.330 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 05 b0 59 1f (00000101 10110000 01011001 00011111) (525971461)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:03:42.368 [t1] INFO com.example.demo.TestHeader - t1outcom.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 05 b0 59 1f (00000101 10110000 01011001 00011111) (525971461)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:03:45.331 [t2] INFO com.example.demo.TestHeader - t2
16:03:45.331 [t1] INFO com.example.demo.TestHeader - t1com.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) ba 16 ee 1c (10111010 00010110 11101110 00011100) (485365434)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059) //klass引用
12 4 (loss due to the next object alignment) //对齐填充

上面的对象头介绍我们可以知道,锁的标识是最后两位,而倒数第三位

  我们在来介绍一下其他几个的含义:age用来记录gc年龄(由于只有4位,最多只能记录到15,因此gc年龄最大也就是15),biased_lock表示偏向锁标识,0关闭,1开启,lock标识锁状态,01偏向锁,00轻量锁,10重量锁,而当被gc标记时,后三位用来表示标记符。

然后大端模式导致我们显示出来的和想象的不一样,可以看到除了对齐填充和klass就是mark word 一共64个01,8个字节,而这8个字节按倒序排序(前8位所占的字节其实是最后一个字节),所以我们看锁标记直接看标红地方的后三位就可以了。

  我们在来具体分析一下这个代码,两个线程t1和t2,t1启动后等待2秒,t2先跑,拿到锁之后歇5秒,而t1在2秒之后到达,则会进行锁竞争,我们可以看到在t2在第一次拿到锁之后,将线程id记录了下来,而t1过来抢锁之后,则由偏向锁直接升级为重量锁。

  我们再试一下将休眠5s给去掉,看下执行结果:

16:45:17.873 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 05 00 00 00 (00000101 00000000 00000000 00000000) (5)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:45:17.876 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 05 48 27 1f (00000101 01001000 00100111 00011111) (522668037)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:45:17.876 [t2] INFO com.example.demo.TestHeader - t2
16:45:19.843 [t1] INFO com.example.demo.TestHeader - t1outcom.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 05 48 27 1f (00000101 01001000 00100111 00011111) (522668037)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:45:19.844 [t1] INFO com.example.demo.TestHeader - t1com.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) f0 f3 ac 1f (11110000 11110011 10101100 00011111) (531428336)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 4 (loss due to the next object alignment)

前面三次还是一样的,由于t2没有休眠,所以拿完锁直接释放了,而t1休眠2秒过来抢锁,偏向已经撤销,转为轻量锁00了。

我们再看一下刚才说的hashCode的情况:

public class TestHeader {

static Response response = new Response();  
public static void aaa(Response response) throws InterruptedException {  
    log.info(Thread.currentThread().getName() + "out" +ClassLayout.parseInstance(response).toPrintable());  
    response.hashCode();  
    log.info(Thread.currentThread().getName() + "hash" +ClassLayout.parseInstance(response).toPrintable());  
    synchronized (response){  
        log.info(Thread.currentThread().getName() + ClassLayout.parseInstance(response).toPrintable());  

// sleep(5000);
}
}

public static void main(String\[\] args) throws InterruptedException {  
    Thread t2 = new Thread("t2"){  
        @SneakyThrows  
        @Override  
        public void run(){  
            aaa(response);  
        }  
    };  
    t2.start();  
    t2.join();  
}

这里只启动了一个线程,分别在hash计算前,计算后和加锁后打印:

16:50:19.440 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 05 00 00 00 (00000101 00000000 00000000 00000000) (5)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:50:19.443 [t2] INFO com.example.demo.TestHeader - t2hashcom.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 63 bb 3f (00000001 01100011 10111011 00111111) (1069245185)
4 4 (object header) 50 00 00 00 (01010000 00000000 00000000 00000000) (80)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:50:19.444 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 10 ee 1e 1f (00010000 11101110 00011110 00011111) (522120720)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 05 c2 00 f8 (00000101 11000010 00000000 11111000) (-134168059)
12 4 (loss due to the next object alignment)

可以看到第一次就是常规的匿名可偏向,而计算完hash之后,变为不可偏向,并计算了hash值,加锁之后也不再是偏向锁,而是直接变为了轻量锁并保存线程id,再看一下,如果已经偏向某个线程后在调用hashCode的结果:

public class TestHeader {

static Response response = new Response();  
public static void aaa(Response response) throws InterruptedException {  
    log.info(Thread.currentThread().getName() + "out" +ClassLayout.parseInstance(response).toPrintable());

    synchronized (response){  
        log.info(Thread.currentThread().getName() + ClassLayout.parseInstance(response).toPrintable());  
        response.hashCode();  
        log.info(Thread.currentThread().getName() + "hash" +ClassLayout.parseInstance(response).toPrintable());  

// sleep(5000);
}
}

public static void main(String\[\] args) throws InterruptedException {  
    Thread t2 = new Thread("t2"){  
        @SneakyThrows  
        @Override  
        public void run(){  
            aaa(response);  
        }  
    };  
    t2.start();  
    t2.join();  
}  

}

执行结果:

16:59:12.601 [t2] INFO com.example.demo.TestHeader - t2outcom.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 05 00 00 00 (00000101 00000000 00000000 00000000) (5)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 9f c1 00 f8 (10011111 11000001 00000000 11111000) (-134168161)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:59:12.604 [t2] INFO com.example.demo.TestHeader - t2com.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 05 68 40 1f (00000101 01101000 01000000 00011111) (524314629)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 9f c1 00 f8 (10011111 11000001 00000000 11111000) (-134168161)
12 4 (loss due to the next object alignment)
Instance size: 16 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

16:59:12.604 [t2] INFO com.example.demo.TestHeader - t2hashcom.example.demo.Response object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 2a 12 d4 1c (00101010 00010010 11010100 00011100) (483660330)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) 9f c1 00 f8 (10011111 11000001 00000000 11111000) (-134168161)
12 4 (loss due to the next object alignment)

可以看到由偏向锁直接升级为重量锁(10)。

总结:

  对象头其实在我看来就是一个死的概念,更多的时在gc或者是锁甚至是以后其他的操作,在jdk源码和jvm中看到了很多对于一个int值或者其他多字节的字段进行拆解操作,比如像jdk中的读写锁,便是用高低位分别表示,,而像这里也是用了一个word表示出那么多的花样,这一篇本来是不打算写的,但是当我要写synchronized的源码分析时,写了一小段突然发现卡壳了,完全没有办法绕开它,不过这也说明了对象头的重要性吧。

  对于锁的升级,从上面的例子也可以看出默认情况下为匿名可偏向(这里是默认去除偏向延迟的,可以加上-XX:BiasedLockingStartupDelay=0),当有一个线程过来时,会偏向当前线程,而多个线程交替执行(即一个线程执行完再执行下一个,永远不会出现两个线程同时在锁临界区内),则会升级为轻量锁,而多个线程竞争(两个或以上线程同时在临界区中),而在计算hash值之后,匿名偏向计算hash后加锁则升级为轻量锁,加锁后计算hash则直接升级为重量锁。

.tb_button { padding: 1px; cursor: pointer; border-right: 1px solid rgba(139, 139, 139, 1); border-left: 1px solid rgba(255, 255, 255, 1); border-bottom: 1px solid rgba(255, 255, 255, 1) }
.tb_button.hover { borer: 2px outset #def; background-color: rgba(248, 248, 248, 1) !important }
.ws_toolbar { z-index: 100000 }
.ws_toolbar .ws_tb_btn { cursor: pointer; border: 1px solid rgba(85, 85, 85, 1); padding: 3px }
.tb_highlight { background-color: rgba(255, 255, 0, 1) }
.tb_hide { visibility: hidden }
.ws_toolbar img { padding: 2px; margin: 0 }
.tb_button { padding: 1px; cursor: pointer; border-right: 1px solid rgba(139, 139, 139, 1); border-left: 1px solid rgba(255, 255, 255, 1); border-bottom: 1px solid rgba(255, 255, 255, 1) }
.tb_button.hover { borer: 2px outset #def; background-color: rgba(248, 248, 248, 1) !important }
.ws_toolbar { z-index: 100000 }
.ws_toolbar .ws_tb_btn { cursor: pointer; border: 1px solid rgba(85, 85, 85, 1); padding: 3px }
.tb_highlight { background-color: rgba(255, 255, 0, 1) }
.tb_hide { visibility: hidden }
.ws_toolbar img { padding: 2px; margin: 0 }
.tb_button { padding: 1px; cursor: pointer; border-right: 1px solid rgba(139, 139, 139, 1); border-left: 1px solid rgba(255, 255, 255, 1); border-bottom: 1px solid rgba(255, 255, 255, 1) }
.tb_button.hover { borer: 2px outset #def; background-color: rgba(248, 248, 248, 1) !important }
.ws_toolbar { z-index: 100000 }
.ws_toolbar .ws_tb_btn { cursor: pointer; border: 1px solid rgba(85, 85, 85, 1); padding: 3px }
.tb_highlight { background-color: rgba(255, 255, 0, 1) }
.tb_hide { visibility: hidden }
.ws_toolbar img { padding: 2px; margin: 0 }
.tb_button { padding: 1px; cursor: pointer; border-right: 1px solid rgba(139, 139, 139, 1); border-left: 1px solid rgba(255, 255, 255, 1); border-bottom: 1px solid rgba(255, 255, 255, 1) }
.tb_button.hover { borer: 2px outset #def; background-color: rgba(248, 248, 248, 1) !important }
.ws_toolbar { z-index: 100000 }
.ws_toolbar .ws_tb_btn { cursor: pointer; border: 1px solid rgba(85, 85, 85, 1); padding: 3px }
.tb_highlight { background-color: rgba(255, 255, 0, 1) }
.tb_hide { visibility: hidden }
.ws_toolbar img { padding: 2px; margin: 0 }

手机扫一扫

移动阅读更方便

阿里云服务器
腾讯云服务器
七牛云服务器