JVM Attach实现原理剖析
阅读原文时间:2023年07月09日阅读:3

本文转载自JVM Attach实现原理剖析

前言

本文旨在从理论上分析JVM 在 Linux 环境下 Attach 操作的前因后果,以及 JVM 为此而设计并实现的解决方案,通过本文,我希望能够讲述清楚如下三个主要方面的内容。

Attach 为什么而出现

Attach的出现究其根本原因,应该就是为了实现 Java 进程(A)与进程(B)之间的本地通信。一旦这个通信通道能够成功建立,那么进程 A 就能通知进程 B 去执行某些操作,从而达到监控进程 B 或者控制进程 B 的某些行为的目的。如 jstack、jmap等 JDK 自带的工具,基本都是通过 Attach 机制去达成各自想要的目的的。至于 jstack、jmap 能做什么、怎么做,就不再本文的讨论范围了,请自行百度或者 Google。

Attach 在 JVM 底层实现的根本原理是什么

Attach 实现的根本原理就是使用了 Linux 下是文件 Socket 通信(详情可以自行百度或 Google)。有人也许会问,为什么要采用文件 socket 而不采用网络 socket?我个人认为也许一方面是为了效率(避免了网络协议的解析、数据包的封装和解封装等),另一方面是为了减少对系统资源的占用(如网络端口占用)。采用文件 socket 通信,就好比两个进程通过事先约定好的协议,对同一个文件进行读写操作,以达到信息的交互和共享。简单理解成如下图所示的模型

通过/tmp/.java.pid2345这个文件,实现客户进程与目标进程2345的通信。

Attach 在 JVM 中实现的源码分析

源码的分析主要分三阶段进行,这里要达到的目的是,弄 Attach 的清楚来龙去脉,本文的所有源码都是基于 Open JDK 1.8的,大家可以自行去下载 Open JDK 1.8 的源码。

或许你会想,在最开始的时候,目标 JVM 是怎么知道有某个进程想 attach 它自己的?答案很简单,就是目标 JVM 在启动的时候,在 JVM 内部启动了一个监听线程,这个线程的名字叫“Signal Dispatcher”,该线程的作用是,监听并处理 OS 的信号。至于什么是 OS 的信号(可以自行百度或 Google),简单理解就是,Linux系统允许进程与进程之间通过过信号的方式进行通信,如触发某个操作(操作由接受到信号的进程自定义)。如平常我们用的最多的就是 kill -9 ${pid}来杀死某个进程,kill进程通过向${pid}的进程发送一个编号为“9”号的信号,来通知系统强制结束${pid}的生命周期。

接下来我们就通过源码截图的方式来呈现一下“Signal Dispatcher”线程的创建过程。

首先进入 JVM 的启动类:/jdk/src/share/bin/main.c

 1 int
 2 main(int argc, char **argv)
 3 {
 4     int margc;
 5     char** margv;
 6     const jboolean const_javaw = JNI_FALSE;
 7 #endif /* JAVAW */
 8 #ifdef _WIN32
 9     {
10         int i = 0;
11         if (getenv(JLDEBUG_ENV_ENTRY) != NULL) {
12             printf("Windows original main args:\n");
13             for (i = 0 ; i < __argc ; i++) {
14                 printf("wwwd_args[%d] = %s\n", i, __argv[i]);
15             }
16         }
17     }
18     JLI_CmdToArgs(GetCommandLine());
19     margc = JLI_GetStdArgc();
20     // add one more to mark the end
21     margv = (char **)JLI_MemAlloc((margc + 1) * (sizeof(char *)));
22     {
23         int i = 0;
24         StdArg *stdargs = JLI_GetStdArgs();
25         for (i = 0 ; i < margc ; i++) {
26             margv[i] = stdargs[i].arg;
27         }
28         margv[i] = NULL;
29     }
30 #else /* *NIXES */
31     margc = argc;
32     margv = argv;
33 #endif /* WIN32 */
34     return JLI_Launch(margc, margv,
35                    sizeof(const_jargs) / sizeof(char *), const_jargs,
36                    sizeof(const_appclasspath) / sizeof(char *), const_appclasspath,
37                    FULL_VERSION,
38                    DOT_VERSION,
39                    (const_progname != NULL) ? const_progname : *margv,
40                    (const_launcher != NULL) ? const_launcher : *margv,
41                    (const_jargs != NULL) ? JNI_TRUE : JNI_FALSE,
42                    const_cpwildcard, const_javaw, const_ergo_class);
43 }

这个类里边最重要的一个方法就是最后的JLI_Launch,这个方法的实现存在于jdk/src/share/bin/java.c 中(大家应该都不陌生平时我们运行 java 程序时,都是采用 java com.***.Main来启动的吧)。

  1 /*
  2  * Entry point.
  3  */
  4 int
  5 JLI_Launch(int argc, char ** argv,              /* main argc, argc */
  6         int jargc, const char** jargv,          /* java args */
  7         int appclassc, const char** appclassv,  /* app classpath */
  8         const char* fullversion,                /* full version defined */
  9         const char* dotversion,                 /* dot version defined */
 10         const char* pname,                      /* program name */
 11         const char* lname,                      /* launcher name */
 12         jboolean javaargs,                      /* JAVA_ARGS */
 13         jboolean cpwildcard,                    /* classpath wildcard*/
 14         jboolean javaw,                         /* windows-only javaw */
 15         jint ergo                               /* ergonomics class policy */
 16 )
 17 {
 18     int mode = LM_UNKNOWN;
 19     char *what = NULL;
 20     char *cpath = 0;
 21     char *main_class = NULL;
 22     int ret;
 23     InvocationFunctions ifn;
 24     jlong start, end;
 25     char jvmpath[MAXPATHLEN];
 26     char jrepath[MAXPATHLEN];
 27     char jvmcfg[MAXPATHLEN];
 28
 29     _fVersion = fullversion;
 30     _dVersion = dotversion;
 31     _launcher_name = lname;
 32     _program_name = pname;
 33     _is_java_args = javaargs;
 34     _wc_enabled = cpwildcard;
 35     _ergo_policy = ergo;
 36
 37     InitLauncher(javaw);
 38     DumpState();
 39     if (JLI_IsTraceLauncher()) {
 40         int i;
 41         printf("Command line args:\n");
 42         for (i = 0; i < argc ; i++) {
 43             printf("argv[%d] = %s\n", i, argv[i]);
 44         }
 45         AddOption("-Dsun.java.launcher.diag=true", NULL);
 46     }
 47
 48     /*
 49      * Make sure the specified version of the JRE is running.
 50      *
 51      * There are three things to note about the SelectVersion() routine:
 52      *  1) If the version running isn't correct, this routine doesn't
 53      *     return (either the correct version has been exec'd or an error
 54      *     was issued).
 55      *  2) Argc and Argv in this scope are *not* altered by this routine.
 56      *     It is the responsibility of subsequent code to ignore the
 57      *     arguments handled by this routine.
 58      *  3) As a side-effect, the variable "main_class" is guaranteed to
 59      *     be set (if it should ever be set).  This isn't exactly the
 60      *     poster child for structured programming, but it is a small
 61      *     price to pay for not processing a jar file operand twice.
 62      *     (Note: This side effect has been disabled.  See comment on
 63      *     bugid 5030265 below.)
 64      */
 65     SelectVersion(argc, argv, &main_class);
 66
 67     CreateExecutionEnvironment(&argc, &argv,
 68                                jrepath, sizeof(jrepath),
 69                                jvmpath, sizeof(jvmpath),
 70                                jvmcfg,  sizeof(jvmcfg));
 71
 72     ifn.CreateJavaVM = 0;
 73     ifn.GetDefaultJavaVMInitArgs = 0;
 74
 75     if (JLI_IsTraceLauncher()) {
 76         start = CounterGet();
 77     }
 78
 79     if (!LoadJavaVM(jvmpath, &ifn)) {
 80         return(6);
 81     }
 82
 83     if (JLI_IsTraceLauncher()) {
 84         end   = CounterGet();
 85     }
 86
 87     JLI_TraceLauncher("%ld micro seconds to LoadJavaVM\n",
 88              (long)(jint)Counter2Micros(end-start));
 89
 90     ++argv;
 91     --argc;
 92
 93     if (IsJavaArgs()) {
 94         /* Preprocess wrapper arguments */
 95         TranslateApplicationArgs(jargc, jargv, &argc, &argv);
 96         if (!AddApplicationOptions(appclassc, appclassv)) {
 97             return(1);
 98         }
 99     } else {
100         /* Set default CLASSPATH */
101         cpath = getenv("CLASSPATH");
102         if (cpath == NULL) {
103             cpath = ".";
104         }
105         SetClassPath(cpath);
106     }
107
108     /* Parse command line options; if the return value of
109      * ParseArguments is false, the program should exit.
110      */
111     if (!ParseArguments(&argc, &argv, &mode, &what, &ret, jrepath))
112     {
113         return(ret);
114     }
115
116     /* Override class path if -jar flag was specified */
117     if (mode == LM_JAR) {
118         SetClassPath(what);     /* Override class path */
119     }
120
121     /* set the -Dsun.java.command pseudo property */
122     SetJavaCommandLineProp(what, argc, argv);
123
124     /* Set the -Dsun.java.launcher pseudo property */
125     SetJavaLauncherProp();
126
127     /* set the -Dsun.java.launcher.* platform properties */
128     SetJavaLauncherPlatformProps();
129
130     return JVMInit(&ifn, threadStackSize, argc, argv, mode, what, ret);
131 }

这个方法中,进行了一系列必要的操作,如libjvm.so的加载、参数解析、Classpath 的获取和设置、系统属性的设置、JVM 初始化等等,不过和本文相关的主要是130行的 JVMInit 方法,接下来我们看下这个方法的实现(位于/jdk/src/solaris/bin/java_md_solinux.c)。

1 int
2 JVMInit(InvocationFunctions* ifn, jlong threadStackSize,
3         int argc, char **argv,
4         int mode, char *what, int ret)
5 {
6     ShowSplashScreen();
7     return ContinueInNewThread(ifn, threadStackSize, argc, argv, mode, what, ret);
8 }

这里请关注两个点,ContinueInNewThread方法ifn 入参。ContinueInNewThread位于 java.c中,而 ifn 则携带了libjvm.so中的几个非常重要的函数(CreateJavaVM/GetDefaultJavaVMInitArgs/GetCreatedJavaVMs),这里我们重点关注CreateJavaVM

 1 int
 2 ContinueInNewThread(InvocationFunctions* ifn, jlong threadStackSize,
 3                     int argc, char **argv,
 4                     int mode, char *what, int ret)
 5 {
 6
 7     /*
 8      * If user doesn't specify stack size, check if VM has a preference.
 9      * Note that HotSpot no longer supports JNI_VERSION_1_1 but it will
10      * return its default stack size through the init args structure.
11      */
12     if (threadStackSize == 0) {
13       struct JDK1_1InitArgs args1_1;
14       memset((void*)&args1_1, 0, sizeof(args1_1));
15       args1_1.version = JNI_VERSION_1_1;
16       ifn->GetDefaultJavaVMInitArgs(&args1_1);  /* ignore return value */
17       if (args1_1.javaStackSize > 0) {
18          threadStackSize = args1_1.javaStackSize;
19       }
20     }
21
22     { /* Create a new thread to create JVM and invoke main method */
23       JavaMainArgs args;
24       int rslt;
25
26       args.argc = argc;
27       args.argv = argv;
28       args.mode = mode;
29       args.what = what;
30       args.ifn = *ifn;
31
32       rslt = ContinueInNewThread0(JavaMain, threadStackSize, (void*)&args);
33       /* If the caller has deemed there is an error we
34        * simply return that, otherwise we return the value of
35        * the callee
36        */
37       return (ret != 0) ? ret : rslt;
38     }
39 }

可以看出,这里进行了 JavaMainArgs 参数设置,设置完成之后,在32行处调用了 ContinueInNewThread0 (位于/jdk/src/solaris/bin/java_md_solinux.c)方法,该方法中传入了 JavaMain 函数指针和 args 参数,这二者至关重要。接下来看下其源码

 1 /*
 2  * Block current thread and continue execution in a new thread
 3  */
 4 int
 5 ContinueInNewThread0(int (JNICALL *continuation)(void *), jlong stack_size, void * args) {
 6     int rslt;
 7 #ifdef __linux__
 8     pthread_t tid;
 9     pthread_attr_t attr;
10     pthread_attr_init(&attr);
11     pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
12
13     if (stack_size > 0) {
14       pthread_attr_setstacksize(&attr, stack_size);
15     }
16
17     if (pthread_create(&tid, &attr, (void *(*)(void*))continuation, (void*)args) == 0) {
18       void * tmp;
19       pthread_join(tid, &tmp);
20       rslt = (int)tmp;
21     } else {
22      /*
23       * Continue execution in current thread if for some reason (e.g. out of
24       * memory/LWP)  a new thread can't be created. This will likely fail
25       * later in continuation as JNI_CreateJavaVM needs to create quite a
26       * few new threads, anyway, just give it a try..
27       */
28       rslt = continuation(args);
29     }
30
31     pthread_attr_destroy(&attr);
32 #else /* ! __linux__ */
33     thread_t tid;
34     long flags = 0;
35     if (thr_create(NULL, stack_size, (void *(*)(void *))continuation, args, flags, &tid) == 0) {
36       void * tmp;
37       thr_join(tid, NULL, &tmp);
38       rslt = (int)tmp;
39     } else {
40       /* See above. Continue in current thread if thr_create() failed */
41       rslt = continuation(args);
42     }
43 #endif /* __linux__ */
44     return rslt;
45 }

这里最关键的点在于,如果是 linux 环境下,则创建了一个 pthread_t 的线程来运行传入的 JavaMain 函数,并且将 args 参数也一并传入了。这时候,我们唯一要关注的便是 JavaMain (在jdk/src/share/bin/java.c )函数,请看源码

  1 int JNICALL
  2 JavaMain(void * _args)
  3 {
  4     JavaMainArgs *args = (JavaMainArgs *)_args;
  5     int argc = args->argc;
  6     char **argv = args->argv;
  7     int mode = args->mode;
  8     char *what = args->what;
  9     InvocationFunctions ifn = args->ifn;
 10
 11     JavaVM *vm = 0;
 12     JNIEnv *env = 0;
 13     jclass mainClass = NULL;
 14     jclass appClass = NULL; // actual application class being launched
 15     jmethodID mainID;
 16     jobjectArray mainArgs;
 17     int ret = 0;
 18     jlong start, end;
 19
 20     RegisterThread();
 21
 22     /* Initialize the virtual machine */
 23     start = CounterGet();
 24     if (!InitializeJVM(&vm, &env, &ifn)) {
 25         JLI_ReportErrorMessage(JVM_ERROR1);
 26         exit(1);
 27     }
 28
 29     if (showSettings != NULL) {
 30         ShowSettings(env, showSettings);
 31         CHECK_EXCEPTION_LEAVE(1);
 32     }
 33
 34     if (printVersion || showVersion) {
 35         PrintJavaVersion(env, showVersion);
 36         CHECK_EXCEPTION_LEAVE(0);
 37         if (printVersion) {
 38             LEAVE();
 39         }
 40     }
 41
 42     /* If the user specified neither a class name nor a JAR file */
 43     if (printXUsage || printUsage || what == 0 || mode == LM_UNKNOWN) {
 44         PrintUsage(env, printXUsage);
 45         CHECK_EXCEPTION_LEAVE(1);
 46         LEAVE();
 47     }
 48
 49     FreeKnownVMs();  /* after last possible PrintUsage() */
 50
 51     if (JLI_IsTraceLauncher()) {
 52         end = CounterGet();
 53         JLI_TraceLauncher("%ld micro seconds to InitializeJVM\n",
 54                (long)(jint)Counter2Micros(end-start));
 55     }
 56
 57     /* At this stage, argc/argv have the application's arguments */
 58     if (JLI_IsTraceLauncher()){
 59         int i;
 60         printf("%s is '%s'\n", launchModeNames[mode], what);
 61         printf("App's argc is %d\n", argc);
 62         for (i=0; i < argc; i++) {
 63             printf("    argv[%2d] = '%s'\n", i, argv[i]);
 64         }
 65     }
 66
 67     ret = 1;
 68
 69     /*
 70      * Get the application's main class.
 71      *
 72      * See bugid 5030265.  The Main-Class name has already been parsed
 73      * from the manifest, but not parsed properly for UTF-8 support.
 74      * Hence the code here ignores the value previously extracted and
 75      * uses the pre-existing code to reextract the value.  This is
 76      * possibly an end of release cycle expedient.  However, it has
 77      * also been discovered that passing some character sets through
 78      * the environment has "strange" behavior on some variants of
 79      * Windows.  Hence, maybe the manifest parsing code local to the
 80      * launcher should never be enhanced.
 81      *
 82      * Hence, future work should either:
 83      *     1)   Correct the local parsing code and verify that the
 84      *          Main-Class attribute gets properly passed through
 85      *          all environments,
 86      *     2)   Remove the vestages of maintaining main_class through
 87      *          the environment (and remove these comments).
 88      *
 89      * This method also correctly handles launching existing JavaFX
 90      * applications that may or may not have a Main-Class manifest entry.
 91      */
 92     mainClass = LoadMainClass(env, mode, what);
 93     CHECK_EXCEPTION_NULL_LEAVE(mainClass);
 94     /*
 95      * In some cases when launching an application that needs a helper, e.g., a
 96      * JavaFX application with no main method, the mainClass will not be the
 97      * applications own main class but rather a helper class. To keep things
 98      * consistent in the UI we need to track and report the application main class.
 99      */
100     appClass = GetApplicationClass(env);
101     NULL_CHECK_RETURN_VALUE(appClass, -1);
102     /*
103      * PostJVMInit uses the class name as the application name for GUI purposes,
104      * for example, on OSX this sets the application name in the menu bar for
105      * both SWT and JavaFX. So we'll pass the actual application class here
106      * instead of mainClass as that may be a launcher or helper class instead
107      * of the application class.
108      */
109     PostJVMInit(env, appClass, vm);
110     /*
111      * The LoadMainClass not only loads the main class, it will also ensure
112      * that the main method's signature is correct, therefore further checking
113      * is not required. The main method is invoked here so that extraneous java
114      * stacks are not in the application stack trace.
115      */
116     mainID = (*env)->GetStaticMethodID(env, mainClass, "main",
117                                        "([Ljava/lang/String;)V");
118     CHECK_EXCEPTION_NULL_LEAVE(mainID);
119
120     /* Build platform specific argument array */
121     mainArgs = CreateApplicationArgs(env, argv, argc);
122     CHECK_EXCEPTION_NULL_LEAVE(mainArgs);
123
124     /* Invoke main method. */
125     (*env)->CallStaticVoidMethod(env, mainClass, mainID, mainArgs);
126
127     /*
128      * The launcher's exit code (in the absence of calls to
129      * System.exit) will be non-zero if main threw an exception.
130      */
131     ret = (*env)->ExceptionOccurred(env) == NULL ? 0 : 1;
132     LEAVE();
133 }

和本小节相关的函数为InitializeJVM函数,在这个函数中,调用CreateJavaVM方法,这个方法就是之前在加载 libjvm.so 的时候,从动态库中获取的,首先看InitializeJVM的源码

 1 /*
 2  * Initializes the Java Virtual Machine. Also frees options array when
 3  * finished.
 4  */
 5 static jboolean
 6 InitializeJVM(JavaVM **pvm, JNIEnv **penv, InvocationFunctions *ifn)
 7 {
 8     JavaVMInitArgs args;
 9     jint r;
10
11     memset(&args, 0, sizeof(args));
12     args.version  = JNI_VERSION_1_2;
13     args.nOptions = numOptions;
14     args.options  = options;
15     args.ignoreUnrecognized = JNI_FALSE;
16
17     if (JLI_IsTraceLauncher()) {
18         int i = 0;
19         printf("JavaVM args:\n    ");
20         printf("version 0x%08lx, ", (long)args.version);
21         printf("ignoreUnrecognized is %s, ",
22                args.ignoreUnrecognized ? "JNI_TRUE" : "JNI_FALSE");
23         printf("nOptions is %ld\n", (long)args.nOptions);
24         for (i = 0; i < numOptions; i++)
25             printf("    option[%2d] = '%s'\n",
26                    i, args.options[i].optionString);
27     }
28
29     r = ifn->CreateJavaVM(pvm, (void **)penv, &args);
30     JLI_MemFree(options);
31     return r == JNI_OK;
32 }

29行处,调用 CreateJavaVM(定义在hotspot/src/share/vm/prims/jni.cpp) 方法,来进行 JVM 虚拟机的真正创建过程,源码如下

  1 _JNI_IMPORT_OR_EXPORT_ jint JNICALL JNI_CreateJavaVM(JavaVM **vm, void **penv, void *args) {
  2 #ifndef USDT2
  3   HS_DTRACE_PROBE3(hotspot_jni, CreateJavaVM__entry, vm, penv, args);
  4 #else /* USDT2 */
  5   HOTSPOT_JNI_CREATEJAVAVM_ENTRY(
  6                                  (void **) vm, penv, args);
  7 #endif /* USDT2 */
  8
  9   jint result = JNI_ERR;
 10   DT_RETURN_MARK(CreateJavaVM, jint, (const jint&)result);
 11
 12   // We're about to use Atomic::xchg for synchronization.  Some Zero
 13   // platforms use the GCC builtin __sync_lock_test_and_set for this,
 14   // but __sync_lock_test_and_set is not guaranteed to do what we want
 15   // on all architectures.  So we check it works before relying on it.
 16 #if defined(ZERO) && defined(ASSERT)
 17   {
 18     jint a = 0xcafebabe;
 19     jint b = Atomic::xchg(0xdeadbeef, &a);
 20     void *c = &a;
 21     void *d = Atomic::xchg_ptr(&b, &c);
 22     assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, "Atomic::xchg() works");
 23     assert(c == &b && d == &a, "Atomic::xchg_ptr() works");
 24   }
 25 #endif // ZERO && ASSERT
 26
 27   // At the moment it's only possible to have one Java VM,
 28   // since some of the runtime state is in global variables.
 29
 30   // We cannot use our mutex locks here, since they only work on
 31   // Threads. We do an atomic compare and exchange to ensure only
 32   // one thread can call this method at a time
 33
 34   // We use Atomic::xchg rather than Atomic::add/dec since on some platforms
 35   // the add/dec implementations are dependent on whether we are running
 36   // on a multiprocessor, and at this stage of initialization the os::is_MP
 37   // function used to determine this will always return false. Atomic::xchg
 38   // does not have this problem.
 39   if (Atomic::xchg(1, &vm_created) == 1) {
 40     return JNI_EEXIST;   // already created, or create attempt in progress
 41   }
 42   if (Atomic::xchg(0, &safe_to_recreate_vm) == 0) {
 43     return JNI_ERR;  // someone tried and failed and retry not allowed.
 44   }
 45
 46   assert(vm_created == 1, "vm_created is true during the creation");
 47
 48   /**
 49    * Certain errors during initialization are recoverable and do not
 50    * prevent this method from being called again at a later time
 51    * (perhaps with different arguments).  However, at a certain
 52    * point during initialization if an error occurs we cannot allow
 53    * this function to be called again (or it will crash).  In those
 54    * situations, the 'canTryAgain' flag is set to false, which atomically
 55    * sets safe_to_recreate_vm to 1, such that any new call to
 56    * JNI_CreateJavaVM will immediately fail using the above logic.
 57    */
 58   bool can_try_again = true;
 59
 60   result = Threads::create_vm((JavaVMInitArgs*) args, &can_try_again);
 61   if (result == JNI_OK) {
 62     JavaThread *thread = JavaThread::current();
 63     /* thread is thread_in_vm here */
 64     *vm = (JavaVM *)(&main_vm);
 65     *(JNIEnv**)penv = thread->jni_environment();
 66
 67     // Tracks the time application was running before GC
 68     RuntimeService::record_application_start();
 69
 70     // Notify JVMTI
 71     if (JvmtiExport::should_post_thread_life()) {
 72        JvmtiExport::post_thread_start(thread);
 73     }
 74
 75     EventThreadStart event;
 76     if (event.should_commit()) {
 77       event.set_javalangthread(java_lang_Thread::thread_id(thread->threadObj()));
 78       event.commit();
 79     }
 80
 81 #ifndef PRODUCT
 82   #ifndef TARGET_OS_FAMILY_windows
 83     #define CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED(f) f()
 84   #endif
 85
 86     // Check if we should compile all classes on bootclasspath
 87     if (CompileTheWorld) ClassLoader::compile_the_world();
 88     if (ReplayCompiles) ciReplay::replay(thread);
 89
 90     // Some platforms (like Win*) need a wrapper around these test
 91     // functions in order to properly handle error conditions.
 92     CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED(test_error_handler);
 93     CALL_TEST_FUNC_WITH_WRAPPER_IF_NEEDED(execute_internal_vm_tests);
 94 #endif
 95
 96     // Since this is not a JVM_ENTRY we have to set the thread state manually before leaving.
 97     ThreadStateTransition::transition_and_fence(thread, _thread_in_vm, _thread_in_native);
 98   } else {
 99     if (can_try_again) {
100       // reset safe_to_recreate_vm to 1 so that retrial would be possible
101       safe_to_recreate_vm = 1;
102     }
103
104     // Creation failed. We must reset vm_created
105     *vm = 0;
106     *(JNIEnv**)penv = 0;
107     // reset vm_created last to avoid race condition. Use OrderAccess to
108     // control both compiler and architectural-based reordering.
109     OrderAccess::release_store(&vm_created, 0);
110   }
111
112   return result;
113 }

这里只关注最核心的方法是60行的Threads::create_vm(hotspot/src/share/vm/runtime/Thread.cpp) 方法,在这个方法中,进行了大量的初始化操作,不过,这里我们只关注其中的一个点,就是 os::signal_init() 方法的调用,这就是启动“Signal Dispatcher”线程的地方。先看 create_vm 的源码

  1 jint Threads::create_vm(JavaVMInitArgs* args, bool* canTryAgain) {
  2
  3   extern void JDK_Version_init();
  4
  5   // Check version
  6   if (!is_supported_jni_version(args->version)) return JNI_EVERSION;
  7
  8   // Initialize the output stream module
  9   ostream_init();
 10
 11   // Process java launcher properties.
 12   Arguments::process_sun_java_launcher_properties(args);
 13
 14   // Initialize the os module before using TLS
 15   os::init();
 16
 17   // Initialize system properties.
 18   Arguments::init_system_properties();
 19
 20   // So that JDK version can be used as a discrimintor when parsing arguments
 21   JDK_Version_init();
 22
 23   // Update/Initialize System properties after JDK version number is known
 24   Arguments::init_version_specific_system_properties();
 25
 26   // Parse arguments
 27   jint parse_result = Arguments::parse(args);
 28   if (parse_result != JNI_OK) return parse_result;
 29
 30   os::init_before_ergo();
 31
 32   jint ergo_result = Arguments::apply_ergo();
 33   if (ergo_result != JNI_OK) return ergo_result;
 34
 35   if (PauseAtStartup) {
 36     os::pause();
 37   }
 38
 39 #ifndef USDT2
 40   HS_DTRACE_PROBE(hotspot, vm__init__begin);
 41 #else /* USDT2 */
 42   HOTSPOT_VM_INIT_BEGIN();
 43 #endif /* USDT2 */
 44
 45   // Record VM creation timing statistics
 46   TraceVmCreationTime create_vm_timer;
 47   create_vm_timer.start();
 48
 49   // Timing (must come after argument parsing)
 50   TraceTime timer("Create VM", TraceStartupTime);
 51
 52   // Initialize the os module after parsing the args
 53   jint os_init_2_result = os::init_2();
 54   if (os_init_2_result != JNI_OK) return os_init_2_result;
 55
 56   jint adjust_after_os_result = Arguments::adjust_after_os();
 57   if (adjust_after_os_result != JNI_OK) return adjust_after_os_result;
 58
 59   // intialize TLS
 60   ThreadLocalStorage::init();
 61
 62   // Bootstrap native memory tracking, so it can start recording memory
 63   // activities before worker thread is started. This is the first phase
 64   // of bootstrapping, VM is currently running in single-thread mode.
 65   MemTracker::bootstrap_single_thread();
 66
 67   // Initialize output stream logging
 68   ostream_init_log();
 69
 70   // Convert -Xrun to -agentlib: if there is no JVM_OnLoad
 71   // Must be before create_vm_init_agents()
 72   if (Arguments::init_libraries_at_startup()) {
 73     convert_vm_init_libraries_to_agents();
 74   }
 75
 76   // Launch -agentlib/-agentpath and converted -Xrun agents
 77   if (Arguments::init_agents_at_startup()) {
 78     create_vm_init_agents();
 79   }
 80
 81   // Initialize Threads state
 82   _thread_list = NULL;
 83   _number_of_threads = 0;
 84   _number_of_non_daemon_threads = 0;
 85
 86   // Initialize global data structures and create system classes in heap
 87   vm_init_globals();
 88
 89   // Attach the main thread to this os thread
 90   JavaThread* main_thread = new JavaThread();
 91   main_thread->set_thread_state(_thread_in_vm);
 92   // must do this before set_active_handles and initialize_thread_local_storage
 93   // Note: on solaris initialize_thread_local_storage() will (indirectly)
 94   // change the stack size recorded here to one based on the java thread
 95   // stacksize. This adjusted size is what is used to figure the placement
 96   // of the guard pages.
 97   main_thread->record_stack_base_and_size();
 98   main_thread->initialize_thread_local_storage();
 99
100   main_thread->set_active_handles(JNIHandleBlock::allocate_block());
101
102   if (!main_thread->set_as_starting_thread()) {
103     vm_shutdown_during_initialization(
104       "Failed necessary internal allocation. Out of swap space");
105     delete main_thread;
106     *canTryAgain = false; // don't let caller call JNI_CreateJavaVM again
107     return JNI_ENOMEM;
108   }
109
110   // Enable guard page *after* os::create_main_thread(), otherwise it would
111   // crash Linux VM, see notes in os_linux.cpp.
112   main_thread->create_stack_guard_pages();
113
114   // Initialize Java-Level synchronization subsystem
115   ObjectMonitor::Initialize() ;
116
117   // Second phase of bootstrapping, VM is about entering multi-thread mode
118   MemTracker::bootstrap_multi_thread();
119
120   // Initialize global modules
121   jint status = init_globals();
122   if (status != JNI_OK) {
123     delete main_thread;
124     *canTryAgain = false; // don't let caller call JNI_CreateJavaVM again
125     return status;
126   }
127
128   // Should be done after the heap is fully created
129   main_thread->cache_global_variables();
130
131   HandleMark hm;
132
133   { MutexLocker mu(Threads_lock);
134     Threads::add(main_thread);
135   }
136
137   // Any JVMTI raw monitors entered in onload will transition into
138   // real raw monitor. VM is setup enough here for raw monitor enter.
139   JvmtiExport::transition_pending_onload_raw_monitors();
140
141   // Fully start NMT
142   MemTracker::start();
143
144   // Create the VMThread
145   { TraceTime timer("Start VMThread", TraceStartupTime);
146     VMThread::create();
147     Thread* vmthread = VMThread::vm_thread();
148
149     if (!os::create_thread(vmthread, os::vm_thread))
150       vm_exit_during_initialization("Cannot create VM thread. Out of system resources.");
151
152     // Wait for the VM thread to become ready, and VMThread::run to initialize
153     // Monitors can have spurious returns, must always check another state flag
154     {
155       MutexLocker ml(Notify_lock);
156       os::start_thread(vmthread);
157       while (vmthread->active_handles() == NULL) {
158         Notify_lock->wait();
159       }
160     }
161   }
162
163   assert (Universe::is_fully_initialized(), "not initialized");
164   if (VerifyDuringStartup) {
165     // Make sure we're starting with a clean slate.
166     VM_Verify verify_op;
167     VMThread::execute(&verify_op);
168   }
169
170   EXCEPTION_MARK;
171
172   // At this point, the Universe is initialized, but we have not executed
173   // any byte code.  Now is a good time (the only time) to dump out the
174   // internal state of the JVM for sharing.
175   if (DumpSharedSpaces) {
176     MetaspaceShared::preload_and_dump(CHECK_0);
177     ShouldNotReachHere();
178   }
179
180   // Always call even when there are not JVMTI environments yet, since environments
181   // may be attached late and JVMTI must track phases of VM execution
182   JvmtiExport::enter_start_phase();
183
184   // Notify JVMTI agents that VM has started (JNI is up) - nop if no agents.
185   JvmtiExport::post_vm_start();
186
187   {
188     TraceTime timer("Initialize java.lang classes", TraceStartupTime);
189
190     if (EagerXrunInit && Arguments::init_libraries_at_startup()) {
191       create_vm_init_libraries();
192     }
193
194     initialize_class(vmSymbols::java_lang_String(), CHECK_0);
195
196     // Initialize java_lang.System (needed before creating the thread)
197     initialize_class(vmSymbols::java_lang_System(), CHECK_0);
198     initialize_class(vmSymbols::java_lang_ThreadGroup(), CHECK_0);
199     Handle thread_group = create_initial_thread_group(CHECK_0);
200     Universe::set_main_thread_group(thread_group());
201     initialize_class(vmSymbols::java_lang_Thread(), CHECK_0);
202     oop thread_object = create_initial_thread(thread_group, main_thread, CHECK_0);
203     main_thread->set_threadObj(thread_object);
204     // Set thread status to running since main thread has
205     // been started and running.
206     java_lang_Thread::set_thread_status(thread_object,
207                                         java_lang_Thread::RUNNABLE);
208
209     // The VM creates & returns objects of this class. Make sure it's initialized.
210     initialize_class(vmSymbols::java_lang_Class(), CHECK_0);
211
212     // The VM preresolves methods to these classes. Make sure that they get initialized
213     initialize_class(vmSymbols::java_lang_reflect_Method(), CHECK_0);
214     initialize_class(vmSymbols::java_lang_ref_Finalizer(),  CHECK_0);
215     call_initializeSystemClass(CHECK_0);
216
217     // get the Java runtime name after java.lang.System is initialized
218     JDK_Version::set_runtime_name(get_java_runtime_name(THREAD));
219     JDK_Version::set_runtime_version(get_java_runtime_version(THREAD));
220
221     // an instance of OutOfMemory exception has been allocated earlier
222     initialize_class(vmSymbols::java_lang_OutOfMemoryError(), CHECK_0);
223     initialize_class(vmSymbols::java_lang_NullPointerException(), CHECK_0);
224     initialize_class(vmSymbols::java_lang_ClassCastException(), CHECK_0);
225     initialize_class(vmSymbols::java_lang_ArrayStoreException(), CHECK_0);
226     initialize_class(vmSymbols::java_lang_ArithmeticException(), CHECK_0);
227     initialize_class(vmSymbols::java_lang_StackOverflowError(), CHECK_0);
228     initialize_class(vmSymbols::java_lang_IllegalMonitorStateException(), CHECK_0);
229     initialize_class(vmSymbols::java_lang_IllegalArgumentException(), CHECK_0);
230   }
231
232   // See        : bugid 4211085.
233   // Background : the static initializer of java.lang.Compiler tries to read
234   //              property"java.compiler" and read & write property "java.vm.info".
235   //              When a security manager is installed through the command line
236   //              option "-Djava.security.manager", the above properties are not
237   //              readable and the static initializer for java.lang.Compiler fails
238   //              resulting in a NoClassDefFoundError.  This can happen in any
239   //              user code which calls methods in java.lang.Compiler.
240   // Hack :       the hack is to pre-load and initialize this class, so that only
241   //              system domains are on the stack when the properties are read.
242   //              Currently even the AWT code has calls to methods in java.lang.Compiler.
243   //              On the classic VM, java.lang.Compiler is loaded very early to load the JIT.
244   // Future Fix : the best fix is to grant everyone permissions to read "java.compiler" and
245   //              read and write"java.vm.info" in the default policy file. See bugid 4211383
246   //              Once that is done, we should remove this hack.
247   initialize_class(vmSymbols::java_lang_Compiler(), CHECK_0);
248
249   // More hackery - the static initializer of java.lang.Compiler adds the string "nojit" to
250   // the java.vm.info property if no jit gets loaded through java.lang.Compiler (the hotspot
251   // compiler does not get loaded through java.lang.Compiler).  "java -version" with the
252   // hotspot vm says "nojit" all the time which is confusing.  So, we reset it here.
253   // This should also be taken out as soon as 4211383 gets fixed.
254   reset_vm_info_property(CHECK_0);
255
256   quicken_jni_functions();
257
258   // Must be run after init_ft which initializes ft_enabled
259   if (TRACE_INITIALIZE() != JNI_OK) {
260     vm_exit_during_initialization("Failed to initialize tracing backend");
261   }
262
263   // Set flag that basic initialization has completed. Used by exceptions and various
264   // debug stuff, that does not work until all basic classes have been initialized.
265   set_init_completed();
266
267 #ifndef USDT2
268   HS_DTRACE_PROBE(hotspot, vm__init__end);
269 #else /* USDT2 */
270   HOTSPOT_VM_INIT_END();
271 #endif /* USDT2 */
272
273   // record VM initialization completion time
274 #if INCLUDE_MANAGEMENT
275   Management::record_vm_init_completed();
276 #endif // INCLUDE_MANAGEMENT
277
278   // Compute system loader. Note that this has to occur after set_init_completed, since
279   // valid exceptions may be thrown in the process.
280   // Note that we do not use CHECK_0 here since we are inside an EXCEPTION_MARK and
281   // set_init_completed has just been called, causing exceptions not to be shortcut
282   // anymore. We call vm_exit_during_initialization directly instead.
283   SystemDictionary::compute_java_system_loader(THREAD);
284   if (HAS_PENDING_EXCEPTION) {
285     vm_exit_during_initialization(Handle(THREAD, PENDING_EXCEPTION));
286   }
287
288 #if INCLUDE_ALL_GCS
289   // Support for ConcurrentMarkSweep. This should be cleaned up
290   // and better encapsulated. The ugly nested if test would go away
291   // once things are properly refactored. XXX YSR
292   if (UseConcMarkSweepGC || UseG1GC) {
293     if (UseConcMarkSweepGC) {
294       ConcurrentMarkSweepThread::makeSurrogateLockerThread(THREAD);
295     } else {
296       ConcurrentMarkThread::makeSurrogateLockerThread(THREAD);
297     }
298     if (HAS_PENDING_EXCEPTION) {
299       vm_exit_during_initialization(Handle(THREAD, PENDING_EXCEPTION));
300     }
301   }
302 #endif // INCLUDE_ALL_GCS
303
304   // Always call even when there are not JVMTI environments yet, since environments
305   // may be attached late and JVMTI must track phases of VM execution
306   JvmtiExport::enter_live_phase();
307
308   // Signal Dispatcher needs to be started before VMInit event is posted
309   os::signal_init();
310
311   // Start Attach Listener if +StartAttachListener or it can't be started lazily
312   if (!DisableAttachMechanism) {
313     AttachListener::vm_start();
314     if (StartAttachListener || AttachListener::init_at_startup()) {
315       AttachListener::init();
316     }
317   }
318
319   // Launch -Xrun agents
320   // Must be done in the JVMTI live phase so that for backward compatibility the JDWP
321   // back-end can launch with -Xdebug -Xrunjdwp.
322   if (!EagerXrunInit && Arguments::init_libraries_at_startup()) {
323     create_vm_init_libraries();
324   }
325
326   // Notify JVMTI agents that VM initialization is complete - nop if no agents.
327   JvmtiExport::post_vm_initialized();
328
329   if (TRACE_START() != JNI_OK) {
330     vm_exit_during_initialization("Failed to start tracing backend.");
331   }
332
333   if (CleanChunkPoolAsync) {
334     Chunk::start_chunk_pool_cleaner_task();
335   }
336
337   // initialize compiler(s)
338 #if defined(COMPILER1) || defined(COMPILER2) || defined(SHARK)
339   CompileBroker::compilation_init();
340 #endif
341
342   if (EnableInvokeDynamic) {
343     // Pre-initialize some JSR292 core classes to avoid deadlock during class loading.
344     // It is done after compilers are initialized, because otherwise compilations of
345     // signature polymorphic MH intrinsics can be missed
346     // (see SystemDictionary::find_method_handle_intrinsic).
347     initialize_class(vmSymbols::java_lang_invoke_MethodHandle(), CHECK_0);
348     initialize_class(vmSymbols::java_lang_invoke_MemberName(), CHECK_0);
349     initialize_class(vmSymbols::java_lang_invoke_MethodHandleNatives(), CHECK_0);
350   }
351
352 #if INCLUDE_MANAGEMENT
353   Management::initialize(THREAD);
354 #endif // INCLUDE_MANAGEMENT
355
356   if (HAS_PENDING_EXCEPTION) {
357     // management agent fails to start possibly due to
358     // configuration problem and is responsible for printing
359     // stack trace if appropriate. Simply exit VM.
360     vm_exit(1);
361   }
362
363   if (Arguments::has_profile())       FlatProfiler::engage(main_thread, true);
364   if (MemProfiling)                   MemProfiler::engage();
365   StatSampler::engage();
366   if (CheckJNICalls)                  JniPeriodicChecker::engage();
367
368   BiasedLocking::init();
369
370   if (JDK_Version::current().post_vm_init_hook_enabled()) {
371     call_postVMInitHook(THREAD);
372     // The Java side of PostVMInitHook.run must deal with all
373     // exceptions and provide means of diagnosis.
374     if (HAS_PENDING_EXCEPTION) {
375       CLEAR_PENDING_EXCEPTION;
376     }
377   }
378
379   {
380       MutexLockerEx ml(PeriodicTask_lock, Mutex::_no_safepoint_check_flag);
381       // Make sure the watcher thread can be started by WatcherThread::start()
382       // or by dynamic enrollment.
383       WatcherThread::make_startable();
384       // Start up the WatcherThread if there are any periodic tasks
385       // NOTE:  All PeriodicTasks should be registered by now. If they
386       //   aren't, late joiners might appear to start slowly (we might
387       //   take a while to process their first tick).
388       if (PeriodicTask::num_tasks() > 0) {
389           WatcherThread::start();
390       }
391   }
392
393   // Give os specific code one last chance to start
394   os::init_3();
395
396   create_vm_timer.end();
397 #ifdef ASSERT
398   _vm_complete = true;
399 #endif
400   return JNI_OK;
401 }

309 行处,看到了os::signal_init() 的调用(hotspot/src/share/vm/runtime/os.cpp),这就是我们要找的。接着,我们看下其具体实现

 1 void os::signal_init() {
 2   if (!ReduceSignalUsage) {
 3     // Setup JavaThread for processing signals
 4     EXCEPTION_MARK;
 5     Klass* k = SystemDictionary::resolve_or_fail(vmSymbols::java_lang_Thread(), true, CHECK);
 6     instanceKlassHandle klass (THREAD, k);
 7     instanceHandle thread_oop = klass->allocate_instance_handle(CHECK);
 8
 9     const char thread_name[] = "Signal Dispatcher";
10     Handle string = java_lang_String::create_from_str(thread_name, CHECK);
11
12     // Initialize thread_oop to put it into the system threadGroup
13     Handle thread_group (THREAD, Universe::system_thread_group());
14     JavaValue result(T_VOID);
15     JavaCalls::call_special(&result, thread_oop,
16                            klass,
17                            vmSymbols::object_initializer_name(),
18                            vmSymbols::threadgroup_string_void_signature(),
19                            thread_group,
20                            string,
21                            CHECK);
22
23     KlassHandle group(THREAD, SystemDictionary::ThreadGroup_klass());
24     JavaCalls::call_special(&result,
25                             thread_group,
26                             group,
27                             vmSymbols::add_method_name(),
28                             vmSymbols::thread_void_signature(),
29                             thread_oop,         // ARG 1
30                             CHECK);
31
32     os::signal_init_pd();
33
34     { MutexLocker mu(Threads_lock);
35       JavaThread* signal_thread = new JavaThread(&signal_thread_entry);
36
37       // At this point it may be possible that no osthread was created for the
38       // JavaThread due to lack of memory. We would have to throw an exception
39       // in that case. However, since this must work and we do not allow
40       // exceptions anyway, check and abort if this fails.
41       if (signal_thread == NULL || signal_thread->osthread() == NULL) {
42         vm_exit_during_initialization("java.lang.OutOfMemoryError",
43                                       "unable to create new native thread");
44       }
45
46       java_lang_Thread::set_thread(thread_oop(), signal_thread);
47       java_lang_Thread::set_priority(thread_oop(), NearMaxPriority);
48       java_lang_Thread::set_daemon(thread_oop());
49
50       signal_thread->set_threadObj(thread_oop());
51       Threads::add(signal_thread);
52       Thread::start(signal_thread);
53     }
54     // Handle ^BREAK
55     os::signal(SIGBREAK, os::user_handler());
56   }
57 }

这里的完全可以看出来,在此函数中35行处,创建了一个 java 线程,用于执行signal_thread_entry 函数,那我们来看看,这个 signal_thread_entry 函数到底做了什么?

 1 // sigexitnum_pd is a platform-specific special signal used for terminating the Signal thread.
 2 static void signal_thread_entry(JavaThread* thread, TRAPS) {
 3   os::set_priority(thread, NearMaxPriority);
 4   while (true) {
 5     int sig;
 6     {
 7       // FIXME : Currently we have not decieded what should be the status
 8       //         for this java thread blocked here. Once we decide about
 9       //         that we should fix this.
10       sig = os::signal_wait();
11     }
12     if (sig == os::sigexitnum_pd()) {
13        // Terminate the signal thread
14        return;
15     }
16
17     switch (sig) {
18       case SIGBREAK: {
19         // Check if the signal is a trigger to start the Attach Listener - in that
20         // case don't print stack traces.
21         if (!DisableAttachMechanism && AttachListener::is_init_trigger()) {
22           continue;
23         }
24         // Print stack traces
25         // Any SIGBREAK operations added here should make sure to flush
26         // the output stream (e.g. tty->flush()) after output.  See 4803766.
27         // Each module also prints an extra carriage return after its output.
28         VM_PrintThreads op;
29         VMThread::execute(&op);
30         VM_PrintJNI jni_op;
31         VMThread::execute(&jni_op);
32         VM_FindDeadlocks op1(tty);
33         VMThread::execute(&op1);
34         Universe::print_heap_at_SIGBREAK();
35         if (PrintClassHistogram) {
36           VM_GC_HeapInspection op1(gclog_or_tty, true /* force full GC before heap inspection */);
37           VMThread::execute(&op1);
38         }
39         if (JvmtiExport::should_post_data_dump()) {
40           JvmtiExport::post_data_dump();
41         }
42         break;
43       }
44       default: {
45         // Dispatch the signal to java
46         HandleMark hm(THREAD);
47         Klass* k = SystemDictionary::resolve_or_null(vmSymbols::sun_misc_Signal(), THREAD);
48         KlassHandle klass (THREAD, k);
49         if (klass.not_null()) {
50           JavaValue result(T_VOID);
51           JavaCallArguments args;
52           args.push_int(sig);
53           JavaCalls::call_static(
54             &result,
55             klass,
56             vmSymbols::dispatch_name(),
57             vmSymbols::int_void_signature(),
58             &args,
59             THREAD
60           );
61         }
62         if (HAS_PENDING_EXCEPTION) {
63           // tty is initialized early so we don't expect it to be null, but
64           // if it is we can't risk doing an initialization that might
65           // trigger additional out-of-memory conditions
66           if (tty != NULL) {
67             char klass_name[256];
68             char tmp_sig_name[16];
69             const char* sig_name = "UNKNOWN";
70             InstanceKlass::cast(PENDING_EXCEPTION->klass())->
71               name()->as_klass_external_name(klass_name, 256);
72             if (os::exception_name(sig, tmp_sig_name, 16) != NULL)
73               sig_name = tmp_sig_name;
74             warning("Exception %s occurred dispatching signal %s to handler"
75                     "- the VM may need to be forcibly terminated",
76                     klass_name, sig_name );
77           }
78           CLEAR_PENDING_EXCEPTION;
79         }
80       }
81     }
82   }
83 }

函数里面意思已经很清晰明了了,首先在10行处,有一个os::signal_wait()的调用,该调用的主要是阻塞当前线程,并等待接收系统信号,然后再根据接收到的信号 sig 做 switch 逻辑,对于不同的信号做不同的处理。至此,关于“目标 JVM 对OS信号监听的实现”这一点,就已经分析结束了。简单的一句话总结就是,JVM 在启动的时候,会创建一个名为“Signal Dispatcher”的线程用于接收os 的信号,以便对不同信号分别做处理。

经过3.1的分析,我们已经知道在 JVM 启动之后,内部会有线程监听并处理 os 的信号,那么,这个时候,如果我们想和已经启动的 JVM 建立通信,当然就可以毫不犹豫的使用信号来进行了。不过,基于信号的通信,也是存在限制的,一方面,os 支持的信号是有限的,二来信号的通信往往是单向的,不方便通信双方进行高效的通信。基于这些,笔者认为,为了使得 Client JVM 和 Target JVM 更好的通信,就采用了 Socket 通信来实现二者的通信。那接下来我们看看,这个通道究竟是如何创建的?

当我们需要 attach 到某个目标 JVM 进程上去的时候,我们通常会写如下代码

1 VirtualMachine vm = VirtualMachine.attach(pid);

这样我们就能得到目标 JVM 的相关信息了,是不是很简单?不过,今天我们要做的可不是这么简单的事情,我们需要深入其后,了解其根本。接下来我们就以com.sun.tools.attach.VirtualMachineattach 方法入手,逐层揭开其神秘面纱。

 1 public static VirtualMachine attach(String id)
 2         throws AttachNotSupportedException, IOException
 3     {
 4         if (id == null) {
 5             throw new NullPointerException("id cannot be null");
 6         }
 7         List<AttachProvider> providers = AttachProvider.providers();
 8         if (providers.size() == 0) {
 9             throw new AttachNotSupportedException("no providers installed");
10         }
11         AttachNotSupportedException lastExc = null;
12         for (AttachProvider provider: providers) {
13             try {
14                 return provider.attachVirtualMachine(id);
15             } catch (AttachNotSupportedException x) {
16                 lastExc = x;
17             }
18         }
19         throw lastExc;
20     }

这是attach的源码,入参为目标 JVM 的进程 ID,其实现委派给了 AttachProvider 了,通过provider.attachVirtualMachine(id);来实现真正的 attach 操作。由于 AttachProvider 是个抽象类,所以这个方法的真正实现在子类中,在 Linux 环境下,我们看 sun.tools.attach.BsdAttachProvider.java 的实现。

 1 public VirtualMachine attachVirtualMachine(String vmid)
 2         throws AttachNotSupportedException, IOException
 3     {
 4         checkAttachPermission();
 5
 6         // AttachNotSupportedException will be thrown if the target VM can be determined
 7         // to be not attachable.
 8         testAttachable(vmid);
 9
10         return new BsdVirtualMachine(this, vmid);
11     }

这个方法非常简单,就是 new 了一个 BsdVirtualMachine 对象,并且把目标进程 ID 带过去了。看sun.tools.attach.BsdVirtualMachine.java 的构造函数

 1 /**
 2      * Attaches to the target VM
 3      */
 4     BsdVirtualMachine(AttachProvider provider, String vmid)
 5         throws AttachNotSupportedException, IOException
 6     {
 7         super(provider, vmid);
 8
 9         // This provider only understands pids
10         int pid;
11         try {
12             pid = Integer.parseInt(vmid);
13         } catch (NumberFormatException x) {
14             throw new AttachNotSupportedException("Invalid process identifier");
15         }
16
17         // Find the socket file. If not found then we attempt to start the
18         // attach mechanism in the target VM by sending it a QUIT signal.
19         // Then we attempt to find the socket file again.
20         path = findSocketFile(pid);
21         if (path == null) {
22             File f = new File(tmpdir, ".attach_pid" + pid);
23             createAttachFile(f.getPath());
24             try {
25                 sendQuitTo(pid);
26
27                 // give the target VM time to start the attach mechanism
28                 int i = 0;
29                 long delay = 200;
30                 int retries = (int)(attachTimeout() / delay);
31                 do {
32                     try {
33                         Thread.sleep(delay);
34                     } catch (InterruptedException x) { }
35                     path = findSocketFile(pid);
36                     i++;
37                 } while (i <= retries && path == null);
38                 if (path == null) {
39                     throw new AttachNotSupportedException(
40                         "Unable to open socket file: target process not responding " +
41                         "or HotSpot VM not loaded");
42                 }
43             } finally {
44                 f.delete();
45             }
46         }
47
48         // Check that the file owner/permission to avoid attaching to
49         // bogus process
50         checkPermissions(path);
51
52         // Check that we can connect to the process
53         // - this ensures we throw the permission denied error now rather than
54         // later when we attempt to enqueue a command.
55         int s = socket();
56         try {
57             connect(s, path);
58         } finally {
59             close(s);
60         }
61     }

首先看20行处的findSocketFile(pid);这里是找对应的 socket (/tmp/.java_pid${pid})文件,这个文件就是我们在第二大点图中画出来的,用于进程间通信的 socket 文件,如果不存在,即第一次进入该方法的时候。这时会运行到74行的createAttachFile(f.getPath());来创建一个attach 文件,socket 文件的命名方式为:/tmp/../.attach_pid${pid},关于这两个方法(findSocketFile和createAttachFile)的具体实现,这里就不展开了,感兴趣的可以直接去查看jdk/src/solaris/native/sun/tools/attach/BsdVirtualMachine.c的相关源码。然后就会运行到一个非常关键的方法25行的sendQuitTo(pid);这个方法的实现,我们等会进入BsdVirtualMachine.c看下源码,其主要目的就是给该进程发送一个信号。之后会进入到31行处的 do…while循环,自旋反复轮询指定的次数来获取该 socket 文件的路径,直到超时或者 path(即 socket 文件路径) 不为空,最后在55行处,建立一个 socket,并且在57行处通过 path 进行 socket 的连接,从而完成了客户端(Client JVM)到目标进程(Target JVM)的 socket 通道建立。不过,请打住,这里是不是少了点什么?我相信细心的你肯定发现了,至少还存2个问题,

1. Target JVM 的 socket 服务端是何时创建的?

2. 用于通信的 socket 文件是在哪里创建的?

带着这两个问题,我们进入25行关键方法sendQuitTo(pid);的源码解读,该方法是个本地方法,位于jdk/src/solaris/native/sun/tools/attach/BsdVirtualMachine.c中

 1 /*
 2  * Class:     sun_tools_attach_BsdVirtualMachine
 3  * Method:    sendQuitTo
 4  * Signature: (I)V
 5  */
 6 JNIEXPORT void JNICALL Java_sun_tools_attach_BsdVirtualMachine_sendQuitTo
 7   (JNIEnv *env, jclass cls, jint pid)
 8 {
 9     if (kill((pid_t)pid, SIGQUIT)) {
10         JNU_ThrowIOExceptionWithLastError(env, "kill");
11     }
12 }

看到第9行的时候,是不是觉得这里必然和前面3.1中大篇幅分析的信号处理线程“Signal Dispatcher”有种必然联系了?没错,这里就是通过 kill 这个系统调用像目标 JVM,发送了一个 SIGQUIT 的信号,该信号是个#define,即宏,表示的数字“3”,即类似在 linux 命令行执行了“kill -3 ${pid}”的操做(其实,这个命令正是获取目标 JVM 线程 dump 文件的一种方式,读者可以试试)。既然这里向目标 JVM 发送了这么个信号,那么我们现在就移步到3.1中讲到过的 signal_thread_entry 方法中去。

 1 static void signal_thread_entry(JavaThread* thread, TRAPS) {
 2   os::set_priority(thread, NearMaxPriority);
 3   while (true) {
 4     int sig;
 5     {
 6       // FIXME : Currently we have not decieded what should be the status
 7       //         for this java thread blocked here. Once we decide about
 8       //         that we should fix this.
 9       sig = os::signal_wait();
10     }
11     if (sig == os::sigexitnum_pd()) {
12        // Terminate the signal thread
13        return;
14     }
15
16     switch (sig) {
17       case SIGBREAK: {
18         // Check if the signal is a trigger to start the Attach Listener - in that
19         // case don't print stack traces.
20         if (!DisableAttachMechanism && AttachListener::is_init_trigger()) {
21           continue;
22         }
23         // Print stack traces
24         // Any SIGBREAK operations added here should make sure to flush
25         // the output stream (e.g. tty->flush()) after output.  See 4803766.
26         // Each module also prints an extra carriage return after its output.
27         VM_PrintThreads op;
28         VMThread::execute(&op);
29         VM_PrintJNI jni_op;
30         VMThread::execute(&jni_op);
31         VM_FindDeadlocks op1(tty);
32         VMThread::execute(&op1);
33         Universe::print_heap_at_SIGBREAK();
34         if (PrintClassHistogram) {
35           VM_GC_HeapInspection op1(gclog_or_tty, true /* force full GC before heap inspection */);
36           VMThread::execute(&op1);
37         }
38         if (JvmtiExport::should_post_data_dump()) {
39           JvmtiExport::post_data_dump();
40         }
41         break;
42       }
43       default: {
44         // Dispatch the signal to java
45         HandleMark hm(THREAD);
46         Klass* k = SystemDictionary::resolve_or_null(vmSymbols::sun_misc_Signal(), THREAD);
47         KlassHandle klass (THREAD, k);
48         if (klass.not_null()) {
49           JavaValue result(T_VOID);
50           JavaCallArguments args;
51           args.push_int(sig);
52           JavaCalls::call_static(
53             &result,
54             klass,
55             vmSymbols::dispatch_name(),
56             vmSymbols::int_void_signature(),
57             &args,
58             THREAD
59           );
60         }
61         if (HAS_PENDING_EXCEPTION) {
62           // tty is initialized early so we don't expect it to be null, but
63           // if it is we can't risk doing an initialization that might
64           // trigger additional out-of-memory conditions
65           if (tty != NULL) {
66             char klass_name[256];
67             char tmp_sig_name[16];
68             const char* sig_name = "UNKNOWN";
69             InstanceKlass::cast(PENDING_EXCEPTION->klass())->
70               name()->as_klass_external_name(klass_name, 256);
71             if (os::exception_name(sig, tmp_sig_name, 16) != NULL)
72               sig_name = tmp_sig_name;
73             warning("Exception %s occurred dispatching signal %s to handler"
74                     "- the VM may need to be forcibly terminated",
75                     klass_name, sig_name );
76           }
77           CLEAR_PENDING_EXCEPTION;
78         }
79       }
80     }
81   }
82 }

这里的17行,我们看到了有个对 SIGBREAK(宏定义) 信号处理的 case,事实上,这个SIGBREAK和前面客户端发过来的SIGQUIT 的值是一样的,都是“3”,熟悉 C语言的读者应该不难理解。所以,当客户端发送这个信号给目标 JVM 时,就理所应当的进入了这个 case 的处理逻辑。这里的27行到40行,事实上就是对“kill -3 ${pid}”执行时对应的处理逻辑“进行目标 JVM 进程的线程 dump 操作”。现在我们重点关注一下20行的 if 语句,第一个 boolean 值,某认情况下是false(可通过/hotspot/src/share/vm/runtime/globals.c)查看,表示某认情况下是不禁止attach 机制的,于是就会进入第二个条件的判断AttachListener::is_init_trigger(),这里的判断还是比较有意思的(即判断当前杀是不是需要进行 attach 的初始化操作),我们进入源码,源码的文件为:hotspot/src/os/bsd/vm/attachListener_bsd.cpp

 1 // If the file .attach_pid<pid> exists in the working directory
 2 // or /tmp then this is the trigger to start the attach mechanism
 3 bool AttachListener::is_init_trigger() {
 4   if (init_at_startup() || is_initialized()) {
 5     return false;               // initialized at startup or already initialized
 6   }
 7   char path[PATH_MAX + 1];
 8   int ret;
 9   struct stat st;
10
11   snprintf(path, PATH_MAX + 1, "%s/.attach_pid%d",
12            os::get_temp_directory(), os::current_process_id());
13   RESTARTABLE(::stat(path, &st), ret);
14   if (ret == 0) {
15     // simple check to avoid starting the attach mechanism when
16     // a bogus user creates the file
17     if (st.st_uid == geteuid()) {
18       init();
19       return true;
20     }
21   }
22   return false;
23 }

方法进入的第一行,即判断是不是在 JVM 启动时就初始化或者之前已经初始化过,如果是,则直接返回,否则继续当前方法。方法的第11行,是在处理/tmp/attach_pid${pid}路径(这个文件就是 Client JVM 在attach 时创建的),并把 path 传入13行定义的宏进行判断,如果这个文件存在,且刚好是当前用户的创建的 attach_pid 文件,则进入18行的 init() 方法,否则什么也不做,返回 false。接着我们进入 init 的源码(hotspot/src/share/vm/services/attachListener.cpp)

 1 // Starts the Attach Listener thread
 2 void AttachListener::init() {
 3   EXCEPTION_MARK;
 4   Klass* k = SystemDictionary::resolve_or_fail(vmSymbols::java_lang_Thread(), true, CHECK);
 5   instanceKlassHandle klass (THREAD, k);
 6   instanceHandle thread_oop = klass->allocate_instance_handle(CHECK);
 7
 8   const char thread_name[] = "Attach Listener";
 9   Handle string = java_lang_String::create_from_str(thread_name, CHECK);
10
11   // Initialize thread_oop to put it into the system threadGroup
12   Handle thread_group (THREAD, Universe::system_thread_group());
13   JavaValue result(T_VOID);
14   JavaCalls::call_special(&result, thread_oop,
15                        klass,
16                        vmSymbols::object_initializer_name(),
17                        vmSymbols::threadgroup_string_void_signature(),
18                        thread_group,
19                        string,
20                        THREAD);
21
22   if (HAS_PENDING_EXCEPTION) {
23     tty->print_cr("Exception in VM (AttachListener::init) : ");
24     java_lang_Throwable::print(PENDING_EXCEPTION, tty);
25     tty->cr();
26
27     CLEAR_PENDING_EXCEPTION;
28
29     return;
30   }
31
32   KlassHandle group(THREAD, SystemDictionary::ThreadGroup_klass());
33   JavaCalls::call_special(&result,
34                         thread_group,
35                         group,
36                         vmSymbols::add_method_name(),
37                         vmSymbols::thread_void_signature(),
38                         thread_oop,             // ARG 1
39                         THREAD);
40
41   if (HAS_PENDING_EXCEPTION) {
42     tty->print_cr("Exception in VM (AttachListener::init) : ");
43     java_lang_Throwable::print(PENDING_EXCEPTION, tty);
44     tty->cr();
45
46     CLEAR_PENDING_EXCEPTION;
47
48     return;
49   }
50
51   { MutexLocker mu(Threads_lock);
52     JavaThread* listener_thread = new JavaThread(&attach_listener_thread_entry);
53
54     // Check that thread and osthread were created
55     if (listener_thread == NULL || listener_thread->osthread() == NULL) {
56       vm_exit_during_initialization("java.lang.OutOfMemoryError",
57                                     "unable to create new native thread");
58     }
59
60     java_lang_Thread::set_thread(thread_oop(), listener_thread);
61     java_lang_Thread::set_daemon(thread_oop());
62
63     listener_thread->set_threadObj(thread_oop());
64     Threads::add(listener_thread);
65     Thread::start(listener_thread);
66   }
67 }

从源码中,我们可以看出来,这里最主要的功能是,创建一个名为“Attach Listener”的 Java 线程,该线程启动后会调用attach_listener_thread_entry这个方法(52行),来完成有关的任务处理。进入attach_listener_thread_entry方法

 1 // The Attach Listener threads services a queue. It dequeues an operation
 2 // from the queue, examines the operation name (command), and dispatches
 3 // to the corresponding function to perform the operation.
 4
 5 static void attach_listener_thread_entry(JavaThread* thread, TRAPS) {
 6   os::set_priority(thread, NearMaxPriority);
 7
 8   thread->record_stack_base_and_size();
 9
10   if (AttachListener::pd_init() != 0) {
11     return;
12   }
13   AttachListener::set_initialized();
14
15   for (;;) {
16     AttachOperation* op = AttachListener::dequeue();
17     if (op == NULL) {
18       return;   // dequeue failed or shutdown
19     }
20
21     ResourceMark rm;
22     bufferedStream st;
23     jint res = JNI_OK;
24
25     // handle special detachall operation
26     if (strcmp(op->name(), AttachOperation::detachall_operation_name()) == 0) {
27       AttachListener::detachall();
28     } else {
29       // find the function to dispatch too
30       AttachOperationFunctionInfo* info = NULL;
31       for (int i=0; funcs[i].name != NULL; i++) {
32         const char* name = funcs[i].name;
33         assert(strlen(name) <= AttachOperation::name_length_max, "operation <= name_length_max");
34         if (strcmp(op->name(), name) == 0) {
35           info = &(funcs[i]);
36           break;
37         }
38       }
39
40       // check for platform dependent attach operation
41       if (info == NULL) {
42         info = AttachListener::pd_find_operation(op->name());
43       }
44
45       if (info != NULL) {
46         // dispatch to the function that implements this operation
47         res = (info->func)(op, &st);
48       } else {
49         st.print("Operation %s not recognized!", op->name());
50         res = JNI_ERR;
51       }
52     }
53
54     // operation complete - send result and output to client
55     op->complete(res, &st);
56   }
57 }

这里需要关注两个方面的内容,

第一、第10行的AttachListener::pd_init();

第二、第15行开始的 for 循环里面的内容。

首先看AttachListener::pd_init()

 1 int AttachListener::pd_init() {
 2   JavaThread* thread = JavaThread::current();
 3   ThreadBlockInVM tbivm(thread);
 4
 5   thread->set_suspend_equivalent();
 6   // cleared by handle_special_suspend_equivalent_condition() or
 7   // java_suspend_self() via check_and_wait_while_suspended()
 8
 9   int ret_code = BsdAttachListener::init();
10
11   // were we externally suspended while we were waiting?
12   thread->check_and_wait_while_suspended();
13
14   return ret_code;
15 }

以上的 pd_init() 方法是在hotspot/src/os/bsd/vm/attachListener_bsd.cpp中实现的,我们看第9行的代码,调用了BsdAttachListener::init()一个这样的方法,该方法的主要作用就是生产 socket 通信文件的。源码如下

 1 // Initialization - create a listener socket and bind it to a file
 2
 3 int BsdAttachListener::init() {
 4   char path[UNIX_PATH_MAX];          // socket file
 5   char initial_path[UNIX_PATH_MAX];  // socket file during setup
 6   int listener;                      // listener socket (file descriptor)
 7
 8   // register function to cleanup
 9   ::atexit(listener_cleanup);
10
11   int n = snprintf(path, UNIX_PATH_MAX, "%s/.java_pid%d",
12                    os::get_temp_directory(), os::current_process_id());
13   if (n < (int)UNIX_PATH_MAX) {
14     n = snprintf(initial_path, UNIX_PATH_MAX, "%s.tmp", path);
15   }
16   if (n >= (int)UNIX_PATH_MAX) {
17     return -1;
18   }
19
20   // create the listener socket
21   listener = ::socket(PF_UNIX, SOCK_STREAM, 0);
22   if (listener == -1) {
23     return -1;
24   }
25
26   // bind socket
27   struct sockaddr_un addr;
28   addr.sun_family = AF_UNIX;
29   strcpy(addr.sun_path, initial_path);
30   ::unlink(initial_path);
31   int res = ::bind(listener, (struct sockaddr*)&addr, sizeof(addr));
32   if (res == -1) {
33     ::close(listener);
34     return -1;
35   }
36
37   // put in listen mode, set permissions, and rename into place
38   res = ::listen(listener, 5);
39   if (res == 0) {
40     RESTARTABLE(::chmod(initial_path, S_IREAD|S_IWRITE), res);
41     if (res == 0) {
42       // make sure the file is owned by the effective user and effective group
43       // (this is the default on linux, but not on mac os)
44       RESTARTABLE(::chown(initial_path, geteuid(), getegid()), res);
45       if (res == 0) {
46         res = ::rename(initial_path, path);
47       }
48     }
49   }
50   if (res == -1) {
51     ::close(listener);
52     ::unlink(initial_path);
53     return -1;
54   }
55   set_path(path);
56   set_listener(listener);
57
58   return 0;
59 }

从方法的注释,就能看出这个方法就是用来创建一个基于文件的 socket 的 listener 端,即服务端的。具体的创建过程,代码写的已经很清楚了,我做下简单描述,11行处,构建 socket 通信文件的路径(/tmp/.java_pid${pid}),21 行,创建一个 socket,其中关注 socket 函数的第一个参数,当为 PF_UNIX 时,表示创建文件 socket,详情可以参考 linux 的 socket 函数说明,然后到29行,将 socket 文件 path 拷贝到 socket 通信地址中,即以此文件作为通信地址,然后在31行时,将 socket 和该 socket 文件地址做一个绑定,38行,表示对当前 socket 进行监听(数字5表示监听时可容纳客户端连接的队列的大小),如果有Client JVM 的客户端连接上来,并且发送了相关消息,该服务端就可以对其进行相应处理了。至此,进程间 socket 的通信的通道就建立了。

其次看下 for 循环做了什么?其实很简单,16行,BsdAttachListener::dequeue() 从监听器的队列中拿到一个 Client JVM 的AttachOperation(当客户端 attach 上 target JVM 之后,往目标 JVM 发送任意 socket 信息,都会被放置到这个队列中,等待被处理),此处会被阻塞,直到收到请求,如下源码的13行, socket 的 accept 函数处于等待状态,等待来之客户端 JVM 的相关请求,一旦获取到请求,则将请求组装好返回给调用者一个BadAttachOperation 对象。

 1 // Dequeue an operation
 2 //
 3 // In the Bsd implementation there is only a single operation and clients
 4 // cannot queue commands (except at the socket level).
 5 //
 6 BsdAttachOperation* BsdAttachListener::dequeue() {
 7   for (;;) {
 8     int s;
 9
10     // wait for client to connect
11     struct sockaddr addr;
12     socklen_t len = sizeof(addr);
13     RESTARTABLE(::accept(listener(), &addr, &len), s);
14     if (s == -1) {
15       return NULL;      // log a warning?
16     }
17
18     // get the credentials of the peer and check the effective uid/guid
19     // - check with jeff on this.
20     uid_t puid;
21     gid_t pgid;
22     if (::getpeereid(s, &puid, &pgid) != 0) {
23       ::close(s);
24       continue;
25     }
26     uid_t euid = geteuid();
27     gid_t egid = getegid();
28
29     if (puid != euid || pgid != egid) {
30       ::close(s);
31       continue;
32     }
33
34     // peer credential look okay so we read the request
35     BsdAttachOperation* op = read_request(s);
36     if (op == NULL) {
37       ::close(s);
38       continue;
39     } else {
40       return op;
41     }
42   }
43 }

所以,只要收到一个AttachOperation不为“detachall”的操纵请求就会进入到45行处进行处理,这里的目的就是为了拿到对应的操作AttachOperationFunctionInfo对象,如果不为空,则调用其func,来完成对客户端的响应,如47行所示。AttachOperationFunctionInfo(/hotspot/src/share/vm/services/attachListener.cpp)的定义如下

 1 // names must be of length <= AttachOperation::name_length_max
 2 static AttachOperationFunctionInfo funcs[] = {
 3   { "agentProperties",  get_agent_properties },
 4   { "datadump",         data_dump },
 5   { "dumpheap",         dump_heap },
 6   { "load",             JvmtiExport::load_agent_library },
 7   { "properties",       get_system_properties },
 8   { "threaddump",       thread_dump },
 9   { "inspectheap",      heap_inspection },
10   { "setflag",          set_flag },
11   { "printflag",        print_flag },
12   { "jcmd",             jcmd },
13   { NULL,               NULL }
14 };

从这里,我们可以看到,threaddump、dumpheap 等我们常用的操纵。到此为止,水落石出,涉及到 attach 操纵的服务端的原理基本已经理清楚了。接下来我们以 jstack 为例,来看下客户端 JVM 是不是确实是以我们上面分析出来的方式与服务端 JVM 进行通信,并获取到它想要的内容的。

我们首先进入 jstack 的源码,源码目录为jdk/src/share/classes/sun/tools/jstack/JStack.java。进入 main 函数

 1 public static void main(String[] args) throws Exception {
 2         if (args.length == 0) {
 3             usage(1); // no arguments
 4         }
 5
 6         boolean useSA = false;
 7         boolean mixed = false;
 8         boolean locks = false;
 9
10         // Parse the options (arguments starting with "-" )
11         int optionCount = 0;
12         while (optionCount < args.length) {
13             String arg = args[optionCount];
14             if (!arg.startsWith("-")) {
15                 break;
16             }
17             if (arg.equals("-help") || arg.equals("-h")) {
18                 usage(0);
19             }
20             else if (arg.equals("-F")) {
21                 useSA = true;
22             }
23             else {
24                 if (arg.equals("-m")) {
25                     mixed = true;
26                 } else {
27                     if (arg.equals("-l")) {
28                        locks = true;
29                     } else {
30                         usage(1);
31                     }
32                 }
33             }
34             optionCount++;
35         }
36
37         // mixed stack implies SA tool
38         if (mixed) {
39             useSA = true;
40         }
41
42         // Next we check the parameter count. If there are two parameters
43         // we assume core file and executable so we use SA.
44         int paramCount = args.length - optionCount;
45         if (paramCount == 0 || paramCount > 2) {
46             usage(1);
47         }
48         if (paramCount == 2) {
49             useSA = true;
50         } else {
51             // If we can't parse it as a pid then it must be debug server
52             if (!args[optionCount].matches("[0-9]+")) {
53                 useSA = true;
54             }
55         }
56
57         // now execute using the SA JStack tool or the built-in thread dumper
58         if (useSA) {
59             // parameters (<pid> or <exe> <core>
60             String params[] = new String[paramCount];
61             for (int i=optionCount; i<args.length; i++ ){
62                 params[i-optionCount] = args[i];
63             }
64             runJStackTool(mixed, locks, params);
65         } else {
66             // pass -l to thread dump operation to get extra lock info
67             String pid = args[optionCount];
68             String params[];
69             if (locks) {
70                 params = new String[] { "-l" };
71             } else {
72                 params = new String[0];
73             }
74             runThreadDump(pid, params);
75         }
76     }

当采用 jstack -l 时,会走65行的 else 分支,最终执行77行的runThreadDump方法

 1 // Attach to pid and perform a thread dump
 2     private static void runThreadDump(String pid, String args[]) throws Exception {
 3         VirtualMachine vm = null;
 4         try {
 5             vm = VirtualMachine.attach(pid);
 6         } catch (Exception x) {
 7             String msg = x.getMessage();
 8             if (msg != null) {
 9                 System.err.println(pid + ": " + msg);
10             } else {
11                 x.printStackTrace();
12             }
13             if ((x instanceof AttachNotSupportedException) &&
14                 (loadSAClass() != null)) {
15                 System.err.println("The -F option can be used when the target " +
16                     "process is not responding");
17             }
18             System.exit(1);
19         }
20
21         // Cast to HotSpotVirtualMachine as this is implementation specific
22         // method.
23         InputStream in = ((HotSpotVirtualMachine)vm).remoteDataDump((Object[])args);
24
25         // read to EOF and just print output
26         byte b[] = new byte[256];
27         int n;
28         do {
29             n = in.read(b);
30             if (n > 0) {
31                 String s = new String(b, 0, n, "UTF-8");
32                 System.out.print(s);
33             }
34         } while (n > 0);
35         in.close();
36         vm.detach();
37     }

5行:执行VirtualMachine.attach(pid);则会达到3.1、3.2的效果,即服务端已经做好了所有 attach 所需的准备,如 socket 服务端、socket 通信文件、socket 请求处理线程“Attach Listener”。

23行:通过调用 HotSpotVirtualMachine 对象的 remoteDataDump 函数进行远程 dump,获得输入流 InputStream in,最后通过读取输入流的内容,来通过标准输出流输出从服务端获取的数据。至此,jstack -l 命令完成所有操作。

接下来,我们重点分析HotSpotVirtualMachine 对象的 remoteDataDump 函数。首先上HotSpotVirtualMachine(/jdk/src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java) 对象的 remoteDataDump的源码

1 // Remote ctrl-break. The output of the ctrl-break actions can
2     // be read from the input stream.
3     public InputStream remoteDataDump(Object ... args) throws IOException {
4         return executeCommand("threaddump", args);
5     }

请注意4行的 cmd 字符串为“threaddump”,这个和3.2中 AttachOperationFunctionInfo 的定义是吻合的,也就是说最终在服务端会调用 thread_dump 方法,来执行线程 dump,并将结果返回给客户端。接着我们看下下 executeCommand方法,该方法只是简单的调用 execute 方法,如下

 1 /*
 2      * Convenience method for simple commands
 3      */
 4     private InputStream executeCommand(String cmd, Object ... args) throws IOException {
 5         try {
 6             return execute(cmd, args);
 7         } catch (AgentLoadException x) {
 8             throw new InternalError("Should not get here", x);
 9         }
10     }

exectue 方法在该类中为抽象方法,其具体实现放在了sun.tools.attach.BsdVirtualMachine.java中,我们看下在其具体实现这里可是最最关键的地方了

 1 /**
 2      * Execute the given command in the target VM.
 3      */
 4     InputStream execute(String cmd, Object ... args) throws AgentLoadException, IOException {
 5         assert args.length <= 3;                // includes null
 6
 7         // did we detach?
 8         String p;
 9         synchronized (this) {
10             if (this.path == null) {
11                 throw new IOException("Detached from target VM");
12             }
13             p = this.path;
14         }
15
16         // create UNIX socket
17         int s = socket();
18
19         // connect to target VM
20         try {
21             connect(s, p);
22         } catch (IOException x) {
23             close(s);
24             throw x;
25         }
26
27         IOException ioe = null;
28
29         // connected - write request
30         // <ver> <cmd> <args...>
31         try {
32             writeString(s, PROTOCOL_VERSION);
33             writeString(s, cmd);
34
35             for (int i=0; i<3; i++) {
36                 if (i < args.length && args[i] != null) {
37                     writeString(s, (String)args[i]);
38                 } else {
39                     writeString(s, "");
40                 }
41             }
42         } catch (IOException x) {
43             ioe = x;
44         }
45
46
47         // Create an input stream to read reply
48         SocketInputStream sis = new SocketInputStream(s);
49
50         // Read the command completion status
51         int completionStatus;
52         try {
53             completionStatus = readInt(sis);
54         } catch (IOException x) {
55             sis.close();
56             if (ioe != null) {
57                 throw ioe;
58             } else {
59                 throw x;
60             }
61         }
62
63         if (completionStatus != 0) {
64             sis.close();
65
66             // In the event of a protocol mismatch then the target VM
67             // returns a known error so that we can throw a reasonable
68             // error.
69             if (completionStatus == ATTACH_ERROR_BADVERSION) {
70                 throw new IOException("Protocol mismatch with target VM");
71             }
72
73             // Special-case the "load" command so that the right exception is
74             // thrown.
75             if (cmd.equals("load")) {
76                 throw new AgentLoadException("Failed to load agent library");
77             } else {
78                 throw new IOException("Command failed in target VM");
79             }
80         }
81
82         // Return the input stream so that the command output can be read
83         return sis;
84     }

17行:创建一个 socket,这个是 socket 是一个 jni 本地方法,有兴趣的可以去看对应的实现,源码在jdk/src/solaris/native/sun/tools/attach/BsdVirtualMachine.c中,其关键操作就一个return socket(PF_UNIX, SOCK_STREAM, 0) 客户端 socket 连接。

21行:这里也是一个本地方法,调用了 connect(s,p),这里的 p 就是 attach 时产生的/tmp/.java_pid${pid}的 socket 文件路径,这样,客户端就和目标 JVM 连接上了,该方法同样是一个 native 方法,可以通过查看BsdVirtualMachine.c的源码来进行查看,如下,重点在16行使用 socket 文件路径作为连接地址 和 18 行与目标 JVM 端启动的 socket server 建立连接;

 1 /*
 2  * Class:     sun_tools_attach_BsdVirtualMachine
 3  * Method:    connect
 4  * Signature: (ILjava/lang/String;)I
 5  */
 6 JNIEXPORT void JNICALL Java_sun_tools_attach_BsdVirtualMachine_connect
 7   (JNIEnv *env, jclass cls, jint fd, jstring path)
 8 {
 9     jboolean isCopy;
10     const char* p = GetStringPlatformChars(env, path, &isCopy);
11     if (p != NULL) {
12         struct sockaddr_un addr;
13         int err = 0;
14
15         addr.sun_family = AF_UNIX;
16         strcpy(addr.sun_path, p);
17
18         if (connect(fd, (struct sockaddr*)&addr, sizeof(addr)) == -1) {
19             err = errno;
20         }
21
22         if (isCopy) {
23             JNU_ReleaseStringPlatformChars(env, path, p);
24         }
25
26         /*
27          * If the connect failed then we throw the appropriate exception
28          * here (can't throw it before releasing the string as can't call
29          * JNI with pending exception)
30          */
31         if (err != 0) {
32             if (err == ENOENT) {
33                 JNU_ThrowByName(env, "java/io/FileNotFoundException", NULL);
34             } else {
35                 char* msg = strdup(strerror(err));
36                 JNU_ThrowIOException(env, msg);
37                 if (msg != NULL) {
38                     free(msg);
39                 }
40             }
41         }
42     }

31~44行:想 Target JVM 端发送命令 threaddump、以及可能存在的相关参数,如-l;这里的 writeString 同样是一个本地方法,涉及到的底层操作就是一个 C 语言库的 write 操作,感兴趣的可以自己看源码,不再赘述;

48~83行:这里就是对当前 socket 连接,构建一个 SocketInputStream 对象,并等待Target JVM 端数据完全返回,最后将这个 InputStream 对象作为方法返回参数返回。

总结

本文结合 Attach 的原理和使用案例(jstack -l),对 Attach 的各个方面都进行了深入的分析和总结,希望能对有需要的同学有所帮助。当然,以上均为本人个人所学,所以难免会有错误和疏忽的地方,如果您发现了,还麻烦指出。