在分析zygote进程的第一篇文章里面,我们讲到了ZygoteInit进程中会fork出systemServer进程(没看过第一篇的同学可以先看第一篇文章哦,[android启动之zygote进程(一)](https://liqi.site/archives/android%E5%90%AF%E5%8A%A8%E4%B9%8Bzygote%E8%BF%9B%E7%A8%8B%E4%B8%80)),那篇文章里我们遗留了一些有关systemServer进程的代码没有分析,这篇文章就开始分析有关systemServer的代码。
我们先回顾下之前文章里分析到的systemServer代码,之前我们分析init进程的时候,调用了ZygoteInit的main方法,我们先看下这个方法:
```java
public static void main(String argv[]) {
..............
if (startSystemServer) {
startSystemServer(abiList, socketName, zygoteServer);
}
...........
} catch (Zygote.MethodAndArgsCaller caller) {
caller.run();
} catch (Throwable ex) {
Log.e(TAG, "System zygote died with exception", ex);
zygoteServer.closeServerSocket();
throw ex;
}
}
```
这里的代码只贴出了和systemServer有关的部分,在这里我们分析到了调用startSystemServer方法,这个方法会启动systemServer:
```java
private static boolean startSystemServer(String abiList, String socketName, ZygoteServer zygoteServer)
throws Zygote.MethodAndArgsCaller, RuntimeException {
.................
/* For child process */
if (pid == 0) { // 复制出的systemServer进程走这里
if (hasSecondZygote(abiList)) {
waitForSecondaryZygote(socketName);
}
// 关闭zygote的socket
zygoteServer.closeServerSocket();
// fork出systemServer后,继续处理
handleSystemServerProcess(parsedArgs);
}
return true;
}
```
这个方法中会fork出systemServer进程,最后会调用handleSystemServerProcess方法,这个方法已经是在systemServer进程中执行的,所以我们现在开始接着这个方法开始分析systemServer的启动流程。
# systemServer的入口方法
我们首先看下handleSystemServerProcess这个方法:
```java
private static void handleSystemServerProcess(
ZygoteConnection.Arguments parsedArgs)
throws Zygote.MethodAndArgsCaller {
// set umask to 0077 so new files and directories will default to owner-only permissions.
// 设置文件的权限
Os.umask(S_IRWXG | S_IRWXO);
if (parsedArgs.niceName != null) {
// 设置进程名system_server
Process.setArgV0(parsedArgs.niceName);
}
final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");
if (systemServerClasspath != null) {
performSystemServerDexOpt(systemServerClasspath);
// Capturing profiles is only supported for debug or eng builds since selinux normally
// prevents it.
boolean profileSystemServer = SystemProperties.getBoolean(
"dalvik.vm.profilesystemserver", false);
if (profileSystemServer && (Build.IS_USERDEBUG || Build.IS_ENG)) {
try {
File profileDir = Environment.getDataProfilesDePackageDirectory(
Process.SYSTEM_UID, "system_server");
File profile = new File(profileDir, "primary.prof");
profile.getParentFile().mkdirs();
profile.createNewFile();
String[] codePaths = systemServerClasspath.split(":");
VMRuntime.registerAppInfo(profile.getPath(), codePaths);
} catch (Exception e) {
Log.wtf(TAG, "Failed to set up system server profile", e);
}
}
}
```
这是这个方法的第一段代码,这里的代码主要做一些初始化和优化动作,不涉及到流程逻辑,我们简单看一下这段代码。首先是设置创建的文件或者文件夹的权限,这里简单介绍一下权限方法的知识。
熟悉linux的同学经常会看到给文件或者文件夹设置chmod 777类似这样的代码,这个什么意思呢?首先我们要知道这里的权限都是三位数的,之所以三位数是表示每个位置都代表一类对象,分别是该文件或者文件夹的owner,同组的用户,其他的用户。所以每一个位置上的这个数字就表示这个对象所拥有的权限。
然后我们在介绍下数字本身的意义。一般来说,权限有三种,可读,可写,可执行,比如000就表示不可读,不可写,不可执行。111则就是可读,可写,可执行了。所以我们看到这里0代表是没有权限,1代表是有权限。那么为什么会有其他数字呢?其实0和1是二进制的形式,其他的数组只不过是转化为十进制形式而已。比如111看成二进制转化为十进制就是7,011就是3,以此类推。所以说一个权限转化为十进制后只要用1个数字表示就可以了,比二进制更简洁。
上面介绍了权限对应的对象以及权限数字本身的意义,我们再来看一下umask和权限的对应关系。根据我们上面说的一个权限最高的数字是7,表示可读,可写,可执行。但是如果是刚创建一个文件的话,最多只能是6,不能立刻给与执行的能力,需要后面通过chmod命令修改,所以我们看下下面这个表格:
|数字|文件|文件夹|
|-------|-------|-------|
|0|6|7|
|1|6|6|
|2|4|5|
|3|4|4|
|4|2|3|
|5|2|2|
|6|0|1|
|7|0|0|
上面这个表格,对数字又做了一次映射,比如数字0,表示的其实是权限6或者7(文件创建时候最多为6),数字7则表示权限0,所以我们回到代码中,可以看到设置了S_IRWXG和S_IRWXO这两个权限,他们的定义如下:
```cpp
#define S_IRWXG 00070
#define S_IRWXO 00007
```
所以他们做或运算,其实就是077,按照上面我们的分析077对应到linux的权限数字就是600(文件)或者700(文件夹),所以这个权限只对owner有效。好了,权限这个主要就结合这里的代码介绍一下相关知识,我们接着往下看。
之后会设置进程名,现在这里进程名是system_server。再往后会判断systemServerClasspath这个环境变量是否有值,如果有值的话会做一些dex的优化,我们这里不准备讲优化的东西,如果后面有时间分析安装包方面的内容可能会分析这一块内容。我们这里主要跟一下systemServerClasspath这个环境变量是在哪里设置的。
首先来到init.environ.rc.in这个文件
```cpp
on init
export ANDROID_BOOTLOGO 1
export ANDROID_ROOT /system
export ANDROID_ASSETS /system/app
export ANDROID_DATA /data
export ANDROID_STORAGE /storage
export EXTERNAL_STORAGE /sdcard
export ASEC_MOUNTPOINT /mnt/asec
export BOOTCLASSPATH %BOOTCLASSPATH%
export SYSTEMSERVERCLASSPATH %SYSTEMSERVERCLASSPATH%
%EXPORT_GLOBAL_ASAN_OPTIONS%
%EXPORT_GLOBAL_GCOV_OPTIONS%
```
这个文件之前也看到过,这是执行命令的文件,看过之前文章的同学应该也比较熟悉了。可以看到这里SYSTEMSERVERCLASSPATH变量被用于export,export命令真正的执行命令就是写入环境变量中,具体对应方法的地方之前也分析过了,这个也就不说了,不熟悉的同学可以去看下之前的文章。
那么我主要看SYSTEMSERVERCLASSPATH这个变量的值是多少。这里看到源码/system/core/rootdir/Android.mk文件中:
```cpp
$(LOCAL_BUILT_MODULE): $(LOCAL_PATH)/init.environ.rc.in $(bcp_dep)
@echo "Generate: $< -> $@"
@mkdir -p $(dir $@)
$(hide) sed -e 's?%BOOTCLASSPATH%?$(PRODUCT_BOOTCLASSPATH)?g' $< >$@
$(hide) sed -i -e 's?%SYSTEMSERVERCLASSPATH%?$(PRODUCT_SYSTEM_SERVER_CLASSPATH)?g' $@
$(hide) sed -i -e 's?%EXPORT_GLOBAL_ASAN_OPTIONS%?$(EXPORT_GLOBAL_ASAN_OPTIONS)?g' $@
$(hide) sed -i -e 's?%EXPORT_GLOBAL_GCOV_OPTIONS%?$(EXPORT_GLOBAL_GCOV_OPTIONS)?g' $@
```
这里可以看到路径是由PRODUCT_SYSTEM_SERVER_CLASSPATH这个变量来指定的,而这个值是源码/build/core/dex_preopt.mk下指定的:
```cpp
PRODUCT_SYSTEM_SERVER_CLASSPATH := $(subst $(space),:,$(foreach m,$(PRODUCT_SYSTEM_SERVER_JARS),/system/framework/$(m).jar))
```
我们可以看到这里是一些jar包,具体jar是代表什么我们就不研究了,这里主要是看下这个路径在哪里。我们回到前面的代码。
上面的代码主要是systemServer进程在执行过程中,设置权限和优化dex包,下面开始要调用具体的方法了:
```java
if (parsedArgs.invokeWith != null) { // 启动systemServer没有这个参数,所以会走else
...............
} else {
ClassLoader cl = null;
if (systemServerClasspath != null) {
cl = createPathClassLoader(systemServerClasspath, parsedArgs.targetSdkVersion);
Thread.currentThread().setContextClassLoader(cl);
}
/*
* Pass the remaining arguments to SystemServer.
*/
// 调用systemServer的main方法
ZygoteInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);
}
```
首先判断invokeWith这个字段是否空,前面我们介绍过这个字段,一般正常启动过程中这个字段为空,只有当需要通过其他工具来启动的时候这个字段才有值,所以现在我们只看正常启动情况下的执行。
这里我们看else分支中的方法。首先设置一个classLoader,后面会反射来加载类。之后就调用ZygoteInit的zygoteInit方法:
```java
// 启动传进来的类,执行这个方法的是从zygote进程fork出来的进程,所以下面有初始化该进程binder线程池的操作
public static final void zygoteInit(int targetSdkVersion, String[] argv,
ClassLoader classLoader) throws Zygote.MethodAndArgsCaller {
if (RuntimeInit.DEBUG) {
Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");
}
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit");
// 日志初始化输出设备
RuntimeInit.redirectLogStreams();
// 通用的初始化
RuntimeInit.commonInit();
// 初始化这个进程的binder线程池
ZygoteInit.nativeZygoteInit();
// 新进程的初始化
RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader);
}
```
这个方法不长,不过他里面会调用几个方法。首先会初始化日志输出设备,这个我们知道下就好。接着会调用RuntimeInit.commonInit()这个方法,里面也是初始化了一些东西,我们看下这个方法:
```java
protected static final void commonInit() {
if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");
// 设置未捕获异常开始打印的log
Thread.setUncaughtExceptionPreHandler(new LoggingHandler());
// 设置默认未捕获异常处理方法,
Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler());
// 设置时区
TimezoneGetter.setInstance(new TimezoneGetter() {
@Override
public String getId() {
return SystemProperties.get("persist.sys.timezone");
}
});
TimeZone.setDefault(null);
// 日志重置
LogManager.getLogManager().reset();
new AndroidConfig();
// 设置用户代理
String userAgent = getDefaultUserAgent();
System.setProperty("http.agent", userAgent);
// 初始化流量统计
NetworkManagementSocketTagger.install();
String trace = SystemProperties.get("ro.kernel.android.tracing");
if (trace.equals("1")) {
Slog.i(TAG, "NOTE: emulator trace profiling enabled");
Debug.enableEmulatorTraceOutput();
}
initialized = true;
}
```
这个方法里面设置了时区,用户代理,重置了日志以及初始化流量的统计,这些初始化了不同的模块,具体如果对哪个模块感兴趣的同学可以再深入研究下。我们这里主要说一下开始的设置异常捕获的方法。
我们平时开发中进程会用try-catch来捕获异常,如果有个地方我们没有捕获到异常,那么程序就会崩溃。在java中有个方法可以捕获那些没有try的异常,就是dispatchUncaughtException方法:
```java
public final void dispatchUncaughtException(Throwable e) {
// BEGIN Android-added: uncaughtExceptionPreHandler for use by platform.
Thread.UncaughtExceptionHandler initialUeh =
Thread.getUncaughtExceptionPreHandler();
if (initialUeh != null) {
try {
initialUeh.uncaughtException(this, e);
} catch (RuntimeException | Error ignored) {
// Throwables thrown by the initial handler are ignored
}
}
// END Android-added: uncaughtExceptionPreHandler for use by platform.
getUncaughtExceptionHandler().uncaughtException(this, e);
}
```
这个方法可以看到,会先调用getUncaughtExceptionPreHandler方法获取一个UncaughtExceptionHandler对象,然后执行他的uncaughtException方法。之后再调用getUncaughtExceptionHandler方法,执行他的uncaughtException方法。这样即使不在程序中try住你的代码,这里也会替你来处理。我们先看下上面第一个方法getUncaughtExceptionPreHandler获取的对象是什么:
```java
public static UncaughtExceptionHandler getUncaughtExceptionPreHandler() {
return uncaughtExceptionPreHandler;
}
```
这里返回的是uncaughtExceptionPreHandler,这个对象的设置正好是我们分析的commonInit方法中的第一个方法:
```java
public static void setUncaughtExceptionPreHandler(UncaughtExceptionHandler eh) {
uncaughtExceptionPreHandler = eh;
}
```
在commonInit中,调用setUncaughtExceptionPreHandler方法传入的对象是LoggingHandler,我们看下这个类:
```java
private static class LoggingHandler implements Thread.UncaughtExceptionHandler {
@Override
public void uncaughtException(Thread t, Throwable e) {
// Don't re-enter if KillApplicationHandler has already run
if (mCrashing) return;
if (mApplicationObject == null) {
// The "FATAL EXCEPTION" string is still used on Android even though
// apps can set a custom UncaughtExceptionHandler that renders uncaught
// exceptions non-fatal.
Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);
} else {
StringBuilder message = new StringBuilder();
// The "FATAL EXCEPTION" string is still used on Android even though
// apps can set a custom UncaughtExceptionHandler that renders uncaught
// exceptions non-fatal.
message.append("FATAL EXCEPTION: ").append(t.getName()).append("\n");
final String processName = ActivityThread.currentProcessName();
if (processName != null) {
message.append("Process: ").append(processName).append(", ");
}
message.append("PID: ").append(Process.myPid());
Clog_e(TAG, message.toString(), e);
}
}
}
```
这个类的uncaughtException就是出现异常的时候调用的方法,这里会打印出这个进程的名字,pid等信息。所以默认程序出现crash时候都会看下这个进程的相关信息。接着commonInit的第二个方法又设置了一个UncaughtExceptionHandler,具体是调用setDefaultUncaughtExceptionHandler方法来设置的:
```java
public static void setDefaultUncaughtExceptionHandler(UncaughtExceptionHandler ueh) {
@SuppressWarnings("removal")
SecurityManager sm = System.getSecurityManager();
if (sm != null) {
sm.checkPermission(
new RuntimePermission("setDefaultUncaughtExceptionHandler"));
}
defaultUncaughtExceptionHandler = ueh;
}
```
这里把UncaughtExceptionHandler对象设置给defaultUncaughtExceptionHandler这个变量,这个变量在前面说到的dispatchUncaughtException方法中也会被执行,我们看下这里设置的时候这个类是什么样的:
```java
private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
public void uncaughtException(Thread t, Throwable e) {
try {
// Don't re-enter -- avoid infinite loops if crash-reporting crashes.
if (mCrashing) return;
mCrashing = true;
// Try to end profiling. If a profiler is running at this point, and we kill the
// process (below), the in-memory buffer will be lost. So try to stop, which will
// flush the buffer. (This makes method trace profiling useful to debug crashes.)
if (ActivityThread.currentActivityThread() != null) {
ActivityThread.currentActivityThread().stopProfiling();
}
// Bring up crash dialog, wait for it to be dismissed
// crash弹窗
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
} catch (Throwable t2) {
if (t2 instanceof DeadObjectException) {
// System process is dead; ignore
} else {
try {
Clog_e(TAG, "Error reporting crash", t2);
} catch (Throwable t3) {
// Even Clog_e() fails! Oh well.
}
}
} finally {
// Try everything to make sure this process goes away.
// 杀进程
Process.killProcess(Process.myPid());
System.exit(10);
}
}
}
```
这里看到首先会调用AMS的handleApplicationCrash方法,这个方法是AMS标准的崩溃时候的弹窗,最后finally里面会调用Process.killProcess杀死自己,所以我们我们平时经常遇到的崩溃后弹窗,然后退出程序就是在这里设置的。看到这里,有些小伙伴也可以尝试在自己的应用中利用这个方法来捕获一些我们开发中没有想到的一些异常,或者说集成了写三方的库,如果库中报错没处理,我们在这里是不是可以暂时处理下,看了这些代码,其实都是可以思考的问题。
# 初始化binder线程
commonInit方法就说到这里,我们返回ZygoteInit的zygoteInit方法中,继续看后面一个方法ZygoteInit.nativeZygoteInit(),这个方法是一个native方法,具体的执行方法如下:
```cpp
static void com_android_internal_os_ZygoteInit_nativeZygoteInit(JNIEnv* env, jobject clazz)
{
gCurRuntime->onZygoteInit();
}
virtual void onZygoteInit()
{
sp<ProcessState> proc = ProcessState::self();
ALOGV("App process: starting thread pool.\n");
proc->startThreadPool();
}
void ProcessState::startThreadPool()
{
AutoMutex _l(mLock);
if (!mThreadPoolStarted) {
mThreadPoolStarted = true;
spawnPooledThread(true);
}
}
void ProcessState::spawnPooledThread(bool isMain)
{
if (mThreadPoolStarted) {
String8 name = makeBinderThreadName();
ALOGV("Spawning new pooled thread, name=%s\n", name.string());
sp<Thread> t = new PoolThread(isMain);
t->run(name.string());
}
}
```
这里方法一路调用,最终会调用到ProcessState的spawnPooledThread方法中,如果之前看过binder系列文章的同学可能会有印象,每个进程在启动后会开启binder线程,这个binder线程如果没有任务的话会进入睡眠状态,一旦有跨进程之间的通讯了,那么binder驱动就会把对应的binder线程唤醒,binder线程就会处理接下来的任务。这里的spawnPooledThread方法其实就是在进程binder线程的初始化。这里会调用PoolThread的run方法,我们看下PoolThread这个类:
```cpp
class PoolThread : public Thread
{
public:
PoolThread(bool isMain)
: mIsMain(isMain)
{
}
protected:
virtual bool threadLoop() // 线程入口函数
{
IPCThreadState::self()->joinThreadPool(mIsMain);
return false;
}
const bool mIsMain;
};
```
PoolThread类继承了Thread类,重写了threadLoop方法,这个方法会在run方法中触发,我们等下会看下thread类是怎么从run出发到threadLoop方法的。这个方法里面会调用IPCThreadState的joinThreadPool方法,这个方法在之前在binder系列的文章中提过,但是没仔细分析过这个方法,因为如果不结合这里进程启动来讲,会比较抽象,现在正好说到进程启动了,所以自然而然的看下binder线程的启动,我们接下来也会看下这个方法。
我们现在先说下c++中thread类是怎么触发到threadLoop方法的,开发java的时候我们启动一个thread的话会把具体执行的方法写在run方法里面,c++这里会有些区别,熟悉pthread的同学可能知道用pthread创建一个线程的时候,会把具体方法传入,比如像这样:
```c++
pthread_create(&tid, NULL, fun, (void*)b)
```
这个方法是创建一个线程,其中第三个参数是这个线程需要执行的方法,我们知道这个参数就可以了,不对这个方法进行深入讲解。接着我们看下Thread的run方法:
```cpp
status_t Thread::run(const char* name, int32_t priority, size_t stack)
{
...........
bool res;
if (mCanCallJava) {
res = createThreadEtc(_threadLoop,
this, name, priority, stack, &mThread);
} else {
res = androidCreateRawThreadEtc(_threadLoop,
this, name, priority, stack, &mThread);
}
............
}
```
这里可以看到mCanCallJava为true时,表示是否是主线程,我们这里刚开始创建进程,所以是主线程,那么继续调用createThreadEtc方法,注意这里的第一个参数就是threadLoop方法
```java
inline bool createThreadEtc(thread_func_t entryFunction,
void *userData,
const char* threadName = "android:unnamed_thread",
int32_t threadPriority = PRIORITY_DEFAULT,
size_t threadStackSize = 0,
thread_id_t *threadId = 0)
{
return androidCreateThreadEtc(entryFunction, userData, threadName,
threadPriority, threadStackSize, threadId) ? true : false;
}
```
这里继续调用androidCreateThreadEtc方法:
```cpp
inline bool createThreadEtc(thread_func_t entryFunction,
void *userData,
const char* threadName = "android:unnamed_thread",
int32_t threadPriority = PRIORITY_DEFAULT,
size_t threadStackSize = 0,
thread_id_t *threadId = 0)
{
return androidCreateThreadEtc(entryFunction, userData, threadName,
threadPriority, threadStackSize, threadId) ? true : false;
}
```
继续看androidCreateThreadEtc方法:
```cpp
static android_create_thread_fn gCreateThreadFn = androidCreateRawThreadEtc;
int androidCreateThreadEtc(android_thread_func_t entryFunction,
void *userData,
const char* threadName,
int32_t threadPriority,
size_t threadStackSize,
android_thread_id_t *threadId)
{
return gCreateThreadFn(entryFunction, userData, threadName,
threadPriority, threadStackSize, threadId);
}
```
这里会调用gCreateThreadFn方法,而gCreateThreadFn指向androidCreateRawThreadEtc,所以我们接着看androidCreateRawThreadEtc方法:
```cpp
int androidCreateRawThreadEtc(android_thread_func_t entryFunction,
void *userData,
const char* threadName __android_unused,
int32_t threadPriority,
size_t threadStackSize,
android_thread_id_t *threadId)
{
...........
int result = pthread_create(&thread, &attr,
(android_pthread_entry)entryFunction, userData);
.............
}
```
这里我们看到了pthread_create方法,而这里第三个参数entryFunction就是我们说的threadLoop方法。好了,从run方法怎么执行到threadLoop方法我们跟踪好了,接着我们看看threadLoop方法这里的内容:
```cpp
virtual bool threadLoop() // 线程入口函数
{
IPCThreadState::self()->joinThreadPool(mIsMain);
return false;
}
```
这里会调用IPCThreadState的joinThreadPool方法,我们跟进去看看:
```cpp
void IPCThreadState::joinThreadPool(bool isMain) // 将本线程加入线程池
{
LOG_THREADPOOL("**** THREAD %p (PID %d) IS JOINING THE THREAD POOL\n", (void*)pthread_self(), getpid());
mOut.writeInt32(isMain ? BC_ENTER_LOOPER : BC_REGISTER_LOOPER);
status_t result;
do {
processPendingDerefs();
// now get the next command to be processed, waiting if necessary
result = getAndExecuteCommand();
if (result < NO_ERROR && result != TIMED_OUT && result != -ECONNREFUSED && result != -EBADF) {
ALOGE("getAndExecuteCommand(fd=%d) returned unexpected error %d, aborting",
mProcess->mDriverFD, result);
abort();
}
// Let this thread exit the thread pool if it is no longer
// needed and it is not the main process thread.
if(result == TIMED_OUT && !isMain) {
break;
}
} while (result != -ECONNREFUSED && result != -EBADF);
LOG_THREADPOOL("**** THREAD %p (PID %d) IS LEAVING THE THREAD POOL err=%d\n",
(void*)pthread_self(), getpid(), result);
mOut.writeInt32(BC_EXIT_LOOPER);
talkWithDriver(false);
}
```
这个方法已经属于binder的部分了,之前binder的文章启动进程时候创建线程池的入口正在也没分析,这里就看一下。这里mOut是一个Parcel,首先由于这里是主线程,所以写入一个命令BC_ENTER_LOOPER。接下来会调用getAndExecuteCommand这个方法,这个方法会返回和binder驱动交互后的结果,如果正常的话,会循环执行getAndExecuteCommand方法,否则就会退出循环。最后会往mOut写入退出binder线程的命令,然后再调用talkWithDriver方法,这个方法我们在getAndExecuteCommand方法中也会看到,我们先继续看getAndExecuteCommand方法:
```cpp
status_t IPCThreadState::getAndExecuteCommand()
{
status_t result;
int32_t cmd;
// 向binder发送命令,mOut中有发送的命令
result = talkWithDriver();
if (result >= NO_ERROR) {
size_t IN = mIn.dataAvail(); // 获取binder驱动返回的数据
if (IN < sizeof(int32_t)) return result;
cmd = mIn.readInt32(); // 获取binder返回命令
IF_LOG_COMMANDS() {
alog << "Processing top-level Command: "
<< getReturnString(cmd) << endl;
}
pthread_mutex_lock(&mProcess->mThreadCountLock);
mProcess->mExecutingThreadsCount++;
if (mProcess->mExecutingThreadsCount >= mProcess->mMaxThreads &&
mProcess->mStarvationStartTimeMs == 0) {
mProcess->mStarvationStartTimeMs = uptimeMillis();
}
pthread_mutex_unlock(&mProcess->mThreadCountLock);
// 处理从binder那里返回的命令
result = executeCommand(cmd);
pthread_mutex_lock(&mProcess->mThreadCountLock);
mProcess->mExecutingThreadsCount--;
if (mProcess->mExecutingThreadsCount < mProcess->mMaxThreads &&
mProcess->mStarvationStartTimeMs != 0) {
int64_t starvationTimeMs = uptimeMillis() - mProcess->mStarvationStartTimeMs;
if (starvationTimeMs > 100) {
ALOGE("binder thread pool (%zu threads) starved for %" PRId64 " ms",
mProcess->mMaxThreads, starvationTimeMs);
}
mProcess->mStarvationStartTimeMs = 0;
}
pthread_cond_broadcast(&mProcess->mThreadCountDecrement);
pthread_mutex_unlock(&mProcess->mThreadCountLock);
}
return result;
}
```
这个方法中我们看到,显示调用了talkWithDriver,这个方法在binder的文章中我们已经重点有讲解过,是和binder驱动进行交互的,这里不进行详细的讲解,我们主要看下这里的流程,我们进入这个方法:
```cpp
status_t IPCThreadState::talkWithDriver(bool doReceive)
{
.............
if (ioctl(mProcess->mDriverFD, BINDER_WRITE_READ, &bwr) >= 0) // 正式和binder交互
err = NO_ERROR;
else
err = -errno;
.............
}
```
这里调用ioctl会进入binder驱动中,在内核的binder.c文件的binder_ioctl方法中,命令是BINDER_WRITE_READ,最终会走到binder_thread_write方法中:
```c
case BC_ENTER_LOOPER: // 这里是注册为一个binder线程
if (binder_debug_mask & BINDER_DEBUG_THREADS)
printk(KERN_INFO "binder: %d:%d BC_ENTER_LOOPER\n",
proc->pid, thread->pid);
// 该线程是一个非binder线程,BINDER_LOOPER_STATE_REGISTERED是当线程不够时
// binder驱动主动请求的线程
if (thread->looper & BINDER_LOOPER_STATE_REGISTERED) {
thread->looper |= BINDER_LOOPER_STATE_INVALID;
binder_user_error("binder: %d:%d ERROR:"
" BC_ENTER_LOOPER called after "
"BC_REGISTER_LOOPER\n",
proc->pid, thread->pid);
}
thread->looper |= BINDER_LOOPER_STATE_ENTERED; // 注册为binder线程
break;
```
这里贴出这个方法,具体方法就不做分析了,之前在binder分析文章中有过具体的分析,可以看这里[android进程间通信binder(一)](https://liqi.site/archives/android%E8%BF%9B%E7%A8%8B%E9%97%B4%E9%80%9A%E4%BF%A1binder%E4%B8%80)。这个把一个线程注册为binder线程后,一切正常的话会返回到binder驱动文件binder.c的binder_ioctl方法中,调用binder_thread_read方法,这时候如果没有任务,就进入休眠状态了。之后如果有进程间通信的话,就开始正常的binder流程了,具体binder流程还是看上面给出的binder通信的文章,这里就不多讲述了。
我们回到前面IPCThreadState的joinThreadPool方法,在方法最后可以看到,如果binder线程返回超时或者错误了,也会调用talkWithDriver方法向binder驱动发送BC_EXIT_LOOPER命令,这个命令最终会执行到内核binder.c的binder_thread_write方法中注销binder线程:
```c
case BC_EXIT_LOOPER:
if (binder_debug_mask & BINDER_DEBUG_THREADS)
printk(KERN_INFO "binder: %d:%d BC_EXIT_LOOPER\n",
proc->pid, thread->pid);
thread->looper |= BINDER_LOOPER_STATE_EXITED; // 注销binder线程
break;
```
可以看到注销和注册是同样逻辑的操作,一个是给binder_thread对象的looper字段设置为BINDER_LOOPER_STATE_ENTERED,另一个是设置为BINDER_LOOPER_STATE_EXITED,这样就完成了注册和注销了。具体binder的分析如果不清楚的同学还是看上面给出的文章,
binder还是比较复杂的,这里也不可能简单的说清楚,给出的文章对binder的分析还是比较详细的,这里不多说了。
好了,注册binder线程我们说完了,回到ZygoteInit中,继续看下面的方法。接着会调用RuntimeInit的applicationInit方法,
```java
protected static void applicationInit(int targetSdkVersion, String[] argv, ClassLoader classLoader)
throws Zygote.MethodAndArgsCaller {
// If the application calls System.exit(), terminate the process
// immediately without running any shutdown hooks. It is not possible to
// shutdown an Android application gracefully. Among other things, the
// Android runtime shutdown hooks close the Binder driver, which can cause
// leftover running threads to crash before the process actually exits.
// 参数是true的话,进程调用System.exit()不会调用onExit方法,这个方法会是否binder文件描述符
// 那么如果还有线程在运行的话,可能会crash
nativeSetExitWithoutCleanup(true);
// We want to be fairly aggressive about heap utilization, to avoid
// holding on to a lot of memory that isn't needed.
// 设置堆栈的可用百分比
VMRuntime.getRuntime().setTargetHeapUtilization(0.75f);
// 设置sdk
VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);
final Arguments args;
try {
// 封装参数对象args
args = new Arguments(argv);
} catch (IllegalArgumentException ex) {
Slog.e(TAG, ex.getMessage());
// let the process exit
return;
}
// The end of of the RuntimeInit event (see #zygoteInit).
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
// Remaining arguments are passed to the start class's static main
// 继续调用启动类的main方法
invokeStaticMain(args.startClass, args.startArgs, classLoader);
}
```
这个方法里面会调用启动类的main方法。首先会调用nativeSetExitWithoutCleanup方法,这个方法的参数为true,表示如果进程调用System.exit退出的时候,是否要调用AppRuntime的onExit方法,这个方法里面会关闭binder文件描述符。熟悉binder的同学知道,每个进程在启动的时候都会有个binder线程,这个我们上面分析的时候也分析到了,进程之间的通信就是通过binder线程来处理相互发送的命令的,所以如果关闭binder线程的话,有可能会引起程序的崩溃,这个谷歌在方法的注释前面也有写到,我们这里了解一下就好。
接着设置堆栈可用的比率,设置sdk版本参数等,直接就开始解析从前面传过来的参数了。这里创建了Arguments对象来解析参数,我们看下这个方法:
```java
Arguments(String args[]) throws IllegalArgumentException {
parseArgs(args);
}
private void parseArgs(String args[])
throws IllegalArgumentException {
int curArg = 0;
//遍历参数
for (; curArg < args.length; curArg++) {
String arg = args[curArg];
// 参数是“--”,break
if (arg.equals("--")) {
curArg++;
break;
} else if (!arg.startsWith("--")) {
// 参数是不是"--setuid="开头的,break
break;
}
}
if (curArg == args.length) {
throw new IllegalArgumentException("Missing classname argument to RuntimeInit!");
}
startClass = args[curArg++];
startArgs = new String[args.length - curArg];
System.arraycopy(args, curArg, startArgs, 0, startArgs.length);
}
```
这里最终会调用parseArgs方法来解析参数,这个解析的方法和我们之前在启动zygote进程后,执行到app_main的main方法里面封装参数的逻辑是一样的,这里用同样的逻辑取出参数,然后封装为Arguments对象返回。这个解析过程的具体过程就不分析了,不太了解的同学可以参考前面分析app_main方法时的解释。
我们回到applicationInit方法,最终调用invokeStaticMain方法,我们跟进这个方法:
```java
private static void invokeStaticMain(String className, String[] argv, ClassLoader classLoader)
throws Zygote.MethodAndArgsCaller {
Class<?> cl;
try {
// 获取加载的类
cl = Class.forName(className, true, classLoader);
} catch (ClassNotFoundException ex) {
throw new RuntimeException(
"Missing class when invoking static main " + className,
ex);
}
Method m;
try {
// 类的main方法
m = cl.getMethod("main", new Class[] { String[].class });
} catch (NoSuchMethodException ex) {
throw new RuntimeException(
"Missing static main on " + className, ex);
} catch (SecurityException ex) {
throw new RuntimeException(
"Problem getting static main on " + className, ex);
}
int modifiers = m.getModifiers();
// 如果方法不是public static,报错
if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {
throw new RuntimeException(
"Main method is not public and static on " + className);
}
/*
* This throw gets caught in ZygoteInit.main(), which responds
* by invoking the exception's run() method. This arrangement
* clears up all the stack frames that were required in setting
* up the process.
*/
// 抛出异常被ZygoteInit.main()捕获
throw new Zygote.MethodAndArgsCaller(m, argv);
}
```
这个方法就是准备要调用main方法了。首先这里要加载类,这里的类是前面参数传过来的,这里马上要启动的是com.android.server.SystemServer这个类,通过Class.forName获得这个类的Class,接着获取要调用的方法这里是main方法。之后判断方法是不是public static,如果不是的话抛出异常。如果一切正常的话最后会抛出一个Zygote.MethodAndArgsCaller这个异常,这里抛出的异常是在哪里接住的呢?回到本文的最开头,这里我在贴下代码:
```java
public static void main(String argv[]) {
..............
if (startSystemServer) {
startSystemServer(abiList, socketName, zygoteServer);
}
...........
} catch (Zygote.MethodAndArgsCaller caller) {
caller.run();
} catch (Throwable ex) {
Log.e(TAG, "System zygote died with exception", ex);
zygoteServer.closeServerSocket();
throw ex;
}
}
```
这里第一个catch看到吗,参数是Zygote.MethodAndArgsCaller,所以异常会在ZygoteInit的main方法这里接住,不过别搞混了,这个类的名字虽然带zygote,但是别忘了这是从zygote进程fork出来的,现在是在systemServer进程中了,所以是systemServer进程给接住的。
我们看下这个方法里面的执行:
```java
// systemserver进场启动main方法通过抛出异常,然后执行这里的run
public static class MethodAndArgsCaller extends Exception
implements Runnable {
/** method to call */
private final Method mMethod;
/** argument array */
private final String[] mArgs;
public MethodAndArgsCaller(Method method, String[] args) {
mMethod = method;
mArgs = args;
}
public void run() {
try {
// 反射调用类的方法,比如main方法
mMethod.invoke(null, new Object[] { mArgs });
} catch (IllegalAccessException ex) {
throw new RuntimeException(ex);
} catch (InvocationTargetException ex) {
Throwable cause = ex.getCause();
if (cause instanceof RuntimeException) {
throw (RuntimeException) cause;
} else if (cause instanceof Error) {
throw (Error) cause;
}
throw new RuntimeException(ex);
}
}
}
```
这里我们可以看到,run方法里面通过反射来调用执行的方法,上面我们已经讲过了,这里执行的是com.android.server.SystemServer的main方法,我们下面会继续跟进去看这个方法。在这之前,我们先思考一下,为什么这里执行这个方法要通过抛出异常后,在catch里面在调用的这种形式呢,不能够执行在正常的流程中反射调用吗?
其实这个是为了清楚堆栈中的内容,我们知道zygote进程启动后,其他进程都是通过fork的方式来创建的,fork的时候堆栈中可能就会保存一些zygote进程中留下的东西,这些东西对新进程是没有必要的,而且在fork出新进程后,还执行了handleSystemServerProcess方法,这个方法中最后才调用了main方法,所以说在main方法被调用前,堆栈里可能已经存在数据了,而我们正常启动一个进程的入口方法时,堆栈不应该有数据,这些数据说不定会导致一些bug,所以最好是把启动后堆栈的内容给清空,而java的异常机制,从抛出异常后到接住异常的地方,这中间的堆栈都会被清空,而现在接住异常的地方就是当初zygote启动systemServer的这个方法里,所以这样就确保了在启动systemServer后到调用main方法前堆栈的内容都被清空了,这就是google之所以这里用抛出异常的方法来执行进程入口方法的原因。好了,说完这个,我们继续进入com.android.server.SystemServer的main方法看看。
# 进入SystemServer的main方法
```java
public static void main(String[] args) {
new SystemServer().run();
}
```
这里调用了SystemServer的run方法,我们看看run方法中做了些什么:
```java
private void run() {
try {
traceBeginAndSlog("InitBeforeStartServices");
// If a device's clock is before 1970 (before 0), a lot of
// APIs crash dealing with negative numbers, notably
// java.io.File#setLastModified, so instead we fake it and
// hope that time from cell towers or NTP fixes it shortly.
// 初始化系统时间,如果时间小于1970年,那么就设置为1970年
if (System.currentTimeMillis() < EARLIEST_SUPPORTED_TIME) {
Slog.w(TAG, "System clock is before 1970; setting to 1970.");
SystemClock.setCurrentTimeMillis(EARLIEST_SUPPORTED_TIME);
}
//
// Default the timezone property to GMT if not set.
// 初始化时区
String timezoneProperty = SystemProperties.get("persist.sys.timezone");
if (timezoneProperty == null || timezoneProperty.isEmpty()) {
Slog.w(TAG, "Timezone not set; setting to GMT.");
SystemProperties.set("persist.sys.timezone", "GMT");
}
// If the system has "persist.sys.language" and friends set, replace them with
// "persist.sys.locale". Note that the default locale at this point is calculated
// using the "-Duser.locale" command line flag. That flag is usually populated by
// AndroidRuntime using the same set of system properties, but only the system_server
// and system apps are allowed to set them.
//
// NOTE: Most changes made here will need an equivalent change to
// core/jni/AndroidRuntime.cpp
// 初始化语言
if (!SystemProperties.get("persist.sys.language").isEmpty()) {
final String languageTag = Locale.getDefault().toLanguageTag();
SystemProperties.set("persist.sys.locale", languageTag);
SystemProperties.set("persist.sys.language", "");
SystemProperties.set("persist.sys.country", "");
SystemProperties.set("persist.sys.localevar", "");
}
// The system server should never make non-oneway calls
// 如果bind有阻塞,要有经过,看来只能异步调用
Binder.setWarnOnBlocking(true);
// Here we go!
Slog.i(TAG, "Entered the Android system server!");
int uptimeMillis = (int) SystemClock.elapsedRealtime();
EventLog.writeEvent(EventLogTags.BOOT_PROGRESS_SYSTEM_RUN, uptimeMillis);
if (!mRuntimeRestart) {
MetricsLogger.histogram(null, "boot_system_server_init", uptimeMillis);
}
// In case the runtime switched since last boot (such as when
// the old runtime was removed in an OTA), set the system
// property so that it is in sync. We can | xq oqi't do this in
// libnativehelper's JniInvocation::Init code where we already
// had to fallback to a different runtime because it is
// running as root and we need to be the system user to set
// the property. http://b/11463182
// 设置虚拟机的路径
SystemProperties.set("persist.sys.dalvik.vm.lib.2", VMRuntime.getRuntime().vmLibrary());
// 开启性能统计方面的
// Enable the sampling profiler.
if (SamplingProfilerIntegration.isEnabled()) {
SamplingProfilerIntegration.start();
mProfilerSnapshotTimer = new Timer();
mProfilerSnapshotTimer.schedule(new TimerTask() {
@Override
public void run() {
SamplingProfilerIntegration.writeSnapshot("system_server", null);
}
}, SNAPSHOT_INTERVAL, SNAPSHOT_INTERVAL);
}
// Mmmmmm... more memory!
// 清楚内存大小限制
VMRuntime.getRuntime().clearGrowthLimit();
// The system server has to run all of the time, so it needs to be
// as efficient as possible with its memory usage.
// 堆栈利用率0.8
VMRuntime.getRuntime().setTargetHeapUtilization(0.8f);
// Some devices rely on runtime fingerprint generation, so make sure
// we've defined it before booting further.
// 和指纹有关的一些设定
Build.ensureFingerprintProperty();
// Within the system server, it is an error to access Environment paths without
// explicitly specifying a user.
// 进入system server需要有指定的用户
Environment.setUserRequired(true);
// Within the system server, any incoming Bundles should be defused
// to avoid throwing BadParcelableException.
// 设置这个后BadParcelableException异常被抛出
BaseBundle.setShouldDefuse(true);
// Ensure binder calls into the system always run at foreground priority.
// binder调用的优先级是前台
BinderInternal.disableBackgroundScheduling(true);
// Increase the number of binder threads in system_server
// 设置最大线程数
BinderInternal.setMaxThreads(sMaxBinderThreads);
// Prepare the main looper thread (this thread).
// 设置线程优先级
android.os.Process.setThreadPriority(
android.os.Process.THREAD_PRIORITY_FOREGROUND);
// 不允许后台线程
android.os.Process.setCanSelfBackground(false);
// 开始主线程looper
Looper.prepareMainLooper();
// Initialize native services.
// 加载android_servers.so库
System.loadLibrary("android_servers");
// Check whether we failed to shut down last time we tried.
// This call may not return.
// 检查是否有过重启,做一些处理
performPendingShutdown();
// Initialize the system context.
// 创建context
createSystemContext();
// Create the system service manager.
// 创建SystemServiceManager
mSystemServiceManager = new SystemServiceManager(mSystemContext);
mSystemServiceManager.setRuntimeRestarted(mRuntimeRestart);
// 把systemServiceManager保存到LocalServices中,保存是一个map,key是类的class,value是实例
LocalServices.addService(SystemServiceManager.class, mSystemServiceManager);
// Prepare the thread pool for init tasks that can be parallelized
// 初始化线程池类
SystemServerInitThreadPool.get();
} finally {
traceEnd(); // InitBeforeStartServices
}
```
这个方法是一个入口方法,里面初始化了整个系统接下去要用到的模块,我们分段来看,这里是第一段。这部分初始化了许多东西,比如初始化时间和时区,加载了so库,开启hander的looper,创建了context和SystemServiceManager等等。SystemServiceManager我们在分析binder的时候已经详细分析过了,这个是binder进程的一个核心模块,所有binder进程相互通信都会通过他来处理,这个类我们这里不做详细的介绍,我们看到这里会创建SystemServiceManager后,调用addService把他加入到LocalServices中:
```java
private static final ArrayMap<Class<?>, Object> sLocalServiceObjects =
new ArrayMap<Class<?>, Object>();
public static <T> void addService(Class<T> type, T service) {
synchronized (sLocalServiceObjects) {
if (sLocalServiceObjects.containsKey(type)) {
throw new IllegalStateException("Overriding service registration");
}
sLocalServiceObjects.put(type, service);
}
}
```
我们看到这里有map变量sLocalServiceObjects,他的key是一个class,value是一个对象。这里会把所有的service保存在这里,之后用的时候会通过class取出service的实例。这里保存了SystemServiceManager,后面还会看到保存其他的service。
我们在上面这段代码中还看到了createSystemContext这个方法,是用来创建context的:
```java
private void createSystemContext() {
// 获取ActivityThread对象
ActivityThread activityThread = ActivityThread.systemMain();
// 获取context
mSystemContext = activityThread.getSystemContext();
mSystemContext.setTheme(DEFAULT_SYSTEM_THEME);
// 获取ui context,和上面context相比应该是这里的resource有不同,具有主题化能力
final Context systemUiContext = activityThread.getSystemUiContext();
systemUiContext.setTheme(DEFAULT_SYSTEM_THEME);
}
```
可以看到这里开始会调用ActivityThread的systemMain返回的是ActivityThread对象,之前在AMS分析的文章中,我们知道ActivityThread是描述一个进程的,他的里面保存了很多和这个进程有关的东西,比如代表这个进程的binder,这个进程的Activity等,这里其实是要给systemServer创建他的ActivityThread,然后获得他的context,我们看下systemMain方法:
```java
public static ActivityThread systemMain() {
// The system process on low-memory devices do not get to use hardware
// accelerated drawing, since this can add too much overhead to the
// process.
// 如果非高端设备
if (!ActivityManager.isHighEndGfx()) {
// 关闭硬件渲染
ThreadedRenderer.disable(true);
} else {
ThreadedRenderer.enableForegroundTrimming();
}
ActivityThread thread = new ActivityThread();
thread.attach(true);
return thread;
}
```
可以看到这个方法根据硬件的配置来决定是否要关闭渲染以减少进程的负载。之后就是创建一个ActivityThread,调用它的attach方法。创建ActivityThread我们在之前AMS的文章中也有过分析,我们这里再跟进去看下有些不同的地方:
```java
private void attach(boolean system) {
sCurrentActivityThread = this;
mSystemThread = system; // 是否系统进程
if (!system) {
...........
}else{
android.ddm.DdmHandleAppName.setAppName("system_process",
UserHandle.myUserId());
try {
// new一个Instrumentation
mInstrumentation = new Instrumentation();
// 创建contextImpl对象
ContextImpl context = ContextImpl.createAppContext(
this, getSystemContext().mPackageInfo);
// 创建进程的Application对象
mInitialApplication = context.mPackageInfo.makeApplication(true, null);
// 调用Application的onCreate方法
mInitialApplication.onCreate();
} catch (Exception e) {
throw new RuntimeException(
"Unable to instantiate Application():" + e.toString(), e);
}
}
............
}
```
这个attach方法,我们在分析AMS的时候已经遇到过了,这个方法的参数表示是否是系统进程,我们之前分析的是非系统进程时候的执行流程,这里我们在来看下是系统进程时候的流程。
这里的参数是true表示是系统进程,后面会走else分支。这里看到会new一个Instrumentation对象,这个对象熟悉AMS的同学都知道,AMS中很多调用组件生命周期相关的方法都是从这里调用了,可以看做是一个封装的类,这里首先创建这个对象。之后就是创建Context和我们熟悉的Application,最后调用Application的onCreate。这些之前AMS都说过了,这里也不都说,不熟悉的同学可以看AMS的文章。
我们看到创建一个系统进程的ActivityThread还是比较简单的,这里做的一些初始化操作在非系统进程中都有,而且流程也比较少。我们回到SystemServer的createSystemContext方法,接着会获取这个ActivityThread的context,另外后面还会获得一个systemUiContext,这个和上面的context区别在于它所持有的Resource不同,他的Resource具有一些主题的资源,这个我们知道一下就好,暂时不去关心。
我们回到SystemServer的main方法,后面我们还看到会初始化一个线程池对象:
```java
private ExecutorService mService = ConcurrentUtils.newFixedThreadPool(4,
"system-server-init-thread", Process.THREAD_PRIORITY_FOREGROUND);
public static synchronized SystemServerInitThreadPool get() {
if (sInstance == null) {
sInstance = new SystemServerInitThreadPool();
}
Preconditions.checkState(sInstance.mService != null, "Cannot get " + TAG
+ " - it has been shut down");
return sInstance;
}
```
这个线程池对象中有一个ExecutorService,这个里面有四个线程,可以猜测后面的代码中肯定有执行用到线程的地方,我们知道这个就可以了。
# 启动各种系统service
目前为止主要是做了一些初始化方法的工作,下面还是有更具体的事情要做了,我们看下后面的代码:
```java
try {
traceBeginAndSlog("StartServices");
// 启动引导服务
startBootstrapServices();
// 启动核心服务
startCoreServices();
// 启动其他服务
startOtherServices();
SystemServerInitThreadPool.shutdown();
} catch (Throwable ex) {
Slog.e("System", "******************************************");
Slog.e("System", "************ Failure starting system services", ex);
throw ex;
} finally {
traceEnd();
}
// For debug builds, log event loop stalls to dropbox for analysis.
// 如果是debug模式,dropbox会有log
if (StrictMode.conditionallyEnableDebugLogging()) {
Slog.i(TAG, "Enabled StrictMode for system server main thread.");
}
if (!mRuntimeRestart && !isFirstBootOrUpgrade()) {
int uptimeMillis = (int) SystemClock.elapsedRealtime();
MetricsLogger.histogram(null, "boot_system_server_ready", uptimeMillis);
final int MAX_UPTIME_MILLIS = 60 * 1000;
if (uptimeMillis > MAX_UPTIME_MILLIS) {
Slog.wtf(SYSTEM_SERVER_TIMING_TAG,
"SystemServer init took too long. uptimeMillis=" + uptimeMillis);
}
}
// Loop forever.
Looper.loop(); // Handler等待消息
throw new RuntimeException("Main thread loop unexpectedly exited");
```
我们先整体看下后面这段方法,主要是启动了三个和service有关的方法startBootstrapServices,startCoreServices和startOtherServices。我们知道android系统在启动的时候会启动很多有系统相关service,比如AMS,WMS等等,这些系统service就是在这里启动的,我们等下根据这几个方法中看下。之后会调用SystemServerInitThreadPool.shutdown()方法关闭线程池,可以想象到前面三个启动service的方法中肯定用到了线程池,现在用完了就关闭了。最后还可以看到调用loop方法等待接受消息,Hander模块我们这里不做详细分析,有专门的文章分析Hander,相信这里代码的意思大家都能看懂。
好了,我们接下去主要就是看下几个启动service的方法里面做了什么,首先看下startBootstrapServices这个方法:
```java
private void startBootstrapServices() {
Slog.i(TAG, "Reading configuration...");
final String TAG_SYSTEM_CONFIG = "ReadingSystemConfig";
traceBeginAndSlog(TAG_SYSTEM_CONFIG);
// 读取系统配置,注意这里submit第一个参数应该是Runnable,这里直接用lambda写法类后双冒号加函数名
SystemServerInitThreadPool.get().submit(SystemConfig::getInstance, TAG_SYSTEM_CONFIG);
traceEnd();
// Wait for installd to finish starting up so that it has a chance to
// create critical directories such as /data/user with the appropriate
// permissions. We need this to complete before we initialize other services.
traceBeginAndSlog("StartInstaller");
// 启动Installer,安装其他应用的
Installer installer = mSystemServiceManager.startService(Installer.class);
traceEnd();
// In some cases after launching an app we need to access device identifiers,
// therefore register the device identifier policy before the activity manager.
traceBeginAndSlog("DeviceIdentifiersPolicyService");
// 启动DeviceIdentifiersPolicyService服务,验证设备身份
mSystemServiceManager.startService(DeviceIdentifiersPolicyService.class);
traceEnd();
// Activity manager runs the show.
traceBeginAndSlog("StartActivityManager");
// 启动AMS
mActivityManagerService = mSystemServiceManager.startService(
ActivityManagerService.Lifecycle.class).getService();
// systemServiceManager设给AMS
mActivityManagerService.setSystemServiceManager(mSystemServiceManager);
// 把installer设给AMS
mActivityManagerService.setInstaller(installer);
traceEnd();
// Power manager needs to be started early because other services need it.
// Native daemons may be watching for it to be registered so it must be ready
// to handle incoming binder calls immediately (including being able to verify
// the permissions for those calls).
traceBeginAndSlog("StartPowerManager");
// 启动电池管理服务
mPowerManagerService = mSystemServiceManager.startService(PowerManagerService.class);
traceEnd();
// Now that the power manager has been started, let the activity manager
// initialize power management features.
traceBeginAndSlog("InitPowerManagement");
// 初始化电池管理
mActivityManagerService.initPowerManagement();
traceEnd();
// Bring up recovery system in case a rescue party needs a reboot
if (!SystemProperties.getBoolean("config.disable_noncore", false)) {
traceBeginAndSlog("StartRecoverySystemService");
// 启动系统恢复服务
mSystemServiceManager.startService(RecoverySystemService.class);
traceEnd();
}
// Now that we have the bare essentials of the OS up and running, take
// note that we just booted, which might send out a rescue party if
// we're stuck in a runtime restart loop.
RescueParty.noteBoot(mSystemContext);
// Manages LEDs and display backlight so we need it to bring up the display.
traceBeginAndSlog("StartLightsService");
// 启动灯光服务
mSystemServiceManager.startService(LightsService.class);
traceEnd();
// Display manager is needed to provide display metrics before package manager
// starts up.
traceBeginAndSlog("StartDisplayManager");
// 启动显示服务
mDisplayManagerService = mSystemServiceManager.startService(DisplayManagerService.class);
traceEnd();
// We need the default display before we can initialize the package manager.
traceBeginAndSlog("WaitForDisplay");
// 到这里为止,在等待显示服务的阶段,阶段100
mSystemServiceManager.startBootPhase(SystemService.PHASE_WAIT_FOR_DEFAULT_DISPLAY);
traceEnd();
// Only run "core" apps if we're encrypting the device.
// 读取加密状态
String cryptState = SystemProperties.get("vold.decrypt");
if (ENCRYPTING_STATE.equals(cryptState)) {
Slog.w(TAG, "Detected encryption in progress - only parsing core apps");
mOnlyCore = true;
} else if (ENCRYPTED_STATE.equals(cryptState)) {
Slog.w(TAG, "Device encrypted - only parsing core apps");
mOnlyCore = true;
}
// Start the package manager.
if (!mRuntimeRestart) {
MetricsLogger.histogram(null, "boot_package_manager_init_start",
(int) SystemClock.elapsedRealtime());
}
traceBeginAndSlog("StartPackageManagerService");
// 创建PKMS
mPackageManagerService = PackageManagerService.main(mSystemContext, installer,
mFactoryTestMode != FactoryTest.FACTORY_TEST_OFF, mOnlyCore);
// 是否首次启动
mFirstBoot = mPackageManagerService.isFirstBoot();
// 获取PM
mPackageManager = mSystemContext.getPackageManager();
traceEnd();
if (!mRuntimeRestart && !isFirstBootOrUpgrade()) {
MetricsLogger.histogram(null, "boot_package_manager_init_ready",
(int) SystemClock.elapsedRealtime());
}
// Manages A/B OTA dexopting. This is a bootstrap service as we need it to rename
// A/B artifacts after boot, before anything else might touch/need them.
// Note: this isn't needed during decryption (we don't have /data anyways).
if (!mOnlyCore) {
boolean disableOtaDexopt = SystemProperties.getBoolean("config.disable_otadexopt",
false);
if (!disableOtaDexopt) {
traceBeginAndSlog("StartOtaDexOptService");
try {
// 创建系统更新服务
OtaDexoptService.main(mSystemContext, mPackageManagerService);
} catch (Throwable e) {
reportWtf("starting OtaDexOptService", e);
} finally {
traceEnd();
}
}
}
traceBeginAndSlog("StartUserManagerService");
// 启动用户管理服务
mSystemServiceManager.startService(UserManagerService.LifeCycle.class);
traceEnd();
// Initialize attribute cache used to cache resources from packages.
traceBeginAndSlog("InitAttributerCache");
// 初始化资源缓存类
AttributeCache.init(mSystemContext);
traceEnd();
// Set up the Application instance for the system process and get started.
traceBeginAndSlog("SetSystemProcess");
// 注册一些系统服务,然后把systemServer的的信息保存到AMS中
mActivityManagerService.setSystemProcess();
traceEnd();
// DisplayManagerService needs to setup android.display scheduling related policies
// since setSystemProcess() would have overridden policies due to setProcessGroup
// 把显示相关设置到top的线程组中
mDisplayManagerService.setupSchedulerPolicies();
// Manages Overlay packages
traceBeginAndSlog("StartOverlayManagerService");
// 启动OverlayManagerService
mSystemServiceManager.startService(new OverlayManagerService(mSystemContext, installer));
traceEnd();
// The sensor service needs access to package manager service, app ops
// service, and permissions service, therefore we start it after them.
// Start sensor service in a separate thread. Completion should be checked
// before using it.
mSensorServiceStart = SystemServerInitThreadPool.get().submit(() -> {
BootTimingsTraceLog traceLog = new BootTimingsTraceLog(
SYSTEM_SERVER_TIMING_ASYNC_TAG, Trace.TRACE_TAG_SYSTEM_SERVER);
traceLog.traceBegin(START_SENSOR_SERVICE);
// 开启传感器服务
startSensorService();
traceLog.traceEnd();
}, START_SENSOR_SERVICE);
}
```
这个方法是启动各种引导service的,方法非常长,启动的service也非常多,我们也不可能一个个来的分析,上面代码中把大部分启动的service所代表的模块做了注释,对某个模块有兴趣的同学,可以挑自己感兴趣的深入分析,我们这不对具体的service模块做详细的分析,我们主要看下系统是怎么启动service的。
首先我们看到开始的SystemServerInitThreadPool.get().submit(SystemConfig::getInstance, TAG_SYSTEM_CONFIG)这句代码,这个就是使用了我们前面分析到了创建的线程池了,这里启动一个线程来执行SystemConfig,这个是一个系统配置类,这里其实就是读取系统配置信息,我们不关心具体的配置内容,大概知道下就可以了,这里说一下这个语法。我们知道ExecutorService调用submit执行一个线程的时候传入的参数是Runnable,但是这里的SystemConfig::getInstance是一个方法:
```java
public static SystemConfig getInstance() {
synchronized (SystemConfig.class) {
if (sInstance == null) {
sInstance = new SystemConfig();
}
return sInstance;
}
}
```
可能有同学会感到疑惑,其实这里是java 8的lambda的语法。我们来看Runnable这个接口:
```java
@FunctionalInterface
public interface Runnable {
public abstract void run();
}
```
可以看到这里有个注解@FunctionalInterface,这个注解表示如果这个接口只有一个抽象方法,那么可以用lambda表达式来表示。比如我们要创建一个Runnable,传统的话,会这样写:
```java
Runnable aa = new Runnable() {
@Override
public void run() {
xxx();
}
};
```
比如这里我们调用一个xxx()方法,会把这个写到run的方法里面,如果采用java 8的lambda表达式,可以这样写:
```java
Runnable aa = () -> xxx();
```
这里->前面的()里面是参数,由于这里run方法没有参数,所以直接写括号就可以,->后面紧跟着具体的方法就可以。所以我们回到上面的代码中,执行线程的时候submit方法参数直接写方法就可以了。
我们回到startBootstrapServices接着看后面的代码。后面的大段代码就是启动一个个的service了,这里会调用SystemServiceManager的startService方法来启动,我们来看下SystemServiceManager的startService方法:
```java
public <T extends SystemService> T startService(Class<T> serviceClass) {
try {
final String name = serviceClass.getName();
Slog.i(TAG, "Starting " + name);
Trace.traceBegin(Trace.TRACE_TAG_SYSTEM_SERVER, "StartService " + name);
// Create the service.
// 必须是SystemService的子类
if (!SystemService.class.isAssignableFrom(serviceClass)) {
throw new RuntimeException("Failed to create " + name
+ ": service must extend " + SystemService.class.getName());
}
final T service;
try {
// 创建service
Constructor<T> constructor = serviceClass.getConstructor(Context.class);
service = constructor.newInstance(mContext);
} catch (InstantiationException ex) {
throw new RuntimeException("Failed to create service " + name
+ ": service could not be instantiated", ex);
} catch (IllegalAccessException ex) {
throw new RuntimeException("Failed to create service " + name
+ ": service must have a public constructor with a Context argument", ex);
} catch (NoSuchMethodException ex) {
throw new RuntimeException("Failed to create service " + name
+ ": service must have a public constructor with a Context argument", ex);
} catch (InvocationTargetException ex) {
throw new RuntimeException("Failed to create service " + name
+ ": service constructor threw an exception", ex);
}
// 继续调用
startService(service);
return service;
} finally {
Trace.traceEnd(Trace.TRACE_TAG_SYSTEM_SERVER);
}
}
```
这个方法传入的参数是一个class,所以我们猜测后面又是通过反射来创建的。这个方法首先获得这个class的名字,然后校验下这个类是否是SystemService的子类,由此我们知道这些系统的service都是SystemService的子类。之后我们就看到了通过反射来构建一个service的实例,最后继续调用startService方法,把这个service实例传入,我们看下startService方法:
```java
public void startService(@NonNull final SystemService service) {
// Register it.
mServices.add(service); // 放入集合中
// Start it.
long time = System.currentTimeMillis();
try {
service.onStart(); // 调用service的onStart方法
} catch (RuntimeException ex) {
throw new RuntimeException("Failed to start service " + service.getClass().getName()
+ ": onStart threw an exception", ex);
}
warnIfTooLong(System.currentTimeMillis() - time, service, "onStart");
}
```
这个方法就比较简单了,首先加入到SystemService集合中,然后调用service的start,那么这个service就启动了。这里有人会有疑问为什么要加入SystemService这个集合中,在说这个问题之前我们先看下这些service的start方法做了些什么,我们就拿熟悉的AMS来看:
```java
private void start() {
removeAllProcessGroups();
mProcessCpuThread.start();
mBatteryStatsService.publish(mContext);
mAppOpsService.publish(mContext);
Slog.d("AppOps", "AppOpsService published");
LocalServices.addService(ActivityManagerInternal.class, new LocalService());
// Wait for the synchronized block started in mProcessCpuThread,
// so that any other acccess to mProcessCpuTracker from main thread
// will be blocked during mProcessCpuTracker initialization.
try {
mProcessCpuInitLatch.await();
} catch (InterruptedException e) {
Slog.wtf(TAG, "Interrupted wait during start", e);
Thread.currentThread().interrupt();
throw new IllegalStateException("Interrupted wait during start");
}
}
```
这个是AMS的start方法,这个方法中我们先不管其余的方法,这里直接看mAppOpsService.publish(mContext)这个方法:
```java
public void publish(Context context) {
mContext = context;
ServiceManager.addService(Context.APP_OPS_SERVICE, asBinder());
}
```
这个方法我们看到会把这个service加入到ServiceManager中,如果看过之前binder文章的同学会知道,加入了ServiceManager后,这个service就可以提供给其他进程使用了,所以start方法主要就是对外发布这个service的,这也是这个方法名字的意义。我们回到前面方法中,这个service还会被加入SystemService集合中,这个集合中的service后面还会继续启动。什么,不是启动好了吗,怎么还要启动?其实更准确的说应该是初始化,是这样的,系统的service非常之多,他们之间其实也是有依赖关系的,所以为了在启动过程中根据他们的依赖关系顺序启动,所以需要有些service需要启动在前面,有些service启动在后面,这样启动的service就能找到他依赖的service了。一旦一个系统service启动后,会调用start方法,这个方法主要是发布自己,可以让ServiceManager知道自己的存在从而提供给别的进程使用,另外就是加入SystemService集合,如果这个service还没初始化完毕,那就会等待后面的通知,当达到了一个合适的时间点,系统会从SystemService集合中取出他们让他们继续初始化。
这里所说的合适的时间点是什么,我们回到startBootstrapServices方法中,继续看后面的代码。在后面的代码中我们看到基本都是调用startService来启动一个service,但是我们也会看到这样一句代码的调用mSystemServiceManager.startBootPhase(SystemService.PHASE_WAIT_FOR_DEFAULT_DISPLAY),这个是SystemServiceManager中的方法,我们看一下这个方法:
```java
// 回调已经startService的onBootPhase方法
public void startBootPhase(final int phase) {
if (phase <= mCurrentPhase) {
throw new IllegalArgumentException("Next phase must be larger than previous");
}
mCurrentPhase = phase;
Slog.i(TAG, "Starting phase " + mCurrentPhase);
try {
Trace.traceBegin(Trace.TRACE_TAG_SYSTEM_SERVER, "OnBootPhase " + phase);
final int serviceLen = mServices.size();
// 遍历在这个阶段之前已经startService的那些service
for (int i = 0; i < serviceLen; i++) {
final SystemService service = mServices.get(i);
long time = System.currentTimeMillis();
Trace.traceBegin(Trace.TRACE_TAG_SYSTEM_SERVER, service.getClass().getName());
try {
// 调用service的onBootPhase方法
service.onBootPhase(mCurrentPhase);
} catch (Exception ex) {
throw new RuntimeException("Failed to boot service "
+ service.getClass().getName()
+ ": onBootPhase threw an exception during phase "
+ mCurrentPhase, ex);
}
warnIfTooLong(System.currentTimeMillis() - time, service, "onBootPhase");
Trace.traceEnd(Trace.TRACE_TAG_SYSTEM_SERVER);
}
} finally {
Trace.traceEnd(Trace.TRACE_TAG_SYSTEM_SERVER);
}
}
```
这个方法不复杂,其实就是从前面说的SystemService集合中,取出每个元素,调用他们的onBootPhase。说到这里,其实这个方法的调用就是我们上面说的合适的时间点,当前系统启动了指定的一些service后,我们就可以理解已经达到了一个阶段了,这个阶段就是指之前启动的service中,当时没有初始化的部分,现在已经可以继续初始化一部分的,所以会从SystemService集合中取出保存的service,调用onBootPhase方法就可以理解为继续初始化这个service。这里我们注意到还传入了一个参数,由于一个service的初始化也可能分为好几个阶段,所以这个参数就代码目前这个是哪个阶段,service根据这个参数代表的阶段,继续初始化当前阶段可以初始化的事情。我们可以看下SystemService中定义的所有阶段:
```java
/*
* Boot Phases
*/
public static final int PHASE_WAIT_FOR_DEFAULT_DISPLAY = 100; // maybe should be a dependency?
/**
* After receiving this boot phase, services can obtain lock settings data.
*/
public static final int PHASE_LOCK_SETTINGS_READY = 480;
/**
* After receiving this boot phase, services can safely call into core system services
* such as the PowerManager or PackageManager.
*/
public static final int PHASE_SYSTEM_SERVICES_READY = 500;
/**
* After receiving this boot phase, services can broadcast Intents.
*/
public static final int PHASE_ACTIVITY_MANAGER_READY = 550;
/**
* After receiving this boot phase, services can start/bind to third party apps.
* Apps will be able to make Binder calls into services at this point.
*/
public static final int PHASE_THIRD_PARTY_APPS_CAN_START = 600;
/**
* After receiving this boot phase, services can allow user interaction with the device.
* This phase occurs when boot has completed and the home application has started.
* System services may prefer to listen to this phase rather than registering a
* broadcast receiver for ACTION_BOOT_COMPLETED to reduce overall latency.
*/
public static final int PHASE_BOOT_COMPLETED = 1000;
```
这里我们看到从100开始到1000,会分为好多个阶段,我们没必要详细知道每个阶段具体代表什么意思,这里只要看名字就可以大概了解,比如第一个100的名字是PHASE_WAIT_FOR_DEFAULT_DISPLAY,我们就可以知道他和显示有关。再比如这里550的名字是PHASE_ACTIVITY_MANAGER_READY,我们可以知道这里是AMS准备好了的阶段。我们回到startBootstrapServices中,这里设置了PHASE_WAIT_FOR_DEFAULT_DISPLAY阶段,这个阶段之前,我们可以看到启动了一个DisplayManagerService,所以我们到这个service的onBootPhase方法中看下:
```java
@Override
public void onBootPhase(int phase) {
if (phase == PHASE_WAIT_FOR_DEFAULT_DISPLAY) {
synchronized (mSyncRoot) {
long timeout = SystemClock.uptimeMillis() + WAIT_FOR_DEFAULT_DISPLAY_TIMEOUT;
while (mLogicalDisplays.get(Display.DEFAULT_DISPLAY) == null) {
long delay = timeout - SystemClock.uptimeMillis();
if (delay <= 0) {
throw new RuntimeException("Timeout waiting for default display "
+ "to be initialized.");
}
if (DEBUG) {
Slog.d(TAG, "waitForDefaultDisplay: waiting, timeout=" + delay);
}
try {
mSyncRoot.wait(delay);
} catch (InterruptedException ex) {
}
}
}
}
}
```
可以看到这里果然是有PHASE_WAIT_FOR_DEFAULT_DISPLAY这个阶段的处理代码,具体代码我们暂时不管他的意思,我们现在主要是了解启动流程。好了我们回到startBootstrapServices方法,继续看后面的代码,下面开始创建PackageManagerService,这里创建的方法是调用PackageManagerService.main,我们看下这个方法:
```java
public static PackageManagerService main(Context context, Installer installer,
boolean factoryTest, boolean onlyCore) {
// Self-check for initial settings.
PackageManagerServiceCompilerMapping.checkProperties();
PackageManagerService m = new PackageManagerService(context, installer,
factoryTest, onlyCore);
m.enableSystemUserPackages();
ServiceManager.addService("package", m);
return m;
}
```
这个方法中我们看到在new了一个PackageManagerService后,没有像前面那样调用startService来启动service,而是直接调用ServiceManager的addService方法来注册,之所以这样说明PackageManagerService是不需要依赖其他service的,他只要直接注册到PackageManagerService里面就可以用了。看到这里我们可以得出systemServer会根据service的不同分别通过startService和addService来启动service,我们接着往下看代码。
下面的代码中,我们看到会调用mActivityManagerService.setSystemProcess()这个方法,这个方法是把当前systemServer进程保存在AMS中,之前我们分析AMS的时候可以知道,各个和AMS进程交互的进程都在保存进程的信息在AMS中,以便AMS可以通过这些信息找到对应的进程,systemServer也是一样的,他也是一个进程,他和AMS也有交互,所以这里的setSystemProcess方法是把systemServer进程的信息保存在AMS中,我们看下这个方法:
```java
public void setSystemProcess() {
try {
ServiceManager.addService(Context.ACTIVITY_SERVICE, this, true);
ServiceManager.addService(ProcessStats.SERVICE_NAME, mProcessStats);
ServiceManager.addService("meminfo", new MemBinder(this));
ServiceManager.addService("gfxinfo", new GraphicsBinder(this));
ServiceManager.addService("dbinfo", new DbBinder(this));
if (MONITOR_CPU_USAGE) {
ServiceManager.addService("cpuinfo", new CpuBinder(this));
}
ServiceManager.addService("permission", new PermissionController(this));
ServiceManager.addService("processinfo", new ProcessInfoService(this));
// 获取系统进程的信息
ApplicationInfo info = mContext.getPackageManager().getApplicationInfo(
"android", STOCK_PM_FLAGS | MATCH_SYSTEM_ONLY);
// systemServer的ApplicationInfo保存到AMS中
mSystemThread.installSystemApplicationInfo(info, getClass().getClassLoader());
synchronized (this) {
// 创建systemServer的ProcessRecord
ProcessRecord app = newProcessRecordLocked(info, info.processName, false, 0);
app.persistent = true;
app.pid = MY_PID;
app.maxAdj = ProcessList.SYSTEM_ADJ;
// 把systemserver的binder赋值给AMS
app.makeActive(mSystemThread.getApplicationThread(), mProcessStats);
synchronized (mPidsSelfLocked) {
mPidsSelfLocked.put(app.pid, app);
}
updateLruProcessLocked(app, false, null);
updateOomAdjLocked();
}
} catch (PackageManager.NameNotFoundException e) {
throw new RuntimeException(
"Unable to find android system package", e);
}
}
```
这个方法开始也是通过addService方法注册了一些service到ServiceManager中,说明这些service也是不依赖其他service的。接着会调用前面刚启动的PackageManager获取systemServer的ApplicationInfo,ApplicationInfo这个类我们在分析AMS时候有看到过,之前在把一个进程当做普通应用看待的时候,这个对象里主要保存的应用minifest的数据,这个systemServer是个系统进程,所以没有minifest,但是之前我们在systemServer类的run方法中调用 createSystemContext()方法给他设置过包名还记得吗,大家可以往上看看这个方法里面在创建ActivityThread类的时候设置了包名是"android",现在PackageManager通过这个包名取得ApplicationInfo,然后保存到AMS中。AMS中专门有个变量mSystemThread来保存systemServer的ActivityThread(因为这个是系统类比较特殊嘛),然后这里调用它的installSystemApplicationInfo方法把ApplicationInfo保存过去:
```java
public void installSystemApplicationInfo(ApplicationInfo info, ClassLoader classLoader) {
synchronized (this) {
getSystemContext().installSystemApplicationInfo(info, classLoader);
getSystemUiContext().installSystemApplicationInfo(info, classLoader);
// give ourselves a default profiler
mProfiler = new Profiler();
}
}
........
void installSystemApplicationInfo(ApplicationInfo info, ClassLoader classLoader) {
assert info.packageName.equals("android");
mApplicationInfo = info;
mClassLoader = classLoader;
}
```
这里最终会保存到LoadApk中的变量中,这样AMS可以顺藤摸瓜最后会找到systemServer相关的信息。我们回到前面setSystemProcess方法,后面还创建了ProcessRecord对象,这个对象我们在之前AMS中也见到对,是保存每个进程里面含有所有数据的对象,其中最重要的是他里面有个IApplicationThread字段,这个字段里面是一个binder,即他是负责进程间通信的,这里会调用他的makeActive方法来设置IApplicationThread:
```java
public void makeActive(IApplicationThread _thread, ProcessStatsService tracker) {
if (thread == null) {
final ProcessState origBase = baseProcessTracker;
if (origBase != null) {
origBase.setState(ProcessStats.STATE_NOTHING,
tracker.getMemFactorLocked(), SystemClock.uptimeMillis(), pkgList);
origBase.makeInactive();
}
baseProcessTracker = tracker.getProcessStateLocked(info.packageName, uid,
info.versionCode, processName);
baseProcessTracker.makeActive();
for (int i=0; i<pkgList.size(); i++) {
ProcessStats.ProcessStateHolder holder = pkgList.valueAt(i);
if (holder.state != null && holder.state != origBase) {
holder.state.makeInactive();
}
holder.state = tracker.getProcessStateLocked(pkgList.keyAt(i), uid,
info.versionCode, processName);
if (holder.state != baseProcessTracker) {
holder.state.makeActive();
}
}
}
thread = _thread;
}
```
这个方法我们在AMS也分析过,这里就看下最后一句代码,设置了IApplicationThread,这样ProcessRecord就持有的systemServer的binder了,最后把这个ProcessRecord设置给AMS的mPidsSelfLocked,mPidsSelfLocked是一个map,以进程的pid为key,ProcessRecord为value,这样AMS中就有了systemServer的进程信息。
好了,startBootstrapServices基本就分析到这里了,他里面主要是启动了一些基础的service,同时对于有些需要依赖其他service而不能完全初始化的service来说,暂时加入SystemService集合,等待该启动的service都启动完了,会调用startBootPhase方法来回到SystemService中的各个service,以便他们能继续初始化下一阶段的内。同时startBootstrapServices方法还会把systemServer进程保存到AMS中,以便以后可以和AMS同学。接着我们回到SystemServer的run方法,继续看下一个启动service方法startCoreServices:
```java
private void startCoreServices() {
// Records errors and logs, for example wtf()
traceBeginAndSlog("StartDropBoxManager");
// 启动DropBox日历管理服务
mSystemServiceManager.startService(DropBoxManagerService.class);
traceEnd();
traceBeginAndSlog("StartBatteryService");
// Tracks the battery level. Requires LightService.
// 启动电池统计服务
mSystemServiceManager.startService(BatteryService.class);
traceEnd();
// Tracks application usage stats.
traceBeginAndSlog("StartUsageService");
// 启动应用使用统计服务
mSystemServiceManager.startService(UsageStatsService.class);
// 把统计服务设置给AMS
mActivityManagerService.setUsageStatsManager(
LocalServices.getService(UsageStatsManagerInternal.class));
traceEnd();
// Tracks whether the updatable WebView is in a ready state and watches for update installs.
traceBeginAndSlog("StartWebViewUpdateService");
// 启动更新webview服务
mWebViewUpdateService = mSystemServiceManager.startService(WebViewUpdateService.class);
traceEnd();
}
```
这个方法我们看到也是启动了几个service,根据这个方法的名字可以看做为核心服务。但是这个方法我们没有看到startBootPhase方法的调用,说明这么启动的阶段还是保持在和上一个方法结束时一样,这里具体的service也就不深入看了,这个方法流程比较比较简单,就是启动了几个新的service,我们继续看下一个方法startOtherServices,这个方法可谓是非常非常的长,但是里面绝大多数的代码还是启动service,由于代码实在太多了,下面主要贴出一些准备分析的内容,其余大家可以找源码看下:
```java
private void startOtherServices() {
........
mSystemServiceManager.startBootPhase(SystemService.PHASE_LOCK_SETTINGS_READY);
traceEnd();
traceBeginAndSlog("StartBootPhaseSystemServicesReady");
mSystemServiceManager.startBootPhase(SystemService.PHASE_SYSTEM_SERVICES_READY);
traceEnd();
...........
// 执行AMS中的systemReady,会启动桌面
mActivityManagerService.systemReady(() -> {
Slog.i(TAG, "Making services ready");
traceBeginAndSlog("StartActivityManagerReadyPhase");
mSystemServiceManager.startBootPhase(
SystemService.PHASE_ACTIVITY_MANAGER_READY);
traceEnd();
traceBeginAndSlog("StartObservingNativeCrashes");
try {
mActivityManagerService.startObservingNativeCrashes();
} catch (Throwable e) {
reportWtf("observing native crashes", e);
}
traceEnd();
..................
mSystemServiceManager.startBootPhase(
SystemService.PHASE_THIRD_PARTY_APPS_CAN_START);
....................
}, BOOT_TIMINGS_TRACE_LOG);
}
```
这个方法中会多次startBootPhase方法,这个方法前面分析过了,每经过一个阶段的启动,就会调用这个方法,这个方法会回调之前启动过的service继续初始化他们需要初始化的东西,最终会在AMS中完成全部的启动阶段。
上面这个方法中,我们看到AMS会调用systemReady方法,这个方法里面会调用到startBootPhase方法的PHASE_THIRD_PARTY_APPS_CAN_START阶段,这个阶段是倒数第二个阶段,最后一个完成介绍实在AMS里面调用的,这个要等到启动到桌面后再调用,我们后面再说。执行到这个方法为止,基本上系统的service已经都启动完成了,后面只要进入到桌面进程后,整个android的启动流程也就走完了。这个方法中除了启动service以及执行startBootPhase来不断一个阶段一个阶段的初始化各个service,我们从代码中看到最后还会调用每个service的systemReady方法,这个方法执行后说明这个service本身已经启动完成了,他的各项功能已经完备了,最后只差AMS那里做一些相互间交互的最后准备工作了。所以我们这里看看AMS的systemReady方法做了些什么:
```java
public void systemReady(final Runnable goingCallback, BootTimingsTraceLog traceLog) {
traceLog.traceBegin("PhaseActivityManagerReady");
synchronized(this) {
if (mSystemReady) {
// If we're done calling all the receivers, run the next "boot phase" passed in
// by the SystemServer
if (goingCallback != null) {
goingCallback.run();
}
return;
}
..............
mSystemReady = true;
}
ArrayList<ProcessRecord> procsToKill = null;
synchronized(mPidsSelfLocked) {
// 遍历所有正在运行的进程
for (int i=mPidsSelfLocked.size()-1; i>=0; i--) {
ProcessRecord proc = mPidsSelfLocked.valueAt(i);
// 如果这个进程不是persistent的,会被加到procsToKill集合中
if (!isAllowedWhileBooting(proc.info)){
if (procsToKill == null) {
procsToKill = new ArrayList<ProcessRecord>();
}
procsToKill.add(proc);
}
}
}
synchronized(this) {
if (procsToKill != null) {
// 遍历上面筛选出来的非系统进程,然后要杀进程
for (int i=procsToKill.size()-1; i>=0; i--) {
ProcessRecord proc = procsToKill.get(i);
Slog.i(TAG, "Removing system update proc: " + proc);
removeProcessLocked(proc, true, false, "system update done");
}
}
// Now that we have cleaned up any update processes, we
// are ready to start launching real processes and know that
// we won't trample on them any more.
mProcessesReady = true;
}
```
这个方法我们分两段来看,这个是第一段。首先如果mSystemReady这个变量是true,那么说明已经被执行过了,那么会调用第一个参数的runnable,这个方法定义在前面startOtherServices中,那里有调用startBootPhase等方法,否则false的话后面会置为true。
接着会检查当前运行的进程中是否是persistent的,当前正常情况下都是系统进程,应该是persistent的,否则不是的话,就会先加入到procsToKill集合中,然后会调用removeProcessLocked方法移除进程,我们看下这个方法:
```java
boolean removeProcessLocked(ProcessRecord app,
boolean callerWillRestart, boolean allowRestart, String reason) {
final String name = app.processName;
final int uid = app.uid;
if (DEBUG_PROCESSES) Slog.d(TAG_PROCESSES,
"Force removing proc " + app.toShortString() + " (" + name + "/" + uid + ")");
// 获取这个进程的ProcessRecord
ProcessRecord old = mProcessNames.get(name, uid);
if (old != app) {
// This process is no longer active, so nothing to do.
Slog.w(TAG, "Ignoring remove of inactive process: " + app);
return false;
}
// 从ProcessRecord集合中移除这个进程
removeProcessNameLocked(name, uid);
// 如果是个重量级进程,还会发送Handler,取消通知等
if (mHeavyWeightProcess == app) {
mHandler.sendMessage(mHandler.obtainMessage(CANCEL_HEAVY_NOTIFICATION_MSG,
mHeavyWeightProcess.userId, 0));
mHeavyWeightProcess = null;
}
boolean needRestart = false;
if (app.pid > 0 && app.pid != MY_PID) {
int pid = app.pid;
synchronized (mPidsSelfLocked) {
mPidsSelfLocked.remove(pid);
mHandler.removeMessages(PROC_START_TIMEOUT_MSG, app);
}
mBatteryStatsService.noteProcessFinish(app.processName, app.info.uid);
if (app.isolated) {
mBatteryStatsService.removeIsolatedUid(app.uid, app.info.uid);
getPackageManagerInternalLocked().removeIsolatedUid(app.uid);
}
boolean willRestart = false;
if (app.persistent && !app.isolated) {
if (!callerWillRestart) {
willRestart = true;
} else {
needRestart = true;
}
}
// 杀进程
app.kill(reason, true);
// 处理和这个进程相关的后事,比如service的死亡监听等
handleAppDiedLocked(app, willRestart, allowRestart);
// 如果要重启的话,重新执行
if (willRestart) {
removeLruProcessLocked(app);
addAppLocked(app.info, null, false, null /* ABI override */);
}
} else {
mRemovedProcesses.add(app);
}
return needRestart;
}
```
这个方法也比较简单,首先通过进程名字获取这个进程的ProcessRecord,如果和需要移除的进程是相同的对象,那么就会从AMS的保存运行进程的map中移除。之后如果这个移除的进程不是本进程的话,会杀死这个进程,然后处理和这个进程相关的一些后续操作,比如移除和他相关的service死亡监听等。如果是本进程则先把他加入带移除的进程集合,后续再操作。好了,我们回到systemReady方法,看后半段代码:
```java
// 根据setting来初始化一些数据
retrieveSettings();
final int currentUserId;
synchronized (this) {
currentUserId = mUserController.getCurrentUserIdLocked();
// 读取被授权的uri,保存在mGrantedUriPermissions中
readGrantedUriPermissionsLocked();
}
// 执行systemServer那里的startOtherServices后面的方法
if (goingCallback != null) goingCallback.run();
traceLog.traceBegin("ActivityManagerStartApps");
// 统计用户运行的时间
mBatteryStatsService.noteEvent(BatteryStats.HistoryItem.EVENT_USER_RUNNING_START,
Integer.toString(currentUserId), currentUserId);
// 统计前台运行时间
mBatteryStatsService.noteEvent(BatteryStats.HistoryItem.EVENT_USER_FOREGROUND_START,
Integer.toString(currentUserId), currentUserId);
mSystemServiceManager.startUser(currentUserId);
synchronized (this) {
// Only start up encryption-aware persistent apps; once user is
// unlocked we'll come back around and start unaware apps
// 启动persist的app
startPersistentApps(PackageManager.MATCH_DIRECT_BOOT_AWARE);
// Start up initial activity.
mBooting = true;
// Enable home activity for system user, so that the system can always boot. We don't
// do this when the system user is not setup since the setup wizard should be the one
// to handle home activity in this case.
if (UserManager.isSplitSystemUser() &&
Settings.Secure.getInt(mContext.getContentResolver(),
Settings.Secure.USER_SETUP_COMPLETE, 0) != 0) {
ComponentName cName = new ComponentName(mContext, SystemUserHomeActivity.class);
try {
// 可以启用SystemUserHomeActivity这个Activity
AppGlobals.getPackageManager().setComponentEnabledSetting(cName,
PackageManager.COMPONENT_ENABLED_STATE_ENABLED, 0,
UserHandle.USER_SYSTEM);
} catch (RemoteException e) {
throw e.rethrowAsRuntimeException();
}
}
// 启动主界面
startHomeActivityLocked(currentUserId, "systemReady");
try {
if (AppGlobals.getPackageManager().hasSystemUidErrors()) {
Slog.e(TAG, "UIDs on the system are inconsistent, you need to wipe your"
+ " data partition or your device will be unstable.");
mUiHandler.obtainMessage(SHOW_UID_ERROR_UI_MSG).sendToTarget();
}
} catch (RemoteException e) {
}
if (!Build.isBuildConsistent()) {
Slog.e(TAG, "Build fingerprint is not consistent, warning user");
mUiHandler.obtainMessage(SHOW_FINGERPRINT_ERROR_UI_MSG).sendToTarget();
}
long ident = Binder.clearCallingIdentity();
try {
// 发送ACTION_USER_STARTED广播
Intent intent = new Intent(Intent.ACTION_USER_STARTED);
intent.addFlags(Intent.FLAG_RECEIVER_REGISTERED_ONLY
| Intent.FLAG_RECEIVER_FOREGROUND);
intent.putExtra(Intent.EXTRA_USER_HANDLE, currentUserId);
broadcastIntentLocked(null, null, intent,
null, null, 0, null, null, null, AppOpsManager.OP_NONE,
null, false, false, MY_PID, SYSTEM_UID,
currentUserId);
// 发送ACTION_USER_STARTING广播
intent = new Intent(Intent.ACTION_USER_STARTING);
intent.addFlags(Intent.FLAG_RECEIVER_REGISTERED_ONLY);
intent.putExtra(Intent.EXTRA_USER_HANDLE, currentUserId);
broadcastIntentLocked(null, null, intent,
null, new IIntentReceiver.Stub() {
@Override
public void performReceive(Intent intent, int resultCode, String data,
Bundle extras, boolean ordered, boolean sticky, int sendingUser)
throws RemoteException {
}
}, 0, null, null,
new String[] {INTERACT_ACROSS_USERS}, AppOpsManager.OP_NONE,
null, true, false, MY_PID, SYSTEM_UID, UserHandle.USER_ALL);
} catch (Throwable t) {
Slog.wtf(TAG, "Failed sending first user broadcasts", t);
} finally {
Binder.restoreCallingIdentity(ident);
}
// 显示前台焦点Activity
mStackSupervisor.resumeFocusedStackTopActivityLocked();
mUserController.sendUserSwitchBroadcastsLocked(-1, currentUserId);
traceLog.traceEnd(); // ActivityManagerStartApps
traceLog.traceEnd(); // PhaseActivityManagerReady
}
```
后半段代码首先调用retrieveSettings方法从设置中读取一些配置信息。我们稍微看一下这个方法:
```java
private void retrieveSettings() {
final ContentResolver resolver = mContext.getContentResolver();
final boolean freeformWindowManagement =
mContext.getPackageManager().hasSystemFeature(FEATURE_FREEFORM_WINDOW_MANAGEMENT)
|| Settings.Global.getInt(
resolver, DEVELOPMENT_ENABLE_FREEFORM_WINDOWS_SUPPORT, 0) != 0;
final boolean supportsPictureInPicture =
mContext.getPackageManager().hasSystemFeature(FEATURE_PICTURE_IN_PICTURE);
final boolean supportsMultiWindow = ActivityManager.supportsMultiWindow(mContext);
final boolean supportsSplitScreenMultiWindow =
ActivityManager.supportsSplitScreenMultiWindow(mContext);
final boolean supportsMultiDisplay = mContext.getPackageManager()
.hasSystemFeature(FEATURE_ACTIVITIES_ON_SECONDARY_DISPLAYS);
final String debugApp = Settings.Global.getString(resolver, DEBUG_APP);
final boolean waitForDebugger = Settings.Global.getInt(resolver, WAIT_FOR_DEBUGGER, 0) != 0;
final boolean alwaysFinishActivities =
Settings.Global.getInt(resolver, ALWAYS_FINISH_ACTIVITIES, 0) != 0;
final boolean forceRtl = Settings.Global.getInt(resolver, DEVELOPMENT_FORCE_RTL, 0) != 0;
final boolean forceResizable = Settings.Global.getInt(
resolver, DEVELOPMENT_FORCE_RESIZABLE_ACTIVITIES, 0) != 0;
final long waitForNetworkTimeoutMs = Settings.Global.getLong(resolver,
NETWORK_ACCESS_TIMEOUT_MS, NETWORK_ACCESS_TIMEOUT_DEFAULT_MS);
final boolean supportsLeanbackOnly =
mContext.getPackageManager().hasSystemFeature(FEATURE_LEANBACK_ONLY);
// Transfer any global setting for forcing RTL layout, into a System Property
SystemProperties.set(DEVELOPMENT_FORCE_RTL, forceRtl ? "1":"0");
final Configuration configuration = new Configuration();
Settings.System.getConfiguration(resolver, configuration);
if (forceRtl) {
// This will take care of setting the correct layout direction flags
configuration.setLayoutDirection(configuration.locale);
}
synchronized (this) {
mDebugApp = mOrigDebugApp = debugApp;
mWaitForDebugger = mOrigWaitForDebugger = waitForDebugger;
mAlwaysFinishActivities = alwaysFinishActivities;
mSupportsLeanbackOnly = supportsLeanbackOnly;
mForceResizableActivities = forceResizable;
final boolean multiWindowFormEnabled = freeformWindowManagement
|| supportsSplitScreenMultiWindow
|| supportsPictureInPicture
|| supportsMultiDisplay;
if ((supportsMultiWindow || forceResizable) && multiWindowFormEnabled) {
mSupportsMultiWindow = true;
mSupportsFreeformWindowManagement = freeformWindowManagement;
mSupportsSplitScreenMultiWindow = supportsSplitScreenMultiWindow;
mSupportsPictureInPicture = supportsPictureInPicture;
mSupportsMultiDisplay = supportsMultiDisplay;
} else {
mSupportsMultiWindow = false;
mSupportsFreeformWindowManagement = false;
mSupportsSplitScreenMultiWindow = false;
mSupportsPictureInPicture = false;
mSupportsMultiDisplay = false;
}
mWindowManager.setForceResizableTasks(mForceResizableActivities);
mWindowManager.setSupportsPictureInPicture(mSupportsPictureInPicture);
// This happens before any activities are started, so we can change global configuration
// in-place.
updateConfigurationLocked(configuration, null, true);
final Configuration globalConfig = getGlobalConfiguration();
if (DEBUG_CONFIGURATION) Slog.v(TAG_CONFIGURATION, "Initial config: " + globalConfig);
// Load resources only after the current configuration has been set.
final Resources res = mContext.getResources();
mHasRecents = res.getBoolean(com.android.internal.R.bool.config_hasRecents);
mThumbnailWidth = res.getDimensionPixelSize(
com.android.internal.R.dimen.thumbnail_width);
mThumbnailHeight = res.getDimensionPixelSize(
com.android.internal.R.dimen.thumbnail_height);
mAppErrors.loadAppsNotReportingCrashesFromConfigLocked(res.getString(
com.android.internal.R.string.config_appsNotReportingCrashes));
mUserController.mUserSwitchUiEnabled = !res.getBoolean(
com.android.internal.R.bool.config_customUserSwitchUi);
if ((globalConfig.uiMode & UI_MODE_TYPE_TELEVISION) == UI_MODE_TYPE_TELEVISION) {
mFullscreenThumbnailScale = (float) res
.getInteger(com.android.internal.R.integer.thumbnail_width_tv) /
(float) globalConfig.screenWidthDp;
} else {
mFullscreenThumbnailScale = res.getFraction(
com.android.internal.R.fraction.thumbnail_fullscreen_scale, 1, 1);
}
mWaitForNetworkTimeoutMs = waitForNetworkTimeoutMs;
}
}
```
这段方法的详细解释就不解释了,我们可以看到这个方法会从设置或者资源文件中读取各种配置保存下来,也是一个初始化的过程。我们回到systemReady方法继续看下去。接着会读取uri权限,统统计运行时间等,最后会准备启动桌面了。我们看到启动桌面进程的类是SystemUserHomeActivity,接下来就会调用startHomeActivityLocked来启动桌面进程了,关于启动桌面进程我们下一篇文章在介绍,这里大家就知道了,会在systemServer的最后启动桌面进程从而android系统开始进入桌面了。
启动桌面后,systemServer还会发送ACTION_USER_STARTED和ACTION_USER_STARTING广播,最后调用AMS的resumeFocusedStackTopActivityLocked方法,这个方法会把桌面的任务显示在最前面,这个方法我们在分析AMS中已经分析过了,这里也不详细分析了。
至此,systemServer的流程就完成了,我们从init进程到zygote进程,再到systemServer用了三篇文章来分析android的启动过程,启动的过程确实做了不少的事情,我们如果平时开发android的应用程序比较多的话,在来看底层的启动,确实涉及到的东西非常的多,我们的分析中也不是说面面俱到,这里涉及的内容确实太多了,也不可能面面俱到,但是我们沿着主要的流程走,启动的主要内容能有所了解,那么具体某一部分如果我们需要深入了解的话,就可以快速找到位置进行分析了。好了,这篇文章就暂时到这里,下面我们会继续分析从systemServer到桌面进程的流程,以及看下android正常启动后是如果创建一个新进程的流程。下面我们还是看下systemServer的时序图。

android启动之SystemServer进程