简单分析 App 进程 Crash 机制

code小生 一个专注大前端领域的技术平台 公众号回复 Android 加入安卓技术群

作者:杰杰_88

链接:https://www.jianshu.com/p/ecd52cd90a4b

声明:本文已获 杰杰_88 授权发表,转发等请联系原作者授权

结论:App进程Crash,不是真正意义上的进程崩溃(对比native代码崩溃),是java代码运行抛出没人处理的异常后,App自己把自己Kill掉了。

工作中遇到后台Service挂掉后(弹出停止运行),很久没有重启,分析log发现进程抛出FATAL EXCEPTION后并没有被杀,很久后才被杀掉重启,迷惑,遂看看具体的App挂掉流程是什么样的。

表象

当一个Android App进程因为各种原因抛出异常而没有被catch处理的时候,在用户看来,就会看到一个“某某已停止运行”的对话框,之前我一般认为该app进程已经挂掉。

实际上

以前在看到“某某已停止运行”时,一直认为对应进程也同时结束,没有仔细分析过整个App停止运行的机制,其实,停止运行对话框弹出的时候,进程还没有完全退出,真正的退出是进程将自己kill掉的时候。下面就记录下从App抛出没有catch的异常到该进程真正灰飞烟灭的整个过程。

App进程的创建

要分析一个app进程是怎么没的,先看看app进程是怎么来的。

关键代码

App进程创建流程:

App进程启动流程.png

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

startResult = Process.start(entryPoint,

                        app.processName, uid, uid, gids, debugFlags, mountExternal,

                        app.info.targetSdkVersion, seInfo, requiredAbi, instructionSet,

                        app.info.dataDir, invokeWith, entryPointArgs);

frameworks/base/core/java/android/os/ZygoteProcess.java

//ZygoteState维护了与Zygote进程通过Socket的连接

    private ZygoteState openZygoteSocketIfNeeded(String abi) throws ZygoteStartFailedEx {

        Preconditions.checkState(Thread.holdsLock(mLock), "ZygoteProcess lock not held");

        if (primaryZygoteState == null || primaryZygoteState.isClosed()) {

            try {

                primaryZygoteState = ZygoteState.connect(mSocket);

            } catch (IOException ioe) {

                throw new ZygoteStartFailedEx("Error connecting to primary zygote", ioe);

            }

        }

        if (primaryZygoteState.matches(abi)) {

            return primaryZygoteState;

        }

        // The primary zygote didn't match. Try the secondary.

        if (secondaryZygoteState == null || secondaryZygoteState.isClosed()) {

            try {

                secondaryZygoteState = ZygoteState.connect(mSecondarySocket);

            } catch (IOException ioe) {

                throw new ZygoteStartFailedEx("Error connecting to secondary zygote", ioe);

            }

        }

        if (secondaryZygoteState.matches(abi)) {

            return secondaryZygoteState;

        }

        throw new ZygoteStartFailedEx("Unsupported zygote ABI: " + abi);

    }


    private static Process.ProcessStartResult zygoteSendArgsAndGetResult(

            ZygoteState zygoteState, ArrayList<String> args)

            throws ZygoteStartFailedEx {

        try {

            // Throw early if any of the arguments are malformed. This means we can

            // avoid writing a partial response to the zygote.

            int sz = args.size();

            for (int i = 0; i < sz; i++) {

                if (args.get(i).indexOf('\n') >= 0) {

                    throw new ZygoteStartFailedEx("embedded newlines not allowed");

                }

            }

            /**

            * See com.android.internal.os.SystemZygoteInit.readArgumentList()

            * Presently the wire format to the zygote process is:

            * a) a count of arguments (argc, in essence)

            * b) a number of newline-separated argument strings equal to count

            *

            * After the zygote process reads these it will write the pid of

            * the child or -1 on failure, followed by boolean to

            * indicate whether a wrapper process was used.

            */

            final BufferedWriter writer = zygoteState.writer;

            final DataInputStream inputStream = zygoteState.inputStream;

            writer.write(Integer.toString(args.size()));

            writer.newLine();

            for (int i = 0; i < sz; i++) {

                String arg = args.get(i);

                writer.write(arg);

                writer.newLine();

            }

            writer.flush();

            // Should there be a timeout on this?

            Process.ProcessStartResult result = new Process.ProcessStartResult();

            // Always read the entire result from the input stream to avoid leaving

            // bytes in the stream for future process starts to accidentally stumble

            // upon.

            result.pid = inputStream.readInt();

            result.usingWrapper = inputStream.readBoolean();

            if (result.pid < 0) {

                throw new ZygoteStartFailedEx("fork() failed");

            }

            return result;

        } catch (IOException ex) {

            zygoteState.close();

            throw new ZygoteStartFailedEx(ex);

        }

    }

zygoteSendArgsAndGetResult方法通过LocalSocket发送的命令被Zygote接收到:

frameworks/base/core/java/com/android/internal/os/ZygoteConnection.java

pid = Zygote.forkAndSpecialize(parsedArgs.uid, parsedArgs.gid, parsedArgs.gids,

                parsedArgs.debugFlags, rlimits, parsedArgs.mountExternal, parsedArgs.seInfo,

                parsedArgs.niceName, fdsToClose, fdsToIgnore, parsedArgs.instructionSet,

                parsedArgs.appDataDir);

此处fork出真正的app进程,然后在fork出的子进程中执行命令:

ZygoteInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs,

                    null /* classLoader */);

执行的命令:

最终会从ActivityThread.java 的main函数进入,开始App的生命周期

*RuntimeInit.commonInit()

上面流程中,App进程fork出来后,执行此函数:

RuntimeInit.commonInit()

其中:

    Thread.setUncaughtExceptionPreHandler(new LoggingHandler());

    Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler());
    /**

    * Dispatch an uncaught exception to the handler. This method is

    * intended to be called only by the runtime and by tests.

    *

    * @hide

    */

    // @VisibleForTesting (would be private if not for tests)

    public final void dispatchUncaughtException(Throwable e) {

        Thread.UncaughtExceptionHandler initialUeh =

                Thread.getUncaughtExceptionPreHandler();

        if (initialUeh != null) {

            try {

                initialUeh.uncaughtException(this, e);

            } catch (RuntimeException | Error ignored) {

                // Throwables thrown by the initial handler are ignored

            }

        }

        getUncaughtExceptionHandler().uncaughtException(this, e);

    }

setUncaughtExceptionPreHandler设置“未捕获异常预处理程序”为loggingHandler,setDefaultUncaughtExceptionHandler设置真正的“未捕获异常默认处理程序”为KillApplicationHandler,按字面意思以及函数dispatchUncaughtException理解,发生异常时,先调用loggingHandler处理异常,再调用KillApplicationHandler处理。loggingHandler就是用来打印FATAL EXCEPTION以及trace的:

E AndroidRuntime: FATAL EXCEPTION: main

KillApplicationHandler:

    /**

    * Handle application death from an uncaught exception.  The framework

    * catches these for the main threads, so this should only matter for

    * threads created by applications.  Before this method runs,

    * {@link LoggingHandler} will already have logged details.

    */

    private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {

        public void uncaughtException(Thread t, Throwable e) {

            try {

                // Don't re-enter -- avoid infinite loops if crash-reporting crashes.

                if (mCrashing) return;

                mCrashing = true;

                // Try to end profiling. If a profiler is running at this point, and we kill the

                // process (below), the in-memory buffer will be lost. So try to stop, which will

                // flush the buffer. (This makes method trace profiling useful to debug crashes.)

                if (ActivityThread.currentActivityThread() != null) {

                    ActivityThread.currentActivityThread().stopProfiling();

                }

                final String processName = ActivityThread.currentProcessName();

                if (processName != null) {

                    if (Build.IS_USERDEBUG && processName.equals(SystemProperties.get("persist.debug.process")))  {

                        Log.w(TAG, "process: " + processName + " crash message is skip");

                        return;

                    }

                }

                // Bring up crash dialog, wait for it to be dismissed

                ActivityManager.getService().handleApplicationCrash(

                        mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));

            } catch (Throwable t2) {

                if (t2 instanceof DeadObjectException) {

                    // System process is dead; ignore

                } else {

                    try {

                        Clog_e(TAG, "Error reporting crash", t2);

                    } catch (Throwable t3) {

                        // Even Clog_e() fails!  Oh well.

                    }

                }

            } finally {

                // Try everything to make sure this process goes away.

                Process.killProcess(Process.myPid());

                System.exit(10);

            }

        }

    }

这里通过如下代码和ActivityManagerService交互弹出“停止运行”对话框,注意注释,对话框消失后才会继续往下执行。

// Bring up crash dialog, wait for it to be dismissed

ActivityManager.getService().handleApplicationCrash(

                        mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));

在ActivityManagerService,最终会停在如下代码处:

AppErrors.java crashApplicationInner():
synchronized (mService) {

            /**

            * If crash is handled by instance of {@link android.app.IActivityController},

            * finish now and don't show the app error dialog.

            */

            if (handleAppCrashInActivityController(r, crashInfo, shortMsg, longMsg, stackTrace,

                    timeMillis, callingPid, callingUid)) {

                return;

            }

            /**

            * If this process was running instrumentation, finish now - it will be handled in

            * {@link ActivityManagerService#handleAppDiedLocked}.

            */

            if (r != null && r.instr != null) {

                return;

            }

            // Log crash in battery stats.

            if (r != null) {

                mService.mBatteryStatsService.noteProcessCrash(r.processName, r.uid);

            }

            AppErrorDialog.Data data = new AppErrorDialog.Data();

            data.result = result;

            data.proc = r;

            // If we can't identify the process or it's already exceeded its crash quota,

            // quit right away without showing a crash dialog.

            if (r == null || !makeAppCrashingLocked(r, shortMsg, longMsg, stackTrace, data)) {

                return;

            }

            final Message msg = Message.obtain();

            msg.what = ActivityManagerService.SHOW_ERROR_UI_MSG;

            task = data.task;

            msg.obj = data;

            mService.mUiHandler.sendMessage(msg);

        }

        int res = result.get();

result为AppErrorResult类型,result.get()会wait(),block当前Binder调用,等待对应的notify;前面的代码就是弹出“停止运行”的对话框:AppErrorDialog,result会随data传入AppErrorDialog,dismiss时调用result.set(),唤醒刚才Binder线程的wait:

AppErrorResult
final class AppErrorResult {

    public void set(int res) {

        synchronized (this) {

            mHasResult = true;

            mResult = res;

            notifyAll();

        }

    }

    public int get() {

        synchronized (this) {

            while (!mHasResult) {

                try {

                    wait();

                } catch (InterruptedException e) {

                }

            }

        }

        return mResult;

    }

    boolean mHasResult = false;

    int mResult;

}

然后进行后面的处理Binder调用返回后,App进程中才最终会杀死自己:

finally {

    // Try everything to make sure this process goes away.

    Process.killProcess(Process.myPid());

    System.exit(10);

}

注意到,在AppErrorDialog构造函数中:

// After the timeout, pretend the user clicked the quit button

mHandler.sendMessageDelayed(

        mHandler.obtainMessage(TIMEOUT),

        DISMISS_TIMEOUT)

如果用户一直没有理睬,会在5分钟后返回,可以注意如下log:

Slog.w(TAG, "handleApplicationStrictModeViolation; res=" + res);

在超时后才返回,就会导致 app 进程在 crash 状态下存在 5 分钟之久,除了异常的线程,其他线程还会努力工作,有可能会有些奇怪的事情发生。应该挂掉重启的,由于进程没有被杀死, ActivityManagerService 收不到 binderDied 消息,也会在超时之前一直得不到重启。

我来评几句
登录后评论

已发表评论数()

相关站点

+订阅
热门文章