How to Efficiently Run 600+ Pytest Appium Android Tests Within an Hour Using Single Emulator or Parallel Execution?

vipulsai · April 21, 2025, 5:01am

I have a large suite of 600+ pytest Appium testcases for an Android APK. Running them on a single emulator takes several hours locally and in Jenkins (headless mode), and attempts at parallel execution with pytest-xdist cause emulator crashes, Appium connection failures, and high test flakiness due to resource constraints on the machine.

I want to complete the full test run successfully in under an hour. Given the hardware limits and instability when running multiple emulators in parallel, I’m considering two possible approaches:

Running all tests sequentially on a single emulator but optimizing for speed and stability
Running tests in parallel across multiple emulators but without overwhelming the machine and causing failures

What are the best strategies, configurations, or tools to achieve fast, reliable execution in either of these approaches? How can I balance emulator resource usage, Appium stability, and test execution speed effectively? Any advice on managing emulator instances, Appium server setup, or test distribution for large-scale Android automation would be highly appreciated.

mykola-mokhnach · April 21, 2025, 7:14am

There are multiple articles available on how to optimize emulator performance, for example Reconsider the Android emulator for faster testing | by Doug Stevenson | Mesmer | Medium

As for my personal experience:

it is necessary to have really powerful machine to run multiple emulators tests on it in parallel. everything matters: CPU, RAM, I/O
In my tests the performance of the same test on a headless emulator was worse in comparison to the non-headless one. I assume it highly depends on the availability of hardware acceleration, but it still makes sense for you to compare these
In my experience horizontal scaling gives better performance and stability results in comparison to a vertical one. Examples of horizontally scaled systems are:
- multiple docker instances with real devices connected to each one (1 real device → 1 docker host). Here the performance should be optimal as real devices would provide the necessary resources needed for the app and the test to run without affecting host resources.
- multiple docker instances with emulator devices running in headless (1 emulator → 1 docker host). In this solution the performance of a single unit is slower because of the lack of hardware acceleration, but it might be compensated by the amount of parallel runners. It is important to mention that docker VMs should be properly distributed across their physical hosts to align the shared host resources usage
- multiple instances of VMs with hardware acceleration enabled running emulators (1 emulator → 1 virtualized host). This might be the most complicated setup type. You need to look for GPU virtualization technologies, for example Proxmox: Installing and Enabling Intel Split GPU SR-IOV (GVT-g) on 12Gen/13Gen CPUs | by Abdelwahed AJ | Medium, which allow to share a host GPU into virtual machines.
- Use multiple host devices with emulators without virtualization (1 emulator → 1 physical host). This might be more expensive and less flexible from the perspective of scaling and maintenance. But also the most performant option if done properly. Consider using mini-pcs with integrated Intel graphic for such purpose. I mentioned Intel because AFAIK their drivers work the best in Linux.
Consider using a mobile testing cloud provider. It would be a (much more) expensive option, but it won’t require any local hardware setup or maintenance. Simply plug and play. If you don’t like their performance then you may always change your provider to another one.

You’d also need to think about the software orchestration layer over runner machines. Check, for example, zookeper, or ansible for such purpose.
You may also use minikube/k3s/proxmox to manage virtual instances scaling and deployment