Using C/C++ in Android: A Comprehensive Guide For Beginners | by Shubham Panchal | Jan, 2024
As an Android developer, we all have been using Java or Kotlin to build beautiful UIs and features. The entire Android stack, right from app development to app execution, revolves around JVM and Java-ish features, excluding the kernel (the Linux kernel) which is written in C.
Java, as a programming language, has a lot of good features that make it the first for app-development. It is platform-independent (because of a virtual machine execution), JIT compiled, multi-threading support and an expressive, simple syntax for programmers. Due to its platform-agnostic properties, Java packages are portable across CPU-architectures, which makes library development easier hence enhancing the overall ecosystem of plugins, build tools and utility packages.
There happens to be a tradeoff between the number of features vs. performance. Languages like Assembly have least memory and execution overhead but also have the least number of features from a programmer’s perspective. Moving up the hierarchy, languages like C and C++ provide a good-set of features while remaining closer to the underlying hardware. Above them are languages like Java and Python, who choose to eliminate platform dependence completely by the use of virtual machines. Programs written in these languages have huge overheads but are a developer’s paradise.
Why would someone need C/C++ support in their Android projects?
As in our discussion above, in our systems, performance is more important than developer-friendliness, which shifts our focus to ‘native languages’ (C/C++) from Java/Kotlin. Let us go through some examples where we can understand the role of native code and its performance improvements,
- Graphics, Rendering and Interaction: Developing user-interfaces and making them look attractive may seem to be a child’s play in high-level frameworks like Jetpack Compose. At the pixel-level, thousands of calculations are made to calculate the intensity of shadows, lightning modes and textures of objects. These calculations involve heavy use of linear algebra constructs, like vectors and matrices and their respective operations. Handling touch interactions, that involves processing raw coordinates from the touch sensors on the mobile screen and differentiating between a click, double-click, drag or swipe gesture, also needs heavy computations. These computations are better performed in languages closer to the hardware where additional optimizations can be performed.
- Machine Learning: The role of C/C++ is easily understood by the fact that popular frameworks like PyTorch and TensorFlow are a major portion of their codebases written in C/C++. TensorFlow uses operations written in C++ and provides wrappers (interfaces) to use those operations from Python code. The adoption of C++ is obvious, as codebases for linear-algebra operations, CUDA (used for parallel processing) were written years ago and have been battle-tested through many years now. Python is used as one of the interfaces for TensorFlow, just to make C/C++ things look neat and tidy, and easy for non-programming users.
Many such systems uphold performance compromising readability and some other factors. Next, we’ll have a short discussion on Instruction Set Architectures (ISAs) and how program execution changes with changing CPU architectures.
As in the figure above, the usage of C/C++ code in Android is depicted, wherein two independent build processes, one for C/C++ code and another for Java/Kotlin code are present. In this blog, we’ll focus on the C/C++ code build process and see how the code communicates with the JVM for function invocation.
We’ll first go through a brief overview on how C/C++ and Java programs are compiled, which mainly highlights the platform-specific nature of C/C++ compilation. Next, we discuss JNI which acts as a glue between C/C++ and Java code. We conclude our discussion with CMake, shared libraries and ABI which are the bottom-most components of the build process.
Let’s get started 🚀
➡️ C++ is a compiled language, where the source code gets converted to executable binary code. The executable contains the binary version of the source program, constants and library code if required.
➡️ This executable is parsed by a component of the operating system, called the loader, which allocates memory for the execution of the program and reads instructions from the executable. For instance, if a hello-world C++ program is compiled with
g++ available on Ubuntu, it will run on some other Linux distro too, as long as they understand the
x86_64 instruction sets.
➡️ Mobile devices operate on
arm64 instruction sets, hence a program compiled for
x86 would not work as both executables are written in a completely different language (as seen by the loader).
Android devices can primarily run on four architectures —
x86_64 . The
arm- architectures also for ARM based processors used in most Android mobile phones, whereas
x86- based architectures are utilized on Intel or AMD processors, examples include Windows emulator and Chromebooks.
➡️ If you’ve learnt Java at some point in time, a remarkable feature that is often highlighted in videos and blogs is platform independence or build once, run anywhere. Instead of transforming the source code to a machine-dependent executable format, Java converts the code to an intermediate representation (IR).
➡️ The IR is platform-agnostic, meaning the IR generated on
arm platforms is the same, regardless of the differences in the instruction sets. The IR is parsed by a platform-dependent component, called the Java Virtual Machine which reads instructions from it and executes them on the underlying CPU. As the JVM has one hand on the IR and the other on the machine’s CPU, it is not platform-agnostic.
Also: The JVM supports Just-In-Time (JIT) compilation, a technique which can provide huge performance gains compared to purely interpreted languages. Here’s a great blog by Vaidehi Joshi on JIT compilation,
➡️ The JVM can run on almost all CPU architectures and execute Java code written on any platform (as the generated IR is platform-agnostic), the only dependency being that we need JVM installed on the target machine.
To summarize, Java and C++ have different compilation strategies, the key point being that C++ execution is architecture dependent and hence if we’re trying to use C++ with any architecture-neutral language like Java, we need to make sure that C++ dependencies respect different architectures on which they would operate.
The JNI or the Java Native Interface is a framework that allows communication between the JVM and native code (C, C++ or Assembly code) with ease. In general terms, it provides foreign function interfaces (FFIs) that allow code written in one language to communicate with code written in some other language, usually through the means of function calling. Java source code can search for definitions of functions present in C++ modules, where they’re flagged for use by the JVM.
JNI contains classes like
jstring etc. which represent their corresponding Java primitives (
String respectively) in C++. A JNI function defined in C++, for instance,
// C++ source file
extern "C" JNIEXPORT jstring JNICALL
jobject instance ,
jstring message ,
// Method block goes here
will have an equivalent Kotlin function
// Kotlin source file
external fun compute( message: String , length: Long ): String
MainActivity.kt , JVM needs to find a definition for the function
compute we’ve declared in the code. As we know, the definition is contained within the C++ source file, so how do we provide it to the Java program? We compile our C++ code and package it as a shared library within which JVM will search for definitions of the JNI functions.
Android NDK and Toolchains
We develop Android apps on our Windows, macOS or Linux-based operating systems. Most of these systems do not have an Android-specific ARM architecture, and compiling code on an Android device is not possible. Then how do we compile code for Android-specific ARM architectures which mobile phones use?
We use Android NDK, the Android Native Development Kit, which offers compilers and linkers to build Android-ARM libraries and executables from
x86 or even other
arm devices (Apple Silicon or Raspberry Pi). This process of building code for some other target (like Android-ARM) on a system running some other target (like
x86_64 ) is termed as cross-compilation. So, on a Windows machine, using compilers from Android NDK, we can build shared libraries for the app which would run perfectly on the mobile device i.e. on an ARM device.
There exists a
CMAKE_TOOLCHAIN_FILE in the Android NDK, that informs CMake which compiler to use. As Wikipedia says, a toolchain is a set of programming tools that is used to perform a complex software development task or to create a software product, the Android NDK provides various toolchains for different Android API levels to build and compile C/C++ programs.
What is CMake?
If we were to compile a simple C++ hello-world program, we would have used the GNU’s
g++ compiler which is pre-installed in most Linux distros,
g++ main.cpp -o main
➡️ For a single source file,
main.cpp , a single command will do the work. Larger codebases may have multiple modules and numerous C/C++ source files which have to compiled or built into a shared/static library. Dependencies of such codebases, which are other C++ projects, need to be integrated well. Such a huge codebase will also take a huge time for compilation.
➡️ In order to counter these problems, GNU’s
Make tool can be used which provides features to manage multiple targets, incremental builds, ability to include header files and supports multiple languages. So, instead of running multiple commands for compilation, a single
Make script will perform compilation efficiently.
# Tell CMake to build a shared library (.so) for the given
# source file native-lib.cpp.
# native-lib.cpp also contains the JNI functions
# CMake can also link other libraries to the current build
# android and log are used to provide android-specific routines
# and logging respectively
➡️ CMake can generate
Make scripts in a compiler-independent method and has its own syntax which allows developers to add dependency, headers and other libraries which have to be linked in compile time. CMake, is analogous to Gradle, as both are build systems.
For a quick read, refer this StackOverflow answer on What is the difference between using a Makefile and CMake to compile the code?
➡️ Compilation of C/C++ code can result in either an executable or a library, both containing binary representations of the source code. An executable has additional details such as the address of the
main function from where the execution starts and obeys the ELF format. Libraries provide functions that can be called by other programs, by linking the library with the program’s object code.
➡️ In Android, the C/C++ files are compiled to shared libraries, ending with a
.so (shared object) extension. These libraries expose the JNI functions we had written in (2) as they were marked with
extern in their prototype. The JVM can look through the code of the
.so files and use the binary code of the function to execute it on the device.
➡️ Such an interaction, between the source code and the library code, which happens at the binary-level, is usually known to happen through an application binary interface (ABI). Contrary, the application programming interface (API) facilitates such an interaction at the source code level, way before compilation happens.
For an intuitive explanation on ABIs, do checkout my LinkedIn post — When two pieces of software need to communicate in the source code, we use APIs. What if two binary modules want to communicate?
The JVM can now access the functions exposed in the shared libraries, and the OS executes them as needed.
I hope this article was interesting and you learnt something new. Do share your doubts and suggestions in the comments below. Have a nice day ahead!
- Mobile App Development (625)