I recently read an article about optimizing tail recursion in kotlin through the tailrec keyword. I was curious how this is implemented under the hood and decided to do a little experiment.

Tail recursion is a special case of recursion in which the recursive call is the last operation before the function returns. Simply put, the function must return the final result, or return the result of its recursive call.

This example of calculating the Fibonacci sequence very well demonstrates the principle of tail recursion

tailrec fun fibonacci(n: Int, a: BigInteger, b: BigInteger): BigInteger {
return if (n == 0) b else fibonacci(n - 1, a + b, a)
}

Tail recursion is very well optimized through iteration in a flat loop. Let’s look at the decompiled code to understand how tailrec optimizes our recursion

@NotNull
public final BigInteger fibonacci(
int n, @NotNull BigInteger a, @NotNull BigInteger b
) {
while(true) {
if (n == 0) {
return b;
}

n = n - 1;
BigInteger var10001 = a.add(b);
b = a;
a = var10001;
}
}

As you can see, everything has turned into a regular loop and our recursion is no longer in the code. But this will only work strictly for cases of simple tail recursion. If you try to use tailrec for more complex recursive functions, there will be no optimization.

Problem: Adding tailrec to a function without tail recursion will compile successfully, but will not give you any optimization. The maximum you will get is a compiler warning.

To be honest, I was discouraged and created an Issue in JetBrains because in my opinion it is an error condition and not a warning. But perhaps they have a reasonable explanation.

A problem I like to give in interviews is to write a function to find all views that match the search criteria given in an external predicate. This problem has a trivial solution by recursion, but while the candidate is solving it, we can discuss a large number of issues: disadvantages of recursion, functional approach, optimization issues, inlining, function extension, etc.

Here is the simplest solution to this problem using recursion.

fun ViewGroup.findViewRecursion(predicate: (View) -> Boolean): List<View> {
val accumulator = mutableListOf<View>()
if (predicate(this)) {
accumulator.add(this)
}
children.forEach { child ->
when {
child is ViewGroup -> accumulator.addAll(
child.findViewRecursion(predicate)
)
predicate(child) -> accumulator.add(child)
}
}
return accumulator
}

It is clear that this solution has a problem in the form of recursion, but the most important thing is that each time we call recursion, we create and copy a list to return the results to the top.

The problem of copying lists on each call of recursion can be very easily solved by introducing an accumulator in the form of a variable list, which can be passed to the recursive function.

fun ViewGroup.findViewRecursionOpt(predicate: (View) -> Boolean): List<View> {
val accumulator = mutableListOf<View>()
this.internalFindView(predicate, accumulator)
return accumulator
}

private fun ViewGroup.internalFindView(
predicate: (View) -> Boolean,
accumulator: MutableList<View> = mutableListOf()
) {
if (predicate(this)) {
accumulator.add(this)
}
children.forEach { child ->
when {
child is ViewGroup -> {
child.internalFindView(predicate, accumulator)
}
predicate(child) -> accumulator.add(child)
}
}
}

We have already found out that the tailrec keyword will not help us here, since this is a more complex type of recursion. But how do you remove recursion from this fairly typical tree traversal method?

This method is standard and allows you to transform almost any recursion into a one-by-one iteration without recursion. That is, we will store information about depth-first calls not in the stack, but in an external list.

This optimization principle is best demonstrated by the code itself.

fun ViewGroup.findViewQueue(predicate: (View) -> Boolean): List<View> {
val accumulator = mutableListOf<View>()
val queue: Queue<View> = LinkedList()

// add self as a first element of queue
queue.add(this)
while (queue.isNotEmpty()) {
// get and remove next item from queue
val view = queue.poll()

if (predicate(view)) {
accumulator.add(view)
}

// add to queue all childs for current view
if (view is ViewGroup) {
view.children.forEach { queue.add(it) }
}
}

return accumulator
}

First, we add the current item to the queue. Then, in a loop, while there are items in our queue, we take the next item from the queue and process it. If that item has children, we add them to our queue.

This way we can easily process all our nested views without using recursion. This method is a simple and elegant solution that allows you to easily replace almost any recursion.

The queue optimization method works great, but it is not flexible and difficult to integrate into existing solutions.

I decided to create a generic, lazy iterator for iterating over trees. Such an iterator can be easily integrated into standard sequence processing chains and loops. The idea is that we give the iterator the root element and a lambda that returns an iterator of its children for the current element. This way we traverse the tree hierarchy lazily and don’t have nested recursion.

/**
* Lazy iterator for iterate by abstract hierarchy
* @param rootIterator Iterator for root elements of hierarchy
* @param getChildIterator function which get child iterator for current item
* if current item has child return child iterator else return null
*
* Example of using for ViewGroup hierarchy
* TreeIterator<View>(viewGroup.children.iterator) { (it as? ViewGroup)?.children?.iterator() }
*
* @author Max Sidorov on 15.12.2023
*/
class TreeIterator<T>(
rootIterator: Iterator<T>,
private val getChildIterator: ((T) -> Iterator<T>?)
) : Iterator<T> {
private val stack = mutableListOf<Iterator<T>>()

private var iterator: Iterator<T> = rootIterator

override fun hasNext(): Boolean {
return iterator.hasNext()
}

override fun next(): T {
val item = iterator.next()
prepareNextIterator(item)
return item
}

/**
* calculate next iterator for [item]
* if current item has child then get child iterator and save current iterator to stack
* else if current iterator hasn't more elements then restore parent iterator from stack
*/
private fun prepareNextIterator(item: T) {
val childIterator = getChildIterator(item)
if (childIterator != null && childIterator.hasNext()) {
stack.add(iterator)
iterator = childIterator
} else {
while (!iterator.hasNext() && stack.isNotEmpty()) {
iterator = stack.last()
stack.removeLast()
}
}
}
}

Implementing our View search using this iterator becomes a trivial task. And such an iterator can be easily integrated into any lazy data processing chains via sequence.

fun ViewGroup.findViewTreeIterator(predicate: (View) -> Boolean): Sequence<View> {
val treeIterator = TreeIterator(children.iterator()) { view ->
(view as? ViewGroup)?.children?.iterator()
}

return sequenceOf(this)
.plus(treeIterator.asSequence())
.filter { predicate(it) }
}

But there is another interesting way to get rid of recursion, which I would like to talk about here. Sequence has a mechanism for lazy yield statements that I didn’t know about before. Actually, I discovered them when I was writing this article and studying the code for the standard androidx.core.view extensions for ViewGroup.

public val ViewGroup.descendants: Sequence<View>
get() = sequence {
forEach { child ->
yield(child)
if (child is ViewGroup) {
yieldAll(child.descendants)
}
}
}

This function creates a lazy sequence that allows linear iteration through the entire hierarchy of child Views. The key here are the yield and yieldAll functions, which lazily substitute new elements into the overall sequence iterator.

Thus, the addition of child elements will happen lazily, not immediately. That is, the children of the next level of the hierarchy will be added to the sequence iterator exactly at the moment when we reach it while processing our sequence. And each time elements of only one hierarchy level will be added. That is, there will be no recursion here, but new wrappers will be created over the iterators to process each level of the hierarchy (which is not free).

Conceptually, this approach is very similar to how my tree iterator works, but here it is done at the level of support for a yield operator that produces a sequence.

I haven’t actually fully figured out how the yield call works under the hood yet. It uses a principle very similar to the principle of suspend function interrupts in coroutines. I will probably write a big article about it later on.

And finally, our View search function itself, which in the case of yeld has also become trivial.

fun ViewGroup.findViewYield(predicate: (View) -> Boolean): Sequence<View> {
return sequenceOf(this)
.plus(this.descendants)
.filter { predicate(it) }
}

I was wondering which recursion optimization method would give the greatest performance gain.

I created a View hierarchy with different nesting depths and wrote tests that search this hierarchy in all ways. For measurements I used the kotlinx.benchmark library.

To be honest, the results surprised me

As you can see, the usual recursion with list copying gave an error already at 3000 levels of nesting. Optimized recursion with an accumulator gave an error at 5000 levels deep.

Speaking about speed, the optimized recursion is twice as fast as the queue. That is, passing parameters across the stack when calling recursion works much faster than adding and removing an item from the queue. If your tree does not imply a large number of nesting levels, recursion is the best option in terms of speed.

However, if you need to get rid of recursion to handle a hierarchy with a potentially large number of nesting levels, then optimization via a queue is the best option in terms of speed. Although optimization with a lazy iterator does not lose much to it, it is a more flexible solution.

But the biggest disappointment is the yield statement in sequence. I assumed that the speed of the sequence and the lazy expansion of the sequence through yield calls might be low, but I didn’t expect it to be so low. It works hundreds of times slower than my iterator solution, although the principle of their operation is similar.

Source codes of measurement tests: https://github.com/maxssoft/yield_recursion

The test results clearly show that recursion is not an absolute evil, and there is no need to try to get rid of it if the size and depth of your tree allows you to work through recursion.

Moreover, all attempts to remove recursion lead to a decrease in performance, since you have to pay for everything. But in the case where the depth of your tree can exceed the maximum stack size, then the best way is to optimize through a queue. But the most flexible and convenient solution is to use TreeIterator.

And one more important point. Any optimizations must be measured, since it very often happens that logical and optimized code (in the opinion of the developer) begins to work slower than simple and unoptimized code. What seems logical and optimal to us is not always logical and optimal for the compiler.

I created an issue on Google to replace the view hierarchy iteration implementation in the ViewGroup.descendants function with my lazy iterator solution. This is a standard feature and many developers use it without realizing that the performance of hierarchy processing using this feature drops hundreds of times.

Google Issue with optimization of ViewGroup.descendants function

JetBrains issue with problem of sequence.yield call

I will be grateful for your likes in issue and my article.

Source link