I feel like I need to discuss this, so I decided to write an article on a hot topic in the Kotlin dev community about one-off events antipatterns.

So the story begins with this article by Manuel Vivo

ViewModel: One-off event antipatterns

I recommend that you read the article and especially the comments to understand the situation fully, but I’ll summarize the article:

Manuel Vivo argues that one-off events that happen inside the UI / UI logic layers of an application must be expressed as state variables instead of streams of objects that can be consumed by the UI components.

I left a comment under that article criticizing the approach the author is suggesting, and I will get into the arguments in a moment. The article since then has gained 1.5K+ claps, and my comment got 300+ claps.

With this article I do not mean to denigrate Manuel in any way or suggest that he is incompetent. Instead, I would like to talk about the drawbacks of the approach suggested in that post, so that you, the reader, can hopefully gain a better understanding of the pros and cons of each approach and make an educated choice.

Let’s begin by walking step by step through the arguments presented in the article.

Exposing events that haven’t been reduced to state from a ViewModel means the ViewModel is not the source of truth for the state derived from those events; Unidirectional Data Flow (UDF) describes the advantages of sending events only to consumers that outlive their producers.

First of all, in my humble opinion, several of the statements here can be corrected:

  1. Exposing events from the VM does not mean that the VM is not the source of truth. I’m not sure I even follow the logic of the statement here, because events are still produced strictly in the viewmodel and cannot be sent from outside, assuming the VM does not expose the underlying mutable channel to consumers of course. That would be a simple leaky abstraction in that case and had an easy fix.
  2. UDF does not mention the advantages of sending events to consumers that outlive producers. I’m not exactly sure how this was concluded, i think that consumers that outlive producers would create a leaky abstraction because now the UI component (composable, consumer) is somehow supposed to outlive the ViewModel or a singleton MVI container. For example, making the producer aware of the existence of a lifecycle of some external entity. The relationship has essentially been flipped and this leads to a leak of the lifecycle of the UI to the underlying business logic layer. In apps, the business logic layer should ideally operate independently of the UI and its lifecycle. So this case, even if supposedly valid, is not related to our discussion since we are talking about the opposite direction of event flow.

Let’s move on.

You should handle ViewModel events immediately, causing a UI state update. Trying to expose events as an object using other reactive solutions such as Channel or SharedFlow doesn’t guarantee the delivery and processing of the events.

I do agree though that both of these APIs (SharedFlow and Channel) do not guarantee the reception of events by the consumer, however, let’s talk a little about why that is a normal and intended behavior and why that problem in my humble opinion can be handled differently in other ways.

First of all, the usage of SharedFlow as a state event holder is not correct in more than 95% of cases (derived from personal experience, mind this). The reason is that SharedFlow will ignore all events that are being emitted when no subscribers are present (by default). This is an intentional behavior of the shared flow API. So using SharedFlow for events is a risky ordeal. Consider this explanation on how SharedFlow operates

“If something has happened and there are consumers that want to know, I will let them all know about it. If there are none, then there is no one to notify, and I will drop the event”.

Can you see how this logic is not suitable for one-off events?

  1. Consumers (Composables / UI Components, but essentially just users) can leave and return at any point, and they will expect that when they return, an event will be waiting for them to happen. This does not make sense as a valid use case of SharedFlow for one-off events.
  2. When there are multiple subscribers, they will all expect that only one of them should receive the event, because if an event is consumed by multiple consumers, then by definition of an “event”, it is expected to be handled in the exact same way across all subscribers. This means that, for example, it makes no sense to show two snackbars with the same text at once, it makes no sense to send two notifications with the same content at once, and it makes no sense to navigate to a different screen in two places at once.

There is a valid use-case for this however, when multiple subscribers can handle the same event in different ways. I consider this use-case an undesirable one however, since the behavior of handling an event will now be spread across multiple places in the code.

So to conclude this section, there is a better approach than sharing events in most cases. Even for valid cases, the producer is going to want to ensure that all subscribers will be present (to correctly fulfill the expectations), and that requirement calls for a separate API, ideally provided by the architectural framework, not the client code.

When events are shared, in most cases, other things are shared. Sharing just events is a use-case that I never encountered throughout my app development career. Whenever I thought I had encountered one, I later had to rewrite that code, because I realized that was an architectural mistake for other reasons than missed events.

Second of all, let’s talk about the supposedly “dangerous” case when an event is lost with a Channel. Even channels, which guarantee that an event will be sent, stored, and then emitted to subscribers, do not guarantee, however, that all the code that was consuming the event, will actually be able to execute in full when the event is handled. This is also a design decision made by the Kotlin team that is called “Prompt Cancelation Guarantee”.

What this means is that even though the coroutine that handles the event will always be started (guaranteed by Channel), there is no guarantee it won’t be canceled due to the Cooperative Cancellation principle of Kotlin Coroutines.

In practice, this means that in about 0.0001% of execution runs (this is a simple guess of mine, just a number to give you a general idea), the code that handles the received event will be canceled “mid-way”, before handling the event properly. The reason this number is so small is that for this issue to arise, the event has to be sent in a window of about 20–50ms before a configuration change or other event that causes cancellation of flow subscriptions. So here are my arguments why that is completely okay:

  1. About 0.0001% is such an incredibly small number that most consumer applications can simply ignore this issue and be fine.
  2. This issue is very easily fixable by a simple context switching inside the handler code. What seemed strange to me is that Manuel has created the issue I linked to himself, but then wrote the article despite receiving an actually working solution in one of the comments under the issue, not mentioning all the other working solutions. I do not know why Manuel chose to simply ignore the solutions that were provided to him. Despite being error prone at first glance, a good architecture framework will simply do the context switching for you.

So with all this we’ve essentially managed to mitigate the issue by addressing the root cause of the problem and not searching for (in my opinion) workarounds.

The next argument the author makes is that:

This is an antipattern because the payment result state modeled in the UI layer isn’t durable or atomic if we think about it in terms of an ACID transaction. The payment may have succeeded as far as the repository is concerned, but we never moved to the proper next screen.

In my opinion, a use-case like this is an architectural problem. This statement assumes that relying on UI to be responsible for data integrity is somehow the correct approach. In a real banking app, a transaction result will not be a simple matter of navigating to a page.

A payment result changes the state of the backend service and/or the database, resulting in persistent state change that can be then observed by the UI logic component (e.g. ViewModel). In this case, a reactive application that follows the aforementioned UDF principles will always react to the changes in the state of the source of truth (not the Viewmodel, but the true source like a backend service database). When the user returns to the payment flow, a request or a websocket connection must be made (for example) to verify that the user is not going to perform the same transaction twice. For example, a transaction id will return PURCHASED state or something similar. In case you are wondering how this could be implemented, I am going to again refer to architectural frameworks whose responsibilities should include such cases, like observing state changes upon resubscription.

I am glad that Manuel added a post-production note about switching context dispatchers that highlighted the approach I described above. In addition to that, I want to say I do not consider that…

However, if that’s not enforced by a lint check, this solution could be error-prone as devs could easily forget it.

…for the reason that this setup, again, is usually done either by an architectural framework or a simple function that can be reused across the whole app. The root cause of the problem of “forgetting” is again solved by abstracting the details away and reusing code. We already have so many things to be aware during development, trying to keep the architectural setup in one place, like a library or a module, is a good idea in my opinion.

The ViewModel should tell the UI what the app state is and the UI should determine how to reflect that. The ViewModel shouldn’t tell the UI which actions it should take.

I agree that determining how to change the UI of the app is not the responsibility of the ViewModel (Container etc.). Trying to add code that decides what action to take based on UI-bound configuration into the ViewModel is where the true problem usually lies among developers.

For example, the problem stated in the article…

For an app that supports multiple screen sizes, the UI action to perform given a ViewModel event might be different depending on the screen size.

…is easily solved by moving the code responsible for the UI-bound logic to the UI itself, not keeping it in a VM. The ViewModel is unaware of the UI that it manages, and if screen size is required to determine the next action taken by the UI in response to an event, then that code should be located on the UI layer itself. For example, I assume someone could write the code that navigates to the payment completion screen like this:

// ViewModel
fun performPayment() {
val isTablet = TODO("how to get this here?")
if (isTablet) sideEffect(GoToPaymentCompletion) else sideEfect(ShowSnackbar(R.string.payment_completed))

This code is problematic because the UI logic is leaked through the abstraction to the VM layer. A correct approach is as simple as:

// ViewModel
fun performPayment() {
// Composable
fun PerformPaymentScreen(nav: Navigator) {
val isTablet = windowSizeClass.isWideScreen
val viewModel by viewModel<PerformPaymentViewModel>()
val snackbarHostState = rememberSnackbarHostState()
viewModel.subscribe { event -> // somewhere here a Dispatchers.Main.immediate switching has been abstracted away from ✅
when(event) {
is ShowPaymentCompletionResult -> if (isTablet) snackbarHostState.showSnackbar(/* ... */) else nav.toPaymentCompletion()

I know. I was making this mistake myself at first. But when you gain experience with KMP and MVI, you are forced to learn to avoid such problems naturally.

State is, events happen. The longer an event isn’t handled, the harder the problem becomes. For ViewModel events, process the event as soon as possible and generate a new UI state from it.

Yes, that’s exactly right. State is, events happen.

But let’s first talk about the second part. That statement assumes that for some reason there will be a significant delay between event sending and consumption. I would argue — not often. The thing is, we’re doing most of our work already on Dispatchers.Main.immediate. That’s the default dispatcher the ViewModel class uses, and that is the dispatcher we use to collect the events now. So the coroutines framework will smartly recognize that no context switching is required and, according to the Dispatchers.Main.immediate documentation, will simply send, process, collect, and then handle an event immediately. That’s literally what the word “immediate” means. That dispatcher was made specifically for this purpose – to be immediate.

In any case, even if I am incorrect here, I don’t believe the argument about squeezing the few milliseconds required to create and pass an object in order to send it to the UI is a worthy one. All of this will happen in under one frame (8ms) on most devices. Even if it doesn’t, the user will never notice a delay until 50 to 70 ms have passed. This is an example of premature optimization in my opinion. If something is not as fast as it should be (which is rare for simple UI events), I usually first start looking at client code, not implementation details of a framework.

To conclude the article, let’s look at this lil’ note at the end of the article

Note: If in your use case the Activity doesn’t finish() and is kept in the backstack, your ViewModel would need to expose a function to clear the paymentResult from the UiState (i.e. setting the field to null) that will be called after the Activity starts the other one. An example of this can be found in the Consuming events can trigger state updates section of the docs.

In my opinion, this is the biggest problem with the approach. Remember how we said that events “happen”? When events are represented through states, they now somehow “are” (a state). I strongly disagree.

To me personally, what is probably more important than watching for rare issues is the fact that the code we write must be understandable, intuitive, and simple. Consider a real word example of how you as a person and how your device operates.

Let’s say you get a notification. But is the notification “a state”? In that case, when we send a notification, what is the “state” that was changed? What is the state that the notification now has? Should we react to something that has already happened multiple times? It has been sent off to the operating system, nowhere to be found in our application code. And it should stay that way. Because our brain assumes that is how the real world works, and how we assume our code should work. I think most people will agree that, in the real world, there are things that happen, and there are things that “exist”. Then why, in the software’s code, there should be no familiar and recognizable, natural concepts representing certain things in the real world?

Throughout my experience, the practice of representing events as state often backfired on our team. With teams that I lead and with projects I work on, when people try to represent events as state, two things happen:

  1. The state property amount grows unnaturally large, becoming cluttered with dozens of properties that mean essentially nothing and are simple flags to represent that something has happened, that need to be cleared, over and over, for each flag there is. The state becomes harder and harder to understand, maintain, and eventually, the state consistency falls apart. This happens simply because team members struggle to understand how an event can be represented as a “state”, and that that state now has to be kept track of. Human brain is able to naturally distinguish events and states, and to contradict that notion is to let developers make obvious mistakes in the app’s logic.
  2. Sooner or later, bugs start to happen. Dozens of functions that set a boolean to true and then to false grow out of control, and soon, one of the developers simply forgets to call the function to “clear the payment result”, leading, for example, to a really bad bug with infinite navigation loop that sends the user back to the success page of a transaction over and over, locking them out of the app forever. This is not represented in crashlytics or analytics services, but rather, in one-star user reviews, which is the worst scenario ever possible for an application in my opinion.

Those above were some of the reasons our team has migrated from “events as state” approach and could not be happier with the stability, maintainability and performance of our app since then.

Source link