Here’s an interesting question I saw on StackOverflow recently; it was interesting because the answer seems obvious at first, but making a small change to the question makes the answer very different.
The original question was: suppose we have an asynchronous workflow where we need to get an integer to pass to another method. Which of these is, if any, is the better way to express that workflow?
Task<int> ftask = FAsync(); int f = await ftask; M(f);
int f = await FAsync(); M(f);
The answer of course is that all of these are the same workflow; they differ only in the verbosity of the code. You might argue that when debugging the code it is easier to debug if you have one operation per line. Or you might argue that efficient use vertical screen space is important for readability and so the last version is better. There’s not a clear best practice here, so do whatever you think works well for your application.
(If it is not clear to you that these are all the same workflow, remember that “await” does not magically make a synchronous operation into an asynchronous one, any more than “if(M())” makes M() a “conditional operation”. The await operator is just that: an operator that operates on values; the value returned by a method call is a value like any other! I’ll say more about the true meaning of await at the end of this episode.)
But now suppose we make a small change to the problem. What if instead we have:
M(await FAsync(), await GAsync());
? This workflow is equivalent to:
Task<int> ftask = FAsync(); int f = await ftask; Task<int> gtask = GAsync(); int g = await gtask; M(f, g);
but that causes the start of the GAsync task to be delayed until after the FAsync task finishes! If the execution of GAsync does not depend on the completion of FAsync then we would be better off writing:
Task<int> ftask = FAsync(); Task<int> gtask = GAsync(); int f = await ftask; int g = await gtask; M(f, g);
Task<int> ftask = FAsync(); Task<int> gtask = GAsync(); M(await ftask, await gtask);
and possibly get some additional efficiency in our workflow; if FAsync is for some reason delayed then we can still work on GAsync’s workflow.
Always remember when designing asynchronous workflows: an await is by definition a position in the workflow where the workflow pauses (asynchronously!) until a task completes. If it is possible to delay those pauses until later in the workflow, you can sometimes gain very real efficiencies!
Thanks, that’s a good tip
I’ve made changes like this in a lot projects I’ve joined. I don’t think a majority of programmers understand when the operation is kicked off. I believe they assume it is when you await the operation, instead of when you call it. I’ll see N number of IO calls made one after another with no ordering dependency and make this change for easy gains. Pretty common fix I end up making in a lot of code bases.
One of the unfortunate side effects of making something easy to do is that it also tends to make it easy to misunderstand and to misuse. I would hazard that *maybe* 15-20% of average C# developers know how async/await works and how to use it correctly in common scenarios.
The worst is when misconceptions lead to exclusion of the entire feature. “Async causes deadlocks so we can’t use it” is an argument I’ve heard more than once.
A problem feeding the misunderstanding is different languages adopting different conventions here. As you say, C# starts immediately. But in other languages, it varies – Python has a variety of semantics, and Rust explicitly doesn’t begin until the task is awaited.
We have similar issues with iterator blocks. In an asynchronous coroutine the task is “hot”; when you call Task FooAsync(), it runs until the first await in FooAsync() yields control. But when you call a method IEnumerable Foo() that does a yield return, the body of Foo() does not run *at all*; it returns an enumerable and the body of Foo() runs when MoveNext is called on the enumerable for the first time. Both can be confusing.
How about the use of
That would work too, but it is not clear to me that the code would be any more clear. Could you show us what you think the code should look like?
When I looked at your code, using `Task.WhenAll` seemed to jump out at me (rather than awaiting the tasks individually). The reason is that `WhenAll` matches the semantics of what I believe is the intent: “Start all the tasks, and, when they are *all* finished, then call M with the results”. By awaiting them individually, the casual reader will think that the order the tasks’s completion may be somehow significant. But, that’s just me
Pingback: The Morning Brew - Chris Alcock » The Morning Brew #2950
Pingback: Dew Drop – March 11, 2020 (#3151) | Morning Dew
`Task.WhenAll()` could be used to avoid unobserved exceptions. As it is, if both `ftask` and `gtask` have errors, you’d never observe the `gtask` error.
That’s a good point, but let’s take a wider view.
Suppose ftask is awaited and the task has completed abnormally; the await throws an exception which is either caught or uncaught.
If it is uncaught then we have an uncaught exception and the meaning of the program is undefined, so any behaviour is correct, and the program will likely terminate.
If it is caught then during the workflow that results from handling of the exception either the program is shut down or it continues running.
If it is shut down, then who cares if gtask’s failure state is never observed? The program is shut down. Do you really care if there are dirty dishes in the sink and unpolished doorknobs if you’re bulldozing the house?
If it continues to run then either the result of gtask is unneeded or it is needed in whatever workflow is recovering from the failure of ftask.
If it is unneeded then why should we care if gtask is faulted or not? It’s an unneeded task in the current workflow.
If it is needed then we get the result by awaiting gtask, and now we’ve observed its failure state.
So in every possible scenario either we observe the failure, or we don’t care about it.
Are there possibilities that I’m missing here?
Unless I’ve missed something, ThrowUnobservedTaskExceptions and/or TaskScheduler.UnobservedTaskException are probably important here (at least in the case that both tasks are faulted but only ftask’s exception is observed). For example, if ThrowUnobservedTaskExceptions is set to true, then not observing the exception in gtask will cause the process to be terminated once gtask is GCed, which is quite possibly not what was intended!
Yes, this was my primary concern; I should have been more clear. I agree that the caller is unlikely to care about the specifics of the second error; in fact, if you await Task.WhenAll, only one of the exceptions is thrown, but they are all “observed”. Something as simple as cancellation could cause both tasks to throw a cancellation exception, so I don’t think this scenario is rare.
That is an excellent point which I had forgotten! Thanks — I should update the text to point that out.
lots of devs get in the bad habit of sequential awaits cause their first exposure is EF Core which would throw runtime exceptions all over the place if you two tasks were ToListAsync(). Instills a bad practice in them…too bad.
By the way, was there, during the design stage of async/await, a proposition that had automatic, implicit asynchronous wait, and a new special keyword that would prevent it? E.g., instead of
Task-of-int ftask = FAsync();
int f = await ftask;
it would’ve been
Task-of-Int ftask = nowait FAsync();
int f = ftask; // var f = ftask would also make f an int, not a Task-of-Int
and instead of “M(await FAsync(), await GAsync())” it’d be “M(FAsync(), GAsync())”, and instead of “Task.WhenAll(FAsync(), GAsync())” it’d be “Task.WhenAll(nowait FAsync(), nowait GAsync())”. The type checker would catch most of the inconsistencies in scenarios without heavily nested Task types (like Task-of-Task-of-int or Task-of-Task-of-Task-of-void).
We had this idea during one of the discussions about “await/async” pitfalls, when my converstaion partner (wait, does English have no simple, single word for “interlocutor”?) pointed out to me that he regularly forgets to put “await” before method calls, and the compiler is completely okay with it if there is at least one other “await” in the current method. If, on the other hand, the Task-of-T expressions were auto-awaited inside async methods (which is what you want to do 90% of the time), and there were a special keyword to prevent it, that’d have been more like “the pit of success” approach.