Development Perils: How to not create a mobile application

by: Matthias Ngeo

Ever since Forus Labs' first mobile application, TimeBloc, was acquired in September 2020, I've mused about writing a short postmortem on its less-than-stellar development. Perhaps as a conclusion to the first chapter in our software engineering careers. I hesitated each time, unsure of how to concisely fit everything into an article. It's almost 2023. Enough time has passed that memories of that period are becoming hazy. I can't hesitate any longer.

Rare photograph of TimeBloc's development circa 2019

Those seeking groundbreaking insights into software engineering should stop reading. This article just describes the aftermath of ignoring practices beaten to death by others.

So gather around the fireplace, as I tell a tale of poor software engineering decisions; of how to not create a mobile application.

Choose Wisely

Our tale begins in early 2019. Three lads, fresh out of polytechnic (high-school equivalent), had an overabundance of time before embarking on their compulsory service. They assumed creating a mobile application to be entertaining and straightforward affair. However, none of them had any prior professional experience creating mobile applications. You can probably sense where this is heading.

During meetings at their local Starbucks, they pitched wild and fantastical features to include in their time-blocking application. One of those features was real-time synchronization of all the user's data, i.e. time-blocks and settings, across their devices. Debates on whether even that was too fantastical continued perpetually until a compromise was sought, deferring the feature to a subsequent release. Unknowingly, the three lads had steered the project away from certain doom.

Deferring the feature was one of the few mistakes avoided. It was only discovered to be fraught with difficulty after implementing a similar feature in a subsequent application.

In essence, it was a distributed computing problem. Users could concurrently modify and sync data across several of their devices. To further complicate matters, the application had to be offline-first. That is to say, the application must work even when unconnected to the internet. Modifications had to be reconciled and propagated as they arrived piecemeal. Think "Multi-Leader Replication on Steroids".

Had the three lads stubbornly insisted on real-time synchronization of data, TimeBloc would remain vaporware until today. A project's features is quite literally make-or-break. Moral of the story is, choose wisely what to implement, err on the side of caution and do not implement something if in doubt, KISS. Likewise, don't implement offline-first support and data synchronization together. It's difficult.

Ecological Survey

A few weeks passed in the blink of an eye. The three lads had finished bike-shedding the application's initial features. Said features remained tame, devoid of those too deemed outlandish. Before development commenced, one question still remained. Which language and framework do they use?

"Sea port at sunset" - Claude Lorrain, 1639

The three lads found themselves at a port seeking passage across a perilous, sprawling ocean. Once aboard, it was nigh impossible to switch ships mid-voyage. Moored close to shore were two colossal ships, native Android and iOS development, surrounded by flocks of passengers awaiting embarkment. Both ships were remarkably popular, their seaworthiness trialed-and-tested by time. Moored further down the pier was React Native. Despite having been built later, it had proven to be seaworthy and attracted a respectable crowd. Lastly, there was Flutter, a brand-new ship yet to sail its maiden voyage. It incorporated the latest advancements in shipbuilding and was surrounded by crowds on the dock. Nevertheless, few in those crowds were actual passengers.

Lacking the manpower and funding to develop two separate applications, native Android and iOS development were out of the picture. Both had a single, different destination. However, our funds afforded us passage to only one. Yet, we sought to visit both destinations. Thus, the only contenders were cross-platform frameworks like Flutter and React Native.

After brief experimentation and poring over documentation, Flutter was chosen. Unbeknown to us was the importance of conducting a thorough ecological survey. That is to say, smitten by the ship's advanced exterior, we forgot to check if the ship's interior was even furnished. Flutter in 2019 isn't Flutter in 2022. It was still in its infancy. Likewise, the community and open-source ecosystem surrounding the framework was still budding. It was only discovered partway through development that there was no support for Lottie animations.

Sample Lottie animation

Although Rive was supported, good luck convincing any freelance designer to create an animation in that format. Stuck between a rock and a hard place, the difficult decision was eventually made to scrap all animations.

Some other memorable issues included the notification scheduling library not accounting for Daylight saving time, and the SQLite Flutter library not supporting desktop environments. The latter meant unit tests depending on SQLite couldn't be ran outside an Android/iOS emulator. It greatly influenced the decision to skip unit tests covered in the next section.

Because of its recency, Flutter's community had yet to take root. This manifested as less publicly available information owning to the lack of grey-haired Gandalf-types that thrived on other platforms. Consequentially, that led to greater difficulty with debugging and troubleshooting problems.

One particularly nasty incident occurred after integrating background notification scheduling. In production, reports that the application crashed during start-up began coming in. Further examination revealed that it only affected iPhone 8 devices running a certain iOS version. To complicate matters, the issue could not be replicated on an emulator nor did we own an iPhone 8 running that iOS version. An entire weekend was spent frantically debugging the issue, scouring the internet for any hints to no avail. Desperate, the decision was made to remove background notification scheduling altogether in an emergency patch.

Developing any non-trivial piece of software will inevitably require features offered beyond the language or framework. It is often the surrounding community that provides those missing pieces. Reusing the ship analogy, embarking on a ship guarantees passage but not comfort. The moral of the story is, always conduct an ecological survey on the surrounding open-source ecosystem and community when deciding on a language/framework.

Test Now

Yet in another blink of an eye, a few months have passed. Our three lads found themselves wading knee-deep in development work. Things had progressed slower than anticipated while the looming deadline drew close. The metaphorical ship's pace had to be tightened. To lighten the ship, the three lads tossed the lifeboats overboard. They reasoned that the ship wasn't on fire, and the lifeboats could be retrieved if it did, or at the end of the voyage. Long story short, they didn't. The lifeboats remain lost at seas till today.

Skipping unit testing was controversial. Although we acknowledged it to be potentially disastrous, the motivations seemed rationale. Unit tests benefited maintenance in the long term. However, there wasn't going to be a long term if the application missed the initial deadline. Tests could always be added once things have stabilized. In the interim, manual testing should suffice. It couldn't be that bad.

In short, test later gradually became test never. Things could be that bad. Manual testing was time-consuming and unreliable in a constant development flux. That meant manual tests gradually subsided too, while developer confidence plummeted. Eventually, manual testing was only conducted when gluing the UI and business logic together.

The application was built using a pseudo-BLoC architecture composed of several layers. Each developer tackled a single layer in isolation. Contrary to the adage of "integrating often and early", integration only commenced once all layers were individually completed. It was neither often nor early.

Skipping tests and delaying integration was a potent combination. It halted progress and development manifested into the nine circles of hell. It was only discovered during integration that each layer behaved contrary to the other developers' expectations. Similar to the Tower of Babel, further examination revealed contrasting interpretations of each layer's supposed behaviour. To remedy the issue, several bootleg modifications were applied over the span of a day, further damaging the application's structure.

Rare photo of developer debugging TimeBloc, circa 2019

To worsen matters, every imaginable bug surfaced in swarms during manual testing. The application would spontaneously crash and data would become corrupted seemingly at random. Since each individual layer wasn't tested, identifying and isolating the root causes became miniature D&D campaigns. A bug could be caused by the UI, persistence layer, and everything in between. Speaking from personal experience, nothing is as soul-draining as reaching work at 10am and debugging bugs until 4am in the morning.

In the end, although the application barely met the looming deadline, the decision to forego unit testing turned the application into a "Haunted Graveyard" during its lifetime. Future development stalled. Features couldn't be added and existing bugs couldn't be fully stamped-out. Because of that, rewriting the application was under consideration shortly before the application was acquired.

We failed to acknowledge the immediate maintenance benefits of unit tests. The time spent performing manual testing surpassed the predicted time writing equivalent unit tests by a few folds. Notwithstanding the time spent debugging nor the toll on developers' morale. Similarly, integrating changes late increased the cost of debugging and modification substantially. This combination forced us to cut features and postpone our plans to implement monetization in the initial release. Shedding tests to quicken velocity is almost always counterproductive. Test now before it becomes test never. That goes both for writing unit tests and integrating early. See Chapter 11 of Software Engineering at Google for a more in-depth treatise of the subject.

Perils

Following the previous sections, leftover material still remains. None of which substantial enough to dedicate an entire section to. Listed below in no particular order, are perils encountered during development.

Final Thoughts

Our first foray into the world of professional software engineering wasn't glamorous. Nevertheless, it still represented a significant step forward. Although plenty of lessons were learnt through blood, toil, tears and sweat, I'm glad to be able to sit here and laugh at our own foolish mistakes in hindsight. Likewise, I hope you had a chuckle at the sheer madness even if you didn't take away anything else.

TL;DR


Article was originally published on Medium.


<- Index