Mpesa downtimes – Safaricom NOT to blame
by Idd Salim on Apr.19, 2010, under Personal, Symbiotic, Zunguka
Last week, I called my landlord to ask her why there was no water in my apartment. She told me, “pigia city council uwaulize. I just connect you to water. Not provide it.” So I packed my stuff and moved to the leafy suburbs where the taps never run dry.
Also, Last week [what an eventful week. Arsenal match included], I Mathematically demonstrated that it was IMPOSSIBLE for Mpesa to go down due to user-load. But, the downtime issue occured again last week! This downtime lasted so long that I, for just some seconds, assumed that the eye of the Nebula had finally opened and the finger of god was about to start poking us all. I could not send money home and I had to cancel my Friday night Pool Hustling to take the money back home, by hand. Yuck! 2002 all over again.
But now, Mpesa is back up. We are all smiling. Long live Safaricom. Until the next downtime. They we can all switch back to Safaricom-ni-madogi mode.
After a response by Kaduki and a blog posting by Kachwanya (both very learned, incisive and non-partisan friends of mine and former Stacherians) about a non-safaricom-controllable element of the downtime, I decided to do my research and what I found out was interesting.
The Mpesa Architecture
Note/Disclaimer : The map above is my own sketch of how the Mpesa system would hypothetically work. It is by no way endorsed by Vodafone or Safcom. Ok.. Safcom wouldn’t -ofcourse; So let me stop at Vodafone.
From the WAN-map above, we see that Mpesa has 3 primary points-of-failure.
Point 1 : Data Path-1-to-2 Request Path
If the link between Saf and Voda fails (cut, rained on, power issues of just the plain fear of Makmende), your Mpesa will fail. Shared responsibility – Saf-Voda
Point 2 : Data Path-2-to-3 Auth path
If for some reason Voda does not get a full hand-shake and ACK from the bank, then your Mpesa will fails. Shared responsibility – Bank-Voda
Point 2 : Data Path-3-to-2-to-1 Response path
If for some reason Saf does not get a full and timely response from Voda, then your Mpesa will fails. Shared responsibility – Saf-Voda
So, clearly, Safaricom might, and I suspect, always does her part.. and VERY fast at that, but the multiple-points-of-failure make them look bad in the eyes of the public.
Solution
Many come to mind:
- Develop a Kenyan Mpesa. Locally hosted and run. No downtime.
- Take and work on daily data snapshots. Reconcile with Voda at end of day ala the ATM Model.
- Work on a Store-and-forward modus operandi where there is a system-trust threshold based on the last-known-user-balance so that the client ALWAYS gets served and reconciliation is delayed abit. This could also be made more secure by placing repeat requests by this client on queue-2 is reconciliaton is not yet down.
That’s all, folks!
Back to code.
-
http://johnkaranja.com John Karanja
-
http://www.facebook.com/norman.ondego Norman Ondego
-
iddsalim
-
iddsalim
-
http://wapichapaa.blogspot.com edwinabuga
-
http://wapichapaa.blogspot.com edwinabuga
-
iddsalim
-
eebrah
-
http://www.gmeltdown.com gmeltdown
-
john john
-
Anonymous
-
http://www.gmeltdown.com/ gmeltdown
-
Anonymous
-
Mynewemailacct2009
-
Man behind the scene




