The benefits of cloud computing and all its flavors have been well documented already. For the purpose of this post I will not dive deep into the overall pros & cons. I will focus on how cloud has changed the game within the context of innovation and how it has democratized the playing field between big and small budget projects & companies.
I do so with a very practical, real world example of a project I helped a client with, using Google Cloud, particularly the AutoML Vision services (now part of the Unified AI Platform), and how we worked in a very small and quick iterations, Lean Startup type of way.
In less than 2 months, we had a working POC that had validated our technical approach and initial hypothesis. This opened the door for an additional round of budget approvals and the build for V1.0. I’ll refrain from mentioning the company and the idea, as it’s still a work in progress, but I’ll abstract the overall technical and process approach that helped us achieve what we wanted in record time.
Cloud Benefits for Innovation
“The cloud democratizes the ability to test great ideas and bring them to life.”
Ranjit Bawa, principal and U.S. technology cloud leader, Deloitte Consulting LLP
Just to name a few benefits that are specific to innovation:
- Faster time-to-market
- Only pay for what you use
- Makes business more agile and provides team with innovation tools
- Reduces delivery cycles
- Adds scalability, agility, efficiency
- OpEx model vs CapEx
The above list I personally identified as being the key benefits for our specific business and timing needs.
Case in Point – Real-Time Multi-Object Detection Mobile Apps
The Project & Challenges
The main goal was coming up with a POV and next steps regarding technical feasibility for an AR mobile app, that would be able to detect multiple and different objects in real-time (no/little delay), in a “sea” of similar objects. So, for instance, if we configured the app to detect 5 specific objects, it would have to do so when the camera had in view around 50 or more similar objects. To add to the challenge, we had to test if the approach worked well in a no-Internet connectivity scenario.
Imagine something like the following use case. You have a lot of baseball cards on the floor, and in real-time, the cards you are interested are highlighted on the user’s screen as you move along. Thankfully, our challenge didn’t involve our objects to be as unorganized as the below image, but with enough model training, I don’t think that would have been an issue.
The Process & Approach
We had to move fast, so we decided to divide the project in 3 quick iterations, 2 weeks each, where we focused our efforts as follows:
At the end of each 2-week sprint, we had a “pivot/persevere/kill” meeting, were as a team we decided if the idea had some validity and if it was worth continuing the effort. This type of meeting is key in a Lean Startup approach.
Key Considerations
Based on the type of users, and the real world scenarios where the app would be used, we came up with the following key considerations and observations we had to pay close attention to.
- Lightning & shadows in different real world situations
- No Internet connectivity
- Distance from user to objects
- Angle of camera & objects
- Predictability and homogeneity of scenarios – high-correlation of training and testing data
Which Platform / Technology to Use
Initially, we considered several possible options regarding technology strategy. We brainstormed some ideas around using:
- AR markers
- OCR (optical character recognition)
- Object detection
Based on the key considerations explained above, it was quickly identified that going the Object Detection/AI way will give us the best chances to success. Now, the question was which platform should we use, or should we even attempt to build something custom, quick & dirty. There’s just no way we would have been able to build something from scratch in just a few weeks, so we decided to look into several cloud platforms.
At the end it was a decision between Google Cloud AutoML Vision and AWS Sage Maker. We decided to go with Google Cloud AutoML Vision because it ticked all the right boxes. It had a no-code/low-code dashboard, it is super easy to add training data. It also had an out-of-the-box labeling tool, which is something that’s needed for object detection problems. Sage Maker was also nice, but it required some coding which I didn’t have time for (remember, we only had 6 weeks, and only 2 weeks to test viability of the approach).
It also has the option to export the training model as a TensorFlow Lite model that could be hosted on the device.
So, Google Cloud AutoML Vision – Object Detection API it was.
BONUS: if you are new user, Google gives you 3 months free, along with $300 of credit, which meant that we could run the entirety of our tests and iterations basically at no cost.
Note: AutoML Vision has been rolled up into Google AI Platform (Unified), as of the time of this writing.
Training & Testing
When testing AI/ML applications, it all comes down to how good the model is, and how good the training data set you use is compared to a real world scenario. I crowdsourced the training data set to people in my company. To be completely honest, we didn’t get much from them, but just enough.
I had my concerns and reservations around how good the output model could be and how accurate it was in a real world scenario, since the training data set was not what I would say best-in-class, it was highly correlated images and it fell a bit short in terms of recommended size, which could cause imbalanced data, data leakage and/or bad splits.
The resulting model, we exported as a TensorFlow Lite model we could embed within an skeleton mobile app for real-world testing. The app itself was nothing more than a shell with just a couple of parameters we could configure on the fly:
- Confidence threshold
- Snapshot interval (AutoML Vision is an image-based API, so for real-time/AR experience, we had to take snapshots every N milliseconds and call the object detection API).
Again, Google Cloud, conveniently provides some model exporting options we could test as part of the feasibility exercise:
The Results
Pleasantly surprised on how good the results came back. It took a few back & forth efforts in terms of training data set as well as in-app configuration settings, but after just a few tries, we were getting excellent results that we could reasonably use to back up the next phase of the project. The last couple of weeks we spent creating a slightly better version of the app, to be more user-friendly since the intention was to test the app in real world scenarios and with a handful of real users. We also used that time to improve the training data set, which we already knew was good enough even with a less than ideal image data set.
Conclusion
In just 6 weeks, we had come from an idea and little budget, to test and validate that:
- An object-detection / ML approach works well for the challenge we had
- Accuracy of model and app passed the acceptance criteria we had set up for ourselves
- Mobile app size / embedded device model was under the limit
- Object detection speed, frame capture & delay – they were all tested and validated as acceptable.
- Had a POC version of the app to distribute to real (handful) users to test in real-world scenarios.
This is something that we would have never been able to do without using Cloud as our innovation tool, and without using Lean Startup as our development and testing approach. Our team was a team of 2, a cloud / AI architect (myself), and a mobile developer (to develop the shell app) and we moved in a very fast yet consistent pace.
References
- https://deloitte.wsj.com/cio/2018/06/15/how-cloud-can-boost-innovation-2/
- https://cloud.google.com/vision/automl/object-detection/docs/prepare
- https://cloud.google.com/ai-platform-unified/docs/start/introduction-unified-platform
- https://cloud.google.com/ai-platform-unified/docs/start/automl-users
- https://cloud.google.com/vision/automl/object-detection/docs/train-edge