It’s not every day that you get the opportunity to have your work showcased front and center on the main landing page for one of the largest companies in the world. Well, today is definitely my lucky day. I was interviewed last month about a drone-related project that I’ve been working on that focuses on insurance use cases and safety/productivity improvement by using cognitive/artifical intelligence via IBM Watson. I knew it was going to be used for some marketing materials, but the last thing that I expected was to have my image right there on ibm.com. I see this as a tremendous honor, and am humbled by the opportunity and exposure.
Here’s an interview that I recently did with IBM DeveloperWorks TV at the recent World of Watson conference. In it I discuss a project I’ve been working on that analyzes drone imagery to perform automatic damage detection using the Watson Visual Recognition, and generates 3D models from the drone images using photogrammetry processes. The best part – the entire thing runs in the cloud on IBM Bluemix.
Bare Metal servers are dedicated machines in the cloud: not shared, and not virtualized. I’ve got mine setup as a linux server with 24 cores (48 threads), 64 Gigs of RAM, a SSD RAID array, multiple GPUs, etc… and it improved my photogrammetry rendering from hours on my laptop down to merely 10 minutes (in my opinion the best part).
I’ve done all of my testing with DJI Phantom and DJI Inspire aircraft, but really, it could work with any images, from any camera that has embedded GPS information.
It’s been a while since I’ve posted here on the blog… In fact, I just did the math, and it’s been over 7 months. Lots of things have happened since, I’ve moved to a new team within IBM, built new developertools, worked directly with clients on their solutions, worked on a few high profile keynotes, built apps for kinetic motion and activity tracking, built a mobile client for a chat bot, and even completed some new drone projects. It’s been exciting to say the least, but the real reason I’m writing this post is to share a few of the public projects I’ve been involved with from recent conferences.
I recently returned from Gartner Symposium and IBM’s annual World of Watson conference, and it’s been one of the busiest, yet most exciting span of two weeks I’ve experienced in quite a while.
At both events, we showed a project I’ve been working on with IBM’s Global Business Services team that focuses on the use of small consumer drones and drone imagery to transform Insurance use cases. In particular, by leveraging IBM Watson to automatically detect roof damage, in conjunction with photogrammetry to create 3D reconstructions and generate measurements of afflicted areas to expedite and automate claims processing.
This application leverages many of the services IBM Bluemix has to offer… on-demand CloudFoundry runtimes, a Cloudant NoSQL database, scalable Cloud Object Storage (S3 compatible storage), and BareMetal servers on Softlayer. Bare Metal servers are *awesome*… I have a dedicated server in the cloud that has 24 cores (48 threads), 64 GB RAM, RAID array of SSD drives, and 2 high end multi-core GPUs. It’s taken my analysis processes from 2-3 hours on my laptop down to 10 minutes for photogrammetric reconstruction with Watson analysis.
It’s been an incredibly interesting project, and you can check it out yourself in the links below.
World of Watson
World of Watson was a whirlwind of the best kind… I had the opportunity to join IBM SVP of Cloud, Robert LeBlanc, on stage as part of the the Cloud keynote at T-Mobile Arena (a huge venue that seats over 20,000 people) to show off the drone/insurance demo, plus 2 more presentations, and an “ask me anything” session on the expo floor.
You can also check out my session “Elevate Your apps with IBM Bluemix” on UStream to see an overview in much more detail:
.. and that’s not all. I also finally got to see a complete working version of the Olympic Cycling team’s training app on the expo floor, including cycling/biometric feedback, video, etc… I worked with an IBM JStart team and wrote the video integration layer into for the mobile app using IBM Cloud Object Storage and Aspera for efficient network transmission.
On this project we’ve been working with a partner DataWing, who provides drone image/data capture as a service. However, I’ve also been flying and capturing my own data. The app can process virtually any images with appropriate metadata, but I’ve been putting both the DJI Phantom and Inspire 1 to work, and they’re working fantastically.
Here’s a sample point-cloud scan I did of my office. 🙂
In my last post I mentioned some new announcements related to the Swift programming language at IBM. Upon further thought, I guess it’s probably not a bad idea to re-post more detail here too…
If you didn’t see/hear it last week, IBM unveiled several projects to advance the Swift language for developers, which we think will have a profound impact on developers & developer productivity in the years to come. You can view a replay of the IBM announcement in the video embedded below, or just scroll down for direct links:
Here are quick links to each of the projects listed:
A light-weight web framework written in Swift, that allows you to build web services with complex routes, easily. Learn more…
Updated IBM Swift Sandbox
The Swift Sandbox enables developers to learn Swift, build prototypes and share code snippets. Whatever your Swift ambitions, join the over 100,000 community members using the Sandbox today. Learn more…
OpenWhisk is an event-driven compute platform that executes application logic in response to events or through direct invocations–from web/mobile apps or other endpoints. Learn more…
I recently gave a presentation at IBM Insight on Cognitive Computing in mobile apps. I showed two apps: one that uses Watson natural language processing to perform search queries, and another that uses Watson translation and speech to text services to take text in one language, translate it to another language, then even have the app play back the spoken audio in the translated language. It’s this second app that I want to highlight today.
In fact, it gets much cooler than that. I had an idea: “What if we hook up an OCR (optical character recognition) engine to the translation services?” That way, you can take a picture of something, extract the text, and translate it. It turns out, it’s not that hard, and I was able to put together this sample app in just under two days. Check out the video below to see it in action.
To be clear, I ended up using a version of the open source Tesseract OCR engine targeting iOS. This is not based on any of the work IBM research is doing with OCR or natural scene OCR, and should not be confused with any IBM OCR work. This is basic OCR and works best with dark text on a light background.
The Tesseract engine lets you pass in an image, then handles the OCR operations, returning you a collection of words that it is able to extract from that image. Once you have the text, you can do whatever you want from it.
So, here’s where Watson Developer Cloud Services come into play. First, I used the Watson Language Translation Service to perform the translation. When using this service, I make a request to my Node.js app running on IBM Bluemix (IBM’s cloud platform). The Node.js app acts as a facade and delegates to the Watson service for the actual translation.
You can check out a sample on the web here:
On the mobile client, you just make a request to your service and do something with the response. The example below uses the IMFResourceRequest API to make a request to the server (this can be done in either Objective C or Swift). IMFResourceRequest is the MobileFirst wrapper for networking requests that enables the MobileFirst/Mobile Client Access service to capture operational analytics for every request made by the app.
Once you receive the result from the server, then you can update the UI, make a request to the speech to text service, or pretty much anything else.
To generate audio using the Watson Text To Speech service, you can either use the Watson Speech SDK, or you can use the Node.js facade again to broker requests to the Watson Speech To Text Service. In this sample I used the Node.js facade to generate Flac audio, which I played in the native iOS app using the open source Origami Engine library that supports Flac audio formats.
You can preview audio generated using the Watson Text To Speech service using the embedded audio below. Note: In this sample I’m using the OGG file format; it will only work in browsers that support OGG.
English: Hello and welcome! Please share this article with your friends!
Hola y bienvenido! Comparta este artículo con sus amigos!
On the native iOS client, I download the audio file and play it using the Origami Engine player. This could also be done with the Watson iOS SDK (much easier), but I wrote this sample before the SDK was available.
//download the contents of the audio file
NSData *audioData = [NSData dataWithContentsOfURL:flacURL];
NSString *docDirPath = NSTemporaryDirectory() ;
NSString *filePath = [NSString stringWithFormat:@"%@transcript.flac", docDirPath ];
[audioData writeToFile:filePath atomically:YES];
//pass the file url the the origami player and play the audio
NSURL* fileUrl = [NSURL fileURLWithPath:filePath];
Cognitive computing is all about augmenting the experience of the user, and enabling the users to perform their duties more efficiently and more effectively. The Watson language services enable any app to greater facilitate communication and broaden the reach of content across diverse user bases. You should definitely check them out to see how Watson services can benefit you.
So, I mentioned that this app uses IBM MobileFirst offerings on Bluemix. In particular I am using the Mobile Client Access service to collect logs and operational analytics from the app. This lets you capture logs and usage metrics for apps that are live “out in the wild”, providing insight into what people are using, how they’re using it, and the health of the system at any point in time.
You can access the sample iOS client and Node.js code at https://github.com/triceam/Watson-Translator. Setup instructions are available in the readme document. I intend on updating this app with some more translation use cases in the future, so be sure to check back!