Introducing the new Watson iOS SDK (beta)

watson-header
I’ve written here in the past on both the impact of cognitive computing, and how you can integrate IBM Watson services into your mobile apps to add cognitive language processing capabilities and more.  I’m happy to share that IBM has just recently released a new beta SDK that makes integrating more Watson services into your iOS applications easier than ever.

If you aren’t familiar with cognitive computing, or the transformative impact that it is already having on entire industries, then I strongly suggest checking out this video and related article on IBM DeveloperWorks.

IBM Watson services, which are based on machine learning algorithms, give you the ability to handle unstructured data, like text analysis or translation, speech processing, and more.  This makes consumption, mining, or responding to unstructured data or “dark data” faster, more efficient, and more powerful than ever.

The new Watson iOS SDK provides developers with an API to simplify integration of the Watson Developer Cloud services into their mobile apps, including the Dialog, Language Translation, Natural Language Classifier, Personality Insights, Speech To Text, Text to Speech, Alchemy Language, or Alchemy Vision services – all of which are available today, and can now be integrated with just a few lines of code.

The Watson iOS SDK makes integration with Watson services significantly *really* easy. For example, if you want to take advantage of the Language Translation service, you first have to setup a service instance. Once the translation service is setup, then you’ll be able to leverage translation capabilities within your mobile app:

//instantiate the LanguageTranslation service
let service = LanguageTranslation(username: "yourname", password: "yourpass")

//invoke translation methods
service.translate(["Hello","Welcome"],source:"en",target:"es",callback:{(text:[String], error) in
  //do something with the translated text strings
})

I’ve actually put a sample application together that demonstrates the language translation service integration, which you can access at github.com/triceam/Watson-iOS-SDK-Demo.

swift-translator

Be sure to check out the sample’s readme for additional detail and setup instructions. As with all of the Watson services, You must have a service instance properly configured, with authentication credentials in order to be able to consume it within your app.

The new Watson iOS SDK is written in Swift, is open source, and the team encourages you to provide feedback, submit issues, or make contributions.  You can learn more about the Watson iOS SDK, get the source code, and access the open source project here.

Mobile Apps with Language & Translation Services using IBM Watson & IBM MobileFirst

UPDATE 12/22/15:  IBM Recently released a new iOS SDK for Watson that makes integration with Watson services even easier. You can read more about it here.


I recently gave a presentation at IBM Insight on Cognitive Computing in mobile apps.  I showed two apps: one that uses Watson natural language processing to perform search queries, and another that uses Watson translation and speech to text services to take text in one language, translate it to another language, then even have the app play back the spoken audio in the translated language.  It’s this second app that I want to highlight today.

In fact, it gets much cooler than that.  I had an idea: “What if we hook up an OCR (optical character recognition) engine to the translation services?” That way, you can take a picture of something, extract the text, and translate it.  It turns out, it’s not that hard, and I was able to put together this sample app in just under two days.  Check out the video below to see it in action.

To be clear, I ended up using a version of the open source Tesseract OCR engine targeting iOS. This is not based on any of the work IBM research is doing with OCR or natural scene OCR, and should not be confused with any IBM OCR work.  This is basic OCR and works best with dark text on a light background.

The Tesseract engine lets you pass in an image, then handles the OCR operations, returning you a collection of words that it is able to extract from that image.  Once you have the text, you can do whatever you want from it.

So, here’s where Watson Developer Cloud Services come into play. First, I used the Watson Language Translation Service to perform the translation.  When using this service, I make a request to my Node.js app running on IBM Bluemix (IBM’s cloud platform).  The Node.js app acts as a facade and delegates to the Watson service for the actual translation.

translator

You can check out a sample on the web here:

Translate english to:

On the mobile client, you just make a request to your service and do something with the response. The example below uses the IMFResourceRequest API to make a request to the server (this can be done in either Objective C or Swift). IMFResourceRequest is the MobileFirst wrapper for networking requests that enables the MobileFirst/Mobile Client Access service to capture operational analytics for every request made by the app.

NSDictionary *params = @{
  @"text":text,
  @"source":@"en",
  @"target":language
};

IMFResourceRequest * imfRequest =
  [IMFResourceRequest requestWithPath:@"https://translator.mybluemix.net/translate"
                      method:@"GET" parameters:params];

[imfRequest sendWithCompletionHandler:^(IMFResponse *response, NSError *error) {
  NSDictionary* json = response.responseJson;
  NSArray *translations = [json objectForKey:@"translations"];
  NSDictionary *translationObj = [translations objectAtIndex:0];
  self.lastTranslation = [translationObj objectForKey:@"translation"];
  // now do something with the result - like update the UI
}];

On the Node.js server, it is simply taking the request and brokering it to the Watson Translation service (using the Watson Node.js SDK):

app.get('/translate', function(req, res){
  language_translation.translate(req.query, function(err, translation) {
    if (err) {
      console.log(err)
      res.send( err );
    } else {
      console.log(translation);
      res.send( translation );
    }
  });
});

Once you receive the result from the server, then you can update the UI, make a request to the speech to text service, or pretty much anything else.

To generate audio using the Watson Text To Speech service, you can either use the Watson Speech SDK, or you can use the Node.js facade again to broker requests to the Watson Speech To Text Service. In this sample I used the Node.js facade to generate Flac audio, which I played in the native iOS app using the open source Origami Engine library that supports Flac audio formats.

You can preview audio generated using the Watson Text To Speech service using the embedded audio below. Note: In this sample I’m using the OGG file format; it will only work in browsers that support OGG.

English: Hello and welcome! Please share this article with your friends!

Spanish:
Hola y bienvenido! Comparta este artículo con sus amigos!

app.get('/synthesize', function(req, res) {
  var transcript = textToSpeech.synthesize(req.query);
  transcript.on('response', function(response) {
    if (req.query.download) {
      response.headers['content-disposition'] = 'attachment; filename=transcript.flac';
    }
  });
  transcript.on('error', function(error) {
    console.log('Synthesize error: ', error)
  });
  transcript.pipe(res);
});

On the native iOS client, I download the audio file and play it using the Origami Engine player. This could also be done with the Watson iOS SDK (much easier), but I wrote this sample before the SDK was available.

//format the URL
NSString *urlString = [NSString stringWithFormat:@"https://translator.mybluemix.net/synthesize?text=Hola!&voice=es-US_SofiaVoice&accept=audio/flac&download=1", phrase, voice ];
NSString* webStringURL = [urlString stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSURL *flacURL = [NSURL URLWithString:webStringURL];

//download the contents of the audio file
NSData *audioData = [NSData dataWithContentsOfURL:flacURL];
NSString *docDirPath = NSTemporaryDirectory() ;
NSString *filePath = [NSString stringWithFormat:@"%@transcript.flac", docDirPath ];
[audioData writeToFile:filePath atomically:YES];

//pass the file url the the origami player and play the audio
NSURL* fileUrl = [NSURL fileURLWithPath:filePath];
[self.orgmPlayer playUrl:fileUrl];

Cognitive computing is all about augmenting the experience of the user, and enabling the users to perform their duties more efficiently and more effectively. The Watson language services enable any app to greater facilitate communication and broaden the reach of content across diverse user bases. You should definitely check them out to see how Watson services can benefit you.

MobileFirst

So, I mentioned that this app uses IBM MobileFirst offerings on Bluemix. In particular I am using the Mobile Client Access service to collect logs and operational analytics from the app. This lets you capture logs and usage metrics for apps that are live “out in the wild”, providing insight into what people are using, how they’re using it, and the health of the system at any point in time.

Analytics from the Mobile Client Access service
Analytics from the Mobile Client Access service

Be sure to check out the MobileFirst on Bluemix and MobileFirst Platform offerings for more detail.

Source

You can access the sample iOS client and Node.js code at https://github.com/triceam/Watson-Translator. Setup instructions are available in the readme document. I intend on updating this app with some more translation use cases in the future, so be sure to check back!

 

 

Wearables & IBM MobileFirst – Video & Sample Code

Last week I attended IBM Insight in Las Vegas. It was a great event, with tons of great information for attendees. I had a few sessions on mobile applications. In particular, my dev@Insight session on Wearables powered by IBM MobileFirst was recorded. You can check it out here:

https://youtu.be/d4AEwCOmvug

Sorry it’s not in HD, but the content is still great! (Yes, I am biased.)

In this session I showed how you can power wearable apps, specifically those on smart watch devices, using either the MobileFirst Platform Foundation Server, or the MobileFirst offerings on IBM Bluemix (cloud).

Key takeaways from the session:

  1. Wearables are the most personal computing devices ever. Your users can use them to be notified of information, search/consume data, or even collect environmental data for reporting or actionable analysis.
  2. Regardless of whether developing for a peripheral device like the Apple Watch or Microsoft Band, or a standalone device like Android Wear, you are developing an app that runs in an environment that mirrors that of a a native app. So, the fundamental development principles are exactly the same. You write native code, that uses standard protocols and common conventions to interact with the back-end.
  3. Caveat to #1: You user interface is much smaller. You should design the user interface and services to acomodate for the reduced amount of information that can be displayed.
  4. You can share code across both the phone/tablet and watch/wearable experience (depending on the target device).
  5. Using IBM MobileFirst you can easily expose data, add authentication, and capture analytics for both the mobile and wearable solutions.

Demos/Code Samples:

In the session I showed 3 sample wearable apps.  Full source code and setup instructions for each app is available at: https://github.com/triceam/MobileFirst-Wearables/

Stocks

A sample WatchKit (Apple Watch) app powered by IBM MobileFirst Platform Foundation Server.

applewatch-stocks

Contacts

A sample WatchKit (Apple Watch) app powered by IBM MobileFirst on Bluemix.

contacts-watch

Heartrate

A simple heart rate monitor using the Microsoft Band, powered by MobileFirst on Bluemix and IBM Cloudant.

heartrate

 

Thoughts on Cognitive Computing

You may have heard a lot of buzz coming out of IBM lately about Cognitive Computing, and you might have also wondered “what the heck are they talking about?”  You may have heard of services for data and predictive analytics, services for natural language text processing, services for sentiment analysis, services understand speech and translate languages, but it’s sometimes hard to see the forest through the trees.

I highly recommend taking a moment to watch this video that introduces Cognitive Computing from IBM:

Those services that I mentioned above are all examples of Cognitive Computing systems, and are all available for you to use today.

From IBM Research:

Cognitive computing systems learn and interact naturally with people to extend what either humans or machine could do on their own.

They help human experts make better decisions by penetrating the complexity of Big Data.

Cognitive systems are often based upon massive sets of data and powerful analytics algorithms that detect patterns and concepts that can be turned into actionable information for the end users.  It’s not “artificial intelligence” in the sense that the services/machines act upon their own; rather a system that provides the user tools or information that enables them to make better decisions.

The benefits of cognitive systems in a nutshell:

  1. They augment the user’s experience
  2. They provide the ability to process information faster
  3. They make complex information easier to understand
  4. They enable you to do things you might not otherwise be able to do

Curious where this will lead?  Now take a moment and watch this video talking about the industry-transforming opportunities that Cognitive Computing is already beginning to bring to life”

So, why is the “mobile guy” talking about Cognitive Computing?

First, it’s because Cognitive Computing is big… I mean, really, really big.  Cognitive systems are literally transforming industries and providing powerful analytics and insight into the hands of both experts and “normal people”.  When I say “into the hands”, I again mean this literally; much of this cognitive ability is being delivered to those end users through their mobile devices.

It’s also because cognitive systems fit nicely with IBM’s MobileFirst product offerings.  It doesn’t matter whether you’re using the MobileFirst Platform Foundation server on-premise, or leveraging the MobileFirst offerings on IBM Bluemix, in both cases you can easily consume IBM Watson cognitive services to augment and enhance the interactions and data for your mobile applications. Check out the Bluemix catalog to see how you might start adding Watson cognitive or big data abilities to your apps today.

Last, and this is purely just personal opinion, I see the mobile MobileFirst offerings themselves as providing somewhat of cognitive service for developing mobile apps.  If you look at it from the operational analytics perspective, you have an immediate insight and a snapshot into the health of your system that you would never have seen otherwise.  You can know what types of devices are hitting your system, what services are being used, how long things are taking, and detect issues, all without any additional development efforts on your end. It’s not predictive analytics, but sure is helpful and gets us moving in the right direction.

IBM Watson Speech Services Just Got A Whole Lot Easier

UPDATE 12/22/15:  IBM Recently released a new iOS SDK for Watson that makes integration with Watson services even easier. You can read more about it here.


IBM_Watson_avatar_negIBM’s Watson Developer Cloud speech services just got a whole lot easier for mobile developers.  I myself just learned about these two, and can’t wait to integrate them into my own mobile applications.

The Watson Speech to Text and Text to Speech services are now available in both native iOS and Android SDKs, making it even easier to integrate language services into your apps.

These native APIs now include audio streaming back to the Watson Speech to Text service, for lower latency responses to spoken languages.

I can guarantee you that my “voice-drive iOS apps” demo will be updated soon, and I’ll be using this for all future language processing services.