Horea's Blog

Ready to launch that sweet Web-App that sweeps Siri out of the water? If so, you’re in the right place. I’ll show you how to generate a token server-side for Watson’s Speech To Text Service in your Web-App.

OpenWhisk

Now you may be thinking…OpenWhisk? Why not use Node.js? Everyone knows Node, and there’s much more documentation if I get stuck. While I cannot refute the last statement, I’ll show you why using a new technology like OpenWhisk, similar to Amazon’s Lambda, might save you cash down the road.

Getting Started

Every OpenWhisk function needs to have “main”. Inside this function, we will make an HTTP Request to fetch our token so that we can later use this in our client-side code, without having to reveal our credentials.

The Code

Don’t be afraid to copy and paste. Just make sure you read the comments and understand what you’re pasting…


//package that we need to make an HTTP call
var request = require('request');

function main(params) {
  //Speech To Text Service Credentials
  var username = "username";
  var password = "password";

  //Authorization needs to be in base64 for some reason...
  var auth = "Basic " + new Buffer(username + ":" + password).toString("base64");

  //url for our HTTP Request
  var speechToTextUrl = "https://stream.watsonplatform.net/authorization/api/v1/token?url=https://stream.watsonplatform.net/speech-to-text/api";

  //promises are a way for our action to work asynchronously
  var promise = new Promise(function(resolve, reject) {
    //using our Simplified HTTP request lient
    request({
      url:     speechToTextUrl,
      headers : {
            "Authorization" : auth
        }
    },
    function(error, response, body){
      resolve ({
        //tell browser we will receive plain text from our HTTP Request
        headers: {
          'Content-Type': 'text/plain'
        },
        //display the token returned from our HTTP request
        body: body
      });
    });
  });
  return promise;
}

Webify

The only thing we have to do is turn this into a web action. Simply add the –web true param as shown below.

wsk action create {yourPackage}/speechToText speechToText.js --web true

Conclusion

Now that you have a web action that returns a token, all you have to do is make a simple XHR Request to your Web Action URL (if you do not know what your Web Action URL is, read more here) and you’ll be ready to use Voice Recognition in your Web-App. Since we are using a serverless (Function As A Service) cloud platform, you don’t have to keep your server up 247 as you would if you spun up an old-school Node.js server. What that means for you is that you’ll be saving a lot of money since you are billed a very small percentage of a cent per second of execution. Now that’s pretty neat :) Check out the working example below.

comments powered by Disqus