#Sample Applications

Watson Speech iOS (Objective-C) SDK

An SDK for iOS mobile applications enabling use of the Bluemix Watson Speech To Text and Text To Speech APIs from Watson Developer Cloud

The SDK include support for recording and streaming audio and receiving a transcript of the audio in response.

Installation

Using the framework

Download the watsonsdk.framework.zip and unzip it somewhere convenient
Once unzipped drag the watsonsdk.framework folder into your xcode project view under the Frameworks folder.

Some additional iOS standard frameworks must be added.

Select your project in the Xcode file explorer and open the "Build Phases" tab. Expand the "Link Binary With Libraries" section and click the + icon
Add the following frameworks
- AudioToolbox.framework
- AVFoundation.framework
- CFNetwork.framework
- CoreAudio.framework
- Foundation.framework
- libicucore.tbd (or libicucore.dylib on older versions)
- Quartzcore.framework
- Security.framework

Include headers

in Objective-C

	#import <watsonsdk/SpeechToText.h>
	#import <watsonsdk/TextToSpeech.h>

in Swift

Add the headers above for Objective-c into a bridging header file. - Use SwiftSpeechHeader.h in Swift sample

#Sample Applications

This repository contains a sample application demonstrating the SDK functionality.

To run the application clone this repository and then navigate in Finder to folder containing the SDK files.

Double click on the watsonsdk.xcodeproj to launch Xcode.

To run the sample application, change the compile target to 'watsonsdktest-objective-c' or 'watsonsdktest-swift' and run on the iPhone simulator.

Note that this is sample code and no security review has been performed on the code.

The Swift sample was tested in Xcode 8.2.1.

#Speech To Text

Create a STT Configuration

By default the Configuration will use the IBM Bluemix service API endpoint, custom endpoints can be set using setApiURL in most cases this is not required.

in Objective-C

	STTConfiguration *conf = [[STTConfiguration alloc] init];

in Swift

	let conf:STTConfiguration = STTConfiguration()

Authentication

There are currently two authentication options.

Basic Authentication, using the credentials provided by the Bluemix Service instance.

in Objective-C

	[conf setBasicAuthUsername:@"<userid>"];
	[conf setBasicAuthPassword:@"<password>"];

in Swift

	conf.basicAuthUsername = "<userid>"
	conf.basicAuthPassword = "<password>"

Token authentication, if a token authentication provider is running at https://my-token-factory/token

in Objective-C

    [conf setTokenGenerator:^(void (^tokenHandler)(NSString *token)){
        NSURL *url = [[NSURL alloc] initWithString:@"https://<token-factory-url>"];
        NSMutableURLRequest *request = [[NSMutableURLRequest alloc] init];
        [request setHTTPMethod:@"GET"];
        [request setURL:url];
        
        NSError *error = [[NSError alloc] init];
        NSHTTPURLResponse *responseCode = nil;
        NSData *oResponseData = [NSURLConnection sendSynchronousRequest:request returningResponse:&responseCode error:&error];
        if ([responseCode statusCode] != 200) {
            NSLog(@"Error getting %@, HTTP status code %li", url, (long)[responseCode statusCode]);
            return;
        }
        tokenHandler([[NSString alloc] initWithData:oResponseData encoding:NSUTF8StringEncoding]);
    } ];

in Swift

    ...
    confSTT.tokenGenerator = self.tokenGenerator()
    ...

    func tokenGenerator() -> ((((String?) -> Void)?)) -> Void {
        let url = URL(string: "https://<token-factory-url>")
        return ({ ( _ tokenHandler: (((_ token:String?) -> Void)?) ) -> () in
            SpeechUtility .performGet({ (data:Data?, response:URLResponse?, error:Error?) in
                if error != nil {
                    print("Error occurred while requesting token: \(error?.localizedDescription ?? "")")
                    return
                }
                guard let httpResponse: HTTPURLResponse = response as? HTTPURLResponse else {
                    print("Invalid response")
                    return
                }
                if httpResponse.statusCode != 200 {
                    print("Error response: \(httpResponse.statusCode)")
                    return
                }
                
                let token:String = String(data: data!, encoding: String.Encoding.utf8)!
                
                tokenHandler!(token)
            }, for: url, delegate: self, disableCache: true, header: nil)
        })
    }

Create a SpeechToText instance

in Objective-C

	@property SpeechToText;
	
	...
	
	self.stt = [SpeechToText initWithConfig:conf];

in Swift

	var stt:SpeechToText?
	
	...
	
	self.stt = SpeechToText(config: conf)

Get a list of models supported by the service

in Objective-C

	[stt listModels:^(NSDictionary* jsonDict, NSError* err){
        
        if(err == nil)
            ... read values from NSDictionary ...

	}];

in Swift

stt?.listModels({ (jsonDict: [AnyHashable: Any]?, error: Error?) in
    
    if err == nil {
    	print(jsonDict!)
    }
})

Get details of a particular model

Available speech recognition models can be obtained using the listModel function.

in Objective-C

	[stt listModel:^(NSDictionary* jsonDict, NSError* err){
        
        if(err == nil)
            ... read values from NSDictionary ...
	    
    	} withName:@"WatsonSpeechModel"];

in Swift

	stt?.listModel({ (jsonDict: [AnyHashable : Any]?, error: Error?) in
        	if err == nil {
	            print(jsonDict!)
        	}
	}, withName: "WatsonSpeechModel")

Use a named model

The speech recognition model can be changed in the configuration.

in Objective-C

	[conf setModelName:@"ja-JP_BroadbandModel"];

in Swift

	confSTT.modelName = "ja-JP_BroadbandModel"

Enabling audio compression

By default audio sent to the server is uncompressed PCM encoded data, compressed audio using the Opus codec can be enabled.

in Objective-C

	[conf setAudioCodec:WATSONSDK_AUDIO_CODEC_TYPE_OPUS];

in Swift

	confSTT.audioCodec = WATSONSDK_AUDIO_CODEC_TYPE_OPUS

Start audio transcription

in Objective-C

	[stt recognize:^(NSDictionary* res, NSError* err){
        
        if(err == nil) {
            SpeechToTextResult *sttResult = [stt getResult:res];
            
	    if([sttResult transcript]) {
                if([sttResult isFinal]) {
                    // final transcript
		    NSLog(@"%@", [sttResult transcript]);
                }
                else {
                    // partial transcript
		    NSLog(@"%@", [sttResult transcript]);
                }
            }
        }
        else {
            [stt stopRecordingAudio];
            [stt endConnection];
        }
    }];

in Swift

	self.sttInstance?.recognize({ (result: [AnyHashable : Any]?, error: Error?) in
            if error == nil {
                let sttResult = self.sttInstance?.getResult(result)
                guard let transcript = sttResult?.transcript else {
                    return;
                }
                if (sttResult?.isFinal)! {
                    // final transcript
                    print(sttResult?.transcript ?? "")
                }
                else {
                    // partial transcript
                    print(sttResult?.transcript ?? "")
                }
                self.result.text = transcript
            }
            else {
                self.sttInstance?.stopRecordingAudio()
                self.sttInstance?.endConnection()
            }
        })

End audio transcription

The app must explicity indicate to the SDK when transmission should be ended if the continous option is YES.

in Objective-C

	[conf setContinuous:YES];

	...
    
	[stt endTransmission];

in Swift

	conf.continuous = true

	...

	stt?.endTransmission()

Obtain a confidence score

A confidence score is available for any final transcripts (whole sentences). This can be obtained from SpeechToTextResult instance.

in Objective-C

    SpeechToTextResult *sttResult = [stt getResult:res];

    NSLog(@"Confidence score: %@", [sttResult confidenceScore])

in Swift

	let sttResult = self.sttInstance?.getResult(result)
	print("Confidence score: \(sttResult?.confidenceScore)")

Receive speech power levels during the recognize

in Objective-C

	[stt recognize:^(NSDictionary *, NSError *) {
		...
	} powerHandler:^(float power) {
		NSLog(@"Power level: %f", power);
	}];

in Swift

	self.sttInstance?.recognize({ (result: [AnyHashable : Any]?, error: Error?) in
		...
	}, powerHandler: { (power: Float) in
		print("Power level: \(power)")
	})

Text To Speech

Create a Configuration

By default the Configuration will use the IBM Bluemix service API endpoint, custom endpoints can be set using setApiURL in most cases this is not required.

in Objective-C

	TTSConfiguration *conf = [[TTSConfiguration alloc] init];
	[conf setBasicAuthUsername:@"<userid>"];
	[conf setBasicAuthPassword:@"<password>"];

in Swift

	let conf: TTSConfiguration = TTSConfiguration()
	conf.basicAuthUsername = "<userid>"
	conf.basicAuthPassword = "<password>"

Set the voice

You can change the voice model used for TTS by setting it in the configuration.

in Objective-C

	[conf setVoiceName:@"en-US_MichaelVoice"];

in Swift

	conf.voiceName = "en-US_MichaelVoice"

Use Token Authentication

If you use tokens (from your own server) to get access to the service, provide a token generator to the Configuration. userid and password will not be used if a token generator is provided.

in Objective-C

   [conf setTokenGenerator:^(void (^tokenHandler)(NSString *token)){
        // get a token from your server in secure way
        NSString *token = ...

        // provide the token to the tokenHandler
        tokenHandler(token);
    }];

Create a TextToSpeech instance

in Objective-C

	self.tts = [TextToSpeech initWithConfig:conf];

in Swift

    var tts: TextToSpeech?
    
    
    ...
    self.tts = TextToSpeech(config: conf)

Get a list of voices supported by the service

in Objective-C

	[tts listVoices:^(NSDictionary* jsonDict, NSError* err){
        
        if(err == nil)
            ... read values from NSDictionary ...

    }];

in Swift

	tts?.listVoices({ (jsonDict:[AnyHashable: Any]?, error:Error?) in
            if error == nil {
                print(jsonDict!)
            }
        })

Generate and play audio

in Objective-C

	[self.tts synthesize:^(NSData *data, NSError *reqErr) {
    	
    	// request error
    	if(reqErr){
            NSLog(@"Error requesting data: %@", [reqErr description]);
            return;
        }

        // play audio and log when playing has finished
        [self.tts playAudio:^(NSError *err) {
            if(err)
                NSLog(@"error playing audio %@", [err localizedDescription]);
            else
            	NSLog(@"audio finished playing");
            
        } withData:data];
        
    } theText:@"Hello World"];

in Swift

	tts?.synthesize({ (data: NSData!, reqError: NSError!) -> Void in
        if reqError == nil{
		tts?.playAudio({ (error: NSError!) -> Void in
			if error == nil{
				... do something after the audio has played ...
			}
			else{
				... data error handling ...
			}
		}, withData: data)
        }
        else
        	... request error handling ...

	}, theText: "Hello World")

Generate and play customized audio

in Objective-C

    [self.tts synthesize:^(NSData *data, NSError *reqErr) {
        
        // request error
        if(reqErr){
            NSLog(@"Error requesting data: %@", [reqErr description]);
            return;
        }

        // play audio and log when playing has finished
        [self.tts playAudio:^(NSError *err) {
            if(err)
                NSLog(@"error playing audio %@", [err localizedDescription]);
            else
                NSLog(@"audio finished playing");
            
        } withData:data];
        
    } theText:@"Hello World" customizationId:@"your-customization-id"];

in Swift

    tts?.synthesize({ (data: NSData!, reqError: NSError!) -> Void in
        if reqError == nil{
            tts?.playAudio({ (error: NSError!) -> Void in
                if error == nil{
                    ... do something after the audio has played ...
                }
                else{
                    ... data error handling ...
                }
            }, withData: data)
        }
        else
            ... request error handling ...

    }, theText: "Hello World", customizationId: "your-customization-id")

Open Source @ IBM

Find more open source projects on the IBM Github Page.

Name		Name	Last commit message	Last commit date
Latest commit History 232 Commits
watsonResources		watsonResources
watsonsdk.xcodeproj		watsonsdk.xcodeproj
watsonsdk		watsonsdk
watsonsdktest-objective-c		watsonsdktest-objective-c
watsonsdktest-swift		watsonsdktest-swift
.gitignore		.gitignore
CONTRIBUTIONS.txt		CONTRIBUTIONS.txt
LICENSE		LICENSE
License.txt		License.txt
README.md		README.md
watsonsdk.framework.zip		watsonsdk.framework.zip

License

Licenses found

mihui/speech-ios-sdk

Folders and files

Latest commit

History

Repository files navigation

Watson Speech iOS (Objective-C) SDK

Table of Contents

Installation

Include headers

#Sample Applications

#Speech To Text

Create a STT Configuration

Authentication

Create a SpeechToText instance

Get a list of models supported by the service

Get details of a particular model

Use a named model

Enabling audio compression

Start audio transcription

End audio transcription

Obtain a confidence score

Receive speech power levels during the recognize

Text To Speech

Create a Configuration

Set the voice

Use Token Authentication

Create a TextToSpeech instance

Get a list of voices supported by the service

Generate and play audio

Generate and play customized audio

Open Source @ IBM

Copyright and license

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Languages