“阿强,手写板怎么又不见了?” 最近,程序员阿强的那位勇于尝试新事物的外婆,又迷上了网购。在不太费劲儿地把购物软件摸得门儿清之后,没想到,本以为顺畅的网购之路,卡在了搜索物品上。…
                                                                                                                                                                                    “阿强,手写板怎么又不见了?”  
最近,程序员阿强的那位勇于尝试新事物的外婆,又迷上了网购。在不太费劲儿地把购物软件摸得门儿清之后,没想到,本以为顺畅的网购之路,卡在了搜索物品上。 在手写输入环节,要么误操作,无意中更换到不熟悉的输入法;要么误按了界面上抽象的指令字符……于是阿强也经常收到外婆发来的求助。
效果演示 
实时语音识别和音频转文字有丰富的使用场景 1、游戏应用中的运用:当你在联机游戏场组队开黑时,通过实时语音识别跟队友无阻沟通,不占用双手的同时,也避免了开麦露出声音的尴尬。。 2、办公应用中的运用:职场里,耗时长的会议,手打码字记录即低效,还容易漏掉细节,凭借音频文件转文字功能,转写会议讨论内容,会后对转写的文字进行梳理润色,事半功倍。 3、学习应用中的运用:时下越来越多的音频教学材料,一边观看一边暂停做笔记,很容易打断学习节奏,破坏学习过程的完整性,有了音频文件转写,系统的学习完教材后,再对文字进行复习梳理,学习体验更佳。
实现原理 华为机器学习服务提供实时语音识别和音频文件转写能力。 实时语音识别支持将实时输入的短语音(时长不超过60秒)转换为文本,识别准确率可达95%以上。目前支持中文普通话、英语、中英混说、法语、德语、西班牙语、意大利语、阿拉伯语的识别。
支持实时出字。
音频文件转写可将5小时内的音频文件转换成文字,支持输出标点符号,形成断句合理、易于理解的文本信息。同时支持生成带有时间戳的文本信息,便于后续进行更多功能开发。当前版本支持中英文的转写。
开发步骤 1、开发前准备 配置华为Maven仓地址并将agconnect-services.json文件放到app目录下: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21   	buildscript {     repositories {         google()         jcenter()         maven { url 'https://developer.huawei.com/repo/'  }     }     dependencies {         classpath 'com.android.tools.build:gradle:3.5.4'          classpath 'com.huawei.agconnect:agcp:1.4.1.300'                        } } allprojects {     repositories {         google()         jcenter()         maven { url 'https://developer.huawei.com/repo/'  }     } } 
参见云端鉴权信息使用须知,设置应用的鉴权信息。
1 2 3 4 5 6 7 8 9 10   dependencies {          implementation 'com.huawei.hms:ml-computer-voice-aft:2.2.0.300'           implementation 'com.huawei.hms:ml-computer-voice-asr:2.2.0.300'           implementation 'com.huawei.hms:ml-computer-voice-asr-plugin:2.2.0.300'      ... } apply plugin: 'com.huawei.agconnect'    
在app的build中配置签名文件并将签名文件(xxx.jks)放入app目录下: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23   signingConfigs {     release {         storeFile file ("xxx.jks" )          keyAlias xxx         keyPassword xxxxxx         storePassword xxxxxx         v1SigningEnabled true          v2SigningEnabled true      } } buildTypes {     release {         minifyEnabled false          proguardFiles getDefaultProguardFile ('proguard-android-optimize.txt' ) , 'proguard-rules.pro'      }     debug {         signingConfig signingConfigs.release         debuggable true      } } 
4. 在Manifest.xml中添加权限:
1 2 3 4 5 6 7 8 9 10 11   <uses-permission android:name="android.permission.INTERNET"  /> <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"  /> <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"  /> <uses-permission android:name="android.permission.ACCESS_NETWORK_STATE"  /> <uses-permission android:name="android.permission.ACCESS_WIFI_STATE"  /> <uses-permission android:name="android.permission.RECORD_AUDIO"  /> <application     android:requestLegacyExternalStorage="true"    ... </application> 
2、接入实时语音识别能力 进行权限动态申请: 1 2 3 4 5 6 7 8 9 10 11   if  (ActivityCompat.checkSelfPermission(this , Manifest.permission.RECORD_AUDIO) != PackageManager.PERMISSION_GRANTED) {     requestCameraPermission(); } private  void  requestCameraPermission ()  {    final  String[] permissions = new  String []{Manifest.permission.RECORD_AUDIO};     if  (!ActivityCompat.shouldShowRequestPermissionRationale(this , Manifest.permission.RECORD_AUDIO)) {         ActivityCompat.requestPermissions(this , permissions, Constants.AUDIO_PERMISSION_CODE);         return ;     } } 
创建Intent,用于设置实时语音识别参数。 1 2 3 4 5 6 7 8 9    MLApplication.getInstance().setApiKey(AGConnectServicesConfig.fromContext(this ).getString("client/api_key" )); Intent  intentPlugin  =  new  Intent (this , MLAsrCaptureActivity.class)                 .putExtra(MLAsrCaptureConstants.LANGUAGE, MLAsrConstants.LAN_ZH_CN)                  .putExtra(MLAsrCaptureConstants.FEATURE, MLAsrCaptureConstants.FEATURE_WORDFLUX); startActivityForResult(intentPlugin, "1" ); 
覆写“onActivityResult”方法,用于处理语音识别服务返回结果。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58   @Override  protected  void  onActivityResult (int  requestCode, int  resultCode, @Nullable  Intent data)  {    super .onActivityResult(requestCode, resultCode, data);     String  text  =  "" ;     if  (null  == data) {         addTagItem("Intent data is null." , true );     }     if  (requestCode == "1" ) {         if  (data == null ) {             return ;         }         Bundle  bundle  =  data.getExtras();         if  (bundle == null ) {             return ;         }         switch  (resultCode) {             case  MLAsrCaptureConstants.ASR_SUCCESS:                                  if  (bundle.containsKey(MLAsrCaptureConstants.ASR_RESULT)) {                     text = bundle.getString(MLAsrCaptureConstants.ASR_RESULT);                 }                 if  (text == null  || "" .equals(text)) {                     text = "Result is null." ;                     Log.e(TAG, text);                 } else  {                                          searchEdit.setText(text);                     goSearch(text, true );                 }                 break ;                          case  MLAsrCaptureConstants.ASR_FAILURE:                                  if  (bundle.containsKey(MLAsrCaptureConstants.ASR_ERROR_CODE)) {                     text = text + bundle.getInt(MLAsrCaptureConstants.ASR_ERROR_CODE);                                      }                                  if  (bundle.containsKey(MLAsrCaptureConstants.ASR_ERROR_MESSAGE)) {                     String  errorMsg  =  bundle.getString(MLAsrCaptureConstants.ASR_ERROR_MESSAGE);                                          if  (errorMsg != null  && !"" .equals(errorMsg)) {                         text = "["  + text + "]"  + errorMsg;                     }                 }                                  if  (bundle.containsKey(MLAsrCaptureConstants.ASR_SUB_ERROR_CODE)) {                     int  subErrorCode  =  bundle.getInt(MLAsrCaptureConstants.ASR_SUB_ERROR_CODE);                                          text = "["  + text + "]"  + subErrorCode;                 }                 Log.e(TAG, text);                 break ;             default :                 break ;         }     } } 
3. 接入音频文件转写能力 申请动态权限。 1 2 3 4 5 6 7 8 9 10 11 12 13 14   private  static  final  int  REQUEST_EXTERNAL_STORAGE  =  1 ; private  static  final  String[] PERMISSIONS_STORAGE = {        Manifest.permission.READ_EXTERNAL_STORAGE,         Manifest.permission.WRITE_EXTERNAL_STORAGE }; public  static  void  verifyStoragePermissions (Activity activity)  {         int  permission  =  ActivityCompat.checkSelfPermission(activity,             Manifest.permission.WRITE_EXTERNAL_STORAGE);     if  (permission != PackageManager.PERMISSION_GRANTED) {                  ActivityCompat.requestPermissions(activity, PERMISSIONS_STORAGE,                 REQUEST_EXTERNAL_STORAGE);     } } 
新建音频文件转写引擎并初始化;新建音频文件转写配置器。 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18    MLApplication.getInstance().setApiKey(AGConnectServicesConfig.fromContext(getApplication()).getString("client/api_key" )); MLRemoteAftSetting  setting  =  new  MLRemoteAftSetting .Factory()                 .setLanguageCode("zh" )                  .enablePunctuation(true )                  .enableWordTimeOffset(true )                  .enableSentenceTimeOffset(true )         .create(); MLRemoteAftEngine  engine  =  MLRemoteAftEngine.getInstance();engine.init(this ); engine.setAftListener(aftListener); 
新建侦听器回调,用于处理音频文件转写结果: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24   private  MLRemoteAftListener  aftListener  =  new  MLRemoteAftListener () {     public  void  onResult (String taskId, MLRemoteAftResult result, Object ext)  {                  if  (result.isComplete()) {                      }     }     @Override      public  void  onError (String taskId, int  errorCode, String message)  {              }     @Override      public  void  onInitComplete (String taskId, Object ext)  {              }     @Override      public  void  onUploadProgress (String taskId, double  progress, Object ext)  {              }     @Override      public  void  onEvent (String taskId, int  eventId, Object ext)  {              } }; 
长语音转写:适用于时长大于1分钟的音频文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80   private  MLRemoteAftListener  asrListener  =  new  MLRemoteAftListener () {     @Override      public  void  onInitComplete (String taskId, Object ext)  {         Log.e(TAG, "MLAsrCallBack onInitComplete" );                  start(taskId);     }     @Override      public  void  onUploadProgress (String taskId, double  progress, Object ext)  {         Log.e(TAG, " MLAsrCallBack onUploadProgress" );     }     @Override      public  void  onEvent (String taskId, int  eventId, Object ext)  {                  Log.e(TAG, "MLAsrCallBack onEvent"  + eventId);         if  (MLAftEvents.UPLOADED_EVENT == eventId) {                           startQueryResult(taskId);         }     }     @Override      public  void  onResult (String taskId, MLRemoteAftResult result, Object ext)  {         Log.e(TAG, "MLAsrCallBack onResult taskId is :"  + taskId + " " );         if  (result != null ) {             Log.e(TAG, "MLAsrCallBack onResult isComplete: "  + result.isComplete());             if  (result.isComplete()) {                 TimerTask  timerTask  =  timerTaskMap.get(taskId);                 if  (null  != timerTask) {                     timerTask.cancel();                     timerTaskMap.remove(taskId);                 }                 if  (result.getText() != null ) {                     Log.e(TAG, taskId + " MLAsrCallBack onResult result is : "  + result.getText());                     tvText.setText(result.getText());                 }                 List<MLRemoteAftResult.Segment> words = result.getWords();                 if  (words != null  && words.size() != 0 ) {                     for  (MLRemoteAftResult.Segment word : words) {                         Log.e(TAG, "MLAsrCallBack word  text is : "  + word.getText() + ", startTime is : "  + word.getStartTime() + ". endTime is : "  + word.getEndTime());                     }                 }                 List<MLRemoteAftResult.Segment> sentences = result.getSentences();                 if  (sentences != null  && sentences.size() != 0 ) {                     for  (MLRemoteAftResult.Segment sentence : sentences) {                         Log.e(TAG, "MLAsrCallBack sentence  text is : "  + sentence.getText() + ", startTime is : "  + sentence.getStartTime() + ". endTime is : "  + sentence.getEndTime());                     }                 }             }         }     }     @Override      public  void  onError (String taskId, int  errorCode, String message)  {         Log.i(TAG, "MLAsrCallBack onError : "  + message + "errorCode, "  + errorCode);         switch  (errorCode) {             case  MLAftErrors.ERR_AUDIO_FILE_NOTSUPPORTED:                 break ;         }     } }; private  void  start (String taskId)  {    Log.e(TAG, "start" );     engine.setAftListener(asrListener);     engine.startTask(taskId); } private  Map<String, TimerTask> timerTaskMap = new  HashMap <>();private  void  startQueryResult (final  String taskId)  {    Timer  mTimer  =  new  Timer ();     TimerTask  mTimerTask  =  new  TimerTask () {         @Override          public  void  run ()  {             getResult(taskId);         }     };          mTimer.schedule(mTimerTask, 5000 , 10000 );          timerTaskMap.put(taskId, mTimerTask); } 
获取音频,上传音频文件到转写引擎中: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35    Uri  uri  =  getFileUri();Long  audioTime  =  getAudioFileTimeFromUri(uri);if  (audioTime < 60000 ) {         this .taskId = this .engine.shortRecognize(uri, this .setting);     Log.i(TAG, "Short audio transcription." ); } else  {          this .taskId = this .engine.longRecognize(uri, this .setting);     Log.i(TAG, "Long audio transcription." ); } private  Long getAudioFileTimeFromUri (Uri uri)  {    Long  time  =  null ;     Cursor  cursor  =  this .getContentResolver()             .query(uri, null , null , null , null );     if  (cursor != null ) {         cursor.moveToFirst();         time = cursor.getLong(cursor.getColumnIndexOrThrow(MediaStore.Video.Media.DURATION));     } else  {         MediaPlayer  mediaPlayer  =  new  MediaPlayer ();         try  {             mediaPlayer.setDataSource(String.valueOf(uri));             mediaPlayer.prepare();         } catch  (IOException e) {             Log.e(TAG, "Failed to read the file time." );         }         time = Long.valueOf(mediaPlayer.getDuration());     }     return  time; } 
访问华为开发者联盟官网,了解更多相关内容 >>获取开发指导文档 >>华为移动服务开源仓库地址:GitHub、Gitee
本文标题: 300行代码实现语音搜索购物的技术分享
本文作者: OSChina
发布时间: 2021年04月15日 09:46
最后更新: 2025年07月13日 05:44
原始链接: https://haoxiang.eu.org/c26674eb/ 
版权声明: 本文著作权归作者所有,均采用CC BY-NC-SA 4.0 许可协议,转载请注明出处!