html5语音识别

2024-04-19 用户投稿学习笔记 10阅读

想要使用html5网页实现语音识别，大概流程：用户点击录音按钮，进行语音录入，录入后上传语音，后台接收到请求后，会调用语音识别的方法，最后返回识别结果，前端接收到结果后再进行后续处理。

参考：https://blog.csdn.net/qq_33609401/article/details/78172080，这里面写得很详细，HZRecorder.js可以去这里复制。

所以技术上主要分为两个部分：语音获取、语音识别；

一、语音获取

在html上增加录音按钮，使用 HZRecorder.js，调用设备的录音功能，（注意：手机端浏览器需要有ssl证书（即网址协议为https）才能调用录音，有人说打包了就可以，没有ssl证书，打包了也会调用失败）

前端：

<!DOCTYPE html><html><head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><meta name="viewport" content="width=device-width,height=device-height, initial-scale=1.0, maximum-scale=1.0, user-scalable=0" /> <title></title></head><body> <div> <audio controls autoplay></audio> <input onclick="startRecording()" type="button" value="录音" /> <input onclick="stopRecording()" type="button" value="停止" /> <input onclick="playRecording()" type="button" value="播放" /> <input onclick="uploadAudio()" type="button" value="提交" /> </div> <script type="text/javascript" src="HZRecorder.js"></script> <script> var recorder; var audio = document.querySelector('audio'); function startRecording() { HZRecorder.get(function (rec) { recorder = rec; recorder.start(); }); } function stopRecording() { recorder.stop(); } function playRecording() { recorder.play(audio); } function uploadAudio() { recorder.upload("http://localhost:8080/****/UploadVideoServlet", function (state, e) { switch (state) { case 'uploading': break; case 'ok': alert("上传成功"); break; case 'error': alert("上传失败"); break; case 'cancel': alert("上传被取消"); break; } }); } </script></body></html>

后台：在UploadVideoServlet中处理上传和调用语音识别代码。

@WebServlet("/UploadVideoServlet")public class UploadVideoServlet extends HttpServlet {private static final long serialVersionUID = 1L; public UploadVideoServlet() { super(); }protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {response.getWriter().append("Served at: ").append(request.getContextPath());this.doPost(request, response);}protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {// 消息提示String message = "";try { DiskFileItemFactory factory = new DiskFileItemFactory();ServletFileUpload upload = new ServletFileUpload(factory);List<FileItem> list = upload.parseRequest(request);for (FileItem item : list) { // 调用语音识别方法//Sample.main(item.get()); //百度WebIAT.main(item.get()); //科大讯飞message = "文件上传成功！"; }} catch (Exception e) {message = "文件上传失败！";e.printStackTrace();} }}

二、语音识别：使用科大讯飞或者百度提供的api实现。

（不论科大讯飞或者百度：(1) 都需要先去申请 APPID 、API_KEY，具体步骤去官网看看就知道了，（2）都提供了几种方式，我没有依次尝试）

1、科大讯飞示例：

参考 https://doc.xfyun.cn/rest_api/%E8%AF%AD%E9%9F%B3%E5%90%AC%E5%86%99.html

public class WebIAT {// 听写webapi接口地址private static final String WEBIAT_URL = "http://api.xfyun.cn/v1/service/v1/iat";// 应用APPID（必须为webapi类型应用，并开通语音听写服务，参考帖子如何创建一个webapi应用：http://bbs.xfyun.cn/forum.php?mod=viewthread&tid=36481）private static final String APPID = "*******";// 接口密钥（webapi类型应用开通听写服务后，控制台--我的应用---语音听写---相应服务的apikey）private static final String API_KEY = "*************************";// 音频编码private static final String AUE = "raw";// 引擎类型 private static final String ENGINE_TYPE = "sms16k";// 后端点（取值范围0-10000ms）private static final String VAD_EOS = "10000"; /** * 听写 WebAPI 调用示例程序 * @param args * @throws IOException */public static void main(byte[] bytes) throws IOException {Map<String, String> header = buildHttpHeader();//byte[] audioByteArray = FileUtil.read(AUDIO_PATH);byte[] audioByteArray = bytes;String audioBase64 = new String(Base64.encodeBase64(audioByteArray), "UTF-8");String result = HttpUtil.doPost1(WEBIAT_URL, header, "audio=" + URLEncoder.encode(audioBase64, "UTF-8"));System.out.println("听写 WebAPI 接口调用结果：" + result);}// 返回code为错误码时，请查询https://www.xfyun.cn/document/error-code解决方案/** * 组装http请求头 */private static Map<String, String> buildHttpHeader() throws UnsupportedEncodingException {String curTime = System.currentTimeMillis() / 1000L + "";String param = "{\"aue\":\""+AUE+"\""+",\"engine_type\":\"" + ENGINE_TYPE + "\""+",\"vad_eos\":\"" + VAD_EOS + "\"}";String paramBase64 = new String(Base64.encodeBase64(param.getBytes("UTF-8")));String checkSum = DigestUtils.md5Hex(API_KEY + curTime + paramBase64);Map<String, String> header = new HashMap<String, String>();header.put("Content-Type", "application/x-www-form-urlencoded; charset=utf-8");header.put("X-Param", paramBase64);header.put("X-CurTime", curTime);header.put("X-CheckSum", checkSum);header.put("X-Appid", APPID);return header;}}

2、百度示例：

参考 https://ai.baidu.com/docs#/ASR-Online-Java-SDK/top

public class Sample { private static final String serverURL = "http://vop.baidu.com/server_api"; private static String token = ""; //put your own params here private static final String apiKey = "****************"; //改为自己的apiKey private static final String secretKey = "*******************"; //改为自己的secretKey private static final String cuid = "*******"; //唯一标识，可以写机器码 public static void main(byte[] bytes) throws Exception { getToken(); method2(bytes); } private static void getToken() throws Exception { String getTokenURL = "https://openapi.baidu.com/oauth/2.0/token?grant_type=client_credentials" +"&client_id=" + apiKey + "&client_secret=" + secretKey; HttpURLConnection conn = (HttpURLConnection) new URL(getTokenURL).openConnection(); token = new JSONObject(printResponse(conn)).getString("access_token"); } private static void method2(byte[] bytes) throws Exception { HttpURLConnection conn = (HttpURLConnection) new URL(serverURL+ "?cuid=" + cuid + "&token=" + token).openConnection(); // add request header conn.setRequestMethod("POST"); conn.setRequestProperty("Content-Type", "audio/wav; rate=16000"); conn.setDoInput(true); conn.setDoOutput(true); // send request DataOutputStream wr = new DataOutputStream(conn.getOutputStream()); wr.write(bytes); wr.flush(); wr.close(); printResponse(conn); } private static String printResponse(HttpURLConnection conn) throws Exception { if (conn.getResponseCode() != 200) { // request error return ""; } InputStream is = conn.getInputStream(); BufferedReader rd = new BufferedReader(new InputStreamReader(is)); String line; StringBuffer response = new StringBuffer(); while ((line = rd.readLine()) != null) { response.append(line); response.append('\r'); } rd.close(); System.out.println(new JSONObject(response.toString()).toString(4)); return response.toString(); }

转换结果示例：

拿到结果后，前端就可以进行后续处理了。

附： HZRecorder.js

标签：