For the purpose of silence removal of captured sound, we used the algorithm in our final year project.

In this post, I am publishing the endpoint detection and silence removal code ( implementation of this algorithm in JAVA).

These links might be useful to you as well.

The constructor of following java class EndPointDetectiontakes two parameters

array of original signal's amplitude data : float[] originalSignal

sampling rate of original signal in Hz : int samplingRate

packageorg.ioe.tprsa.audio.preProcessings;/***@authorGanesh Tiwari

* @reference 'A New Silence Removal and Endpoint Detection Algorithm

* for Speech and Speaker Recognition Applications' by IIT, Khragpur*/

public classEndPointDetection {private float[] originalSignal; //input

private float[] silenceRemovedSignal;//output

private intsamplingRate;private intfirstSamples;private intsamplePerFrame;public EndPointDetection(float[] originalSignal, intsamplingRate) {this.originalSignal =originalSignal;this.samplingRate =samplingRate;

samplePerFrame= this.samplingRate / 1000;

firstSamples= samplePerFrame * 200;//according to formula

}public float[] doEndPointDetection() {//for identifying each sample whether it is voiced or unvoiced

float[] voiced = new float[originalSignal.length];float sum = 0;double sd = 0.0;double m = 0.0;//1. calculation of mean

for (int i = 0; i < firstSamples; i++) {

sum+=originalSignal[i];

}

m= sum / firstSamples;//mean

sum = 0;//reuse var for S.D.//2. calculation of Standard Deviation

for (int i = 0; i < firstSamples; i++) {

sum+= Math.pow((originalSignal[i] - m), 2);

}

sd= Math.sqrt(sum /firstSamples);//3. identifying one-dimensional Mahalanobis distance function//i.e. |x-u|/s greater than ####3 or not,

for (int i = 0; i < originalSignal.length; i++) {if ((Math.abs(originalSignal[i] - m) / sd) > 0.3) { //0.3 =THRESHOLD.. adjust value yourself

voiced[i] = 1;

}else{

voiced[i]= 0;

}

}//4. calculation of voiced and unvoiced signals//mark each frame to be voiced or unvoiced frame

int frameCount = 0;int usefulFramesCount = 1;int count_voiced = 0;int count_unvoiced = 0;int voicedFrame[] = new int[originalSignal.length /samplePerFrame];//the following calculation truncates the remainder

int loopCount = originalSignal.length - (originalSignal.length %samplePerFrame);for (int i = 0; i < loopCount; i +=samplePerFrame) {

count_voiced= 0;

count_unvoiced= 0;for (int j = i; j < i + samplePerFrame; j++) {if (voiced[j] == 1) {

count_voiced++;

}else{

count_unvoiced++;

}

}if (count_voiced >count_unvoiced) {

usefulFramesCount++;

voicedFrame[frameCount++] = 1;

}else{

voicedFrame[frameCount++] = 0;

}

}//5. silence removal

silenceRemovedSignal = new float[usefulFramesCount *samplePerFrame];int k = 0;for (int i = 0; i < frameCount; i++) {if (voicedFrame[i] == 1) {for (int j = i * samplePerFrame; j < i * samplePerFrame + samplePerFrame; j++) {

silenceRemovedSignal[k++] =originalSignal[j];

}

}

}//end

returnsilenceRemovedSignal;

}

}

问:Hi ganesh, So Is impossible listen the voice after normalizePCM and endpointdetection?

答:you can play the recorded audio after doing those time domain operations.

you need to play the pcm array using the code : http://ganeshtiwaridotcomdotnp.blogspot.com/2011/12/java-audio-playing-pcm-amplitude-array.html

you can find other codes related to sound processing in java here :

http://ganeshtiwaridotcomdotnp.blogspot.com/search/label/Audio%20Processing

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐