[1]徐峰,李平.室外人体脚步声事件及环境联合识别[J].华侨大学学报(自然科学版),2021,42(5):676-683.[doi:10.11830/ISSN.1000-5013.202008011]
 XU Feng,LI Ping.Outdoor Human Footsteps Event and Environment Joint Recognition[J].Journal of Huaqiao University(Natural Science),2021,42(5):676-683.[doi:10.11830/ISSN.1000-5013.202008011]
点击复制

室外人体脚步声事件及环境联合识别()
分享到:

《华侨大学学报(自然科学版)》[ISSN:1000-5013/CN:35-1079/N]

卷:
第42卷
期数:
2021年第5期
页码:
676-683
栏目:
出版日期:
2021-09-20

文章信息/Info

Title:
Outdoor Human Footsteps Event and Environment Joint Recognition
文章编号:
1000-5013(2021)05-0676-08
作者:
徐峰 李平
华侨大学 信息科学与工程学院, 福建 厦门 361021
Author(s):
XU Feng LI Ping
College of Information Science and Engineering, Huaqiao University, Xiamen 361021, China
关键词:
交叉双脚步声 联合识别 多任务学习 融合特征
Keywords:
cross double footsteps joint recognition multitask learning fusion features
分类号:
TP391.4
DOI:
10.11830/ISSN.1000-5013.202008011
文献标志码:
A
摘要:
为了实现室外人体脚步声事件及环境联合识别,首先,设计一个复杂相似环境下的人体跑动和行走数据集,提出一种交叉双脚步声的分割方案,对连续脚步声信号进行交叉分割;然后,从事件和环境的角度分别提取特征,并从任务平衡的角度设计两种融合特征;最后,采用3种深度学习模型对任务进行精确地识别.结果表明:文中方法简化平衡了任务,使室外人体脚步声事件及环境联合识别的多任务设计不需要复杂模型就能实现精确识别.
Abstract:
In order to realize the joint recognition of outdoor human footsteps events and environment, firstly, a human running and walking data set in a complex and similar environment was designed, and a cross double footsteps segmentation scheme was proposed to cross segment the continuous footsteps signals. Then, features were extracted from the perspectives of events and environment, and two fusion features were designed from the perspective of task balance. Finally, three deep learning models were used to identify the task accurately. The results showed that the proposed method simplified and balanced the tasks, and the multitask design of joint identification of outdoor human footsteps events and environment could realize accurate identification without complicated models.

参考文献/References:

[1] MESAROS A,HEITTOLA T,BENETOS E,et al.Detection and classification of acoustic scenes and events: Outcome of the DCASE 2016 challenge[J].IEEE/ACM Transactions on Audio Speech and Language Processing,2017,26(2):379-393.DOI:10.1109/TASLP.2017.2778423.
[2] VIRTANEN T,MESAROS A,HEITTOLA T,et al.Proceedings of the detection and classification of acoustic scenes and events 2017 workshop(DCASE2017)[R/OL].(2017-09-15)[2019-12-10] .https://www.researchgate.net/publication/320409431_Deep_Sequential_Image_Features_on_Acoustic_Scene_Classification.
[3] PICZAK K J.Environmental sound classification with convolutional neural networks[C]//IEEE 25th International Workshop on Machine Learning for Signal Processing.Boston: IEEE Press,2015:1-6.DOI:10.1109/MLSP.2015.7324337.
[4] MESAROS A,HEITTOLA T,ERONEN A,et al.Acoustic event detection in real life recordings[C]//18th European Signal Processing Conference.Aalborg: IEEE Press,2010:1267-1271.
[5] YUN S,KIM S,MOOM S,et al.Discriminative training of GMM parameters for audio scene classification[R/OL].(2016-03-02)[2019-12-10] .https://www.aminer.cn/pub/5f44d5c49e795ee83b76546b/discriminative-training-of-gmm-parameters-for-audio-scene-classification-and-audio-tagging.
[6] RAKOTOMAMONJY A,GASSO G.Histogram of gradients of time-frequency representations for audio scene detection[J].IEEE/ACM Transactions on Audio Speech and Language Processing,2015,23(1):142-153.DOI:10.1109/TASLP.2014.2375575.
[7] CAI Rui,LU Lie,ZHANG Hongjiang,et al.A flexible framework for key audio effects detection and auditory context inference[J].IEEE Transactions on Audio, Speech, and Language Processing,2006,14(3):1026-1039.DOI:10.1109/TSA.2005.857575.
[8] HEITTOLA T,MESAROS A,ERONEN A,et al.Context-dependent sound event detection[J].EURASIP Journal on Audio, Speech, and Music Processing,2013(1):1-13.DOI:10.1186/1687-4722-2013-1.
[9] MESAROS A,DIMENT A,ELIZALDE B,et al.Sound event detection in the DCASE 2017 challenge[J].IEEE/ACM Transactions on Audio Speech and Language Processing,2019,27(6):992-1006.DOI:10.1109/TASLP.2019.2907016.
[10] IMOTO K,KYOCHI S.Sound event detection using graph Laplacian regularization based on event co-occurrence[C]//IEEE International Conference on Acoustics, Speech and Signal Processing.Brighton: IEEE Press,2019:1-5.DOI:10.1109/ICASSP.2019.8683708.
[11] CHAUDHURI S,RAJ B.Unsupervised hierarchical structure induction for deeper semantic analysis of audio[C]//IEEE International Conference on Acoustics, Speech and Signal Processing.Vancouver: IEEE Press,2013:833-837.DOI:10.1109/ICASSP.2013.6637765.
[12] TONAMI N,IMOTO K,NITTSUMA M,et al.Joint analysis of acoustic events and scenes based on multitask learning[C]//IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.New York:IEEE Press,2019:338-342.DOI:10.1109/WASPAA.2019.8937196.
[13] BEAR H L,NOLASCO I,BENETOS E.Towards joint sound scene and polyphonic sound event recognition[C]//Interspeech.Graz: [s.n.],2019:4594-4598.DOI:10.21437/Interspeech.2019-2169.
[14] WANG Wei,SERAJ F,MERATNIA N,et al.Privacy-aware environmental sound classification for indoor human activity recognition[C]//Proceedings of the 12th ACM International Conference on Pervasive Technologies Related to Assistive Environments.New York: Association for Computing Machinery,2019:36-44.DOI:10.1145/3316782.3321521.
[15] SHOLOKHOV A,SAHIDULLAH M,KINNUNEN T.Semi-supervised speech activity detection with an application to automatic speaker verification[J].Computer Speech and Language,2018,47(1):132-156.DOI:10.1016/j.csl.2017.07.005.
[16] HORI Y,ANDO T,FUKUDA A.Personal identification methods using footsteps of one step[C]//International Conference on Artificial Intelligence in Information and Communication.Tianjin: [s.n.],2020:73-78.DOI:10.1109/ICAIIC48513.2020.9065230.
[17] CHU S,NARAYANAN S,KUO C C J.Environmental sound recognition with time-frequency audio features[J].IEEE Transactions on Audio, Speech, and Language Processing,2009,17(6):1142-1158.DOI:10.1109/TASL.2009.2017438.
[18] WARCAKESTUDIOS.Fast steps on wet stones: Recorded with a T-Bonemicro[EB/OL].(2013-01-10)[2019-12-17] .https://freesound.org/people/WarcakeStudios/sounds/173596/.
[19] INSPECTORJ.Raw audio of running on thin, cracked ice on top of a gravel driveway with trainer shoes[EB/OL].(2018-01-30)[2019-12-17] .https://freesound.org/people/InspectorJ/sounds/416967/.
[20] AUDIONINJA001.Recorded with zoom h5 and sennheizer mk600[EB/OL].(2018-12-27)[2019-12-17] .https://freesound.org/people/audioninja001/packs/25644/.

备注/Memo

备注/Memo:
收稿日期: 2020-08-10
通信作者: 李平(1981-),女,副教授,博士,主要从事智能控制、非线性系统的研究.E-mail:pingping_1213@126.com.
基金项目: 国家自然科学基金资助项目(61603144); 福建省自然科学基金资助项目(2018J01095); 福建省高校产学研合作科技重大项目(2013H6016); 华侨大学中青年教师科技创新资助计划项目(ZQN-PY509)
更新日期/Last Update: 2021-09-20