[1]郭宇红,童云海.隐私保护频繁项集挖掘中的分组随机化模型[J].华侨大学学报(自然科学版),2020,41(2):230-236.[doi:10.11830/ISSN.1000-5013.201911025]
 GUO Yuhong,TONG Yunhai.Grouping Randomized Model in Privacy Preserving Frequent Item Set Mining[J].Journal of Huaqiao University(Natural Science),2020,41(2):230-236.[doi:10.11830/ISSN.1000-5013.201911025]
点击复制

隐私保护频繁项集挖掘中的分组随机化模型()
分享到:

《华侨大学学报(自然科学版)》[ISSN:1000-5013/CN:35-1079/N]

卷:
第41卷
期数:
2020年第2期
页码:
230-236
栏目:
出版日期:
2020-03-20

文章信息/Info

Title:
Grouping Randomized Model in Privacy Preserving Frequent Item Set Mining
文章编号:
1000-5013(2020)02-0230-07
作者:
郭宇红1 童云海2
1. 国际关系学院 信息科技学院, 北京 100091;2. 北京大学 智能科学系, 北京 100871
Author(s):
GUO Yuhong1 TONG Yunhai2
1. School of Information Science and Technology, University of International Relations, Beijing 100091, China; 2. Department of Intelligence Science, Peking University, Beijing 100871, China
关键词:
随机化回答 隐私保护 频繁项集 支持度重构 数据挖掘 沃纳模型
Keywords:
randomized response privacy preserving frequent item set support reconstruction data mining Warner model
分类号:
TP311
DOI:
10.11830/ISSN.1000-5013.201911025
文献标志码:
A
摘要:
通过对隐私保护频繁项集挖掘问题的研究,发现现有的单参数随机化回答模型调控的数据范围宽、粒度粗,导致无法实现精细化、差异化的隐私保护的问题.在沃纳模型、单参数等随机化模型的基础上,提出个体分组多参随机化PN/g模型,给出其在隐私保护频繁项集挖掘中的支持度重构方法.研究结果表明:该模型面向多样化、差异化的隐私保护需求,将N个不同个体分为若干组,每组设置不同的随机化参数,可实现差异化的隐私保护效果.实例分析表明:结合所提出的支持度重构方法,可实现基于分组随机化的隐私保护频繁项集挖掘,在保护不同群体隐私的同时,挖掘到有效的频繁项集与关联规则.
Abstract:
Through the research of privacy preserving frequent item set mining, it is found that the existing single-parameter randomized response model regulates the data range wide and the granularity coarse, which leads to the problem that the privacy protection can not be refined and differentiated. Based on Warner model and single-parameter randomization model, an individual grouping multi-parameter randomized model of PN/g is proposed. The corresponding support degree reconstruction method in privacy preserving frequent item set mining is given. The research results show that the model is oriented to diversified and differentiated privacy protection needs, and N different individuals are divided into several groups, and each group is set with different randomization parameters, which can achieve differentiated privacy protection effects. Example analysis shows that combined with the proposed support reconstruction method, privacy preserving frequent item set mining based on grouping randomization can be realized, while protecting the privacy of different groups, effective frequent item sets and association rules can be mined.

参考文献/References:

[1] KENTHAPADI K,MIRONOV I,THAKURTA A G.Privacy-preserving data mining in industry[C]//Proc of the Twelfth ACM International Conference on Web Search and Data Mining(WSDM’19).New York:ACM Press,2019:1308-1310.DOI:10.1145/3308560.3320085.
[2] KOROLOVA A.Privacy-preserving WSDM[C]//Proc of the Twelfth ACM International Conference on Web Search and Data Mining(WSDM’19).New York:ACM Press,2019:4.DOI:10.1145/3289600.3291385.
[3] LI Yaliang,MIAO Chenglin,SU Lu,et al.An efficient two-layer mechanism for privacy-preserving truth discovery[C]//Proc of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(KDD’18).New York:ACM Press,2018:1705-1714.DOI:10.1145/3219819.3219998.
[4] BULLEK B,GARBOSKI S,MIR D J,et al.Towards understanding differential privacy: When do people trust randomized response technique[C]//Proc of the 2017 CHI Conference on Human Factors in Computing Systems(CHI’17).New York:ACM Press,2017:3833-3837.DOI:10.1145/3025453.3025698.
[5] ALDÀ F,SIMON H U.Randomized response schemes, privacy and usefulness[C]//Proc of the 2014 Workshop on Artificial Intelligent and Security Workshop(AISec’14).New York:ACM Press,2014:15-26.DOI:10.1145/2666652.2666654.
[6] WARNER S L.Randomized response: A survey technique for eliminating evasive answer bias[J].The American Statistical Association,1965,60(309):63-69.DOI:10.2307/2283137.
[7] 郭宇红,童云海,唐世渭,等.带学习的同步隐私保护频繁模式挖掘[J].软件学报,2011,22(8):1749-1760.DOI:10.3724/SP.J.1001.2011.04000.
[8] SUN Chongjing,FU Yan,ZHOU Junlin,et al.Personalized privacy-preserving frequent itemset mining using randomized response[J].The Scientific World Journal,2014,2014:1-10.DOI:10.1155/2014/686151.
[9] 丁丽萍,卢国庆.面向频繁模式挖掘的差分隐私保护研究综述[J].通信学报,2014,35(10):200-209.DOI:10.3969/j.issn.1000-436x.2014.10.023.
[10] 许胜之.满足差分隐私保护的频繁模式挖掘关键技术研究[D].北京:北京邮电大学,2016.
[11] 蒋辰,杨庚,白云璐,等.面向隐私保护的频繁项集挖掘算法[J].信息网络安全,2019(4):73-81.DOI:10.3969/j.issn.1671-1122.2019.04.009.
[12] 张鹏,于波,童云海,等.基于随机响应的隐私保护关联规则挖掘[C]//第二十一届中国数据库学术会议论文集.厦门:中国计算机学会,2004:310-313.
[13] 邢欢.基于隐私保护的关联规则挖掘研究[D].南京:南京邮电大学,2016.
[14] RIZVI S J,HARITSA J R.Maintaining data privacy in association rule mining[C]//Proc of the 28th Int’l Conf on Very Large Data Bases(VLDB’02).San Francisco:Margan Kaufmann,2002:682-698.DOI:10.1016/B978-155860869-6/50066-4.
[15] AGRAWAL S,KRISHNAN V,HARITSA J.On addressing efficiency concerns in privacy preserving mining[C]//Proc of the 9th Int’l Conf on Database Systems for Advanced Applications(DASFAA’04).Berlin:Springer-Verlag,2004:113-124.DOI:10.1007/978-3-540-24571-1.
[16] XIA Yi,YANG Yirong,CHI Yun.Mining association rules with non-uniform privacy concerns[C]//Proc of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery(DMKD’04).New York: ACM Press,2004:27-34.DOI:10.1145/1008694.1008699.
[17] ANDRUSZKIEWICZ P. Optimization for MASK scheme in privacy preserving data mining for association rules[C]//Proc of Int’l Conf on Rough Sets and Emerging Intelligent Systems Paradigms(RSEISP’07). Berlin:Springer-Verlag,2007:465-474.DOI:10.1007/978-3-540-73451-2_49.
[18] 张健,刘韶涛.改进的频繁和高效用项集挖掘算法[J].华侨大学学报(自然科学版),2017,38(6):880-885.DOI:10.11830/ISSN.1000-5013.201603067.

备注/Memo

备注/Memo:
收稿日期: 2019-11-08
通信作者: 郭宇红(1979-),女,副教授,博士,主要从事数据挖掘、推荐系统的研究.E-mail:yhguo@uir.cn.
基金项目: 国际关系学院中央高校基本科研业务费专项资金资助项目(3262017T48, 3262018T02)
更新日期/Last Update: 2020-03-20