[1]林苗君,羊梓敏,陈斌.机器学习模型交易中的数据购买量与模型定价[J].华侨大学学报(自然科学版),2025,(1):95-103.[doi:10.11830/ISSN.1000-5013.202404034]
 LIN Miaojun,YANG Zimin,CHEN Bin.Data Purchase Volume and Model Pricing in Machine Learning Model Transactions[J].Journal of Huaqiao University(Natural Science),2025,(1):95-103.[doi:10.11830/ISSN.1000-5013.202404034]
点击复制

机器学习模型交易中的数据购买量与模型定价()
分享到:

《华侨大学学报(自然科学版)》[ISSN:1000-5013/CN:35-1079/N]

卷:
期数:
2025年第1期
页码:
95-103
栏目:
出版日期:
2025-01-10

文章信息/Info

Title:
Data Purchase Volume and Model Pricing in Machine Learning Model Transactions
文章编号:
1000-5013(2025)01-0095-09
作者:
林苗君1 羊梓敏2 陈斌2
1. 华侨大学 财务处, 福建 泉州 362021;2. 华侨大学 数学科学学院, 福建 泉州 362021
Author(s):
LIN Miaojun1 YANG Zimin2 CHEN Bin2
1. Financial Department, Huaqiao University, Quanzhou 362021, China; 2. School of Mathematical Science, Huaqiao University, Quanzhou 362021, China
关键词:
数据定价 模型定价 Shapley值 整数规划
Keywords:
data pricing model pricing Shapley value integer programming
分类号:
TP274;F49
DOI:
10.11830/ISSN.1000-5013.202404034
文献标志码:
A
摘要:
基于数据边界的Shapley值总和构建成本分配问题,采用截断蒙特卡洛的快速算法计算Shapley值,证明了数据边界最优解存在性。针对不同版本模型定价场景,定义模型经纪人收入最大化模型的定价问题,并将收入最大化模型的定价问题转换为等价整数线性规划问题。运用公共数据集数值验证文中方法的正确性,同时与已有的4种方法进行实验对比。实验结果表明:文中方法可以提高模型经纪人收入和模型买方的购买比例。
Abstract:
Cost allocation problem based on Shapley value summation on data boundaries is constructed. Using the fast algrithm of the Truncated Monte Carlo, the existence of the optimal solution of the data boundary is proved. Aiming at pricing scenarios of different versions of the models, the pricing problem of the income maximizing model of the model broker is defined, and the pricing problem of the income maximizing model is transformed into an equivalent integer linear programming problem. By public datasets, our proposed method is validated its correctness, and the experiment is compared with four existing methods, simultaneously. The experimental results show that the proposed method can increase the income of the model broker and the purchase ratio of the model buyer.

参考文献/References:

[1] SONG Jie,HE Guannan,WANG Jianxiao,et al.Shaping futurelow-carbon energy and transportation systems digital technologies and applications[J].Energy,2022,1(3):285-305.DOI:10.23919/IEN.2022.0040.
[2] PEI Jian.A survey on data pricing: From economics to data science[J].IEEE Transactions on Knowledge and Data Engineering,2022,34(10):4586-4608.DOI:10.1109/TKDE.2020.3045927.
[3] LI Xijun,YAO Jianguo,LIU Xue,et al.A first look at information entropy-based data pricing[C]//Proceedings of the 37th International Conference on Distributed Computing Systems.Atlanta:IEEE Press,2017:2053-2060.DOI:10.1109/ICDCS.2017.45.
[4] SHEN Yuncheng,GUO Bing,SHEN Yan,et al.A pricing model for big personal data[J].Tsinghua Science and Technology,2016,21(5):482-490.DOI:10.1109/TST.2016.7590317.
[5] YU Haifei,ZHANG Mengxiao.Data pricing strategy based on data quality[J].Computers & Industrial Engineering,2017,112:1-10.DOI:10.1016/j.cie.2017.08.008.
[6] YANG Jian,ZHAO Chongchong,XING Chunxiao,et al.Big data market optimization pricing model based on data quality[J].Complexity,2019(2):1-10.DOI:10.1155/2019/5964068.
[7] ZHANG Meng,ARAFA A,HUANG Jianwei,et al.Pricing fresh data[J].IEEE Journal on Selected Areas in Communications,2021,39(5):1211-1225.DOI:10.1109/JSAC.2021.3065088.
[8] ZHANG Meng,ARAFA A,HUANG Jianwei,et al. How to price fresh data[C]// Proceedings of the 2019 International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks.Avignon:IEEE Press,2019:1-8.DOI:10.23919/WiOPT47501.2019.9144091.
[9] NIYATO D,ALSHEIKH M A,WANG Ping,et al.Market model and optimal pricing scheme of big data and internet of things(IoT)[C]//Proceedings of the 2016 IEEE International Conference on Communications.Malaysia:IEEE Press,2016:1-6.DOI:10.1109/ICC.2016.7510922.
[10] OH H,PARK S,LEE G,et al.Competitive data trading model with privacy valuation for multiple stakeholders in IoT data markets[J].IEEE Internet of Things Journal,2020,7(4):3623-3639.DOI:10.1109/JIOT.2020.2973662.
[11] TIAN Yingjie,DING Yurong,FU Saiji,et al.Data boundary and data pricing based on the shapley value[J].IEEE Access,2022,10:14288-14300.DOI:10.1109/ACCESS.2022.3147799.
[12] GHORBANI A,ZOU J.Data shapley: Equitable valuation of data for machine learning[C]//Proceedings of the 36th International Conference on Machine Learning.California:PMLR Press,2019:2242-2251.DOI:10.48550/arXiv.1904.02868.
[13] CHEN Lingjiao,KOUTRIS P,KUMAR A.Towards model-based pricing for machine learningin a data marketplace[C]// Proceedings of the 2019 International Conference on Management of Data.Amsterdam:ACM Press,2019:1535-1552.DOI:10.1145/3299869.3300078.
[14] ALBERINI A.Efficiency vsbias of willingness-to-pay estimates: Bivariate and interval-data models[J].Journal of Environmental Economics and Management,1995,29(2):169-180.DOI:10.1006/jeem.1995.1039.
[15] YEH I,HSU T.Building real estate valuation models with comparative approach through case-based reasoning[J].Applied Soft Computing,2018,65:260-271.DOI:10.1016/j.asoc.2018.01.029.
[16] DONG Xin,SAHA B,SRIVASTAVA D.Less is more: Selecting sources wisely for integration[J].VLDB Endowment,2012,6(2):37-48.DOI:10.14778/2535568.2448938.
[17] LIU Jinfei,LOU Jian,LIU Junxu,et al.Dealer: An end-to-end model marketplace with differential privacy[J].VLDB Endowment,2021,14(6):957-969.DOI:10.14778/ 3447689.3447700.
[18] JIA Ruoxi,DAO D,WANG Boxin,et al.Efficient task-specific data valuation for nearest neighbor algorithms[J].VLDB Endowment,2019,12(11):1610-1623.DOI:10.14778/3342263.3342637.

备注/Memo

备注/Memo:
收稿日期: 2024-04-11
通信作者: 陈斌(1984-),男,副教授,博士,主要从事运筹学与控制论的研究。E-mail:chenbinmath@163.com。
基金项目: 国家自然科学基金资助项目(12071165); 福建省自然科学基金资助项目(2023J01124); 中央高校基本科研业务费专项资金资助(ZQN-1102)
更新日期/Last Update: 2025-01-20