您好,欢迎访问云南省农业科学院 机构知识库!

Infrared Spectral Study on the Origin Identification of Boletus Tomentipes Based on the Random Forest Algorithm and Data Fusion Strategy

文献类型: 外文期刊

作者: Hu Yi-ran 1 ; Li Jie-qing 1 ; Liu Hong-gao 2 ; Fan Mao-pan 1 ; Wang Yuan-zhong 3 ;

作者机构: 1.Yunnan Agr Univ, Coll Resources & Environm, Kunming 650201, Yunnan, Peoples R China

2.Yunnan Agr Univ, Coll Agron & Biotechnol, Kunming 650201, Yunnan, Peoples R China

3.Yunnan Acad Agr Sci, Inst Med Plants, Kunming 650200, Yunnan, Peoples R China

关键词: Boletus tomentipes; Geographic origin identification; Data fusion; Fourier transform mid-infrared spectrum; Fourier transform near infrared spectrum

期刊名称:SPECTROSCOPY AND SPECTRAL ANALYSIS ( 影响因子:0.589; 五年影响因子:0.504 )

ISSN: 1000-0593

年卷期: 2020 年 40 卷 5 期

页码:

收录情况: SCI

摘要: Boletus tomentipes Earleas a kind of healthy food is favored by the majority of consumers. The nutrient accumulation of the fruiting body is affected by the growth environment (altitude, climate, etc. ). There is a significant difference in the content of nutrient between different regionsIt is urgent to establish an accurate, rapid and cheap origin identification technology. In this paper, a data fusion strategy combined with random forest algorithm (RF) was used to identify the origin of B. tomentipes, and the effects of various eigenvalue extraction methods on the classification of RF models were compared. Fourier transform near infrared and Fourier transform mid-infrared spectra of 87 samples from 4 producing areas (north subtropics, north temperate zones, south subtropical zones and middle subtropical zones) were scanned to analyze their spectral characteristics. All the sampleswere divided into two thirds of the training set (58) and a third of the validation set (29) by the kennard-stone algorithm. Based on 4 kinds of infrared spectra ( near-infrared average spectra of stipes (N-b) , near-infrared average spectra of caps (N-g) , mid-infrared average spectra of stipes (M-b) , mid-infrared average spectra of caps (M-g)) and three data fusion strategies (low-level fusion strategies, mid-level fusion strategies, high-level fusion strategies) of data, combining with the RF building identification model, the effects of different characteristic value (variable importance in projection, Boruta, latent variables) on the classification results of the model are compared. Among them, the optimal ntree and mtrywere selected according to oob. The classification performance of the model was evaluated with specificity, sensitivity, training set correctness, and validation set accuracy. Finally, the best method to identify the origin of B. tomentipes was found by multiple evaluation indicators. The results showed that (1) near infrared and middle infrared spectra could identify the origin of B. tomentipes. (2) It is not ideal for establish a discriminant model with a single spectrum combined with RF. (3) All three fusion strategies can improve the origin identification effect of B. tomentipes. Theresults of origin identification from good to bad are in order of high-level fusion, midlevel fusion, low-level fusion. By scanning the near infrared and middle infrared spectra of B. tomentipes, a high-level fusion strategy based on characteristic value LV was adopted, and the identification model of B. tomentipes from different regions was established with RF, which has high verification set accuracy (99. 6%), high sensitivity (0. 969) and high specificity (0. 986). As a reliable method, it can identify the geographical origin of B. tomentipes quickly and accurately.

  • 相关文献
作者其他论文 更多>>