数据集载入
1 | # -*- coding: utf-8 -*- |
1 | df.head() |
| Unnamed: 0 | custid | trade_no | bank_card_no | low_volume_percent | middle_volume_percent | take_amount_in_later_12_month_highest | trans_amount_increase_rate_lately | trans_activity_month | trans_activity_day | ... | loans_max_limit | loans_avg_limit | consfin_credit_limit | consfin_credibility | consfin_org_count_current | consfin_product_count | consfin_max_limit | consfin_avg_limit | latest_query_day | loans_latest_day | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 5 | 2791858 | 20180507115231274000000023057383 | 卡号1 | 0.01 | 0.99 | 0 | 0.90 | 0.55 | 0.313 | ... | 2900.0 | 1688.0 | 1200.0 | 75.0 | 1.0 | 2.0 | 1200.0 | 1200.0 | 12.0 | 18.0 |
| 1 | 10 | 534047 | 20180507121002192000000023073000 | 卡号1 | 0.02 | 0.94 | 2000 | 1.28 | 1.00 | 0.458 | ... | 3500.0 | 1758.0 | 15100.0 | 80.0 | 5.0 | 6.0 | 22800.0 | 9360.0 | 4.0 | 2.0 |
| 2 | 12 | 2849787 | 20180507125159718000000023114911 | 卡号1 | 0.04 | 0.96 | 0 | 1.00 | 1.00 | 0.114 | ... | 1600.0 | 1250.0 | 4200.0 | 87.0 | 1.0 | 1.0 | 4200.0 | 4200.0 | 2.0 | 6.0 |
| 3 | 13 | 1809708 | 20180507121358683000000388283484 | 卡号1 | 0.00 | 0.96 | 2000 | 0.13 | 0.57 | 0.777 | ... | 3200.0 | 1541.0 | 16300.0 | 80.0 | 5.0 | 5.0 | 30000.0 | 12180.0 | 2.0 | 4.0 |
| 4 | 14 | 2499829 | 20180507115448545000000388205844 | 卡号1 | 0.01 | 0.99 | 0 | 0.46 | 1.00 | 0.175 | ... | 2300.0 | 1630.0 | 8300.0 | 79.0 | 2.0 | 2.0 | 8400.0 | 8250.0 | 22.0 | 120.0 |
5 rows × 90 columns
1 | df.info() |
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4754 entries, 0 to 4753
Data columns (total 90 columns):
Unnamed: 0 4754 non-null int64
custid 4754 non-null int64
trade_no 4754 non-null object
bank_card_no 4754 non-null object
low_volume_percent 4752 non-null float64
middle_volume_percent 4752 non-null float64
take_amount_in_later_12_month_highest 4754 non-null int64
trans_amount_increase_rate_lately 4751 non-null float64
trans_activity_month 4752 non-null float64
trans_activity_day 4752 non-null float64
transd_mcc 4752 non-null float64
trans_days_interval_filter 4746 non-null float64
trans_days_interval 4752 non-null float64
regional_mobility 4752 non-null float64
student_feature 1756 non-null float64
repayment_capability 4754 non-null int64
is_high_user 4754 non-null int64
number_of_trans_from_2011 4752 non-null float64
first_transaction_time 4752 non-null float64
historical_trans_amount 4754 non-null int64
historical_trans_day 4752 non-null float64
rank_trad_1_month 4752 non-null float64
trans_amount_3_month 4754 non-null int64
avg_consume_less_12_valid_month 4752 non-null float64
abs 4754 non-null int64
top_trans_count_last_1_month 4752 non-null float64
avg_price_last_12_month 4754 non-null int64
avg_price_top_last_12_valid_month 4650 non-null float64
reg_preference_for_trad 4752 non-null object
trans_top_time_last_1_month 4746 non-null float64
trans_top_time_last_6_month 4746 non-null float64
consume_top_time_last_1_month 4746 non-null float64
consume_top_time_last_6_month 4746 non-null float64
cross_consume_count_last_1_month 4328 non-null float64
trans_fail_top_count_enum_last_1_month 4738 non-null float64
trans_fail_top_count_enum_last_6_month 4738 non-null float64
trans_fail_top_count_enum_last_12_month 4738 non-null float64
consume_mini_time_last_1_month 4728 non-null float64
max_cumulative_consume_later_1_month 4754 non-null int64
max_consume_count_later_6_month 4746 non-null float64
railway_consume_count_last_12_month 4742 non-null float64
pawns_auctions_trusts_consume_last_1_month 4754 non-null int64
pawns_auctions_trusts_consume_last_6_month 4754 non-null int64
jewelry_consume_count_last_6_month 4742 non-null float64
status 4754 non-null int64
source 4754 non-null object
first_transaction_day 4752 non-null float64
trans_day_last_12_month 4752 non-null float64
id_name 4478 non-null object
apply_score 4450 non-null float64
apply_credibility 4450 non-null float64
query_org_count 4450 non-null float64
query_finance_count 4450 non-null float64
query_cash_count 4450 non-null float64
query_sum_count 4450 non-null float64
latest_query_time 4450 non-null object
latest_one_month_apply 4450 non-null float64
latest_three_month_apply 4450 non-null float64
latest_six_month_apply 4450 non-null float64
loans_score 4457 non-null float64
loans_credibility_behavior 4457 non-null float64
loans_count 4457 non-null float64
loans_settle_count 4457 non-null float64
loans_overdue_count 4457 non-null float64
loans_org_count_behavior 4457 non-null float64
consfin_org_count_behavior 4457 non-null float64
loans_cash_count 4457 non-null float64
latest_one_month_loan 4457 non-null float64
latest_three_month_loan 4457 non-null float64
latest_six_month_loan 4457 non-null float64
history_suc_fee 4457 non-null float64
history_fail_fee 4457 non-null float64
latest_one_month_suc 4457 non-null float64
latest_one_month_fail 4457 non-null float64
loans_long_time 4457 non-null float64
loans_latest_time 4457 non-null object
loans_credit_limit 4457 non-null float64
loans_credibility_limit 4457 non-null float64
loans_org_count_current 4457 non-null float64
loans_product_count 4457 non-null float64
loans_max_limit 4457 non-null float64
loans_avg_limit 4457 non-null float64
consfin_credit_limit 4457 non-null float64
consfin_credibility 4457 non-null float64
consfin_org_count_current 4457 non-null float64
consfin_product_count 4457 non-null float64
consfin_max_limit 4457 non-null float64
consfin_avg_limit 4457 non-null float64
latest_query_day 4450 non-null float64
loans_latest_day 4457 non-null float64
dtypes: float64(70), int64(13), object(7)
memory usage: 3.3+ MB
1 | typedic = {} # 类型字典 |
int64格式共有13个: ['Unnamed: 0', 'custid', 'take_amount_in_later_12_month_highest', 'repayment_capability', 'is_high_user', 'historical_trans_amount', 'trans_amount_3_month', 'abs', 'avg_price_last_12_month', 'max_cumulative_consume_later_1_month', 'pawns_auctions_trusts_consume_last_1_month', 'pawns_auctions_trusts_consume_last_6_month', 'status']
object格式共有7个: ['trade_no', 'bank_card_no', 'reg_preference_for_trad', 'source', 'id_name', 'latest_query_time', 'loans_latest_time']
float64格式共有70个: ['low_volume_percent', 'middle_volume_percent', 'trans_amount_increase_rate_lately', 'trans_activity_month', 'trans_activity_day', 'transd_mcc', 'trans_days_interval_filter', 'trans_days_interval', 'regional_mobility', 'student_feature', 'number_of_trans_from_2011', 'first_transaction_time', 'historical_trans_day', 'rank_trad_1_month', 'avg_consume_less_12_valid_month', 'top_trans_count_last_1_month', 'avg_price_top_last_12_valid_month', 'trans_top_time_last_1_month', 'trans_top_time_last_6_month', 'consume_top_time_last_1_month', 'consume_top_time_last_6_month', 'cross_consume_count_last_1_month', 'trans_fail_top_count_enum_last_1_month', 'trans_fail_top_count_enum_last_6_month', 'trans_fail_top_count_enum_last_12_month', 'consume_mini_time_last_1_month', 'max_consume_count_later_6_month', 'railway_consume_count_last_12_month', 'jewelry_consume_count_last_6_month', 'first_transaction_day', 'trans_day_last_12_month', 'apply_score', 'apply_credibility', 'query_org_count', 'query_finance_count', 'query_cash_count', 'query_sum_count', 'latest_one_month_apply', 'latest_three_month_apply', 'latest_six_month_apply', 'loans_score', 'loans_credibility_behavior', 'loans_count', 'loans_settle_count', 'loans_overdue_count', 'loans_org_count_behavior', 'consfin_org_count_behavior', 'loans_cash_count', 'latest_one_month_loan', 'latest_three_month_loan', 'latest_six_month_loan', 'history_suc_fee', 'history_fail_fee', 'latest_one_month_suc', 'latest_one_month_fail', 'loans_long_time', 'loans_credit_limit', 'loans_credibility_limit', 'loans_org_count_current', 'loans_product_count', 'loans_max_limit', 'loans_avg_limit', 'consfin_credit_limit', 'consfin_credibility', 'consfin_org_count_current', 'consfin_product_count', 'consfin_max_limit', 'consfin_avg_limit', 'latest_query_day', 'loans_latest_day']
1 | df.columns |
Index(['Unnamed: 0', 'custid', 'trade_no', 'bank_card_no',
'low_volume_percent', 'middle_volume_percent',
'take_amount_in_later_12_month_highest',
'trans_amount_increase_rate_lately', 'trans_activity_month',
'trans_activity_day', 'transd_mcc', 'trans_days_interval_filter',
'trans_days_interval', 'regional_mobility', 'student_feature',
'repayment_capability', 'is_high_user', 'number_of_trans_from_2011',
'first_transaction_time', 'historical_trans_amount',
'historical_trans_day', 'rank_trad_1_month', 'trans_amount_3_month',
'avg_consume_less_12_valid_month', 'abs',
'top_trans_count_last_1_month', 'avg_price_last_12_month',
'avg_price_top_last_12_valid_month', 'reg_preference_for_trad',
'trans_top_time_last_1_month', 'trans_top_time_last_6_month',
'consume_top_time_last_1_month', 'consume_top_time_last_6_month',
'cross_consume_count_last_1_month',
'trans_fail_top_count_enum_last_1_month',
'trans_fail_top_count_enum_last_6_month',
'trans_fail_top_count_enum_last_12_month',
'consume_mini_time_last_1_month',
'max_cumulative_consume_later_1_month',
'max_consume_count_later_6_month',
'railway_consume_count_last_12_month',
'pawns_auctions_trusts_consume_last_1_month',
'pawns_auctions_trusts_consume_last_6_month',
'jewelry_consume_count_last_6_month', 'status', 'source',
'first_transaction_day', 'trans_day_last_12_month', 'id_name',
'apply_score', 'apply_credibility', 'query_org_count',
'query_finance_count', 'query_cash_count', 'query_sum_count',
'latest_query_time', 'latest_one_month_apply',
'latest_three_month_apply', 'latest_six_month_apply', 'loans_score',
'loans_credibility_behavior', 'loans_count', 'loans_settle_count',
'loans_overdue_count', 'loans_org_count_behavior',
'consfin_org_count_behavior', 'loans_cash_count',
'latest_one_month_loan', 'latest_three_month_loan',
'latest_six_month_loan', 'history_suc_fee', 'history_fail_fee',
'latest_one_month_suc', 'latest_one_month_fail', 'loans_long_time',
'loans_latest_time', 'loans_credit_limit', 'loans_credibility_limit',
'loans_org_count_current', 'loans_product_count', 'loans_max_limit',
'loans_avg_limit', 'consfin_credit_limit', 'consfin_credibility',
'consfin_org_count_current', 'consfin_product_count',
'consfin_max_limit', 'consfin_avg_limit', 'latest_query_day',
'loans_latest_day'],
dtype='object')
可视化各特征
1 | name_list_1 = [ |
1 | f, ax = plt.subplots(5, 3, figsize=(20, 20)) |

1 | f, ax = plt.subplots(5, 3, figsize=(20, 20)) |

1 | f, ax = plt.subplots(5, 3, figsize=(20, 20)) |

1 | f, ax = plt.subplots(5, 3, figsize=(20, 20)) |

1 | f, ax = plt.subplots(5, 3, figsize=(20, 20)) |

1 | f, ax = plt.subplots(3, 3, figsize=(20, 20)) |

从以上分布图可以看出,连续型数据的特征基本上符合正态分布,指数分布或者泊松分布等,但在[‘rank_trad_1_month’,’trans_top_time_last_1_month’,’trans_top_time_last_6_month’,’consume_top_time_last_1_month’,’consume_top_time_last_6_month’,’loans_credit_limit’,’loans_credibility_limit’,’loans_max_limit’,’consfin_credibility’]这几个特征上出现数据于最左最右位置会出现骤升。从业务层面考虑,暂时未得出满意结论,待确认特征取数逻辑后再分析。
还有,’avg_price_last_12_month’,’consfin_credit_limit’这两个特征虽然是连续型特征,但由图上看,出现标签分段,考虑继续探索,查看是否存在分段。
另外几个离散型特征,如’reg_preference_for_trad’,’regional_mobility’等,在标签status上区别明显,感觉是对模型有用的特征。
1 | sns.boxplot(y = 'consfin_credit_limit',x = 'status',data = df) |

1 | sns.boxplot(y = 'avg_price_last_12_month',x = 'status',data = df) |

1 | sns.boxplot(y = 'rank_trad_1_month',x = 'status',data = df) |

数据预处理
为什么需要处理缺失值?
不同缺失值的处理方式在一定程度上影响了特征提取、建模和模型训练缺失值太多,可以尝试着直接删除,如果不删除,处理不好,可能会引来噪声缺失值较少,少于某一缺失率时,直接删除又会带来信息的损失,此时可以采取适当的填充方式。
缺失率大于多少时应当抛弃该特征?
一般为70%,但是还要分析该特征与训练目标的重要程度
缺失值填充有哪些方法?
插值填充
- 特殊值
- 均值、加权均值(缺点:降低数据的方差,即随机性)
- 中位数
- 众数(多用在类别变数)
- …
插补法
- 随机插补法:从总体中随机抽取某个样本代替缺失样本
- 多重插补法:通过变量之间的关系对缺失数据进行预测,利用蒙特卡洛方法生成多个完整的数据集,在对这些数据集进行分析,最后对分析结果进行汇总处理
- 热平台插补:指在非缺失数据集中找到一个与缺失值所在样本相似的样本(匹配样本),利用其中的观测值对缺失值进行插补
- 拉格朗日差值法
- 牛顿插值法
列的维度上,如果是连续性,就使用平均值插补,如果是离散性,就使用众数来插补; 行的维度上,引入预测模型,可考虑辅助回归,通过变量间的关系来预测缺失数据
丢弃
如果行和列的缺失达到一定的比例,建议放弃整行或整列数据
模型法
更多时候我们会基于已有的其他字段,将缺失字段作为目标变量进行预测,从而得到最为可能的补全值。如果带有缺失值的列是数值变量,采用回归模型补全;如果是分类变量,则采用分类模型补全
专家补全
对于少量且具有重要意义的数据记录,专家补足也是非常重要的一种途径
不处理
- 当缺失值对模型的影响不大时,直接在包含空值的数据上进行数据挖掘
- 很多模型对于缺失值有容忍度或灵活的处理方法,因此在预处理阶段可以不做处理
- 树模型对于缺失值的敏感度较低,大部分时候可以在数据有缺失时使用
- 涉及到距离度量(distance measurement)时,如计算两个点之间的距离,缺失数据就变得比较重要。因为涉及到“距离”这个概念,那么缺失值处理不当就会导致效果很差,如K近邻算法(KNN)和支持向量机(SVM)
- 线性模型的代价函数(loss function)往往涉及到距离(distance)的计算,计算预测值和真实值之间的差别,这容易导致对缺失值敏感
- 神经网络的鲁棒性强,对于缺失数据不是非常敏感,但一般没有那么多数据可供使用。
- 贝叶斯模型对于缺失数据也比较稳定,数据量很小的时候首推贝叶斯模型。
时间序列
依情况处理,主要看单独时间字段或时间字段与某些字段的组合属性对目标分析的作用程度,再采取相应方式来进行特征提取比如:可以将时间字段与其他字段属性进行组合,分析每天、每周、每月或特点星期几等情况下特征数据频率信息,总的来说还是得看分析得目标
通常看到的情况是,除了考虑时间序列这个单独的特征外,往往是将时间序列和具有时间属性的特征联合起来分析,查看组合特征的对所需要分析的内容的影响
异常值和离群值
大多数的参数统计数值如均值、标准差、相关系数等,以及基于这些参数的统计分析均对离群值高度敏感。因此离群值的存在会对数据分析造成极大影响
判断
- 用分位数搞定,看跟平均值的偏差超过几倍标准差
- LOF算法
- describe的时候加一个箱型图
处理
- 对数转换
- 缩尾
- 截尾
- 插值
无用特征删除
将所有object格式的改为数值格式,其中有部分无用特征则可直接删除
由上可看出需处理包含:[‘trade_no’, ‘bank_card_no’, ‘reg_preference_for_trad’, ‘source’, ‘id_name’, ‘latest_query_time’, ‘loans_latest_time’]
1 | df['source'].value_counts(dropna = False) |
xs 4754
Name: source, dtype: int64
1 | to_drop = ['trade_no', 'bank_card_no', 'student_feature','cross_consume_count_last_1_month', 'id_name', 'source'] |
缺失填充
1 | df['reg_preference_for_trad'].value_counts(dropna = False) |
一线城市 3403
三线城市 1064
境外 150
二线城市 131
其他城市 4
NaN 2
Name: reg_preference_for_trad, dtype: int64
1 | df['reg_preference_for_trad'].fillna('其他城市',inplace = True) |
一线城市 3403
三线城市 1064
境外 150
二线城市 131
其他城市 6
Name: reg_preference_for_trad, dtype: int64
1 | df.isnull().sum().sort_values(ascending = False) |
latest_three_month_apply 304
query_finance_count 304
latest_six_month_apply 304
latest_query_time 304
query_sum_count 304
query_cash_count 304
latest_one_month_apply 304
query_org_count 304
apply_credibility 304
apply_score 304
latest_query_day 304
latest_three_month_loan 297
loans_score 297
loans_credibility_behavior 297
loans_count 297
loans_settle_count 297
loans_overdue_count 297
loans_org_count_behavior 297
consfin_org_count_behavior 297
loans_cash_count 297
latest_one_month_loan 297
loans_latest_day 297
latest_six_month_loan 297
loans_product_count 297
history_suc_fee 297
consfin_avg_limit 297
consfin_product_count 297
consfin_org_count_current 297
consfin_credibility 297
consfin_credit_limit 297
...
trans_amount_increase_rate_lately 3
trans_days_interval 2
regional_mobility 2
number_of_trans_from_2011 2
middle_volume_percent 2
transd_mcc 2
trans_activity_day 2
trans_activity_month 2
low_volume_percent 2
historical_trans_day 2
first_transaction_time 2
first_transaction_day 2
rank_trad_1_month 2
avg_consume_less_12_valid_month 2
trans_day_last_12_month 2
top_trans_count_last_1_month 2
max_cumulative_consume_later_1_month 0
custid 0
status 0
take_amount_in_later_12_month_highest 0
pawns_auctions_trusts_consume_last_6_month 0
pawns_auctions_trusts_consume_last_1_month 0
reg_preference_for_trad 0
historical_trans_amount 0
avg_price_last_12_month 0
abs 0
repayment_capability 0
is_high_user 0
trans_amount_3_month 0
Unnamed: 0 0
Length: 84, dtype: int64
特征间探索
1 | corr = df[:].corr() |

1 | corr |
| Unnamed: 0 | custid | low_volume_percent | middle_volume_percent | take_amount_in_later_12_month_highest | trans_amount_increase_rate_lately | trans_activity_month | trans_activity_day | transd_mcc | trans_days_interval_filter | ... | loans_max_limit | loans_avg_limit | consfin_credit_limit | consfin_credibility | consfin_org_count_current | consfin_product_count | consfin_max_limit | consfin_avg_limit | latest_query_day | loans_latest_day | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Unnamed: 0 | 1.000000 | -0.007628 | 0.002861 | -0.022844 | -0.019728 | 0.014076 | -0.021715 | -0.001027 | -0.006975 | -0.021338 | ... | 0.012189 | -0.004457 | 0.005967 | 0.007746 | 0.012062 | 0.009583 | 0.004020 | 0.008860 | 0.020774 | 0.026387 |
| custid | -0.007628 | 1.000000 | 0.031430 | 0.090599 | -0.030282 | -0.001299 | -0.118811 | -0.216062 | -0.147485 | 0.144513 | ... | -0.236185 | -0.152017 | -0.093180 | -0.041979 | -0.129297 | -0.139197 | -0.115324 | -0.076767 | 0.099208 | 0.098628 |
| low_volume_percent | 0.002861 | 0.031430 | 1.000000 | -0.151504 | -0.066047 | 0.005331 | 0.023895 | -0.093754 | -0.095780 | 0.054680 | ... | -0.140173 | -0.128145 | -0.083987 | -0.032534 | -0.074851 | -0.067226 | -0.096647 | -0.094627 | 0.069301 | 0.047500 |
| middle_volume_percent | -0.022844 | 0.090599 | -0.151504 | 1.000000 | -0.135050 | 0.001608 | -0.031743 | -0.055491 | -0.094736 | 0.010146 | ... | -0.145599 | -0.110230 | -0.153241 | 0.003266 | -0.007761 | -0.014847 | -0.169575 | -0.174126 | -0.001628 | -0.003253 |
| take_amount_in_later_12_month_highest | -0.019728 | -0.030282 | -0.066047 | -0.135050 | 1.000000 | -0.006699 | 0.060997 | 0.112582 | 0.181435 | -0.055318 | ... | 0.071798 | 0.038880 | 0.039328 | 0.015841 | 0.021387 | 0.021337 | 0.054433 | 0.040634 | -0.000570 | -0.010231 |
| trans_amount_increase_rate_lately | 0.014076 | -0.001299 | 0.005331 | 0.001608 | -0.006699 | 1.000000 | 0.015389 | -0.003061 | -0.011644 | 0.013220 | ... | 0.007119 | 0.000119 | 0.028868 | 0.006174 | -0.019901 | -0.019603 | 0.007123 | 0.040865 | -0.001457 | 0.027221 |
| trans_activity_month | -0.021715 | -0.118811 | 0.023895 | -0.031743 | 0.060997 | 0.015389 | 1.000000 | 0.063373 | 0.069272 | -0.091630 | ... | 0.061847 | 0.037275 | -0.036822 | -0.000400 | 0.003332 | 0.005534 | -0.022388 | -0.043764 | -0.024161 | -0.014880 |
| trans_activity_day | -0.001027 | -0.216062 | -0.093754 | -0.055491 | 0.112582 | -0.003061 | 0.063373 | 1.000000 | 0.674547 | -0.491304 | ... | 0.445155 | 0.202242 | 0.096622 | 0.181157 | 0.553486 | 0.551140 | 0.224509 | 0.085255 | -0.188965 | -0.299110 |
| transd_mcc | -0.006975 | -0.147485 | -0.095780 | -0.094736 | 0.181435 | -0.011644 | 0.069272 | 0.674547 | 1.000000 | -0.407126 | ... | 0.313715 | 0.165959 | 0.066250 | 0.131905 | 0.371197 | 0.374660 | 0.156951 | 0.057663 | -0.120330 | -0.211593 |
| trans_days_interval_filter | -0.021338 | 0.144513 | 0.054680 | 0.010146 | -0.055318 | 0.013220 | -0.091630 | -0.491304 | -0.407126 | 1.000000 | ... | -0.222970 | -0.118522 | -0.055205 | -0.142221 | -0.287044 | -0.286212 | -0.111345 | -0.045801 | 0.069560 | 0.163433 |
| trans_days_interval | -0.006297 | 0.200145 | 0.062421 | -0.000383 | -0.053329 | 0.000521 | -0.151579 | -0.542118 | -0.447308 | 0.690819 | ... | -0.271875 | -0.163049 | -0.034716 | -0.102588 | -0.285654 | -0.286510 | -0.105000 | -0.024740 | 0.102343 | 0.178628 |
| regional_mobility | 0.011694 | -0.134926 | -0.042939 | 0.009671 | 0.024089 | -0.026960 | -0.219911 | 0.143697 | 0.114936 | -0.124350 | ... | 0.099696 | 0.062712 | 0.092796 | 0.054084 | 0.147120 | 0.145322 | 0.091150 | 0.079245 | -0.051839 | -0.107139 |
| student_feature | 0.014227 | 0.026639 | -0.007327 | 0.018852 | -0.013524 | -0.002682 | 0.004966 | 0.011653 | 0.005921 | -0.005412 | ... | 0.021982 | 0.007634 | 0.001860 | 0.009800 | 0.020368 | 0.017777 | 0.000096 | 0.004799 | -0.020274 | -0.031571 |
| repayment_capability | -0.006726 | -0.026254 | -0.041522 | -0.178436 | 0.055174 | -0.000572 | 0.024737 | 0.189494 | 0.115434 | -0.048998 | ... | 0.128100 | 0.066266 | 0.052900 | 0.007068 | 0.046341 | 0.046471 | 0.063720 | 0.052120 | -0.020964 | -0.049951 |
| is_high_user | -0.002995 | -0.004640 | 0.018546 | -0.029172 | 0.068763 | -0.002017 | 0.044231 | -0.034766 | -0.080892 | 0.146678 | ... | -0.006073 | 0.014809 | 0.001286 | -0.000752 | -0.020497 | -0.017732 | 0.007764 | 0.013155 | -0.011908 | -0.015299 |
| number_of_trans_from_2011 | 0.016355 | -0.276856 | -0.031668 | -0.084995 | 0.056183 | -0.013609 | -0.198322 | 0.286353 | 0.262917 | -0.181413 | ... | 0.195444 | 0.142227 | 0.105527 | 0.066191 | 0.212621 | 0.215217 | 0.138342 | 0.088776 | -0.087938 | -0.140635 |
| historical_trans_amount | 0.002346 | -0.179652 | -0.117333 | -0.255846 | 0.102959 | -0.002527 | 0.016522 | 0.526801 | 0.392433 | -0.254563 | ... | 0.324514 | 0.165218 | 0.137554 | 0.073505 | 0.288324 | 0.288420 | 0.201670 | 0.138920 | -0.087267 | -0.156048 |
| historical_trans_day | 0.004245 | -0.345875 | -0.068345 | -0.116128 | 0.112908 | -0.006987 | 0.052108 | 0.855832 | 0.603820 | -0.424616 | ... | 0.430792 | 0.223623 | 0.111262 | 0.142560 | 0.472101 | 0.473890 | 0.219427 | 0.097498 | -0.171007 | -0.267034 |
| rank_trad_1_month | 0.011515 | 0.116798 | 0.102996 | 0.139914 | -0.089095 | 0.028153 | -0.017110 | -0.517923 | -0.481565 | 0.345540 | ... | -0.235028 | -0.121879 | -0.120020 | -0.133787 | -0.328297 | -0.327323 | -0.180749 | -0.115613 | 0.139572 | 0.243659 |
| trans_amount_3_month | -0.003329 | -0.069252 | -0.063526 | -0.189438 | 0.074279 | -0.003598 | 0.010864 | 0.293735 | 0.250140 | -0.139654 | ... | 0.186909 | 0.084463 | 0.065007 | 0.033856 | 0.151395 | 0.153457 | 0.094556 | 0.063275 | -0.044342 | -0.099134 |
| avg_consume_less_12_valid_month | -0.001742 | -0.087905 | -0.042251 | -0.154257 | 0.031360 | 0.006201 | 0.014532 | 0.081550 | 0.194729 | -0.017535 | ... | 0.054239 | 0.010832 | 0.024610 | 0.026514 | 0.103333 | 0.101638 | 0.044864 | 0.011955 | -0.069481 | -0.090181 |
| abs | -0.004830 | -0.059940 | -0.059972 | -0.247023 | 0.081978 | -0.003474 | 0.038276 | 0.306292 | 0.160103 | -0.054408 | ... | 0.152274 | 0.105389 | 0.072716 | 0.009934 | 0.071542 | 0.074592 | 0.102727 | 0.090651 | -0.042772 | -0.065586 |
| top_trans_count_last_1_month | 0.021195 | 0.101220 | 0.037765 | 0.059410 | -0.063713 | 0.029701 | -0.026923 | -0.503568 | -0.475823 | 0.381422 | ... | -0.199205 | -0.082819 | -0.071977 | -0.138804 | -0.323863 | -0.324179 | -0.128756 | -0.058697 | 0.140451 | 0.247943 |
| avg_price_last_12_month | 0.011454 | -0.090752 | -0.286213 | -0.556220 | 0.129908 | -0.004808 | -0.006858 | 0.068127 | 0.067431 | -0.003254 | ... | 0.204154 | 0.186553 | 0.183277 | -0.017057 | 0.019580 | 0.017570 | 0.179284 | 0.205885 | -0.033386 | -0.020140 |
| avg_price_top_last_12_valid_month | -0.018637 | 0.124158 | 0.509637 | 0.369498 | -0.123202 | -0.030556 | 0.011839 | -0.138470 | -0.145411 | 0.064464 | ... | -0.316923 | -0.285338 | -0.172475 | -0.002851 | -0.087684 | -0.080339 | -0.197348 | -0.195194 | 0.079399 | 0.062422 |
| trans_top_time_last_1_month | -0.007772 | -0.003624 | -0.023020 | -0.015316 | 0.028547 | -0.021723 | 0.018931 | 0.092705 | 0.143620 | -0.127328 | ... | 0.039372 | 0.019334 | 0.005990 | 0.060674 | 0.059770 | 0.056594 | 0.021747 | 0.000811 | -0.030739 | -0.069952 |
| trans_top_time_last_6_month | 0.012229 | 0.063470 | 0.064055 | 0.025356 | -0.079647 | 0.032999 | -0.016856 | -0.519642 | -0.400228 | 0.541678 | ... | -0.227717 | -0.111353 | -0.055343 | -0.151634 | -0.336984 | -0.334724 | -0.106172 | -0.042912 | 0.088539 | 0.199473 |
| consume_top_time_last_1_month | -0.012425 | -0.013828 | -0.030325 | -0.024816 | 0.022475 | -0.020937 | 0.021517 | 0.126348 | 0.162267 | -0.155562 | ... | 0.053634 | 0.023311 | 0.018442 | 0.073859 | 0.083782 | 0.080835 | 0.038965 | 0.012452 | -0.044495 | -0.078152 |
| consume_top_time_last_6_month | 0.010297 | 0.071615 | 0.026853 | 0.033989 | -0.062038 | 0.032106 | -0.009154 | -0.506739 | -0.371122 | 0.489491 | ... | -0.228478 | -0.109085 | -0.055605 | -0.128403 | -0.331808 | -0.329305 | -0.105383 | -0.042412 | 0.083349 | 0.194636 |
| cross_consume_count_last_1_month | -0.019612 | -0.015749 | 0.056716 | -0.076003 | 0.042310 | -0.004524 | -0.001588 | 0.217779 | 0.236684 | -0.098826 | ... | 0.101269 | 0.038369 | 0.032169 | 0.024935 | 0.129515 | 0.131031 | 0.041163 | 0.025667 | -0.026850 | -0.081532 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| loans_score | -0.014716 | -0.006276 | -0.007019 | -0.039519 | -0.003928 | 0.010481 | -0.020418 | -0.211471 | -0.141698 | 0.105373 | ... | -0.145502 | -0.071344 | 0.076157 | -0.043698 | -0.242594 | -0.249384 | -0.004854 | 0.085289 | 0.035587 | -0.019444 |
| loans_credibility_behavior | -0.000191 | -0.023587 | 0.037414 | -0.031636 | -0.017790 | -0.003788 | 0.002694 | -0.059242 | -0.028579 | 0.016791 | ... | -0.109592 | -0.059538 | 0.113228 | 0.202241 | 0.132995 | 0.141105 | 0.112848 | 0.110999 | 0.014235 | -0.034587 |
| loans_count | 0.008971 | -0.104423 | -0.103643 | 0.093372 | 0.034853 | -0.011057 | 0.010844 | 0.713700 | 0.433028 | -0.340780 | ... | 0.454809 | 0.167113 | 0.097746 | 0.189068 | 0.669975 | 0.658099 | 0.240708 | 0.067635 | -0.194495 | -0.388366 |
| loans_settle_count | 0.008509 | -0.106395 | -0.115451 | 0.084747 | 0.044691 | -0.011398 | 0.006203 | 0.723008 | 0.440667 | -0.346732 | ... | 0.466131 | 0.175771 | 0.109141 | 0.183622 | 0.633814 | 0.618343 | 0.239062 | 0.079620 | -0.198341 | -0.374859 |
| loans_overdue_count | -0.002123 | -0.047870 | 0.003258 | 0.055341 | -0.018406 | -0.009483 | 0.026175 | 0.319103 | 0.193574 | -0.145065 | ... | 0.189896 | 0.071337 | 0.001805 | 0.089767 | 0.389047 | 0.396295 | 0.109103 | -0.020359 | -0.069116 | -0.166293 |
| loans_org_count_behavior | 0.002662 | -0.107997 | -0.113256 | 0.086219 | 0.020780 | -0.012629 | 0.014655 | 0.698811 | 0.451913 | -0.330187 | ... | 0.462677 | 0.133557 | 0.065915 | 0.231726 | 0.799828 | 0.784147 | 0.257087 | 0.038651 | -0.216354 | -0.402870 |
| consfin_org_count_behavior | 0.012062 | -0.129297 | -0.074851 | -0.007761 | 0.021387 | -0.019901 | 0.003332 | 0.553486 | 0.371197 | -0.287044 | ... | 0.357549 | 0.137506 | 0.102057 | 0.297390 | 1.000000 | 0.988027 | 0.345697 | 0.074170 | -0.185038 | -0.399885 |
| loans_cash_count | -0.002987 | -0.078110 | -0.115566 | 0.123857 | 0.016962 | -0.006482 | 0.018475 | 0.662281 | 0.420938 | -0.298750 | ... | 0.443326 | 0.108990 | 0.034866 | 0.156549 | 0.555002 | 0.539896 | 0.164961 | 0.012515 | -0.197164 | -0.337008 |
| latest_one_month_loan | -0.001009 | -0.082016 | -0.034938 | -0.017258 | 0.008845 | -0.010482 | 0.010030 | 0.251331 | 0.152255 | -0.109048 | ... | 0.216387 | 0.078718 | 0.057899 | 0.087776 | 0.328566 | 0.334659 | 0.136317 | 0.039233 | -0.167327 | -0.564125 |
| latest_three_month_loan | 0.009841 | -0.102432 | -0.041141 | -0.007416 | 0.009611 | -0.012746 | 0.024060 | 0.277217 | 0.180567 | -0.132316 | ... | 0.235019 | 0.083134 | 0.055062 | 0.099953 | 0.410173 | 0.415173 | 0.158521 | 0.038337 | -0.172112 | -0.556912 |
| latest_six_month_loan | 0.013859 | -0.084090 | -0.080426 | 0.080167 | 0.034499 | -0.007440 | 0.023376 | 0.618386 | 0.381795 | -0.294128 | ... | 0.395462 | 0.110447 | 0.067585 | 0.169560 | 0.653119 | 0.646769 | 0.214958 | 0.040151 | -0.201667 | -0.457269 |
| history_suc_fee | 0.005751 | -0.154554 | -0.068576 | 0.038832 | 0.036804 | -0.013452 | 0.008331 | 0.707752 | 0.432147 | -0.361614 | ... | 0.439947 | 0.158091 | 0.129951 | 0.194162 | 0.682134 | 0.674063 | 0.258122 | 0.092018 | -0.197019 | -0.392292 |
| history_fail_fee | 0.005633 | -0.039117 | 0.009093 | 0.055486 | -0.019751 | -0.006480 | 0.033715 | 0.269878 | 0.148111 | -0.115664 | ... | 0.164983 | 0.047106 | 0.008807 | 0.098387 | 0.399674 | 0.411005 | 0.130966 | -0.011496 | -0.060527 | -0.199490 |
| latest_one_month_suc | 0.000168 | -0.078523 | -0.025131 | -0.022327 | 0.022699 | -0.009579 | 0.009873 | 0.228821 | 0.146654 | -0.111359 | ... | 0.194397 | 0.056937 | 0.023565 | 0.092642 | 0.349219 | 0.360903 | 0.119473 | 0.014118 | -0.167496 | -0.459412 |
| latest_one_month_fail | 0.013939 | -0.041356 | 0.021126 | -0.018844 | -0.004671 | -0.005436 | 0.017951 | 0.133230 | 0.080658 | -0.053043 | ... | 0.104707 | 0.041028 | -0.000725 | 0.053177 | 0.228943 | 0.244784 | 0.078403 | -0.011238 | -0.051815 | -0.232811 |
| loans_long_time | 0.000992 | -0.216323 | -0.074834 | 0.040822 | 0.012660 | -0.002678 | 0.030914 | 0.384281 | 0.301560 | -0.317888 | ... | 0.368329 | 0.268476 | 0.144168 | 0.229963 | 0.345921 | 0.337877 | 0.198060 | 0.134422 | -0.134214 | -0.214274 |
| loans_credit_limit | -0.006194 | -0.160582 | -0.138446 | -0.086706 | 0.050185 | 0.013118 | 0.036172 | 0.229054 | 0.177929 | -0.146462 | ... | 0.745694 | 0.901755 | 0.085200 | 0.057358 | 0.151520 | 0.152471 | 0.123290 | 0.087213 | -0.111170 | -0.157736 |
| loans_credibility_limit | -0.033850 | -0.000850 | -0.051545 | 0.090824 | -0.025575 | 0.003868 | -0.005513 | 0.161363 | 0.154721 | -0.083218 | ... | 0.328845 | 0.456704 | -0.022646 | -0.029875 | 0.115503 | 0.114720 | 0.021164 | -0.036374 | -0.127687 | -0.138313 |
| loans_org_count_current | -0.002987 | -0.078110 | -0.115566 | 0.123857 | 0.016962 | -0.006482 | 0.018475 | 0.662281 | 0.420938 | -0.298750 | ... | 0.443326 | 0.108990 | 0.034866 | 0.156549 | 0.555002 | 0.539896 | 0.164961 | 0.012515 | -0.197164 | -0.337008 |
| loans_product_count | -0.001941 | -0.072159 | -0.114296 | 0.122741 | 0.018572 | -0.007575 | 0.019401 | 0.656939 | 0.416980 | -0.295409 | ... | 0.441316 | 0.108962 | 0.031936 | 0.155744 | 0.553607 | 0.539424 | 0.163650 | 0.009397 | -0.199549 | -0.348389 |
| loans_max_limit | 0.012189 | -0.236185 | -0.140173 | -0.145599 | 0.071798 | 0.007119 | 0.061847 | 0.445155 | 0.313715 | -0.222970 | ... | 1.000000 | 0.716175 | 0.108760 | 0.101659 | 0.357549 | 0.354231 | 0.208389 | 0.109425 | -0.190074 | -0.282204 |
| loans_avg_limit | -0.004457 | -0.152017 | -0.128145 | -0.110230 | 0.038880 | 0.000119 | 0.037275 | 0.202242 | 0.165959 | -0.118522 | ... | 0.716175 | 1.000000 | 0.091117 | 0.031452 | 0.137506 | 0.137271 | 0.130307 | 0.098641 | -0.099086 | -0.144835 |
| consfin_credit_limit | 0.005967 | -0.093180 | -0.083987 | -0.153241 | 0.039328 | 0.028868 | -0.036822 | 0.096622 | 0.066250 | -0.055205 | ... | 0.108760 | 0.091117 | 1.000000 | 0.241837 | 0.102057 | 0.081338 | 0.788637 | 0.901470 | -0.047756 | -0.110230 |
| consfin_credibility | 0.007746 | -0.041979 | -0.032534 | 0.003266 | 0.015841 | 0.006174 | -0.000400 | 0.181157 | 0.131905 | -0.142221 | ... | 0.101659 | 0.031452 | 0.241837 | 1.000000 | 0.297390 | 0.288275 | 0.216232 | 0.275861 | -0.085131 | -0.230984 |
| consfin_org_count_current | 0.012062 | -0.129297 | -0.074851 | -0.007761 | 0.021387 | -0.019901 | 0.003332 | 0.553486 | 0.371197 | -0.287044 | ... | 0.357549 | 0.137506 | 0.102057 | 0.297390 | 1.000000 | 0.988027 | 0.345697 | 0.074170 | -0.185038 | -0.399885 |
| consfin_product_count | 0.009583 | -0.139197 | -0.067226 | -0.014847 | 0.021337 | -0.019603 | 0.005534 | 0.551140 | 0.374660 | -0.286212 | ... | 0.354231 | 0.137271 | 0.081338 | 0.288275 | 0.988027 | 1.000000 | 0.329687 | 0.054292 | -0.183685 | -0.402011 |
| consfin_max_limit | 0.004020 | -0.115324 | -0.096647 | -0.169575 | 0.054433 | 0.007123 | -0.022388 | 0.224509 | 0.156951 | -0.111345 | ... | 0.208389 | 0.130307 | 0.788637 | 0.216232 | 0.345697 | 0.329687 | 1.000000 | 0.790580 | -0.096890 | -0.193945 |
| consfin_avg_limit | 0.008860 | -0.076767 | -0.094627 | -0.174126 | 0.040634 | 0.040865 | -0.043764 | 0.085255 | 0.057663 | -0.045801 | ... | 0.109425 | 0.098641 | 0.901470 | 0.275861 | 0.074170 | 0.054292 | 0.790580 | 1.000000 | -0.042131 | -0.088072 |
| latest_query_day | 0.020774 | 0.099208 | 0.069301 | -0.001628 | -0.000570 | -0.001457 | -0.024161 | -0.188965 | -0.120330 | 0.069560 | ... | -0.190074 | -0.099086 | -0.047756 | -0.085131 | -0.185038 | -0.183685 | -0.096890 | -0.042131 | 1.000000 | 0.230078 |
| loans_latest_day | 0.026387 | 0.098628 | 0.047500 | -0.003253 | -0.010231 | 0.027221 | -0.014880 | -0.299110 | -0.211593 | 0.163433 | ... | -0.282204 | -0.144835 | -0.110230 | -0.230984 | -0.399885 | -0.402011 | -0.193945 | -0.088072 | 0.230078 | 1.000000 |
82 rows × 82 columns
参考: