Overview

Dataset statistics

Number of variables20
Number of observations1025
Missing cells0
Missing cells (%)0.0%
Duplicate rows302
Duplicate rows (%)29.5%
Total size in memory69.2 KiB
Average record size in memory69.1 B

Variable types

Numeric5
Categorical15

Alerts

Dataset has 302 (29.5%) duplicate rowsDuplicates
chest_pain_type_non-anginal pain is highly correlated with chest_pain_type_typical anginaHigh correlation
chest_pain_type_typical angina is highly correlated with chest_pain_type_non-anginal pain and 1 other fieldsHigh correlation
thalassemia_normal is highly correlated with thalassemia_reversable defect and 1 other fieldsHigh correlation
thalassemia_reversable defect is highly correlated with thalassemia_normalHigh correlation
diagnosis is highly correlated with chest_pain_type_typical angina and 1 other fieldsHigh correlation
chest_pain_type_non-anginal pain is highly correlated with chest_pain_type_typical anginaHigh correlation
chest_pain_type_typical angina is highly correlated with chest_pain_type_non-anginal pain and 1 other fieldsHigh correlation
thalassemia_normal is highly correlated with thalassemia_reversable defect and 1 other fieldsHigh correlation
thalassemia_reversable defect is highly correlated with thalassemia_normalHigh correlation
diagnosis is highly correlated with chest_pain_type_typical angina and 1 other fieldsHigh correlation
chest_pain_type_non-anginal pain is highly correlated with chest_pain_type_typical anginaHigh correlation
chest_pain_type_typical angina is highly correlated with chest_pain_type_non-anginal pain and 1 other fieldsHigh correlation
thalassemia_normal is highly correlated with thalassemia_reversable defect and 1 other fieldsHigh correlation
thalassemia_reversable defect is highly correlated with thalassemia_normalHigh correlation
diagnosis is highly correlated with chest_pain_type_typical angina and 1 other fieldsHigh correlation
thalassemia_normal is highly correlated with thalassemia_reversable defect and 1 other fieldsHigh correlation
chest_pain_type_typical angina is highly correlated with chest_pain_type_non-anginal pain and 1 other fieldsHigh correlation
thalassemia_reversable defect is highly correlated with thalassemia_normalHigh correlation
chest_pain_type_non-anginal pain is highly correlated with chest_pain_type_typical anginaHigh correlation
diagnosis is highly correlated with thalassemia_normal and 1 other fieldsHigh correlation
age is highly correlated with max_heart_rate_achieved and 1 other fieldsHigh correlation
max_heart_rate_achieved is highly correlated with age and 4 other fieldsHigh correlation
st_depression is highly correlated with rest_ecg_left ventricular hypertrophyHigh correlation
num_major_vessels is highly correlated with ageHigh correlation
sex_male is highly correlated with thalassemia_normalHigh correlation
chest_pain_type_atypical angina is highly correlated with chest_pain_type_typical anginaHigh correlation
chest_pain_type_non-anginal pain is highly correlated with chest_pain_type_typical anginaHigh correlation
chest_pain_type_typical angina is highly correlated with max_heart_rate_achieved and 5 other fieldsHigh correlation
rest_ecg_left ventricular hypertrophy is highly correlated with st_depressionHigh correlation
exercise_induced_angina_yes is highly correlated with max_heart_rate_achieved and 2 other fieldsHigh correlation
st_slope_flat is highly correlated with max_heart_rate_achieved and 1 other fieldsHigh correlation
thalassemia_normal is highly correlated with sex_male and 3 other fieldsHigh correlation
thalassemia_reversable defect is highly correlated with thalassemia_normal and 1 other fieldsHigh correlation
diagnosis is highly correlated with max_heart_rate_achieved and 5 other fieldsHigh correlation
st_depression has 329 (32.1%) zeros Zeros

Reproduction

Analysis started2022-09-04 01:57:14.235198
Analysis finished2022-09-04 01:57:21.144530
Duration6.91 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct41
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54.43414634
Minimum29
Maximum77
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.1 KiB
2022-09-03T21:57:21.206584image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum29
5-th percentile39
Q148
median56
Q361
95-th percentile68
Maximum77
Range48
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.072290233
Coefficient of variation (CV)0.1666654268
Kurtosis-0.5256178129
Mean54.43414634
Median Absolute Deviation (MAD)6
Skewness-0.2488659017
Sum55795
Variance82.30645008
MonotonicityNot monotonic
2022-09-03T21:57:21.778574image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
5868
 
6.6%
5757
 
5.6%
5453
 
5.2%
5946
 
4.5%
5243
 
4.2%
5139
 
3.8%
5639
 
3.8%
6237
 
3.6%
6037
 
3.6%
4436
 
3.5%
Other values (31)570
55.6%
ValueCountFrequency (%)
294
 
0.4%
346
 
0.6%
3515
1.5%
376
 
0.6%
3812
 
1.2%
3914
1.4%
4011
 
1.1%
4132
3.1%
4226
2.5%
4326
2.5%
ValueCountFrequency (%)
773
 
0.3%
763
 
0.3%
743
 
0.3%
7111
 
1.1%
7014
1.4%
699
 
0.9%
6812
 
1.2%
6731
3.0%
6625
2.4%
6527
2.6%

resting_blood_pressure
Real number (ℝ≥0)

Distinct49
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean131.6117073
Minimum94
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.1 KiB
2022-09-03T21:57:21.898176image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum94
5-th percentile108
Q1120
median130
Q3140
95-th percentile163.2
Maximum200
Range106
Interquartile range (IQR)20

Descriptive statistics

Standard deviation17.51671801
Coefficient of variation (CV)0.1330939197
Kurtosis0.9912207431
Mean131.6117073
Median Absolute Deviation (MAD)10
Skewness0.7397682261
Sum134902
Variance306.8354097
MonotonicityNot monotonic
2022-09-03T21:57:22.022784image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
120128
 
12.5%
130123
 
12.0%
140107
 
10.4%
11064
 
6.2%
15055
 
5.4%
13845
 
4.4%
12839
 
3.8%
12538
 
3.7%
16036
 
3.5%
11230
 
2.9%
Other values (39)360
35.1%
ValueCountFrequency (%)
947
 
0.7%
10014
 
1.4%
1013
 
0.3%
1026
 
0.6%
1043
 
0.3%
1059
 
0.9%
1063
 
0.3%
10821
 
2.0%
11064
6.2%
11230
2.9%
ValueCountFrequency (%)
2004
 
0.4%
1923
 
0.3%
18010
 
1.0%
1787
 
0.7%
1743
 
0.3%
1723
 
0.3%
17015
1.5%
1654
 
0.4%
1643
 
0.3%
16036
3.5%

cholesterol
Real number (ℝ≥0)

Distinct152
Distinct (%)14.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean246
Minimum126
Maximum564
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.1 KiB
2022-09-03T21:57:22.136881image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum126
5-th percentile175
Q1211
median240
Q3275
95-th percentile330
Maximum564
Range438
Interquartile range (IQR)64

Descriptive statistics

Standard deviation51.59251021
Coefficient of variation (CV)0.2097256512
Kurtosis3.996803049
Mean246
Median Absolute Deviation (MAD)33
Skewness1.074072778
Sum252150
Variance2661.787109
MonotonicityNot monotonic
2022-09-03T21:57:22.235966image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20421
 
2.0%
23421
 
2.0%
19719
 
1.9%
21218
 
1.8%
25417
 
1.7%
26916
 
1.6%
17714
 
1.4%
24014
 
1.4%
28214
 
1.4%
23913
 
1.3%
Other values (142)858
83.7%
ValueCountFrequency (%)
1263
 
0.3%
1313
 
0.3%
1413
 
0.3%
1498
0.8%
1574
0.4%
1603
 
0.3%
1643
 
0.3%
1664
0.4%
1674
0.4%
1683
 
0.3%
ValueCountFrequency (%)
5643
0.3%
4173
0.3%
4093
0.3%
4074
0.4%
3943
0.3%
3603
0.3%
3543
0.3%
3534
0.4%
3424
0.4%
3414
0.4%

max_heart_rate_achieved
Real number (ℝ≥0)

HIGH CORRELATION

Distinct91
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean149.1141463
Minimum71
Maximum202
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.1 KiB
2022-09-03T21:57:22.340055image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum71
5-th percentile108
Q1132
median152
Q3166
95-th percentile182
Maximum202
Range131
Interquartile range (IQR)34

Descriptive statistics

Standard deviation23.00572375
Coefficient of variation (CV)0.1542826372
Kurtosis-0.08882248803
Mean149.1141463
Median Absolute Deviation (MAD)16
Skewness-0.5137771771
Sum152842
Variance529.2633251
MonotonicityNot monotonic
2022-09-03T21:57:22.448148image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16235
 
3.4%
16031
 
3.0%
16329
 
2.8%
17328
 
2.7%
15228
 
2.7%
14426
 
2.5%
13226
 
2.5%
15025
 
2.4%
12525
 
2.4%
14323
 
2.2%
Other values (81)749
73.1%
ValueCountFrequency (%)
714
 
0.4%
883
 
0.3%
903
 
0.3%
954
 
0.4%
967
0.7%
974
 
0.4%
993
 
0.3%
1038
0.8%
10510
1.0%
1063
 
0.3%
ValueCountFrequency (%)
2024
0.4%
1953
0.3%
1943
0.3%
1923
0.3%
1904
0.4%
1883
0.3%
1873
0.3%
1866
0.6%
1853
0.3%
1843
0.3%

st_depression
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct40
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.071512195
Minimum0
Maximum6.2
Zeros329
Zeros (%)32.1%
Negative0
Negative (%)0.0%
Memory size8.1 KiB
2022-09-03T21:57:22.553239image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.8
Q31.8
95-th percentile3.4
Maximum6.2
Range6.2
Interquartile range (IQR)1.8

Descriptive statistics

Standard deviation1.175053255
Coefficient of variation (CV)1.096630781
Kurtosis1.314470889
Mean1.071512195
Median Absolute Deviation (MAD)0.8
Skewness1.210899388
Sum1098.3
Variance1.380750152
MonotonicityNot monotonic
2022-09-03T21:57:22.646819image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
0329
32.1%
1.258
 
5.7%
151
 
5.0%
0.647
 
4.6%
0.844
 
4.3%
1.444
 
4.3%
1.637
 
3.6%
0.237
 
3.6%
1.836
 
3.5%
232
 
3.1%
Other values (30)310
30.2%
ValueCountFrequency (%)
0329
32.1%
0.123
 
2.2%
0.237
 
3.6%
0.310
 
1.0%
0.430
 
2.9%
0.515
 
1.5%
0.647
 
4.6%
0.73
 
0.3%
0.844
 
4.3%
0.910
 
1.0%
ValueCountFrequency (%)
6.23
 
0.3%
5.64
 
0.4%
4.44
 
0.4%
4.26
 
0.6%
412
1.2%
3.84
 
0.4%
3.615
1.5%
3.53
 
0.3%
3.410
1.0%
3.28
0.8%

num_major_vessels
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
578 
1
226 
2
134 
3
69 
4
 
18

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row0
3rd row0
4th row1
5th row3

Common Values

ValueCountFrequency (%)
0578
56.4%
1226
 
22.0%
2134
 
13.1%
369
 
6.7%
418
 
1.8%

Length

2022-09-03T21:57:22.737897image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:22.820968image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0578
56.4%
1226
 
22.0%
2134
 
13.1%
369
 
6.7%
418
 
1.8%

Most occurring characters

ValueCountFrequency (%)
0578
56.4%
1226
 
22.0%
2134
 
13.1%
369
 
6.7%
418
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0578
56.4%
1226
 
22.0%
2134
 
13.1%
369
 
6.7%
418
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0578
56.4%
1226
 
22.0%
2134
 
13.1%
369
 
6.7%
418
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0578
56.4%
1226
 
22.0%
2134
 
13.1%
369
 
6.7%
418
 
1.8%

sex_male
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
1
713 
0
312 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1713
69.6%
0312
30.4%

Length

2022-09-03T21:57:22.895031image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:22.971597image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
1713
69.6%
0312
30.4%

Most occurring characters

ValueCountFrequency (%)
1713
69.6%
0312
30.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1713
69.6%
0312
30.4%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1713
69.6%
0312
30.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1713
69.6%
0312
30.4%

chest_pain_type_atypical angina
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
858 
1
167 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0858
83.7%
1167
 
16.3%

Length

2022-09-03T21:57:23.036152image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:23.114719image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0858
83.7%
1167
 
16.3%

Most occurring characters

ValueCountFrequency (%)
0858
83.7%
1167
 
16.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0858
83.7%
1167
 
16.3%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0858
83.7%
1167
 
16.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0858
83.7%
1167
 
16.3%

chest_pain_type_non-anginal pain
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
741 
1
284 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0741
72.3%
1284
 
27.7%

Length

2022-09-03T21:57:23.182778image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:23.261846image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0741
72.3%
1284
 
27.7%

Most occurring characters

ValueCountFrequency (%)
0741
72.3%
1284
 
27.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0741
72.3%
1284
 
27.7%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0741
72.3%
1284
 
27.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0741
72.3%
1284
 
27.7%

chest_pain_type_typical angina
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
528 
1
497 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0528
51.5%
1497
48.5%

Length

2022-09-03T21:57:23.326901image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:23.403967image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0528
51.5%
1497
48.5%

Most occurring characters

ValueCountFrequency (%)
0528
51.5%
1497
48.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0528
51.5%
1497
48.5%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0528
51.5%
1497
48.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0528
51.5%
1497
48.5%
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
1
872 
0
153 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1872
85.1%
0153
 
14.9%

Length

2022-09-03T21:57:23.469023image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:23.556098image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
1872
85.1%
0153
 
14.9%

Most occurring characters

ValueCountFrequency (%)
1872
85.1%
0153
 
14.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1872
85.1%
0153
 
14.9%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1872
85.1%
0153
 
14.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1872
85.1%
0153
 
14.9%

rest_ecg_left ventricular hypertrophy
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
1010 
1
 
15

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01010
98.5%
115
 
1.5%

Length

2022-09-03T21:57:23.621154image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:23.697719image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
01010
98.5%
115
 
1.5%

Most occurring characters

ValueCountFrequency (%)
01010
98.5%
115
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01010
98.5%
115
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01010
98.5%
115
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01010
98.5%
115
 
1.5%

rest_ecg_normal
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
528 
1
497 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0528
51.5%
1497
48.5%

Length

2022-09-03T21:57:23.760773image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:23.837339image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0528
51.5%
1497
48.5%

Most occurring characters

ValueCountFrequency (%)
0528
51.5%
1497
48.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0528
51.5%
1497
48.5%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0528
51.5%
1497
48.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0528
51.5%
1497
48.5%

exercise_induced_angina_yes
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
680 
1
345 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0680
66.3%
1345
33.7%

Length

2022-09-03T21:57:23.903896image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:23.982464image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0680
66.3%
1345
33.7%

Most occurring characters

ValueCountFrequency (%)
0680
66.3%
1345
33.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0680
66.3%
1345
33.7%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0680
66.3%
1345
33.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0680
66.3%
1345
33.7%

st_slope_flat
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
543 
1
482 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0543
53.0%
1482
47.0%

Length

2022-09-03T21:57:24.049020image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:24.125586image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0543
53.0%
1482
47.0%

Most occurring characters

ValueCountFrequency (%)
0543
53.0%
1482
47.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0543
53.0%
1482
47.0%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0543
53.0%
1482
47.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0543
53.0%
1482
47.0%

st_slope_upslope
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
951 
1
 
74

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0951
92.8%
174
 
7.2%

Length

2022-09-03T21:57:24.191142image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:24.267708image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0951
92.8%
174
 
7.2%

Most occurring characters

ValueCountFrequency (%)
0951
92.8%
174
 
7.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0951
92.8%
174
 
7.2%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0951
92.8%
174
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0951
92.8%
174
 
7.2%
Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
961 
1
 
64

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0961
93.8%
164
 
6.2%

Length

2022-09-03T21:57:24.330761image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:24.408829image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0961
93.8%
164
 
6.2%

Most occurring characters

ValueCountFrequency (%)
0961
93.8%
164
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0961
93.8%
164
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0961
93.8%
164
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0961
93.8%
164
 
6.2%

thalassemia_normal
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
1
544 
0
481 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1544
53.1%
0481
46.9%

Length

2022-09-03T21:57:24.471883image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:24.548448image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
1544
53.1%
0481
46.9%

Most occurring characters

ValueCountFrequency (%)
1544
53.1%
0481
46.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1544
53.1%
0481
46.9%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1544
53.1%
0481
46.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1544
53.1%
0481
46.9%

thalassemia_reversable defect
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
0
615 
1
410 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0615
60.0%
1410
40.0%

Length

2022-09-03T21:57:24.614004image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:24.690570image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0615
60.0%
1410
40.0%

Most occurring characters

ValueCountFrequency (%)
0615
60.0%
1410
40.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0615
60.0%
1410
40.0%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0615
60.0%
1410
40.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0615
60.0%
1410
40.0%

diagnosis
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
1
526 
0
499 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1025
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1526
51.3%
0499
48.7%

Length

2022-09-03T21:57:24.755626image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-03T21:57:24.834694image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
1526
51.3%
0499
48.7%

Most occurring characters

ValueCountFrequency (%)
1526
51.3%
0499
48.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1025
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1526
51.3%
0499
48.7%

Most occurring scripts

ValueCountFrequency (%)
Common1025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1526
51.3%
0499
48.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1526
51.3%
0499
48.7%

Interactions

2022-09-03T21:57:20.210731image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:18.402180image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:18.841557image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.303953image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.734323image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:20.299306image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:18.486252image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:18.935138image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.386524image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.826901image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:20.391887image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:18.574327image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.029719image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.476101image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.924986image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:20.476458image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:18.657399image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.116293image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.557170image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:20.015563image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:20.572040image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:18.747478image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.210873image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:19.647247image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-03T21:57:20.114147image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-09-03T21:57:24.910760image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-03T21:57:25.125443image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-03T21:57:25.340127image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-03T21:57:25.545303image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-09-03T21:57:25.727959image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-03T21:57:20.727173image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-09-03T21:57:21.021925image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

ageresting_blood_pressurecholesterolmax_heart_rate_achievedst_depressionnum_major_vesselssex_malechest_pain_type_atypical anginachest_pain_type_non-anginal painchest_pain_type_typical anginafasting_blood_sugar_lower than 120mg/mlrest_ecg_left ventricular hypertrophyrest_ecg_normalexercise_induced_angina_yesst_slope_flatst_slope_upslopethalassemia_fixed defectthalassemia_normalthalassemia_reversable defectdiagnosis
0521252121681.0210011000000010
1531402031553.1010010011010010
2701451741252.6010011001010010
3611482031610.0110011000000010
4621382941061.9300010000100100
5581002481221.0000011010100101
6581143181404.4310011100011000
7551602891450.8110011011100010
8461202491440.8010011010000010
9541222861163.2210011011100100

Last rows

ageresting_blood_pressurecholesterolmax_heart_rate_achievedst_depressionnum_major_vesselssex_malechest_pain_type_atypical anginachest_pain_type_non-anginal painchest_pain_type_typical anginafasting_blood_sugar_lower than 120mg/mlrest_ecg_left ventricular hypertrophyrest_ecg_normalexercise_induced_angina_yesst_slope_flatst_slope_upslopethalassemia_fixed defectthalassemia_normalthalassemia_reversable defectdiagnosis
1015581282161312.2310011011100010
1016651382821741.4110000010100100
101753123282952.0210011001100010
1018411101721580.0010011010000010
1019471122041430.1010011000000101
1020591402211640.0011001001000101
1021601252581412.8110011011100010
1022471102751181.0110011011100100
1023501102541590.0000011010000101
1024541201881131.4110011000100010

Duplicate rows

Most frequently occurring

ageresting_blood_pressurecholesterolmax_heart_rate_achievedst_depressionnum_major_vesselssex_malechest_pain_type_atypical anginachest_pain_type_non-anginal painchest_pain_type_typical anginafasting_blood_sugar_lower than 120mg/mlrest_ecg_left ventricular hypertrophyrest_ecg_normalexercise_induced_angina_yesst_slope_flatst_slope_upslopethalassemia_fixed defectthalassemia_normalthalassemia_reversable defectdiagnosis# duplicates
10381381751730.04101010000001018
0291302042020.00110010100001014
3351201981301.60100110011000104
4351221921740.00110010000001014
6351381831821.40000110000001014
9381202311823.80100010011000104
12391182191401.20100110001000104
13391382201520.00001010001001014
15401101671142.00100110111000104
17401522231810.00100110000000104