Overview

Dataset statistics

Number of variables13
Number of observations6099
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory619.6 KiB
Average record size in memory104.0 B

Variable types

DateTime2
Categorical2
Numeric9

Alerts

PT08.S1(CO) is highly correlated with PT08.S2(NMHC) and 3 other fieldsHigh correlation
PT08.S2(NMHC) is highly correlated with PT08.S1(CO) and 3 other fieldsHigh correlation
PT08.S3(NOx) is highly correlated with PT08.S1(CO) and 2 other fieldsHigh correlation
PT08.S4(NO2) is highly correlated with PT08.S1(CO) and 3 other fieldsHigh correlation
PT08.S5(O3) is highly correlated with PT08.S1(CO) and 2 other fieldsHigh correlation
T is highly correlated with PT08.S4(NO2) and 2 other fieldsHigh correlation
RH is highly correlated with THigh correlation
AH is highly correlated with PT08.S4(NO2) and 1 other fieldsHigh correlation
PT08.S1(CO) is highly correlated with PT08.S2(NMHC) and 3 other fieldsHigh correlation
PT08.S2(NMHC) is highly correlated with PT08.S1(CO) and 3 other fieldsHigh correlation
PT08.S3(NOx) is highly correlated with PT08.S1(CO) and 2 other fieldsHigh correlation
PT08.S4(NO2) is highly correlated with PT08.S1(CO) and 4 other fieldsHigh correlation
PT08.S5(O3) is highly correlated with PT08.S1(CO) and 3 other fieldsHigh correlation
T is highly correlated with PT08.S4(NO2) and 2 other fieldsHigh correlation
RH is highly correlated with THigh correlation
AH is highly correlated with PT08.S4(NO2) and 1 other fieldsHigh correlation
PT08.S1(CO) is highly correlated with PT08.S2(NMHC) and 2 other fieldsHigh correlation
PT08.S2(NMHC) is highly correlated with PT08.S1(CO) and 3 other fieldsHigh correlation
PT08.S3(NOx) is highly correlated with PT08.S1(CO) and 2 other fieldsHigh correlation
PT08.S4(NO2) is highly correlated with PT08.S2(NMHC) and 1 other fieldsHigh correlation
PT08.S5(O3) is highly correlated with PT08.S1(CO) and 2 other fieldsHigh correlation
AH is highly correlated with PT08.S4(NO2)High correlation
Month is highly correlated with PT08.S4(NO2) and 2 other fieldsHigh correlation
Hour is highly correlated with PT08.S2(NMHC) and 1 other fieldsHigh correlation
PT08.S1(CO) is highly correlated with PT08.S2(NMHC) and 3 other fieldsHigh correlation
PT08.S2(NMHC) is highly correlated with Hour and 4 other fieldsHigh correlation
PT08.S3(NOx) is highly correlated with PT08.S1(CO) and 3 other fieldsHigh correlation
PT08.S4(NO2) is highly correlated with Month and 6 other fieldsHigh correlation
PT08.S5(O3) is highly correlated with PT08.S1(CO) and 3 other fieldsHigh correlation
T is highly correlated with Month and 3 other fieldsHigh correlation
RH is highly correlated with Hour and 1 other fieldsHigh correlation
AH is highly correlated with Month and 2 other fieldsHigh correlation
DateTime has unique values Unique
Hour has 287 (4.7%) zeros Zeros

Reproduction

Analysis started2022-09-01 23:47:45.574097
Analysis finished2022-09-01 23:47:55.809870
Duration10.24 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

Date
Date

Distinct341
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Memory size47.8 KiB
Minimum2004-03-10 00:00:00
Maximum2005-04-04 00:00:00
2022-09-01T19:47:55.871924image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:55.966506image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Month
Categorical

HIGH CORRELATION

Distinct12
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size47.8 KiB
March
1054 
May
543 
June
537 
July
511 
January
501 
Other values (7)
2953 

Length

Max length9
Median length7
Mean length5.916871618
Min length3

Characters and Unicode

Total characters36087
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMarch
2nd rowMarch
3rd rowMarch
4th rowMarch
5th rowMarch

Common Values

ValueCountFrequency (%)
March1054
17.3%
May543
8.9%
June537
8.8%
July511
8.4%
January501
8.2%
April484
7.9%
February482
7.9%
November459
7.5%
December417
 
6.8%
September401
 
6.6%
Other values (2)710
11.6%

Length

2022-09-01T19:47:56.064590image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
march1054
17.3%
may543
8.9%
june537
8.8%
july511
8.4%
january501
8.2%
april484
7.9%
february482
7.9%
november459
7.5%
december417
 
6.8%
september401
 
6.6%
Other values (2)710
11.6%

Most occurring characters

ValueCountFrequency (%)
e4727
13.1%
r4616
12.8%
a3081
 
8.5%
u2779
 
7.7%
b2095
 
5.8%
y2037
 
5.6%
c1807
 
5.0%
M1597
 
4.4%
J1549
 
4.3%
m1277
 
3.5%
Other values (16)10522
29.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter29988
83.1%
Uppercase Letter6099
 
16.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e4727
15.8%
r4616
15.4%
a3081
10.3%
u2779
9.3%
b2095
 
7.0%
y2037
 
6.8%
c1807
 
6.0%
m1277
 
4.3%
t1111
 
3.7%
h1054
 
3.5%
Other values (8)5404
18.0%
Uppercase Letter
ValueCountFrequency (%)
M1597
26.2%
J1549
25.4%
A858
14.1%
F482
 
7.9%
N459
 
7.5%
D417
 
6.8%
S401
 
6.6%
O336
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Latin36087
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e4727
13.1%
r4616
12.8%
a3081
 
8.5%
u2779
 
7.7%
b2095
 
5.8%
y2037
 
5.6%
c1807
 
5.0%
M1597
 
4.4%
J1549
 
4.3%
m1277
 
3.5%
Other values (16)10522
29.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII36087
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e4727
13.1%
r4616
12.8%
a3081
 
8.5%
u2779
 
7.7%
b2095
 
5.8%
y2037
 
5.6%
c1807
 
5.0%
M1597
 
4.4%
J1549
 
4.3%
m1277
 
3.5%
Other values (16)10522
29.2%

Weekday
Categorical

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size47.8 KiB
Saturday
1005 
Sunday
946 
Monday
899 
Friday
863 
Thursday
804 
Other values (2)
1582 

Length

Max length9
Median length8
Mean length7.110345958
Min length6

Characters and Unicode

Total characters43366
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWednesday
2nd rowWednesday
3rd rowWednesday
4th rowWednesday
5th rowWednesday

Common Values

ValueCountFrequency (%)
Saturday1005
16.5%
Sunday946
15.5%
Monday899
14.7%
Friday863
14.1%
Thursday804
13.2%
Tuesday796
13.1%
Wednesday786
12.9%

Length

2022-09-01T19:47:56.140655image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-01T19:47:56.226728image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
saturday1005
16.5%
sunday946
15.5%
monday899
14.7%
friday863
14.1%
thursday804
13.2%
tuesday796
13.1%
wednesday786
12.9%

Most occurring characters

ValueCountFrequency (%)
a7104
16.4%
d6885
15.9%
y6099
14.1%
u3551
8.2%
r2672
 
6.2%
n2631
 
6.1%
s2386
 
5.5%
e2368
 
5.5%
S1951
 
4.5%
T1600
 
3.7%
Other values (7)6119
14.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter37267
85.9%
Uppercase Letter6099
 
14.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a7104
19.1%
d6885
18.5%
y6099
16.4%
u3551
9.5%
r2672
 
7.2%
n2631
 
7.1%
s2386
 
6.4%
e2368
 
6.4%
t1005
 
2.7%
o899
 
2.4%
Other values (2)1667
 
4.5%
Uppercase Letter
ValueCountFrequency (%)
S1951
32.0%
T1600
26.2%
M899
14.7%
F863
14.1%
W786
12.9%

Most occurring scripts

ValueCountFrequency (%)
Latin43366
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a7104
16.4%
d6885
15.9%
y6099
14.1%
u3551
8.2%
r2672
 
6.2%
n2631
 
6.1%
s2386
 
5.5%
e2368
 
5.5%
S1951
 
4.5%
T1600
 
3.7%
Other values (7)6119
14.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII43366
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a7104
16.4%
d6885
15.9%
y6099
14.1%
u3551
8.2%
r2672
 
6.2%
n2631
 
6.1%
s2386
 
5.5%
e2368
 
5.5%
S1951
 
4.5%
T1600
 
3.7%
Other values (7)6119
14.1%

DateTime
Date

UNIQUE

Distinct6099
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size47.8 KiB
Minimum2004-03-10 18:00:00
Maximum2005-04-04 14:00:00
2022-09-01T19:47:56.325814image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:56.420394image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Hour
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct24
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.99294966
Minimum0
Maximum23
Zeros287
Zeros (%)4.7%
Negative0
Negative (%)0.0%
Memory size47.8 KiB
2022-09-01T19:47:56.511472image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q16
median12
Q318
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)12

Descriptive statistics

Standard deviation6.880900037
Coefficient of variation (CV)0.5737454279
Kurtosis-1.10497181
Mean11.99294966
Median Absolute Deviation (MAD)6
Skewness-0.1144539213
Sum73145
Variance47.34678531
MonotonicityNot monotonic
2022-09-01T19:47:56.583033image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
22292
 
4.8%
23292
 
4.8%
0287
 
4.7%
15286
 
4.7%
7284
 
4.7%
12283
 
4.6%
1280
 
4.6%
16280
 
4.6%
13279
 
4.6%
14276
 
4.5%
Other values (14)3260
53.5%
ValueCountFrequency (%)
0287
4.7%
1280
4.6%
2269
4.4%
325
 
0.4%
4145
2.4%
5260
4.3%
6274
4.5%
7284
4.7%
8231
3.8%
9246
4.0%
ValueCountFrequency (%)
23292
4.8%
22292
4.8%
21271
4.4%
20253
4.1%
19236
3.9%
18250
4.1%
17268
4.4%
16280
4.6%
15286
4.7%
14276
4.5%

PT08.S1(CO)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct840
Distinct (%)13.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1093.204952
Minimum667
Maximum1667
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.8 KiB
2022-09-01T19:47:56.667105image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum667
5-th percentile835
Q1955
median1070
Q31212
95-th percentile1438
Maximum1667
Range1000
Interquartile range (IQR)257

Descriptive statistics

Standard deviation182.2290977
Coefficient of variation (CV)0.1666925286
Kurtosis-0.3126180085
Mean1093.204952
Median Absolute Deviation (MAD)128
Skewness0.5033105253
Sum6667457
Variance33207.44405
MonotonicityNot monotonic
2022-09-01T19:47:56.760186image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
97323
 
0.4%
93822
 
0.4%
110022
 
0.4%
98821
 
0.3%
96920
 
0.3%
101620
 
0.3%
106520
 
0.3%
111119
 
0.3%
98419
 
0.3%
102119
 
0.3%
Other values (830)5894
96.6%
ValueCountFrequency (%)
6671
< 0.1%
6831
< 0.1%
6921
< 0.1%
6952
< 0.1%
7031
< 0.1%
7291
< 0.1%
7322
< 0.1%
7381
< 0.1%
7401
< 0.1%
7411
< 0.1%
ValueCountFrequency (%)
16672
< 0.1%
16641
< 0.1%
16571
< 0.1%
16511
< 0.1%
16431
< 0.1%
16421
< 0.1%
16361
< 0.1%
16332
< 0.1%
16251
< 0.1%
16211
< 0.1%

PT08.S2(NMHC)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct978
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean928.6655189
Minimum440
Maximum1504
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.8 KiB
2022-09-01T19:47:56.854767image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum440
5-th percentile601
Q1758.5
median913
Q31089
95-th percentile1319
Maximum1504
Range1064
Interquartile range (IQR)330.5

Descriptive statistics

Standard deviation220.6129523
Coefficient of variation (CV)0.237559108
Kurtosis-0.6364182336
Mean928.6655189
Median Absolute Deviation (MAD)165
Skewness0.2717299101
Sum5663931
Variance48670.07472
MonotonicityNot monotonic
2022-09-01T19:47:56.948347image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
88020
 
0.3%
77619
 
0.3%
80018
 
0.3%
89617
 
0.3%
80317
 
0.3%
98517
 
0.3%
84916
 
0.3%
93116
 
0.3%
96216
 
0.3%
82616
 
0.3%
Other values (968)5927
97.2%
ValueCountFrequency (%)
4401
< 0.1%
4491
< 0.1%
4541
< 0.1%
4591
< 0.1%
4601
< 0.1%
4652
< 0.1%
4661
< 0.1%
4702
< 0.1%
4741
< 0.1%
4761
< 0.1%
ValueCountFrequency (%)
15041
< 0.1%
15031
< 0.1%
15011
< 0.1%
15001
< 0.1%
14991
< 0.1%
14971
< 0.1%
14961
< 0.1%
14951
< 0.1%
14931
< 0.1%
14922
< 0.1%

PT08.S3(NOx)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct904
Distinct (%)14.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean820.3277586
Minimum360
Maximum1374
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.8 KiB
2022-09-01T19:47:57.047432image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum360
5-th percentile525
Q1675.5
median800
Q3944
95-th percentile1185
Maximum1374
Range1014
Interquartile range (IQR)268.5

Descriptive statistics

Standard deviation197.1205218
Coefficient of variation (CV)0.2402948331
Kurtosis-0.2934934982
Mean820.3277586
Median Absolute Deviation (MAD)133
Skewness0.4288411019
Sum5003179
Variance38856.50013
MonotonicityNot monotonic
2022-09-01T19:47:57.253608image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
70519
 
0.3%
76519
 
0.3%
80019
 
0.3%
73319
 
0.3%
84619
 
0.3%
73718
 
0.3%
75118
 
0.3%
70218
 
0.3%
79318
 
0.3%
83017
 
0.3%
Other values (894)5915
97.0%
ValueCountFrequency (%)
3601
< 0.1%
3701
< 0.1%
3811
< 0.1%
3841
< 0.1%
3961
< 0.1%
4041
< 0.1%
4071
< 0.1%
4101
< 0.1%
4151
< 0.1%
4171
< 0.1%
ValueCountFrequency (%)
13741
 
< 0.1%
13732
< 0.1%
13702
< 0.1%
13681
 
< 0.1%
13663
< 0.1%
13651
 
< 0.1%
13641
 
< 0.1%
13631
 
< 0.1%
13613
< 0.1%
13591
 
< 0.1%

PT08.S4(NO2)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1405
Distinct (%)23.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1430.273323
Minimum601
Maximum2337
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.8 KiB
2022-09-01T19:47:57.349190image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum601
5-th percentile880
Q11192
median1444
Q31662
95-th percentile1968.1
Maximum2337
Range1736
Interquartile range (IQR)470

Descriptive statistics

Standard deviation329.5066174
Coefficient of variation (CV)0.2303801742
Kurtosis-0.5563477879
Mean1430.273323
Median Absolute Deviation (MAD)233
Skewness-0.01741131692
Sum8723237
Variance108574.6109
MonotonicityNot monotonic
2022-09-01T19:47:57.437766image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
148815
 
0.2%
141815
 
0.2%
155214
 
0.2%
153914
 
0.2%
158013
 
0.2%
149013
 
0.2%
130713
 
0.2%
146713
 
0.2%
138213
 
0.2%
137413
 
0.2%
Other values (1395)5963
97.8%
ValueCountFrequency (%)
6011
< 0.1%
6051
< 0.1%
6211
< 0.1%
6371
< 0.1%
6401
< 0.1%
6421
< 0.1%
6471
< 0.1%
6521
< 0.1%
6551
< 0.1%
6601
< 0.1%
ValueCountFrequency (%)
23371
< 0.1%
23321
< 0.1%
23191
< 0.1%
23162
< 0.1%
23111
< 0.1%
23061
< 0.1%
23051
< 0.1%
23011
< 0.1%
22881
< 0.1%
22831
< 0.1%

PT08.S5(O3)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1399
Distinct (%)22.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1007.403837
Minimum288
Maximum2108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.8 KiB
2022-09-01T19:47:57.536850image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum288
5-th percentile501
Q1756
median979
Q31238
95-th percentile1598.1
Maximum2108
Range1820
Interquartile range (IQR)482

Descriptive statistics

Standard deviation334.890392
Coefficient of variation (CV)0.332429141
Kurtosis-0.4601943517
Mean1007.403837
Median Absolute Deviation (MAD)240
Skewness0.3322983599
Sum6144156
Variance112151.5747
MonotonicityNot monotonic
2022-09-01T19:47:57.632432image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
83618
 
0.3%
82515
 
0.2%
90515
 
0.2%
82614
 
0.2%
94914
 
0.2%
92614
 
0.2%
80714
 
0.2%
94013
 
0.2%
89113
 
0.2%
101913
 
0.2%
Other values (1389)5956
97.7%
ValueCountFrequency (%)
2881
< 0.1%
3071
< 0.1%
3101
< 0.1%
3132
< 0.1%
3222
< 0.1%
3261
< 0.1%
3281
< 0.1%
3322
< 0.1%
3411
< 0.1%
3421
< 0.1%
ValueCountFrequency (%)
21081
< 0.1%
20261
< 0.1%
20231
< 0.1%
20212
< 0.1%
20202
< 0.1%
20161
< 0.1%
20081
< 0.1%
20041
< 0.1%
19991
< 0.1%
19721
< 0.1%

T
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct414
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.18083292
Minimum-1.9
Maximum41.1
Zeros1
Zeros (%)< 0.1%
Negative8
Negative (%)0.1%
Memory size47.8 KiB
2022-09-01T19:47:57.737522image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-1.9
5-th percentile4.4
Q111.8
median17.6
Q324.2
95-th percentile34.5
Maximum41.1
Range43
Interquartile range (IQR)12.4

Descriptive statistics

Standard deviation8.858076248
Coefficient of variation (CV)0.4872205958
Kurtosis-0.4899169847
Mean18.18083292
Median Absolute Deviation (MAD)6.2
Skewness0.3076622934
Sum110884.9
Variance78.46551482
MonotonicityNot monotonic
2022-09-01T19:47:57.833104image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21.342
 
0.7%
20.839
 
0.6%
20.239
 
0.6%
13.538
 
0.6%
1236
 
0.6%
17.835
 
0.6%
13.435
 
0.6%
16.335
 
0.6%
19.334
 
0.6%
12.334
 
0.6%
Other values (404)5732
94.0%
ValueCountFrequency (%)
-1.91
 
< 0.1%
-1.32
< 0.1%
-1.21
 
< 0.1%
-0.61
 
< 0.1%
-0.51
 
< 0.1%
-0.21
 
< 0.1%
-0.11
 
< 0.1%
01
 
< 0.1%
0.21
 
< 0.1%
0.33
< 0.1%
ValueCountFrequency (%)
41.13
< 0.1%
411
 
< 0.1%
40.91
 
< 0.1%
40.62
 
< 0.1%
40.51
 
< 0.1%
40.45
0.1%
40.34
0.1%
40.23
< 0.1%
40.16
0.1%
402
 
< 0.1%

RH
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct734
Distinct (%)12.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.07504509
Minimum9.2
Maximum88.7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.8 KiB
2022-09-01T19:47:57.936693image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum9.2
5-th percentile20
Q134.3
median47.9
Q361.45
95-th percentile77.2
Maximum88.7
Range79.5
Interquartile range (IQR)27.15

Descriptive statistics

Standard deviation17.42229095
Coefficient of variation (CV)0.3623978078
Kurtosis-0.8709640956
Mean48.07504509
Median Absolute Deviation (MAD)13.6
Skewness0.03965543715
Sum293209.7
Variance303.536222
MonotonicityNot monotonic
2022-09-01T19:47:58.048790image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
47.821
 
0.3%
39.420
 
0.3%
49.420
 
0.3%
34.519
 
0.3%
44.119
 
0.3%
4319
 
0.3%
42.819
 
0.3%
53.118
 
0.3%
45.918
 
0.3%
58.418
 
0.3%
Other values (724)5908
96.9%
ValueCountFrequency (%)
9.21
< 0.1%
9.31
< 0.1%
9.81
< 0.1%
9.92
< 0.1%
10.41
< 0.1%
11.11
< 0.1%
11.61
< 0.1%
12.31
< 0.1%
12.62
< 0.1%
12.71
< 0.1%
ValueCountFrequency (%)
88.71
< 0.1%
87.11
< 0.1%
871
< 0.1%
86.52
< 0.1%
861
< 0.1%
85.72
< 0.1%
85.51
< 0.1%
85.42
< 0.1%
85.31
< 0.1%
85.12
< 0.1%

AH
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4949
Distinct (%)81.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9931203968
Minimum0.1847
Maximum2.0297
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size47.8 KiB
2022-09-01T19:47:58.158383image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.1847
5-th percentile0.38559
Q10.70425
median0.9688
Q31.2666
95-th percentile1.70552
Maximum2.0297
Range1.845
Interquartile range (IQR)0.56235

Descriptive statistics

Standard deviation0.3982713569
Coefficient of variation (CV)0.4010302861
Kurtosis-0.5950051195
Mean0.9931203968
Median Absolute Deviation (MAD)0.2801
Skewness0.2641157285
Sum6057.0413
Variance0.1586200737
MonotonicityNot monotonic
2022-09-01T19:47:58.251963image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.97225
 
0.1%
0.87365
 
0.1%
0.91294
 
0.1%
1.36894
 
0.1%
1.04514
 
0.1%
0.81934
 
0.1%
0.83944
 
0.1%
0.93854
 
0.1%
0.94624
 
0.1%
0.90334
 
0.1%
Other values (4939)6057
99.3%
ValueCountFrequency (%)
0.18471
< 0.1%
0.18621
< 0.1%
0.1911
< 0.1%
0.19751
< 0.1%
0.19881
< 0.1%
0.20291
< 0.1%
0.20311
< 0.1%
0.20621
< 0.1%
0.20861
< 0.1%
0.21571
< 0.1%
ValueCountFrequency (%)
2.02971
< 0.1%
2.0291
< 0.1%
2.02241
< 0.1%
2.0191
< 0.1%
2.01841
< 0.1%
2.01831
< 0.1%
2.01811
< 0.1%
2.01771
< 0.1%
2.01271
< 0.1%
2.0121
< 0.1%

Interactions

2022-09-01T19:47:54.664389image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:48.379502image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.127142image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.873282image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.736521image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.490168image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.269836image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.169608image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.888225image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.744959image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:48.462072image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.207211image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.952850image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.816590image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.571738image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.357911image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.244172image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.971295image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.828028image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:48.542141image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.285778image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.039924image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.897159image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.657811image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.442984image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.321238image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.053866image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.911601image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:48.626213image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.367348image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.122996image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.981732image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.741383image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.529058image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.400305image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.139439image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.996674image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:48.708283image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.450920image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.206568image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.062801image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.828457image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.615632image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.479874image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.224512image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:55.083749image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:48.790354image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.534491image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.389724image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.148375image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.917033image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.702707image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.560944image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.312587image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:55.269907image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:48.876427image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.621065image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.477800image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.235950image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.010113image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.898875image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.645016image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.401664image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:55.351979image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:48.954494image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.699133image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.559370image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.314517image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.091683image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.984949image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.719579image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.483234image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:55.442056image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.041068image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:49.785206image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:50.648947image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:51.402092image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:52.181761image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.077528image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:53.804152image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-01T19:47:54.573312image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-09-01T19:47:58.336035image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-01T19:47:58.451134image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-01T19:47:58.569235image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-01T19:47:58.677328image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-09-01T19:47:58.760900image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-01T19:47:55.577671image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-09-01T19:47:55.743313image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

DateMonthWeekdayDateTimeHourPT08.S1(CO)PT08.S2(NMHC)PT08.S3(NOx)PT08.S4(NO2)PT08.S5(O3)TRHAH
02004-03-10MarchWednesday2004-03-10 18:00:00181360.01046.01056.01692.01268.013.648.90.7578
12004-03-10MarchWednesday2004-03-10 19:00:00191292.0955.01174.01559.0972.013.347.70.7255
22004-03-10MarchWednesday2004-03-10 20:00:00201402.0939.01140.01555.01074.011.954.00.7502
32004-03-10MarchWednesday2004-03-10 21:00:00211376.0948.01092.01584.01203.011.060.00.7867
42004-03-10MarchWednesday2004-03-10 22:00:00221272.0836.01205.01490.01110.011.259.60.7888
52004-03-10MarchWednesday2004-03-10 23:00:00231197.0750.01337.01393.0949.011.259.20.7848
62004-03-11MarchThursday2004-03-11 08:00:0081333.0900.01136.01517.01102.010.857.40.7408
72004-03-11MarchThursday2004-03-11 09:00:0091351.0960.01079.01583.01028.010.560.60.7691
82004-03-11MarchThursday2004-03-11 10:00:00101233.0827.01218.01446.0860.010.858.40.7552
92004-03-11MarchThursday2004-03-11 11:00:00111179.0762.01328.01362.0671.010.557.90.7352

Last rows

DateMonthWeekdayDateTimeHourPT08.S1(CO)PT08.S2(NMHC)PT08.S3(NOx)PT08.S4(NO2)PT08.S5(O3)TRHAH
60892005-04-04AprilMonday2005-04-04 05:00:005888.0528.01077.0987.0578.010.459.90.7550
60902005-04-04AprilMonday2005-04-04 06:00:0061031.0730.0760.01129.0905.09.563.10.7531
60912005-04-04AprilMonday2005-04-04 07:00:0071384.01221.0470.01600.01457.09.761.90.7446
60922005-04-04AprilMonday2005-04-04 08:00:0081446.01362.0415.01777.01705.013.548.90.7553
60932005-04-04AprilMonday2005-04-04 09:00:0091297.01102.0507.01375.01583.018.236.30.7487
60942005-04-04AprilMonday2005-04-04 10:00:00101314.01101.0539.01374.01729.021.929.30.7568
60952005-04-04AprilMonday2005-04-04 11:00:00111163.01027.0604.01264.01269.024.323.70.7119
60962005-04-04AprilMonday2005-04-04 12:00:00121142.01063.0603.01241.01092.026.918.30.6406
60972005-04-04AprilMonday2005-04-04 13:00:00131003.0961.0702.01041.0770.028.313.50.5139
60982005-04-04AprilMonday2005-04-04 14:00:00141071.01047.0654.01129.0816.028.513.10.5028