This project focuses on using satellite imagery to understand environmental conditions in great detail. We're using advanced computer vision techniques to analyze this imagery, looking for signs of drought stress. By training our algorithms on a carefully labeled dataset, we aim to identify patterns that indicate if droughts are likely to occur.
Dataset
The current dataset consists of 86,317 train and 10,778 validation satellite images, 65x65 pixels each, in 10 spectrum bands, with 10,774 images withheld to test long-term generalization (107,869 total). Human experts (pastoralists) have labeled these with the number of cows that the geographic location at the center of the image could support (0, 1, 2, or 3+ cows). Each pixel represents a 30 meter square, so the images at full size are 1.95 kilometers across. Pastoralists are asked to rate the quality of the area within 20 meters of where they are standing, which corresponds to an area slightly larger a single pixel. Since forage quality is correlated across space, the larger image may be useful for prediction.
The data is in TFRecords format, split into train and val, and takes up ~4.3GB (2.15GB zipped). You can learn more about the format of the satellite images here.
The data used in this research was collected through a research collaboration between the International Livestock Research Institute, Cornell University, and UC San Diego. It was supported by the Atkinson Centre for a Sustainable Future’s Academic Venture Fund, Australian Aid through the AusAID Development Research Awards Scheme Agreement No. 66138, the National Science Foundation (0832782, 1059284, 1522054), and ARO grant W911-NF-14-1-0498.
Downloading the Data
!echo "Downloading data"
!curl -SL https://storage.googleapis.com/wandb_datasets/dw_train_86K_val_10K.zip > dw_data.zip
!unzip dw_data.zip
!rm dw_data.zip
!mv droughtwatch_data/ data/
Downloading data
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2050M 100 2050M 0 0 86.6M 0 0:00:23 0:00:23 --:--:-- 118M
Archive: dw_data.zip
creating: droughtwatch_data/
creating: droughtwatch_data/val/
inflating: droughtwatch_data/val/part-r-00090
inflating: droughtwatch_data/val/part-r-00061
inflating: droughtwatch_data/val/part-r-00052
inflating: droughtwatch_data/val/part-r-00043
inflating: droughtwatch_data/val/part-r-00040
inflating: droughtwatch_data/val/part-r-00042
inflating: droughtwatch_data/val/part-r-00067
inflating: droughtwatch_data/val/part-r-00026
inflating: droughtwatch_data/val/part-r-00046
inflating: droughtwatch_data/val/part-r-00023
inflating: droughtwatch_data/val/part-r-00083
inflating: droughtwatch_data/val/part-r-00011
inflating: droughtwatch_data/val/part-r-00058
inflating: droughtwatch_data/val/part-r-00012
inflating: droughtwatch_data/val/part-r-00078
inflating: droughtwatch_data/val/part-r-00082
inflating: droughtwatch_data/val/part-r-00038
inflating: droughtwatch_data/val/part-r-00071
inflating: droughtwatch_data/val/part-r-00088
inflating: droughtwatch_data/val/part-r-00029
inflating: droughtwatch_data/val/part-r-00097
inflating: droughtwatch_data/val/part-r-00096
inflating: droughtwatch_data/val/part-r-00077
inflating: droughtwatch_data/val/part-r-00016
inflating: droughtwatch_data/val/part-r-00076
inflating: droughtwatch_data/val/part-r-00057
inflating: droughtwatch_data/val/part-r-00009
inflating: droughtwatch_data/val/part-r-00054
inflating: droughtwatch_data/val/part-r-00028
inflating: droughtwatch_data/val/part-r-00014
inflating: droughtwatch_data/val/part-r-00001
inflating: droughtwatch_data/val/part-r-00080
inflating: droughtwatch_data/val/part-r-00013
inflating: droughtwatch_data/val/part-r-00004
inflating: droughtwatch_data/val/part-r-00070
inflating: droughtwatch_data/val/part-r-00037
inflating: droughtwatch_data/val/part-r-00005
inflating: droughtwatch_data/val/part-r-00010
inflating: droughtwatch_data/val/part-r-00024
inflating: droughtwatch_data/val/part-r-00075
inflating: droughtwatch_data/val/part-r-00074
inflating: droughtwatch_data/val/part-r-00017
inflating: droughtwatch_data/val/part-r-00055
inflating: droughtwatch_data/val/part-r-00091
inflating: droughtwatch_data/val/part-r-00020
inflating: droughtwatch_data/val/part-r-00095
inflating: droughtwatch_data/val/part-r-00079
inflating: droughtwatch_data/val/part-r-00047
inflating: droughtwatch_data/val/part-r-00051
inflating: droughtwatch_data/val/part-r-00073
inflating: droughtwatch_data/val/part-r-00053
inflating: droughtwatch_data/val/part-r-00060
inflating: droughtwatch_data/val/part-r-00086
inflating: droughtwatch_data/val/part-r-00048
inflating: droughtwatch_data/val/part-r-00006
inflating: droughtwatch_data/val/part-r-00059
inflating: droughtwatch_data/val/part-r-00002
inflating: droughtwatch_data/val/part-r-00094
inflating: droughtwatch_data/val/part-r-00033
inflating: droughtwatch_data/val/part-r-00066
inflating: droughtwatch_data/val/part-r-00015
inflating: droughtwatch_data/val/part-r-00000
inflating: droughtwatch_data/val/part-r-00050
inflating: droughtwatch_data/val/part-r-00093
inflating: droughtwatch_data/val/part-r-00035
inflating: droughtwatch_data/val/part-r-00098
inflating: droughtwatch_data/val/part-r-00099
inflating: droughtwatch_data/val/part-r-00084
inflating: droughtwatch_data/val/part-r-00003
inflating: droughtwatch_data/val/part-r-00032
inflating: droughtwatch_data/val/part-r-00018
inflating: droughtwatch_data/val/part-r-00062
inflating: droughtwatch_data/val/part-r-00045
inflating: droughtwatch_data/val/part-r-00089
inflating: droughtwatch_data/val/part-r-00087
inflating: droughtwatch_data/val/part-r-00027
inflating: droughtwatch_data/val/part-r-00064
inflating: droughtwatch_data/val/part-r-00085
inflating: droughtwatch_data/val/part-r-00025
extracting: droughtwatch_data/val/_SUCCESS
inflating: droughtwatch_data/val/part-r-00030
inflating: droughtwatch_data/val/part-r-00065
inflating: droughtwatch_data/val/part-r-00034
inflating: droughtwatch_data/val/part-r-00031
inflating: droughtwatch_data/val/part-r-00007
inflating: droughtwatch_data/val/part-r-00008
inflating: droughtwatch_data/val/part-r-00041
inflating: droughtwatch_data/val/part-r-00063
inflating: droughtwatch_data/val/part-r-00069
inflating: droughtwatch_data/val/part-r-00021
inflating: droughtwatch_data/val/part-r-00044
inflating: droughtwatch_data/val/part-r-00081
inflating: droughtwatch_data/val/part-r-00039
inflating: droughtwatch_data/val/part-r-00068
inflating: droughtwatch_data/val/part-r-00056
inflating: droughtwatch_data/val/part-r-00072
inflating: droughtwatch_data/val/part-r-00019
inflating: droughtwatch_data/val/part-r-00036
inflating: droughtwatch_data/val/part-r-00092
inflating: droughtwatch_data/val/part-r-00049
inflating: droughtwatch_data/val/part-r-00022
creating: droughtwatch_data/train/
inflating: droughtwatch_data/train/part-r-01174
inflating: droughtwatch_data/train/part-r-01012
inflating: droughtwatch_data/train/part-r-00110
inflating: droughtwatch_data/train/part-r-00132
inflating: droughtwatch_data/train/part-r-01168
inflating: droughtwatch_data/train/part-r-01011
inflating: droughtwatch_data/train/part-r-01084
inflating: droughtwatch_data/train/part-r-01138
inflating: droughtwatch_data/train/part-r-00090
inflating: droughtwatch_data/train/part-r-00167
inflating: droughtwatch_data/train/part-r-00128
inflating: droughtwatch_data/train/part-r-01060
inflating: droughtwatch_data/train/part-r-00061
inflating: droughtwatch_data/train/part-r-00116
inflating: droughtwatch_data/train/part-r-00052
inflating: droughtwatch_data/train/part-r-01091
inflating: droughtwatch_data/train/part-r-01007
inflating: droughtwatch_data/train/part-r-00043
inflating: droughtwatch_data/train/part-r-01120
inflating: droughtwatch_data/train/part-r-01025
inflating: droughtwatch_data/train/part-r-01056
inflating: droughtwatch_data/train/part-r-00040
inflating: droughtwatch_data/train/part-r-00178
inflating: droughtwatch_data/train/part-r-01136
inflating: droughtwatch_data/train/part-r-00106
inflating: droughtwatch_data/train/part-r-01066
inflating: droughtwatch_data/train/part-r-01104
inflating: droughtwatch_data/train/part-r-00042
inflating: droughtwatch_data/train/part-r-01192
inflating: droughtwatch_data/train/part-r-00136
inflating: droughtwatch_data/train/part-r-00067
inflating: droughtwatch_data/train/part-r-01181
inflating: droughtwatch_data/train/part-r-00190
inflating: droughtwatch_data/train/part-r-00196
inflating: droughtwatch_data/train/part-r-00026
inflating: droughtwatch_data/train/part-r-00177
inflating: droughtwatch_data/train/part-r-01039
inflating: droughtwatch_data/train/part-r-00046
inflating: droughtwatch_data/train/part-r-01024
inflating: droughtwatch_data/train/part-r-01102
inflating: droughtwatch_data/train/part-r-00023
inflating: droughtwatch_data/train/part-r-01188
inflating: droughtwatch_data/train/part-r-00184
inflating: droughtwatch_data/train/part-r-00137
inflating: droughtwatch_data/train/part-r-01199
inflating: droughtwatch_data/train/part-r-01115
inflating: droughtwatch_data/train/part-r-00083
inflating: droughtwatch_data/train/part-r-01014
inflating: droughtwatch_data/train/part-r-00011
inflating: droughtwatch_data/train/part-r-00144
inflating: droughtwatch_data/train/part-r-00058
inflating: droughtwatch_data/train/part-r-00115
inflating: droughtwatch_data/train/part-r-01154
inflating: droughtwatch_data/train/part-r-00012
inflating: droughtwatch_data/train/part-r-00156
inflating: droughtwatch_data/train/part-r-00078
inflating: droughtwatch_data/train/part-r-01026
inflating: droughtwatch_data/train/part-r-00082
inflating: droughtwatch_data/train/part-r-01171
inflating: droughtwatch_data/train/part-r-00153
inflating: droughtwatch_data/train/part-r-00172
inflating: droughtwatch_data/train/part-r-01037
inflating: droughtwatch_data/train/part-r-00165
inflating: droughtwatch_data/train/part-r-01022
inflating: droughtwatch_data/train/part-r-00150
inflating: droughtwatch_data/train/part-r-01095
inflating: droughtwatch_data/train/part-r-00108
inflating: droughtwatch_data/train/part-r-00038
inflating: droughtwatch_data/train/part-r-00119
inflating: droughtwatch_data/train/part-r-01170
inflating: droughtwatch_data/train/part-r-01079
inflating: droughtwatch_data/train/part-r-00071
inflating: droughtwatch_data/train/part-r-01076
inflating: droughtwatch_data/train/part-r-00088
inflating: droughtwatch_data/train/part-r-01090
inflating: droughtwatch_data/train/part-r-00112
inflating: droughtwatch_data/train/part-r-00029
inflating: droughtwatch_data/train/part-r-00097
inflating: droughtwatch_data/train/part-r-01092
inflating: droughtwatch_data/train/part-r-01054
inflating: droughtwatch_data/train/part-r-01001
inflating: droughtwatch_data/train/part-r-01198
inflating: droughtwatch_data/train/part-r-00174
inflating: droughtwatch_data/train/part-r-01109
inflating: droughtwatch_data/train/part-r-01161
inflating: droughtwatch_data/train/part-r-01125
inflating: droughtwatch_data/train/part-r-01128
inflating: droughtwatch_data/train/part-r-01019
inflating: droughtwatch_data/train/part-r-00096
inflating: droughtwatch_data/train/part-r-01195
inflating: droughtwatch_data/train/part-r-00077
inflating: droughtwatch_data/train/part-r-00016
inflating: droughtwatch_data/train/part-r-00121
inflating: droughtwatch_data/train/part-r-01087
inflating: droughtwatch_data/train/part-r-00199
inflating: droughtwatch_data/train/part-r-00185
inflating: droughtwatch_data/train/part-r-01116
inflating: droughtwatch_data/train/part-r-01101
inflating: droughtwatch_data/train/part-r-01178
inflating: droughtwatch_data/train/part-r-01035
inflating: droughtwatch_data/train/part-r-01006
inflating: droughtwatch_data/train/part-r-01183
inflating: droughtwatch_data/train/part-r-00104
inflating: droughtwatch_data/train/part-r-01155
inflating: droughtwatch_data/train/part-r-01061
inflating: droughtwatch_data/train/part-r-01008
inflating: droughtwatch_data/train/part-r-01063
inflating: droughtwatch_data/train/part-r-01137
inflating: droughtwatch_data/train/part-r-01034
inflating: droughtwatch_data/train/part-r-00171
inflating: droughtwatch_data/train/part-r-01159
inflating: droughtwatch_data/train/part-r-00109
inflating: droughtwatch_data/train/part-r-00076
inflating: droughtwatch_data/train/part-r-01082
inflating: droughtwatch_data/train/part-r-01004
inflating: droughtwatch_data/train/part-r-01145
inflating: droughtwatch_data/train/part-r-00161
inflating: droughtwatch_data/train/part-r-00101
inflating: droughtwatch_data/train/part-r-00057
inflating: droughtwatch_data/train/part-r-01182
inflating: droughtwatch_data/train/part-r-01118
inflating: droughtwatch_data/train/part-r-00168
inflating: droughtwatch_data/train/part-r-00102
inflating: droughtwatch_data/train/part-r-01085
inflating: droughtwatch_data/train/part-r-00183
inflating: droughtwatch_data/train/part-r-01187
inflating: droughtwatch_data/train/part-r-00009
inflating: droughtwatch_data/train/part-r-01114
inflating: droughtwatch_data/train/part-r-01051
inflating: droughtwatch_data/train/part-r-01185
inflating: droughtwatch_data/train/part-r-01057
inflating: droughtwatch_data/train/part-r-01119
inflating: droughtwatch_data/train/part-r-00189
inflating: droughtwatch_data/train/part-r-01081
inflating: droughtwatch_data/train/part-r-00054
inflating: droughtwatch_data/train/part-r-01144
inflating: droughtwatch_data/train/part-r-00179
inflating: droughtwatch_data/train/part-r-01196
inflating: droughtwatch_data/train/part-r-01072
inflating: droughtwatch_data/train/part-r-01172
inflating: droughtwatch_data/train/part-r-01156
inflating: droughtwatch_data/train/part-r-01176
inflating: droughtwatch_data/train/part-r-00028
inflating: droughtwatch_data/train/part-r-00014
inflating: droughtwatch_data/train/part-r-00001
inflating: droughtwatch_data/train/part-r-01062
inflating: droughtwatch_data/train/part-r-01107
inflating: droughtwatch_data/train/part-r-00145
inflating: droughtwatch_data/train/part-r-00080
inflating: droughtwatch_data/train/part-r-00124
inflating: droughtwatch_data/train/part-r-01151
inflating: droughtwatch_data/train/part-r-00176
inflating: droughtwatch_data/train/part-r-00013
inflating: droughtwatch_data/train/part-r-00004
inflating: droughtwatch_data/train/part-r-00070
inflating: droughtwatch_data/train/part-r-00157
inflating: droughtwatch_data/train/part-r-01113
inflating: droughtwatch_data/train/part-r-00037
inflating: droughtwatch_data/train/part-r-01173
inflating: droughtwatch_data/train/part-r-00005
inflating: droughtwatch_data/train/part-r-01158
inflating: droughtwatch_data/train/part-r-01164
inflating: droughtwatch_data/train/part-r-01175
inflating: droughtwatch_data/train/part-r-00148
inflating: droughtwatch_data/train/part-r-01049
inflating: droughtwatch_data/train/part-r-01106
inflating: droughtwatch_data/train/part-r-01002
inflating: droughtwatch_data/train/part-r-00142
inflating: droughtwatch_data/train/part-r-00126
inflating: droughtwatch_data/train/part-r-00010
inflating: droughtwatch_data/train/part-r-00024
inflating: droughtwatch_data/train/part-r-01184
inflating: droughtwatch_data/train/part-r-01193
inflating: droughtwatch_data/train/part-r-00075
inflating: droughtwatch_data/train/part-r-00175
inflating: droughtwatch_data/train/part-r-01160
inflating: droughtwatch_data/train/part-r-01163
inflating: droughtwatch_data/train/part-r-00074
inflating: droughtwatch_data/train/part-r-01169
inflating: droughtwatch_data/train/part-r-01110
inflating: droughtwatch_data/train/part-r-00017
inflating: droughtwatch_data/train/part-r-01099
inflating: droughtwatch_data/train/part-r-00055
inflating: droughtwatch_data/train/part-r-00166
inflating: droughtwatch_data/train/part-r-00091
inflating: droughtwatch_data/train/part-r-00020
inflating: droughtwatch_data/train/part-r-01010
inflating: droughtwatch_data/train/part-r-01179
inflating: droughtwatch_data/train/part-r-01052
inflating: droughtwatch_data/train/part-r-00133
inflating: droughtwatch_data/train/part-r-00120
inflating: droughtwatch_data/train/part-r-00095
inflating: droughtwatch_data/train/part-r-01059
inflating: droughtwatch_data/train/part-r-00134
inflating: droughtwatch_data/train/part-r-01003
inflating: droughtwatch_data/train/part-r-01157
inflating: droughtwatch_data/train/part-r-00173
inflating: droughtwatch_data/train/part-r-01141
inflating: droughtwatch_data/train/part-r-00079
inflating: droughtwatch_data/train/part-r-00047
inflating: droughtwatch_data/train/part-r-00051
inflating: droughtwatch_data/train/part-r-00073
inflating: droughtwatch_data/train/part-r-01080
inflating: droughtwatch_data/train/part-r-01015
inflating: droughtwatch_data/train/part-r-00122
inflating: droughtwatch_data/train/part-r-01027
inflating: droughtwatch_data/train/part-r-00162
inflating: droughtwatch_data/train/part-r-00053
inflating: droughtwatch_data/train/part-r-00111
inflating: droughtwatch_data/train/part-r-00152
inflating: droughtwatch_data/train/part-r-00060
inflating: droughtwatch_data/train/part-r-00086
inflating: droughtwatch_data/train/part-r-01041
inflating: droughtwatch_data/train/part-r-00048
inflating: droughtwatch_data/train/part-r-01055
inflating: droughtwatch_data/train/part-r-00006
inflating: droughtwatch_data/train/part-r-00059
inflating: droughtwatch_data/train/part-r-01140
inflating: droughtwatch_data/train/part-r-01165
inflating: droughtwatch_data/train/part-r-01065
inflating: droughtwatch_data/train/part-r-01086
inflating: droughtwatch_data/train/part-r-01048
inflating: droughtwatch_data/train/part-r-01045
inflating: droughtwatch_data/train/part-r-01126
inflating: droughtwatch_data/train/part-r-00002
inflating: droughtwatch_data/train/part-r-01009
inflating: droughtwatch_data/train/part-r-01103
inflating: droughtwatch_data/train/part-r-00131
inflating: droughtwatch_data/train/part-r-00094
inflating: droughtwatch_data/train/part-r-01098
inflating: droughtwatch_data/train/part-r-00033
inflating: droughtwatch_data/train/part-r-01177
inflating: droughtwatch_data/train/part-r-01148
inflating: droughtwatch_data/train/part-r-00123
inflating: droughtwatch_data/train/part-r-00100
inflating: droughtwatch_data/train/part-r-01197
inflating: droughtwatch_data/train/part-r-00113
inflating: droughtwatch_data/train/part-r-00066
inflating: droughtwatch_data/train/part-r-01053
inflating: droughtwatch_data/train/part-r-01071
inflating: droughtwatch_data/train/part-r-00015
inflating: droughtwatch_data/train/part-r-00000
inflating: droughtwatch_data/train/part-r-01143
inflating: droughtwatch_data/train/part-r-00140
inflating: droughtwatch_data/train/part-r-00050
inflating: droughtwatch_data/train/part-r-00093
inflating: droughtwatch_data/train/part-r-01040
inflating: droughtwatch_data/train/part-r-01097
inflating: droughtwatch_data/train/part-r-01112
inflating: droughtwatch_data/train/part-r-01149
inflating: droughtwatch_data/train/part-r-01023
inflating: droughtwatch_data/train/part-r-00035
inflating: droughtwatch_data/train/part-r-01121
inflating: droughtwatch_data/train/part-r-00118
inflating: droughtwatch_data/train/part-r-00158
inflating: droughtwatch_data/train/part-r-00146
inflating: droughtwatch_data/train/part-r-00098
inflating: droughtwatch_data/train/part-r-00192
inflating: droughtwatch_data/train/part-r-01152
inflating: droughtwatch_data/train/part-r-00125
inflating: droughtwatch_data/train/part-r-00099
inflating: droughtwatch_data/train/part-r-00084
inflating: droughtwatch_data/train/part-r-01028
inflating: droughtwatch_data/train/part-r-01046
inflating: droughtwatch_data/train/part-r-01180
inflating: droughtwatch_data/train/part-r-00003
inflating: droughtwatch_data/train/part-r-01100
inflating: droughtwatch_data/train/part-r-00180
inflating: droughtwatch_data/train/part-r-00139
inflating: droughtwatch_data/train/part-r-00160
inflating: droughtwatch_data/train/part-r-01005
inflating: droughtwatch_data/train/part-r-01075
inflating: droughtwatch_data/train/part-r-00032
inflating: droughtwatch_data/train/part-r-01067
inflating: droughtwatch_data/train/part-r-00151
inflating: droughtwatch_data/train/part-r-00018
inflating: droughtwatch_data/train/part-r-00143
inflating: droughtwatch_data/train/part-r-01133
inflating: droughtwatch_data/train/part-r-01135
inflating: droughtwatch_data/train/part-r-01017
inflating: droughtwatch_data/train/part-r-00187
inflating: droughtwatch_data/train/part-r-00129
inflating: droughtwatch_data/train/part-r-01020
inflating: droughtwatch_data/train/part-r-00062
inflating: droughtwatch_data/train/part-r-00045
inflating: droughtwatch_data/train/part-r-00147
inflating: droughtwatch_data/train/part-r-01064
inflating: droughtwatch_data/train/part-r-00089
inflating: droughtwatch_data/train/part-r-01074
inflating: droughtwatch_data/train/part-r-00135
inflating: droughtwatch_data/train/part-r-01142
inflating: droughtwatch_data/train/part-r-01036
inflating: droughtwatch_data/train/part-r-01078
inflating: droughtwatch_data/train/part-r-01050
inflating: droughtwatch_data/train/part-r-01166
inflating: droughtwatch_data/train/part-r-01134
inflating: droughtwatch_data/train/part-r-00191
inflating: droughtwatch_data/train/part-r-00182
inflating: droughtwatch_data/train/part-r-01073
inflating: droughtwatch_data/train/part-r-00087
inflating: droughtwatch_data/train/part-r-01029
inflating: droughtwatch_data/train/part-r-00027
inflating: droughtwatch_data/train/part-r-01018
inflating: droughtwatch_data/train/part-r-00198
inflating: droughtwatch_data/train/part-r-00064
inflating: droughtwatch_data/train/part-r-00149
inflating: droughtwatch_data/train/part-r-00193
inflating: droughtwatch_data/train/part-r-00154
inflating: droughtwatch_data/train/part-r-01162
inflating: droughtwatch_data/train/part-r-00085
inflating: droughtwatch_data/train/part-r-00025
inflating: droughtwatch_data/train/part-r-01117
inflating: droughtwatch_data/train/part-r-00130
inflating: droughtwatch_data/train/part-r-01124
extracting: droughtwatch_data/train/_SUCCESS
inflating: droughtwatch_data/train/part-r-01189
inflating: droughtwatch_data/train/part-r-01032
inflating: droughtwatch_data/train/part-r-00030
inflating: droughtwatch_data/train/part-r-01068
inflating: droughtwatch_data/train/part-r-00065
inflating: droughtwatch_data/train/part-r-01108
inflating: droughtwatch_data/train/part-r-00141
inflating: droughtwatch_data/train/part-r-00138
inflating: droughtwatch_data/train/part-r-01096
inflating: droughtwatch_data/train/part-r-01089
inflating: droughtwatch_data/train/part-r-01167
inflating: droughtwatch_data/train/part-r-01150
inflating: droughtwatch_data/train/part-r-01038
inflating: droughtwatch_data/train/part-r-01030
inflating: droughtwatch_data/train/part-r-00034
inflating: droughtwatch_data/train/part-r-01083
inflating: droughtwatch_data/train/part-r-00031
inflating: droughtwatch_data/train/part-r-00164
inflating: droughtwatch_data/train/part-r-00197
inflating: droughtwatch_data/train/part-r-00007
inflating: droughtwatch_data/train/part-r-00103
inflating: droughtwatch_data/train/part-r-01094
inflating: droughtwatch_data/train/part-r-01058
inflating: droughtwatch_data/train/part-r-01069
inflating: droughtwatch_data/train/part-r-00105
inflating: droughtwatch_data/train/part-r-01000
inflating: droughtwatch_data/train/part-r-01093
inflating: droughtwatch_data/train/part-r-01139
inflating: droughtwatch_data/train/part-r-00008
inflating: droughtwatch_data/train/part-r-00195
inflating: droughtwatch_data/train/part-r-01033
inflating: droughtwatch_data/train/part-r-01127
inflating: droughtwatch_data/train/part-r-01147
inflating: droughtwatch_data/train/part-r-00186
inflating: droughtwatch_data/train/part-r-00155
inflating: droughtwatch_data/train/part-r-00041
inflating: droughtwatch_data/train/part-r-01043
inflating: droughtwatch_data/train/part-r-00194
inflating: droughtwatch_data/train/part-r-00114
inflating: droughtwatch_data/train/part-r-01132
inflating: droughtwatch_data/train/part-r-01153
inflating: droughtwatch_data/train/part-r-00063
inflating: droughtwatch_data/train/part-r-01131
inflating: droughtwatch_data/train/part-r-00069
inflating: droughtwatch_data/train/part-r-01194
inflating: droughtwatch_data/train/part-r-01123
inflating: droughtwatch_data/train/part-r-00021
inflating: droughtwatch_data/train/part-r-00044
inflating: droughtwatch_data/train/part-r-01191
inflating: droughtwatch_data/train/part-r-01077
inflating: droughtwatch_data/train/part-r-01122
inflating: droughtwatch_data/train/part-r-00081
inflating: droughtwatch_data/train/part-r-00039
inflating: droughtwatch_data/train/part-r-01013
inflating: droughtwatch_data/train/part-r-00181
inflating: droughtwatch_data/train/part-r-01146
inflating: droughtwatch_data/train/part-r-01044
inflating: droughtwatch_data/train/part-r-00068
inflating: droughtwatch_data/train/part-r-00188
inflating: droughtwatch_data/train/part-r-01070
inflating: droughtwatch_data/train/part-r-01021
inflating: droughtwatch_data/train/part-r-01042
inflating: droughtwatch_data/train/part-r-01047
inflating: droughtwatch_data/train/part-r-00107
inflating: droughtwatch_data/train/part-r-00169
inflating: droughtwatch_data/train/part-r-00056
inflating: droughtwatch_data/train/part-r-01190
inflating: droughtwatch_data/train/part-r-00072
inflating: droughtwatch_data/train/part-r-01088
inflating: droughtwatch_data/train/part-r-01186
inflating: droughtwatch_data/train/part-r-01130
inflating: droughtwatch_data/train/part-r-00127
inflating: droughtwatch_data/train/part-r-00159
inflating: droughtwatch_data/train/part-r-00019
inflating: droughtwatch_data/train/part-r-01111
inflating: droughtwatch_data/train/part-r-01016
inflating: droughtwatch_data/train/part-r-00036
inflating: droughtwatch_data/train/part-r-01105
inflating: droughtwatch_data/train/part-r-01031
inflating: droughtwatch_data/train/part-r-00163
inflating: droughtwatch_data/train/part-r-01129
inflating: droughtwatch_data/train/part-r-00092
inflating: droughtwatch_data/train/part-r-00117
inflating: droughtwatch_data/train/part-r-00049
inflating: droughtwatch_data/train/part-r-00170
inflating: droughtwatch_data/train/part-r-00022
Visualizing data
import os
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
The dataset is in the form of TFRecordDataset. In a TFRecord dataset, data is serialized into a binary format, which reduces storage space and speeds up data reading and processing. Each record in the dataset contains a serialized example, which typically includes features and labels for machine learning tasks.
import os
import tensorflow as tf
# Define a lambda function to generate a list of file paths
dirlist = lambda di: [os.path.join(di, file) for file in os.listdir(di) if 'part-' in file]
# Get a list of training file paths
training_files = dirlist('data/val/')
# Define a function to parse TFRecord data
def parse_visual(data):
# Create a TFRecordDataset from the input data file
dataset = tf.data.TFRecordDataset(data)
# Define the features expected in the TFRecord examples
features = {
'B2': tf.io.FixedLenFeature([], tf.string), # Blue band data
'B3': tf.io.FixedLenFeature([], tf.string), # Green band data
'B4': tf.io.FixedLenFeature([], tf.string), # Red band data
'label': tf.io.FixedLenFeature([], tf.int64), # Label or target value
}
# Parse each TFRecord example in the dataset using the defined features
parsed_examples = [tf.io.parse_single_example(data, features) for data in dataset]
return parsed_examples
# Parse TFRecord examples from the first training file
parsed_examples = parse_visual(training_files[0])
def get_img_from_example(parsed_example, intensify=True):
# Initialize an empty RGB array with dimensions 65x65x3
rgbArray = np.zeros((65,65,3), 'uint8')
# Iterate over each band (B4, B3, B2)
for i, band in enumerate(['B4', 'B3', 'B2']):
# Extract band data from the parsed example and convert it to numpy array
band_data = np.frombuffer(parsed_example[band].numpy(), dtype=np.uint8)
# Reshape the band data to match the dimensions of a 65x65 image
band_data = band_data.reshape(65, 65)
if intensify:
band_data = band_data/np.max(band_data)*255
else:
band_data = band_data*255
# Assign the band data to the corresponding channel in the RGB array
rgbArray[..., i] = band_data
# Extract the label from the parsed example and convert it to numpy array
label = tf.cast(parsed_example['label'], tf.int32).numpy()
return rgbArray, label
fig=plt.figure(figsize=(20, 30), dpi= 80, facecolor='w', edgecolor='k')
for i in range(1,26):
plt.subplot(5, 5, i)
img, label = img, label = get_img_from_example(parsed_examples[i+7])
plt.imshow(img).axes.get_xaxis().set_visible(False)
plt.imshow(img).axes.get_yaxis().set_visible(False)
plt.title(str(label))
fig.show()
Defining constants
# Total size of the dataset
TOTAL_TRAIN = 86317
TOTAL_VAL = 10778
# Sample size of dataset for training purpose
SIZE= 0.1 # modify this only
SIZE_TRAIN = int(TOTAL_TRAIN*SIZE)
SIZE_VAL = int(TOTAL_VAL)
# Parameters of the data (do not change)
IMG_DIM = 65
NUM_CLASSES = 4
Defining features
The satellite imagery data contains different spectral bands. These bands represent different wavelengths of light captured by the satellite sensor, each providing unique information about the Earth's surface.
B1 30 meters 0.43 - 0.45 µm Coastal aerosol
B2 30 meters 0.45 - 0.51 µm Blue
B3 30 meters 0.53 - 0.59 µm Green
B4 30 meters 0.64 - 0.67 µm Red
B5 30 meters 0.85 - 0.88 µm Near infrared
B6 30 meters 1.57 - 1.65 µm Shortwave infrared 1
B7 30 meters 2.11 - 2.29 µm Shortwave infrared 2
B8 15 meters 0.52 - 0.90 µm Band 8 Panchromatic
B9 15 meters 1.36 - 1.38 µm Cirrus
B10 30 meters 10.60 - 11.19 µm Thermal infrared 1, resampled from 100m to 30m
B11 30 meters 11.50 - 12.51 µm Thermal infrared 2, resampled from 100m to 30m
We create a dictionary 'features' which defines the structure of the data stored in TFRecord format. The need to do this arises because:
Data Structure Definition: The TFRecord format is a binary storage format used in TensorFlow for efficiently handling large datasets. However, it's a generic format that doesn't inherently understand the structure of the data it contains. By creating this dictionary, we explicitly define the structure of the data that will be stored in TFRecord files. Each key-value pair in the dictionary represents a feature name and its corresponding properties, such as data type and shape.
Parsing: When reading data from TFRecord files during training or inference, TensorFlow needs to know how to interpret the binary data stored in each example. The dictionary serves as a blueprint for parsing the TFRecord examples correctly. TensorFlow uses this dictionary to decode the binary data into tensors that can be fed into the machine learning model.
Consistency and Compatibility: Defining the data structure upfront ensures consistency and compatibility between the data stored in TFRecord files and the model's input requirements. It ensures that the features expected by the model match the features stored in the TFRecord files, preventing data parsing errors and mismatches during training or inference.
Integration with TensorFlow APIs: TensorFlow provides high-level APIs for working with TFRecord datasets, such as tf.data.TFRecordDataset and tf.io.parse_single_example. These APIs rely on the feature dictionary to understand how to read and parse the data correctly.
features = {
'B1': tf.io.FixedLenFeature([], tf.string),
'B2': tf.io.FixedLenFeature([], tf.string),
'B3': tf.io.FixedLenFeature([], tf.string),
'B4': tf.io.FixedLenFeature([], tf.string),
'B5': tf.io.FixedLenFeature([], tf.string),
'B6': tf.io.FixedLenFeature([], tf.string),
'B7': tf.io.FixedLenFeature([], tf.string),
'B8': tf.io.FixedLenFeature([], tf.string),
'B9': tf.io.FixedLenFeature([], tf.string),
'B10': tf.io.FixedLenFeature([], tf.string),
'B11': tf.io.FixedLenFeature([], tf.string),
'label': tf.io.FixedLenFeature([], tf.int64),
}
Extracting training and validation data
import os
import tensorflow as tf
def get_data(train_data_size, val_data_size, local=True):
def load_data_local(data_path):
train = file_list_from_folder("train", data_path)
val = file_list_from_folder("val", data_path)
return train, val
def file_list_from_folder(folder, data_path):
folderpath = os.path.join(data_path, folder)
filelist = []
for filename in os.listdir(folderpath):
if filename.startswith('part-') and not filename.endswith('gstmp'):
filelist.append(os.path.join(folderpath, filename))
return filelist
def parse_tfrecords(filelist, batch_size, buffer_size, include_viz=False):
# try a subset of possible bands
def _parse_(serialized_example, keylist=['B1', 'B4', 'B3', 'B2', 'B5', 'B6', 'B7', 'B8', 'B9', 'B10', 'B11']):
example = tf.io.parse_single_example(serialized_example, features)
def getband(example_key):
img = tf.io.decode_raw(example_key, tf.uint8)
return tf.reshape(img[:IMG_DIM**2], shape=(IMG_DIM, IMG_DIM, 1))
bandlist = [getband(example[key]) for key in keylist]
# combine bands into tensor
image = tf.concat(bandlist, -1)
# one-hot encode ground truth labels
label = tf.cast(example['label'], tf.int32)
label = tf.one_hot(label, NUM_CLASSES)
return {'image': image}, label
tfrecord_dataset = tf.data.TFRecordDataset(filelist)
tfrecord_dataset = tfrecord_dataset.map(lambda x:_parse_(x)).shuffle(buffer_size).repeat(-1).batch(batch_size)
tfrecord_iterator = iter(tfrecord_dataset)
image, label = tfrecord_iterator.get_next()
return image, label
if local:
data_path = "/content/data/"
train_tfrecords, val_tfrecords = load_data_local(data_path)
X_train, y_train = parse_tfrecords(train_tfrecords, train_data_size, train_data_size)
X_val, y_val = parse_tfrecords(val_tfrecords, val_data_size, val_data_size)
return X_train, X_val, y_train, y_val
X_train_total, X_val_total, y_train_total, y_val_total = get_data(SIZE_TRAIN, SIZE_VAL, local=True)
The holdout fucntion implements a holdout strategy for splitting a dataset into training, validation, and test sets.
def holdout(X_train_total, X_val_total, y_train_total, y_val_total, proportion=(2/3)):
'''Hold out function'''
k = int(proportion * SIZE_TRAIN) # Modify this only
X_train, y_train = X_train_total["image"][:k], y_train_total[:k]
X_val, y_val = X_train_total["image"][k:], y_train_total[k:]
X_test, y_test = X_val_total["image"], y_val_total
return X_train, y_train, X_val, y_val, X_test, y_test
X_train, y_train, X_val, y_val, X_test, y_test = holdout(X_train_total, X_val_total, y_train_total, y_val_total)
Cleaning the data
import numpy as np
def clean_data(X,y):
''' Delete empty images (std < 10) '''
def find_empty_images(X):
empty_images = []
X = np.array(X)
for i in range(X.shape[0]):
if X[i].std() < 10:
empty_images.append(i)
return empty_images
X = np.array(X)
y = np.array(y)
empty_imgs = find_empty_images(X)
new_index = [i for i in range(X.shape[0]) if i not in empty_imgs]
X = np.take(X,new_index,axis=0)
y = np.take(y,new_index,axis=0)
return X,y
X_train, y_train = clean_data(X_train, y_train)
X_val, y_val = clean_data(X_val, y_val)
X_test, y_test = clean_data(X_test, y_test)
when working with multi-channel image data, such as satellite imagery, selecting specific channels can help focus the model's attention on relevant information, reduce computational complexity, and improve model performance by discarding irrelevant or redundant channels. Here we define a function to select specific channels from a dataset of images and returns a modified dataset containing only the selected channels.
def dataset_select_channels(train_images,list_of_channels):
''' Input a dataset and a list of channels as a list:
Example: dataset_select_channels(train_images,['B4','B3','B2']) to convert
images to RGB.
'''
channels_index = [features_list.index(i) for i in list_of_channels]
data = np.array(train_images)
return data[:,:,:,channels_index]
features_list = [ 'B1', 'B4', 'B3', 'B2', 'B5', 'B6', 'B7', 'B8', 'B9', 'B10', 'B11']
# Select features
list_of_channels = ['B7', 'B6', 'B5'] # Modify this for each model, full_list_of_channels = ['B1','B4', 'B3', 'B2', 'B5', 'B6', 'B7','B8','B9','B10','B11']
X_train = dataset_select_channels(X_train, list_of_channels)
X_val = dataset_select_channels(X_val, list_of_channels)
X_test = dataset_select_channels(X_test, list_of_channels)
In this project we use efficientNet model for classification of satellite images. EfficientNet introduces a novel compound scaling method that uniformly scales the network's depth, width, and resolution in a principled way. This allows EfficientNet to efficiently balance model complexity and computational resources, resulting in better performance across different scales.
import os
import sys
import argparse
import math
import pandas as pd
import numpy as np
import tensorflow.compat.v1 as tf
from tensorflow.keras import optimizers
from tensorflow.keras import layers, initializers
from tensorflow.keras import models
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
from tensorflow.keras.layers.experimental.preprocessing import Resizing
from tensorflow.keras.applications import EfficientNetB3
from termcolor import colored
from google.cloud import storage
from keras.models import model_from_json
def efficientnet_model():
''' Transfer learning model that takes X_train with ['B7','B6','B5']'''
IMG_SIZE = 65
inputs = layers.Input(shape=(IMG_SIZE, IMG_SIZE, 3))
x = inputs
x = Resizing(65, 65)(x)
activationnetB3 = EfficientNetB3(include_top=False, weights = "imagenet")(x)
outputsflatten = layers.Flatten()(activationnetB3)
outputsdense1 = layers.Dense(64, activation = "relu")(outputsflatten)
outputsdense2 = layers.Dense(64, activation = "relu")(outputsdense1)
outputsdense3 = layers.Dense(4, activation = "softmax")(outputsdense2)
model = tf.keras.Model(inputs, outputsdense3)
model.compile(optimizer="adam",
loss="categorical_crossentropy",
metrics=["accuracy"])
return model
def train_efficient_net(X_train, X_val, y_train, y_val):
'''Function to fit Efficient Net model'''
es = EarlyStopping(monitor='val_loss', patience=20, verbose=1, restore_best_weights=True)
datagen = tf.keras.preprocessing.image.ImageDataGenerator()
datagen.fit(X_train)
X_val = Resizing(65, 65, interpolation="bilinear")(X_val)
history = model.fit(datagen.flow(X_train, y_train, batch_size=16),
epochs=50,
validation_data = (X_val, y_val),
verbose = 1,
callbacks=[es])
return history
history =train_efficient_net(X_train, X_val, y_train, y_val)
Epoch 1/50 356/356 [==============================] - 127s 165ms/step - loss: 1.0772 - accuracy: 0.5864 - val_loss: 1.3423 - val_accuracy: 0.5862 Epoch 2/50 356/356 [==============================] - 53s 150ms/step - loss: 0.9970 - accuracy: 0.6083 - val_loss: 2.0164 - val_accuracy: 0.5967 Epoch 3/50 356/356 [==============================] - 64s 180ms/step - loss: 0.9669 - accuracy: 0.6080 - val_loss: 0.9412 - val_accuracy: 0.6030 Epoch 4/50 356/356 [==============================] - 62s 174ms/step - loss: 0.9691 - accuracy: 0.6018 - val_loss: 0.9759 - val_accuracy: 0.6048 Epoch 5/50 356/356 [==============================] - 61s 171ms/step - loss: 0.9953 - accuracy: 0.6074 - val_loss: 1.1164 - val_accuracy: 0.6048 Epoch 6/50 356/356 [==============================] - 60s 167ms/step - loss: 0.9601 - accuracy: 0.6031 - val_loss: 1.0989 - val_accuracy: 0.6048 Epoch 7/50 356/356 [==============================] - 58s 163ms/step - loss: 0.9383 - accuracy: 0.6088 - val_loss: 0.9617 - val_accuracy: 0.5988 Epoch 8/50 356/356 [==============================] - 60s 168ms/step - loss: 0.9211 - accuracy: 0.6183 - val_loss: 0.9464 - val_accuracy: 0.6205 Epoch 9/50 356/356 [==============================] - 63s 179ms/step - loss: 0.8892 - accuracy: 0.6441 - val_loss: 0.9397 - val_accuracy: 0.6398 Epoch 10/50 356/356 [==============================] - 62s 175ms/step - loss: 0.8634 - accuracy: 0.6538 - val_loss: 0.8646 - val_accuracy: 0.6493 Epoch 11/50 356/356 [==============================] - 59s 167ms/step - loss: 0.8486 - accuracy: 0.6547 - val_loss: 0.9940 - val_accuracy: 0.6289 Epoch 12/50 356/356 [==============================] - 65s 182ms/step - loss: 0.8432 - accuracy: 0.6634 - val_loss: 0.9236 - val_accuracy: 0.6240 Epoch 13/50 356/356 [==============================] - 64s 181ms/step - loss: 0.8334 - accuracy: 0.6608 - val_loss: 0.8994 - val_accuracy: 0.6549 Epoch 14/50 356/356 [==============================] - 62s 173ms/step - loss: 0.7738 - accuracy: 0.6833 - val_loss: 1.0798 - val_accuracy: 0.6430 Epoch 15/50 356/356 [==============================] - 55s 153ms/step - loss: 0.7767 - accuracy: 0.6838 - val_loss: 1.0439 - val_accuracy: 0.6356 Epoch 16/50 356/356 [==============================] - 60s 168ms/step - loss: 0.7524 - accuracy: 0.6961 - val_loss: 1.0306 - val_accuracy: 0.6692 Epoch 17/50 356/356 [==============================] - 61s 172ms/step - loss: 0.7062 - accuracy: 0.7117 - val_loss: 1.6739 - val_accuracy: 0.6282 Epoch 18/50 356/356 [==============================] - 60s 169ms/step - loss: 0.7049 - accuracy: 0.7205 - val_loss: 1.7295 - val_accuracy: 0.6310 Epoch 19/50 356/356 [==============================] - 57s 160ms/step - loss: 0.7070 - accuracy: 0.7212 - val_loss: 0.9054 - val_accuracy: 0.6559 Epoch 20/50 356/356 [==============================] - 63s 177ms/step - loss: 0.6562 - accuracy: 0.7425 - val_loss: 3.2931 - val_accuracy: 0.6135 Epoch 21/50 356/356 [==============================] - 63s 177ms/step - loss: 0.6672 - accuracy: 0.7363 - val_loss: 1.0890 - val_accuracy: 0.6496 Epoch 22/50 356/356 [==============================] - 61s 171ms/step - loss: 0.6414 - accuracy: 0.7519 - val_loss: 1.2751 - val_accuracy: 0.6542 Epoch 23/50 356/356 [==============================] - 60s 169ms/step - loss: 0.5726 - accuracy: 0.7791 - val_loss: 1.0321 - val_accuracy: 0.6433 Epoch 24/50 328/356 [==========================>...] - ETA: 4s - loss: 0.5317 - accuracy: 0.8028
Accuracy achieved in the classification task using effcientNet model:
results = model.evaluate(X_test,y_test,verbose=1)
print(f'The accuracy of the model is:{results[1]}')
print(results)
334/334 [==============================] - 175s 513ms/step - loss: 1.2502 - accuracy: 0.4534 The accuracy of the model is:0.45342081785202026 [1.2502360343933105, 0.45342081785202026]
References:
- Tan, M., & Le, Q. V. (2020). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv preprint arXiv:1905.11946.
- Hobbs, A., & Svetlichnaya, S. (2020). Satellite-based Prediction of Forage Conditions for Livestock in Northern Kenya. arXiv preprint arXiv:2004.04081.
- https://github.com/wandb/droughtwatch