This project focuses on using satellite imagery to understand environmental conditions in great detail. We're using advanced computer vision techniques to analyze this imagery, looking for signs of drought stress. By training our algorithms on a carefully labeled dataset, we aim to identify patterns that indicate if droughts are likely to occur.
Dataset
The current dataset consists of 86,317 train and 10,778 validation satellite images, 65x65 pixels each, in 10 spectrum bands, with 10,774 images withheld to test long-term generalization (107,869 total). Human experts (pastoralists) have labeled these with the number of cows that the geographic location at the center of the image could support (0, 1, 2, or 3+ cows). Each pixel represents a 30 meter square, so the images at full size are 1.95 kilometers across. Pastoralists are asked to rate the quality of the area within 20 meters of where they are standing, which corresponds to an area slightly larger a single pixel. Since forage quality is correlated across space, the larger image may be useful for prediction.
The data is in TFRecords format, split into train and val, and takes up ~4.3GB (2.15GB zipped). You can learn more about the format of the satellite images here.
The data used in this research was collected through a research collaboration between the International Livestock Research Institute, Cornell University, and UC San Diego. It was supported by the Atkinson Centre for a Sustainable Future’s Academic Venture Fund, Australian Aid through the AusAID Development Research Awards Scheme Agreement No. 66138, the National Science Foundation (0832782, 1059284, 1522054), and ARO grant W911-NF-14-1-0498.
Downloading the Data
!echo "Downloading data"
!curl -SL https://storage.googleapis.com/wandb_datasets/dw_train_86K_val_10K.zip > dw_data.zip
!unzip dw_data.zip
!rm dw_data.zip
!mv droughtwatch_data/ data/
Downloading data % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2050M 100 2050M 0 0 86.6M 0 0:00:23 0:00:23 --:--:-- 118M Archive: dw_data.zip creating: droughtwatch_data/ creating: droughtwatch_data/val/ inflating: droughtwatch_data/val/part-r-00090 inflating: droughtwatch_data/val/part-r-00061 inflating: droughtwatch_data/val/part-r-00052 inflating: droughtwatch_data/val/part-r-00043 inflating: droughtwatch_data/val/part-r-00040 inflating: droughtwatch_data/val/part-r-00042 inflating: droughtwatch_data/val/part-r-00067 inflating: droughtwatch_data/val/part-r-00026 inflating: droughtwatch_data/val/part-r-00046 inflating: droughtwatch_data/val/part-r-00023 inflating: droughtwatch_data/val/part-r-00083 inflating: droughtwatch_data/val/part-r-00011 inflating: droughtwatch_data/val/part-r-00058 inflating: droughtwatch_data/val/part-r-00012 inflating: droughtwatch_data/val/part-r-00078 inflating: droughtwatch_data/val/part-r-00082 inflating: droughtwatch_data/val/part-r-00038 inflating: droughtwatch_data/val/part-r-00071 inflating: droughtwatch_data/val/part-r-00088 inflating: droughtwatch_data/val/part-r-00029 inflating: droughtwatch_data/val/part-r-00097 inflating: droughtwatch_data/val/part-r-00096 inflating: droughtwatch_data/val/part-r-00077 inflating: droughtwatch_data/val/part-r-00016 inflating: droughtwatch_data/val/part-r-00076 inflating: droughtwatch_data/val/part-r-00057 inflating: droughtwatch_data/val/part-r-00009 inflating: droughtwatch_data/val/part-r-00054 inflating: droughtwatch_data/val/part-r-00028 inflating: droughtwatch_data/val/part-r-00014 inflating: droughtwatch_data/val/part-r-00001 inflating: droughtwatch_data/val/part-r-00080 inflating: droughtwatch_data/val/part-r-00013 inflating: droughtwatch_data/val/part-r-00004 inflating: droughtwatch_data/val/part-r-00070 inflating: droughtwatch_data/val/part-r-00037 inflating: droughtwatch_data/val/part-r-00005 inflating: droughtwatch_data/val/part-r-00010 inflating: droughtwatch_data/val/part-r-00024 inflating: droughtwatch_data/val/part-r-00075 inflating: droughtwatch_data/val/part-r-00074 inflating: droughtwatch_data/val/part-r-00017 inflating: droughtwatch_data/val/part-r-00055 inflating: droughtwatch_data/val/part-r-00091 inflating: droughtwatch_data/val/part-r-00020 inflating: droughtwatch_data/val/part-r-00095 inflating: droughtwatch_data/val/part-r-00079 inflating: droughtwatch_data/val/part-r-00047 inflating: droughtwatch_data/val/part-r-00051 inflating: droughtwatch_data/val/part-r-00073 inflating: droughtwatch_data/val/part-r-00053 inflating: droughtwatch_data/val/part-r-00060 inflating: droughtwatch_data/val/part-r-00086 inflating: droughtwatch_data/val/part-r-00048 inflating: droughtwatch_data/val/part-r-00006 inflating: droughtwatch_data/val/part-r-00059 inflating: droughtwatch_data/val/part-r-00002 inflating: droughtwatch_data/val/part-r-00094 inflating: droughtwatch_data/val/part-r-00033 inflating: droughtwatch_data/val/part-r-00066 inflating: droughtwatch_data/val/part-r-00015 inflating: droughtwatch_data/val/part-r-00000 inflating: droughtwatch_data/val/part-r-00050 inflating: droughtwatch_data/val/part-r-00093 inflating: droughtwatch_data/val/part-r-00035 inflating: droughtwatch_data/val/part-r-00098 inflating: droughtwatch_data/val/part-r-00099 inflating: droughtwatch_data/val/part-r-00084 inflating: droughtwatch_data/val/part-r-00003 inflating: droughtwatch_data/val/part-r-00032 inflating: droughtwatch_data/val/part-r-00018 inflating: droughtwatch_data/val/part-r-00062 inflating: droughtwatch_data/val/part-r-00045 inflating: droughtwatch_data/val/part-r-00089 inflating: droughtwatch_data/val/part-r-00087 inflating: droughtwatch_data/val/part-r-00027 inflating: droughtwatch_data/val/part-r-00064 inflating: droughtwatch_data/val/part-r-00085 inflating: droughtwatch_data/val/part-r-00025 extracting: droughtwatch_data/val/_SUCCESS inflating: droughtwatch_data/val/part-r-00030 inflating: droughtwatch_data/val/part-r-00065 inflating: droughtwatch_data/val/part-r-00034 inflating: droughtwatch_data/val/part-r-00031 inflating: droughtwatch_data/val/part-r-00007 inflating: droughtwatch_data/val/part-r-00008 inflating: droughtwatch_data/val/part-r-00041 inflating: droughtwatch_data/val/part-r-00063 inflating: droughtwatch_data/val/part-r-00069 inflating: droughtwatch_data/val/part-r-00021 inflating: droughtwatch_data/val/part-r-00044 inflating: droughtwatch_data/val/part-r-00081 inflating: droughtwatch_data/val/part-r-00039 inflating: droughtwatch_data/val/part-r-00068 inflating: droughtwatch_data/val/part-r-00056 inflating: droughtwatch_data/val/part-r-00072 inflating: droughtwatch_data/val/part-r-00019 inflating: droughtwatch_data/val/part-r-00036 inflating: droughtwatch_data/val/part-r-00092 inflating: droughtwatch_data/val/part-r-00049 inflating: droughtwatch_data/val/part-r-00022 creating: droughtwatch_data/train/ inflating: droughtwatch_data/train/part-r-01174 inflating: droughtwatch_data/train/part-r-01012 inflating: droughtwatch_data/train/part-r-00110 inflating: droughtwatch_data/train/part-r-00132 inflating: droughtwatch_data/train/part-r-01168 inflating: droughtwatch_data/train/part-r-01011 inflating: droughtwatch_data/train/part-r-01084 inflating: droughtwatch_data/train/part-r-01138 inflating: droughtwatch_data/train/part-r-00090 inflating: droughtwatch_data/train/part-r-00167 inflating: droughtwatch_data/train/part-r-00128 inflating: droughtwatch_data/train/part-r-01060 inflating: droughtwatch_data/train/part-r-00061 inflating: droughtwatch_data/train/part-r-00116 inflating: droughtwatch_data/train/part-r-00052 inflating: droughtwatch_data/train/part-r-01091 inflating: droughtwatch_data/train/part-r-01007 inflating: droughtwatch_data/train/part-r-00043 inflating: droughtwatch_data/train/part-r-01120 inflating: droughtwatch_data/train/part-r-01025 inflating: droughtwatch_data/train/part-r-01056 inflating: droughtwatch_data/train/part-r-00040 inflating: droughtwatch_data/train/part-r-00178 inflating: droughtwatch_data/train/part-r-01136 inflating: droughtwatch_data/train/part-r-00106 inflating: droughtwatch_data/train/part-r-01066 inflating: droughtwatch_data/train/part-r-01104 inflating: droughtwatch_data/train/part-r-00042 inflating: droughtwatch_data/train/part-r-01192 inflating: droughtwatch_data/train/part-r-00136 inflating: droughtwatch_data/train/part-r-00067 inflating: droughtwatch_data/train/part-r-01181 inflating: droughtwatch_data/train/part-r-00190 inflating: droughtwatch_data/train/part-r-00196 inflating: droughtwatch_data/train/part-r-00026 inflating: droughtwatch_data/train/part-r-00177 inflating: droughtwatch_data/train/part-r-01039 inflating: droughtwatch_data/train/part-r-00046 inflating: droughtwatch_data/train/part-r-01024 inflating: droughtwatch_data/train/part-r-01102 inflating: droughtwatch_data/train/part-r-00023 inflating: droughtwatch_data/train/part-r-01188 inflating: droughtwatch_data/train/part-r-00184 inflating: droughtwatch_data/train/part-r-00137 inflating: droughtwatch_data/train/part-r-01199 inflating: droughtwatch_data/train/part-r-01115 inflating: droughtwatch_data/train/part-r-00083 inflating: droughtwatch_data/train/part-r-01014 inflating: droughtwatch_data/train/part-r-00011 inflating: droughtwatch_data/train/part-r-00144 inflating: droughtwatch_data/train/part-r-00058 inflating: droughtwatch_data/train/part-r-00115 inflating: droughtwatch_data/train/part-r-01154 inflating: droughtwatch_data/train/part-r-00012 inflating: droughtwatch_data/train/part-r-00156 inflating: droughtwatch_data/train/part-r-00078 inflating: droughtwatch_data/train/part-r-01026 inflating: droughtwatch_data/train/part-r-00082 inflating: droughtwatch_data/train/part-r-01171 inflating: droughtwatch_data/train/part-r-00153 inflating: droughtwatch_data/train/part-r-00172 inflating: droughtwatch_data/train/part-r-01037 inflating: droughtwatch_data/train/part-r-00165 inflating: droughtwatch_data/train/part-r-01022 inflating: droughtwatch_data/train/part-r-00150 inflating: droughtwatch_data/train/part-r-01095 inflating: droughtwatch_data/train/part-r-00108 inflating: droughtwatch_data/train/part-r-00038 inflating: droughtwatch_data/train/part-r-00119 inflating: droughtwatch_data/train/part-r-01170 inflating: droughtwatch_data/train/part-r-01079 inflating: droughtwatch_data/train/part-r-00071 inflating: droughtwatch_data/train/part-r-01076 inflating: droughtwatch_data/train/part-r-00088 inflating: droughtwatch_data/train/part-r-01090 inflating: droughtwatch_data/train/part-r-00112 inflating: droughtwatch_data/train/part-r-00029 inflating: droughtwatch_data/train/part-r-00097 inflating: droughtwatch_data/train/part-r-01092 inflating: droughtwatch_data/train/part-r-01054 inflating: droughtwatch_data/train/part-r-01001 inflating: droughtwatch_data/train/part-r-01198 inflating: droughtwatch_data/train/part-r-00174 inflating: droughtwatch_data/train/part-r-01109 inflating: droughtwatch_data/train/part-r-01161 inflating: droughtwatch_data/train/part-r-01125 inflating: droughtwatch_data/train/part-r-01128 inflating: droughtwatch_data/train/part-r-01019 inflating: droughtwatch_data/train/part-r-00096 inflating: droughtwatch_data/train/part-r-01195 inflating: droughtwatch_data/train/part-r-00077 inflating: droughtwatch_data/train/part-r-00016 inflating: droughtwatch_data/train/part-r-00121 inflating: droughtwatch_data/train/part-r-01087 inflating: droughtwatch_data/train/part-r-00199 inflating: droughtwatch_data/train/part-r-00185 inflating: droughtwatch_data/train/part-r-01116 inflating: droughtwatch_data/train/part-r-01101 inflating: droughtwatch_data/train/part-r-01178 inflating: droughtwatch_data/train/part-r-01035 inflating: droughtwatch_data/train/part-r-01006 inflating: droughtwatch_data/train/part-r-01183 inflating: droughtwatch_data/train/part-r-00104 inflating: droughtwatch_data/train/part-r-01155 inflating: droughtwatch_data/train/part-r-01061 inflating: droughtwatch_data/train/part-r-01008 inflating: droughtwatch_data/train/part-r-01063 inflating: droughtwatch_data/train/part-r-01137 inflating: droughtwatch_data/train/part-r-01034 inflating: droughtwatch_data/train/part-r-00171 inflating: droughtwatch_data/train/part-r-01159 inflating: droughtwatch_data/train/part-r-00109 inflating: droughtwatch_data/train/part-r-00076 inflating: droughtwatch_data/train/part-r-01082 inflating: droughtwatch_data/train/part-r-01004 inflating: droughtwatch_data/train/part-r-01145 inflating: droughtwatch_data/train/part-r-00161 inflating: droughtwatch_data/train/part-r-00101 inflating: droughtwatch_data/train/part-r-00057 inflating: droughtwatch_data/train/part-r-01182 inflating: droughtwatch_data/train/part-r-01118 inflating: droughtwatch_data/train/part-r-00168 inflating: droughtwatch_data/train/part-r-00102 inflating: droughtwatch_data/train/part-r-01085 inflating: droughtwatch_data/train/part-r-00183 inflating: droughtwatch_data/train/part-r-01187 inflating: droughtwatch_data/train/part-r-00009 inflating: droughtwatch_data/train/part-r-01114 inflating: droughtwatch_data/train/part-r-01051 inflating: droughtwatch_data/train/part-r-01185 inflating: droughtwatch_data/train/part-r-01057 inflating: droughtwatch_data/train/part-r-01119 inflating: droughtwatch_data/train/part-r-00189 inflating: droughtwatch_data/train/part-r-01081 inflating: droughtwatch_data/train/part-r-00054 inflating: droughtwatch_data/train/part-r-01144 inflating: droughtwatch_data/train/part-r-00179 inflating: droughtwatch_data/train/part-r-01196 inflating: droughtwatch_data/train/part-r-01072 inflating: droughtwatch_data/train/part-r-01172 inflating: droughtwatch_data/train/part-r-01156 inflating: droughtwatch_data/train/part-r-01176 inflating: droughtwatch_data/train/part-r-00028 inflating: droughtwatch_data/train/part-r-00014 inflating: droughtwatch_data/train/part-r-00001 inflating: droughtwatch_data/train/part-r-01062 inflating: droughtwatch_data/train/part-r-01107 inflating: droughtwatch_data/train/part-r-00145 inflating: droughtwatch_data/train/part-r-00080 inflating: droughtwatch_data/train/part-r-00124 inflating: droughtwatch_data/train/part-r-01151 inflating: droughtwatch_data/train/part-r-00176 inflating: droughtwatch_data/train/part-r-00013 inflating: droughtwatch_data/train/part-r-00004 inflating: droughtwatch_data/train/part-r-00070 inflating: droughtwatch_data/train/part-r-00157 inflating: droughtwatch_data/train/part-r-01113 inflating: droughtwatch_data/train/part-r-00037 inflating: droughtwatch_data/train/part-r-01173 inflating: droughtwatch_data/train/part-r-00005 inflating: droughtwatch_data/train/part-r-01158 inflating: droughtwatch_data/train/part-r-01164 inflating: droughtwatch_data/train/part-r-01175 inflating: droughtwatch_data/train/part-r-00148 inflating: droughtwatch_data/train/part-r-01049 inflating: droughtwatch_data/train/part-r-01106 inflating: droughtwatch_data/train/part-r-01002 inflating: droughtwatch_data/train/part-r-00142 inflating: droughtwatch_data/train/part-r-00126 inflating: droughtwatch_data/train/part-r-00010 inflating: droughtwatch_data/train/part-r-00024 inflating: droughtwatch_data/train/part-r-01184 inflating: droughtwatch_data/train/part-r-01193 inflating: droughtwatch_data/train/part-r-00075 inflating: droughtwatch_data/train/part-r-00175 inflating: droughtwatch_data/train/part-r-01160 inflating: droughtwatch_data/train/part-r-01163 inflating: droughtwatch_data/train/part-r-00074 inflating: droughtwatch_data/train/part-r-01169 inflating: droughtwatch_data/train/part-r-01110 inflating: droughtwatch_data/train/part-r-00017 inflating: droughtwatch_data/train/part-r-01099 inflating: droughtwatch_data/train/part-r-00055 inflating: droughtwatch_data/train/part-r-00166 inflating: droughtwatch_data/train/part-r-00091 inflating: droughtwatch_data/train/part-r-00020 inflating: droughtwatch_data/train/part-r-01010 inflating: droughtwatch_data/train/part-r-01179 inflating: droughtwatch_data/train/part-r-01052 inflating: droughtwatch_data/train/part-r-00133 inflating: droughtwatch_data/train/part-r-00120 inflating: droughtwatch_data/train/part-r-00095 inflating: droughtwatch_data/train/part-r-01059 inflating: droughtwatch_data/train/part-r-00134 inflating: droughtwatch_data/train/part-r-01003 inflating: droughtwatch_data/train/part-r-01157 inflating: droughtwatch_data/train/part-r-00173 inflating: droughtwatch_data/train/part-r-01141 inflating: droughtwatch_data/train/part-r-00079 inflating: droughtwatch_data/train/part-r-00047 inflating: droughtwatch_data/train/part-r-00051 inflating: droughtwatch_data/train/part-r-00073 inflating: droughtwatch_data/train/part-r-01080 inflating: droughtwatch_data/train/part-r-01015 inflating: droughtwatch_data/train/part-r-00122 inflating: droughtwatch_data/train/part-r-01027 inflating: droughtwatch_data/train/part-r-00162 inflating: droughtwatch_data/train/part-r-00053 inflating: droughtwatch_data/train/part-r-00111 inflating: droughtwatch_data/train/part-r-00152 inflating: droughtwatch_data/train/part-r-00060 inflating: droughtwatch_data/train/part-r-00086 inflating: droughtwatch_data/train/part-r-01041 inflating: droughtwatch_data/train/part-r-00048 inflating: droughtwatch_data/train/part-r-01055 inflating: droughtwatch_data/train/part-r-00006 inflating: droughtwatch_data/train/part-r-00059 inflating: droughtwatch_data/train/part-r-01140 inflating: droughtwatch_data/train/part-r-01165 inflating: droughtwatch_data/train/part-r-01065 inflating: droughtwatch_data/train/part-r-01086 inflating: droughtwatch_data/train/part-r-01048 inflating: droughtwatch_data/train/part-r-01045 inflating: droughtwatch_data/train/part-r-01126 inflating: droughtwatch_data/train/part-r-00002 inflating: droughtwatch_data/train/part-r-01009 inflating: droughtwatch_data/train/part-r-01103 inflating: droughtwatch_data/train/part-r-00131 inflating: droughtwatch_data/train/part-r-00094 inflating: droughtwatch_data/train/part-r-01098 inflating: droughtwatch_data/train/part-r-00033 inflating: droughtwatch_data/train/part-r-01177 inflating: droughtwatch_data/train/part-r-01148 inflating: droughtwatch_data/train/part-r-00123 inflating: droughtwatch_data/train/part-r-00100 inflating: droughtwatch_data/train/part-r-01197 inflating: droughtwatch_data/train/part-r-00113 inflating: droughtwatch_data/train/part-r-00066 inflating: droughtwatch_data/train/part-r-01053 inflating: droughtwatch_data/train/part-r-01071 inflating: droughtwatch_data/train/part-r-00015 inflating: droughtwatch_data/train/part-r-00000 inflating: droughtwatch_data/train/part-r-01143 inflating: droughtwatch_data/train/part-r-00140 inflating: droughtwatch_data/train/part-r-00050 inflating: droughtwatch_data/train/part-r-00093 inflating: droughtwatch_data/train/part-r-01040 inflating: droughtwatch_data/train/part-r-01097 inflating: droughtwatch_data/train/part-r-01112 inflating: droughtwatch_data/train/part-r-01149 inflating: droughtwatch_data/train/part-r-01023 inflating: droughtwatch_data/train/part-r-00035 inflating: droughtwatch_data/train/part-r-01121 inflating: droughtwatch_data/train/part-r-00118 inflating: droughtwatch_data/train/part-r-00158 inflating: droughtwatch_data/train/part-r-00146 inflating: droughtwatch_data/train/part-r-00098 inflating: droughtwatch_data/train/part-r-00192 inflating: droughtwatch_data/train/part-r-01152 inflating: droughtwatch_data/train/part-r-00125 inflating: droughtwatch_data/train/part-r-00099 inflating: droughtwatch_data/train/part-r-00084 inflating: droughtwatch_data/train/part-r-01028 inflating: droughtwatch_data/train/part-r-01046 inflating: droughtwatch_data/train/part-r-01180 inflating: droughtwatch_data/train/part-r-00003 inflating: droughtwatch_data/train/part-r-01100 inflating: droughtwatch_data/train/part-r-00180 inflating: droughtwatch_data/train/part-r-00139 inflating: droughtwatch_data/train/part-r-00160 inflating: droughtwatch_data/train/part-r-01005 inflating: droughtwatch_data/train/part-r-01075 inflating: droughtwatch_data/train/part-r-00032 inflating: droughtwatch_data/train/part-r-01067 inflating: droughtwatch_data/train/part-r-00151 inflating: droughtwatch_data/train/part-r-00018 inflating: droughtwatch_data/train/part-r-00143 inflating: droughtwatch_data/train/part-r-01133 inflating: droughtwatch_data/train/part-r-01135 inflating: droughtwatch_data/train/part-r-01017 inflating: droughtwatch_data/train/part-r-00187 inflating: droughtwatch_data/train/part-r-00129 inflating: droughtwatch_data/train/part-r-01020 inflating: droughtwatch_data/train/part-r-00062 inflating: droughtwatch_data/train/part-r-00045 inflating: droughtwatch_data/train/part-r-00147 inflating: droughtwatch_data/train/part-r-01064 inflating: droughtwatch_data/train/part-r-00089 inflating: droughtwatch_data/train/part-r-01074 inflating: droughtwatch_data/train/part-r-00135 inflating: droughtwatch_data/train/part-r-01142 inflating: droughtwatch_data/train/part-r-01036 inflating: droughtwatch_data/train/part-r-01078 inflating: droughtwatch_data/train/part-r-01050 inflating: droughtwatch_data/train/part-r-01166 inflating: droughtwatch_data/train/part-r-01134 inflating: droughtwatch_data/train/part-r-00191 inflating: droughtwatch_data/train/part-r-00182 inflating: droughtwatch_data/train/part-r-01073 inflating: droughtwatch_data/train/part-r-00087 inflating: droughtwatch_data/train/part-r-01029 inflating: droughtwatch_data/train/part-r-00027 inflating: droughtwatch_data/train/part-r-01018 inflating: droughtwatch_data/train/part-r-00198 inflating: droughtwatch_data/train/part-r-00064 inflating: droughtwatch_data/train/part-r-00149 inflating: droughtwatch_data/train/part-r-00193 inflating: droughtwatch_data/train/part-r-00154 inflating: droughtwatch_data/train/part-r-01162 inflating: droughtwatch_data/train/part-r-00085 inflating: droughtwatch_data/train/part-r-00025 inflating: droughtwatch_data/train/part-r-01117 inflating: droughtwatch_data/train/part-r-00130 inflating: droughtwatch_data/train/part-r-01124 extracting: droughtwatch_data/train/_SUCCESS inflating: droughtwatch_data/train/part-r-01189 inflating: droughtwatch_data/train/part-r-01032 inflating: droughtwatch_data/train/part-r-00030 inflating: droughtwatch_data/train/part-r-01068 inflating: droughtwatch_data/train/part-r-00065 inflating: droughtwatch_data/train/part-r-01108 inflating: droughtwatch_data/train/part-r-00141 inflating: droughtwatch_data/train/part-r-00138 inflating: droughtwatch_data/train/part-r-01096 inflating: droughtwatch_data/train/part-r-01089 inflating: droughtwatch_data/train/part-r-01167 inflating: droughtwatch_data/train/part-r-01150 inflating: droughtwatch_data/train/part-r-01038 inflating: droughtwatch_data/train/part-r-01030 inflating: droughtwatch_data/train/part-r-00034 inflating: droughtwatch_data/train/part-r-01083 inflating: droughtwatch_data/train/part-r-00031 inflating: droughtwatch_data/train/part-r-00164 inflating: droughtwatch_data/train/part-r-00197 inflating: droughtwatch_data/train/part-r-00007 inflating: droughtwatch_data/train/part-r-00103 inflating: droughtwatch_data/train/part-r-01094 inflating: droughtwatch_data/train/part-r-01058 inflating: droughtwatch_data/train/part-r-01069 inflating: droughtwatch_data/train/part-r-00105 inflating: droughtwatch_data/train/part-r-01000 inflating: droughtwatch_data/train/part-r-01093 inflating: droughtwatch_data/train/part-r-01139 inflating: droughtwatch_data/train/part-r-00008 inflating: droughtwatch_data/train/part-r-00195 inflating: droughtwatch_data/train/part-r-01033 inflating: droughtwatch_data/train/part-r-01127 inflating: droughtwatch_data/train/part-r-01147 inflating: droughtwatch_data/train/part-r-00186 inflating: droughtwatch_data/train/part-r-00155 inflating: droughtwatch_data/train/part-r-00041 inflating: droughtwatch_data/train/part-r-01043 inflating: droughtwatch_data/train/part-r-00194 inflating: droughtwatch_data/train/part-r-00114 inflating: droughtwatch_data/train/part-r-01132 inflating: droughtwatch_data/train/part-r-01153 inflating: droughtwatch_data/train/part-r-00063 inflating: droughtwatch_data/train/part-r-01131 inflating: droughtwatch_data/train/part-r-00069 inflating: droughtwatch_data/train/part-r-01194 inflating: droughtwatch_data/train/part-r-01123 inflating: droughtwatch_data/train/part-r-00021 inflating: droughtwatch_data/train/part-r-00044 inflating: droughtwatch_data/train/part-r-01191 inflating: droughtwatch_data/train/part-r-01077 inflating: droughtwatch_data/train/part-r-01122 inflating: droughtwatch_data/train/part-r-00081 inflating: droughtwatch_data/train/part-r-00039 inflating: droughtwatch_data/train/part-r-01013 inflating: droughtwatch_data/train/part-r-00181 inflating: droughtwatch_data/train/part-r-01146 inflating: droughtwatch_data/train/part-r-01044 inflating: droughtwatch_data/train/part-r-00068 inflating: droughtwatch_data/train/part-r-00188 inflating: droughtwatch_data/train/part-r-01070 inflating: droughtwatch_data/train/part-r-01021 inflating: droughtwatch_data/train/part-r-01042 inflating: droughtwatch_data/train/part-r-01047 inflating: droughtwatch_data/train/part-r-00107 inflating: droughtwatch_data/train/part-r-00169 inflating: droughtwatch_data/train/part-r-00056 inflating: droughtwatch_data/train/part-r-01190 inflating: droughtwatch_data/train/part-r-00072 inflating: droughtwatch_data/train/part-r-01088 inflating: droughtwatch_data/train/part-r-01186 inflating: droughtwatch_data/train/part-r-01130 inflating: droughtwatch_data/train/part-r-00127 inflating: droughtwatch_data/train/part-r-00159 inflating: droughtwatch_data/train/part-r-00019 inflating: droughtwatch_data/train/part-r-01111 inflating: droughtwatch_data/train/part-r-01016 inflating: droughtwatch_data/train/part-r-00036 inflating: droughtwatch_data/train/part-r-01105 inflating: droughtwatch_data/train/part-r-01031 inflating: droughtwatch_data/train/part-r-00163 inflating: droughtwatch_data/train/part-r-01129 inflating: droughtwatch_data/train/part-r-00092 inflating: droughtwatch_data/train/part-r-00117 inflating: droughtwatch_data/train/part-r-00049 inflating: droughtwatch_data/train/part-r-00170 inflating: droughtwatch_data/train/part-r-00022
Visualizing data
import os
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
The dataset is in the form of TFRecordDataset. In a TFRecord dataset, data is serialized into a binary format, which reduces storage space and speeds up data reading and processing. Each record in the dataset contains a serialized example, which typically includes features and labels for machine learning tasks.
import os
import tensorflow as tf
# Define a lambda function to generate a list of file paths
dirlist = lambda di: [os.path.join(di, file) for file in os.listdir(di) if 'part-' in file]
# Get a list of training file paths
training_files = dirlist('data/val/')
# Define a function to parse TFRecord data
def parse_visual(data):
# Create a TFRecordDataset from the input data file
dataset = tf.data.TFRecordDataset(data)
# Define the features expected in the TFRecord examples
features = {
'B2': tf.io.FixedLenFeature([], tf.string), # Blue band data
'B3': tf.io.FixedLenFeature([], tf.string), # Green band data
'B4': tf.io.FixedLenFeature([], tf.string), # Red band data
'label': tf.io.FixedLenFeature([], tf.int64), # Label or target value
}
# Parse each TFRecord example in the dataset using the defined features
parsed_examples = [tf.io.parse_single_example(data, features) for data in dataset]
return parsed_examples
# Parse TFRecord examples from the first training file
parsed_examples = parse_visual(training_files[0])
def get_img_from_example(parsed_example, intensify=True):
# Initialize an empty RGB array with dimensions 65x65x3
rgbArray = np.zeros((65,65,3), 'uint8')
# Iterate over each band (B4, B3, B2)
for i, band in enumerate(['B4', 'B3', 'B2']):
# Extract band data from the parsed example and convert it to numpy array
band_data = np.frombuffer(parsed_example[band].numpy(), dtype=np.uint8)
# Reshape the band data to match the dimensions of a 65x65 image
band_data = band_data.reshape(65, 65)
if intensify:
band_data = band_data/np.max(band_data)*255
else:
band_data = band_data*255
# Assign the band data to the corresponding channel in the RGB array
rgbArray[..., i] = band_data
# Extract the label from the parsed example and convert it to numpy array
label = tf.cast(parsed_example['label'], tf.int32).numpy()
return rgbArray, label
fig=plt.figure(figsize=(20, 30), dpi= 80, facecolor='w', edgecolor='k')
for i in range(1,26):
plt.subplot(5, 5, i)
img, label = img, label = get_img_from_example(parsed_examples[i+7])
plt.imshow(img).axes.get_xaxis().set_visible(False)
plt.imshow(img).axes.get_yaxis().set_visible(False)
plt.title(str(label))
fig.show()
Defining constants
# Total size of the dataset
TOTAL_TRAIN = 86317
TOTAL_VAL = 10778
# Sample size of dataset for training purpose
SIZE= 0.1 # modify this only
SIZE_TRAIN = int(TOTAL_TRAIN*SIZE)
SIZE_VAL = int(TOTAL_VAL)
# Parameters of the data (do not change)
IMG_DIM = 65
NUM_CLASSES = 4
Defining features
The satellite imagery data contains different spectral bands. These bands represent different wavelengths of light captured by the satellite sensor, each providing unique information about the Earth's surface.
B1 30 meters 0.43 - 0.45 µm Coastal aerosol
B2 30 meters 0.45 - 0.51 µm Blue
B3 30 meters 0.53 - 0.59 µm Green
B4 30 meters 0.64 - 0.67 µm Red
B5 30 meters 0.85 - 0.88 µm Near infrared
B6 30 meters 1.57 - 1.65 µm Shortwave infrared 1
B7 30 meters 2.11 - 2.29 µm Shortwave infrared 2
B8 15 meters 0.52 - 0.90 µm Band 8 Panchromatic
B9 15 meters 1.36 - 1.38 µm Cirrus
B10 30 meters 10.60 - 11.19 µm Thermal infrared 1, resampled from 100m to 30m
B11 30 meters 11.50 - 12.51 µm Thermal infrared 2, resampled from 100m to 30m
We create a dictionary 'features' which defines the structure of the data stored in TFRecord format. The need to do this arises because:
Data Structure Definition: The TFRecord format is a binary storage format used in TensorFlow for efficiently handling large datasets. However, it's a generic format that doesn't inherently understand the structure of the data it contains. By creating this dictionary, we explicitly define the structure of the data that will be stored in TFRecord files. Each key-value pair in the dictionary represents a feature name and its corresponding properties, such as data type and shape.
Parsing: When reading data from TFRecord files during training or inference, TensorFlow needs to know how to interpret the binary data stored in each example. The dictionary serves as a blueprint for parsing the TFRecord examples correctly. TensorFlow uses this dictionary to decode the binary data into tensors that can be fed into the machine learning model.
Consistency and Compatibility: Defining the data structure upfront ensures consistency and compatibility between the data stored in TFRecord files and the model's input requirements. It ensures that the features expected by the model match the features stored in the TFRecord files, preventing data parsing errors and mismatches during training or inference.
Integration with TensorFlow APIs: TensorFlow provides high-level APIs for working with TFRecord datasets, such as tf.data.TFRecordDataset and tf.io.parse_single_example. These APIs rely on the feature dictionary to understand how to read and parse the data correctly.
features = {
'B1': tf.io.FixedLenFeature([], tf.string),
'B2': tf.io.FixedLenFeature([], tf.string),
'B3': tf.io.FixedLenFeature([], tf.string),
'B4': tf.io.FixedLenFeature([], tf.string),
'B5': tf.io.FixedLenFeature([], tf.string),
'B6': tf.io.FixedLenFeature([], tf.string),
'B7': tf.io.FixedLenFeature([], tf.string),
'B8': tf.io.FixedLenFeature([], tf.string),
'B9': tf.io.FixedLenFeature([], tf.string),
'B10': tf.io.FixedLenFeature([], tf.string),
'B11': tf.io.FixedLenFeature([], tf.string),
'label': tf.io.FixedLenFeature([], tf.int64),
}
Extracting training and validation data
import os
import tensorflow as tf
def get_data(train_data_size, val_data_size, local=True):
def load_data_local(data_path):
train = file_list_from_folder("train", data_path)
val = file_list_from_folder("val", data_path)
return train, val
def file_list_from_folder(folder, data_path):
folderpath = os.path.join(data_path, folder)
filelist = []
for filename in os.listdir(folderpath):
if filename.startswith('part-') and not filename.endswith('gstmp'):
filelist.append(os.path.join(folderpath, filename))
return filelist
def parse_tfrecords(filelist, batch_size, buffer_size, include_viz=False):
# try a subset of possible bands
def _parse_(serialized_example, keylist=['B1', 'B4', 'B3', 'B2', 'B5', 'B6', 'B7', 'B8', 'B9', 'B10', 'B11']):
example = tf.io.parse_single_example(serialized_example, features)
def getband(example_key):
img = tf.io.decode_raw(example_key, tf.uint8)
return tf.reshape(img[:IMG_DIM**2], shape=(IMG_DIM, IMG_DIM, 1))
bandlist = [getband(example[key]) for key in keylist]
# combine bands into tensor
image = tf.concat(bandlist, -1)
# one-hot encode ground truth labels
label = tf.cast(example['label'], tf.int32)
label = tf.one_hot(label, NUM_CLASSES)
return {'image': image}, label
tfrecord_dataset = tf.data.TFRecordDataset(filelist)
tfrecord_dataset = tfrecord_dataset.map(lambda x:_parse_(x)).shuffle(buffer_size).repeat(-1).batch(batch_size)
tfrecord_iterator = iter(tfrecord_dataset)
image, label = tfrecord_iterator.get_next()
return image, label
if local:
data_path = "/content/data/"
train_tfrecords, val_tfrecords = load_data_local(data_path)
X_train, y_train = parse_tfrecords(train_tfrecords, train_data_size, train_data_size)
X_val, y_val = parse_tfrecords(val_tfrecords, val_data_size, val_data_size)
return X_train, X_val, y_train, y_val
X_train_total, X_val_total, y_train_total, y_val_total = get_data(SIZE_TRAIN, SIZE_VAL, local=True)
The holdout fucntion implements a holdout strategy for splitting a dataset into training, validation, and test sets.
def holdout(X_train_total, X_val_total, y_train_total, y_val_total, proportion=(2/3)):
'''Hold out function'''
k = int(proportion * SIZE_TRAIN) # Modify this only
X_train, y_train = X_train_total["image"][:k], y_train_total[:k]
X_val, y_val = X_train_total["image"][k:], y_train_total[k:]
X_test, y_test = X_val_total["image"], y_val_total
return X_train, y_train, X_val, y_val, X_test, y_test
X_train, y_train, X_val, y_val, X_test, y_test = holdout(X_train_total, X_val_total, y_train_total, y_val_total)
Cleaning the data
import numpy as np
def clean_data(X,y):
''' Delete empty images (std < 10) '''
def find_empty_images(X):
empty_images = []
X = np.array(X)
for i in range(X.shape[0]):
if X[i].std() < 10:
empty_images.append(i)
return empty_images
X = np.array(X)
y = np.array(y)
empty_imgs = find_empty_images(X)
new_index = [i for i in range(X.shape[0]) if i not in empty_imgs]
X = np.take(X,new_index,axis=0)
y = np.take(y,new_index,axis=0)
return X,y
X_train, y_train = clean_data(X_train, y_train)
X_val, y_val = clean_data(X_val, y_val)
X_test, y_test = clean_data(X_test, y_test)
when working with multi-channel image data, such as satellite imagery, selecting specific channels can help focus the model's attention on relevant information, reduce computational complexity, and improve model performance by discarding irrelevant or redundant channels. Here we define a function to select specific channels from a dataset of images and returns a modified dataset containing only the selected channels.
def dataset_select_channels(train_images,list_of_channels):
''' Input a dataset and a list of channels as a list:
Example: dataset_select_channels(train_images,['B4','B3','B2']) to convert
images to RGB.
'''
channels_index = [features_list.index(i) for i in list_of_channels]
data = np.array(train_images)
return data[:,:,:,channels_index]
features_list = [ 'B1', 'B4', 'B3', 'B2', 'B5', 'B6', 'B7', 'B8', 'B9', 'B10', 'B11']
# Select features
list_of_channels = ['B7', 'B6', 'B5'] # Modify this for each model, full_list_of_channels = ['B1','B4', 'B3', 'B2', 'B5', 'B6', 'B7','B8','B9','B10','B11']
X_train = dataset_select_channels(X_train, list_of_channels)
X_val = dataset_select_channels(X_val, list_of_channels)
X_test = dataset_select_channels(X_test, list_of_channels)
In this project we use efficientNet model for classification of satellite images. EfficientNet introduces a novel compound scaling method that uniformly scales the network's depth, width, and resolution in a principled way. This allows EfficientNet to efficiently balance model complexity and computational resources, resulting in better performance across different scales.
import os
import sys
import argparse
import math
import pandas as pd
import numpy as np
import tensorflow.compat.v1 as tf
from tensorflow.keras import optimizers
from tensorflow.keras import layers, initializers
from tensorflow.keras import models
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
from tensorflow.keras.layers.experimental.preprocessing import Resizing
from tensorflow.keras.applications import EfficientNetB3
from termcolor import colored
from google.cloud import storage
from keras.models import model_from_json
def efficientnet_model():
''' Transfer learning model that takes X_train with ['B7','B6','B5']'''
IMG_SIZE = 65
inputs = layers.Input(shape=(IMG_SIZE, IMG_SIZE, 3))
x = inputs
x = Resizing(65, 65)(x)
activationnetB3 = EfficientNetB3(include_top=False, weights = "imagenet")(x)
outputsflatten = layers.Flatten()(activationnetB3)
outputsdense1 = layers.Dense(64, activation = "relu")(outputsflatten)
outputsdense2 = layers.Dense(64, activation = "relu")(outputsdense1)
outputsdense3 = layers.Dense(4, activation = "softmax")(outputsdense2)
model = tf.keras.Model(inputs, outputsdense3)
model.compile(optimizer="adam",
loss="categorical_crossentropy",
metrics=["accuracy"])
return model
def train_efficient_net(X_train, X_val, y_train, y_val):
'''Function to fit Efficient Net model'''
es = EarlyStopping(monitor='val_loss', patience=20, verbose=1, restore_best_weights=True)
datagen = tf.keras.preprocessing.image.ImageDataGenerator()
datagen.fit(X_train)
X_val = Resizing(65, 65, interpolation="bilinear")(X_val)
history = model.fit(datagen.flow(X_train, y_train, batch_size=16),
epochs=50,
validation_data = (X_val, y_val),
verbose = 1,
callbacks=[es])
return history
history =train_efficient_net(X_train, X_val, y_train, y_val)
Epoch 1/50 356/356 [==============================] - 127s 165ms/step - loss: 1.0772 - accuracy: 0.5864 - val_loss: 1.3423 - val_accuracy: 0.5862 Epoch 2/50 356/356 [==============================] - 53s 150ms/step - loss: 0.9970 - accuracy: 0.6083 - val_loss: 2.0164 - val_accuracy: 0.5967 Epoch 3/50 356/356 [==============================] - 64s 180ms/step - loss: 0.9669 - accuracy: 0.6080 - val_loss: 0.9412 - val_accuracy: 0.6030 Epoch 4/50 356/356 [==============================] - 62s 174ms/step - loss: 0.9691 - accuracy: 0.6018 - val_loss: 0.9759 - val_accuracy: 0.6048 Epoch 5/50 356/356 [==============================] - 61s 171ms/step - loss: 0.9953 - accuracy: 0.6074 - val_loss: 1.1164 - val_accuracy: 0.6048 Epoch 6/50 356/356 [==============================] - 60s 167ms/step - loss: 0.9601 - accuracy: 0.6031 - val_loss: 1.0989 - val_accuracy: 0.6048 Epoch 7/50 356/356 [==============================] - 58s 163ms/step - loss: 0.9383 - accuracy: 0.6088 - val_loss: 0.9617 - val_accuracy: 0.5988 Epoch 8/50 356/356 [==============================] - 60s 168ms/step - loss: 0.9211 - accuracy: 0.6183 - val_loss: 0.9464 - val_accuracy: 0.6205 Epoch 9/50 356/356 [==============================] - 63s 179ms/step - loss: 0.8892 - accuracy: 0.6441 - val_loss: 0.9397 - val_accuracy: 0.6398 Epoch 10/50 356/356 [==============================] - 62s 175ms/step - loss: 0.8634 - accuracy: 0.6538 - val_loss: 0.8646 - val_accuracy: 0.6493 Epoch 11/50 356/356 [==============================] - 59s 167ms/step - loss: 0.8486 - accuracy: 0.6547 - val_loss: 0.9940 - val_accuracy: 0.6289 Epoch 12/50 356/356 [==============================] - 65s 182ms/step - loss: 0.8432 - accuracy: 0.6634 - val_loss: 0.9236 - val_accuracy: 0.6240 Epoch 13/50 356/356 [==============================] - 64s 181ms/step - loss: 0.8334 - accuracy: 0.6608 - val_loss: 0.8994 - val_accuracy: 0.6549 Epoch 14/50 356/356 [==============================] - 62s 173ms/step - loss: 0.7738 - accuracy: 0.6833 - val_loss: 1.0798 - val_accuracy: 0.6430 Epoch 15/50 356/356 [==============================] - 55s 153ms/step - loss: 0.7767 - accuracy: 0.6838 - val_loss: 1.0439 - val_accuracy: 0.6356 Epoch 16/50 356/356 [==============================] - 60s 168ms/step - loss: 0.7524 - accuracy: 0.6961 - val_loss: 1.0306 - val_accuracy: 0.6692 Epoch 17/50 356/356 [==============================] - 61s 172ms/step - loss: 0.7062 - accuracy: 0.7117 - val_loss: 1.6739 - val_accuracy: 0.6282 Epoch 18/50 356/356 [==============================] - 60s 169ms/step - loss: 0.7049 - accuracy: 0.7205 - val_loss: 1.7295 - val_accuracy: 0.6310 Epoch 19/50 356/356 [==============================] - 57s 160ms/step - loss: 0.7070 - accuracy: 0.7212 - val_loss: 0.9054 - val_accuracy: 0.6559 Epoch 20/50 356/356 [==============================] - 63s 177ms/step - loss: 0.6562 - accuracy: 0.7425 - val_loss: 3.2931 - val_accuracy: 0.6135 Epoch 21/50 356/356 [==============================] - 63s 177ms/step - loss: 0.6672 - accuracy: 0.7363 - val_loss: 1.0890 - val_accuracy: 0.6496 Epoch 22/50 356/356 [==============================] - 61s 171ms/step - loss: 0.6414 - accuracy: 0.7519 - val_loss: 1.2751 - val_accuracy: 0.6542 Epoch 23/50 356/356 [==============================] - 60s 169ms/step - loss: 0.5726 - accuracy: 0.7791 - val_loss: 1.0321 - val_accuracy: 0.6433 Epoch 24/50 328/356 [==========================>...] - ETA: 4s - loss: 0.5317 - accuracy: 0.8028
Accuracy achieved in the classification task using effcientNet model:
results = model.evaluate(X_test,y_test,verbose=1)
print(f'The accuracy of the model is:{results[1]}')
print(results)
334/334 [==============================] - 175s 513ms/step - loss: 1.2502 - accuracy: 0.4534 The accuracy of the model is:0.45342081785202026 [1.2502360343933105, 0.45342081785202026]
References:
- Tan, M., & Le, Q. V. (2020). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv preprint arXiv:1905.11946.
- Hobbs, A., & Svetlichnaya, S. (2020). Satellite-based Prediction of Forage Conditions for Livestock in Northern Kenya. arXiv preprint arXiv:2004.04081.
- https://github.com/wandb/droughtwatch