Metadata of the article that will be visualized in OnlineFirst ArticleTitle Mapping urban flood susceptibility in Ouagadougou, Burkina Faso Article Sub-Title Article CopyRight The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature (This will be the copyright line in the final PDF) Journal Name Environmental Earth Sciences Corresponding Author FamilyName Traoré Particle Given Name Karim Suffix Division Organization Laboratoire Eaux Hydro-Systèmes Et Agriculture (LEHSA), Institut International d’Ingénierie de L’Eau, Et de L’Environnement (2iE) Address Rue de La Science, 01 BP 594, Ouagadougou 01, Burkina Faso Division Laboratoire d’Etudes Rurales Sur L’Environnement Et Le Développement Economique Et Social (LERE/DES) Organization Université Nazi BONI (UNB) Address 01 BP 1091, Bobo-Dioulasso 01, Burkina Faso Phone Fax Email karim.traore@2ie-edu.org; karimtraore.sm@gmail.com URL ORCID Author FamilyName Fowe Particle Given Name Tazen Suffix Division Organization Laboratoire Eaux Hydro-Systèmes Et Agriculture (LEHSA), Institut International d’Ingénierie de L’Eau, Et de L’Environnement (2iE) Address Rue de La Science, 01 BP 594, Ouagadougou 01, Burkina Faso Phone Fax Email tazen.fowe@2ie-edu.org URL ORCID Author FamilyName Ouédraogo Particle Given Name Mathieu Suffix Division Laboratoire d’Etudes Rurales Sur L’Environnement Et Le Développement Economique Et Social (LERE/DES) Organization Université Nazi BONI (UNB) Address 01 BP 1091, Bobo-Dioulasso 01, Burkina Faso Division Organization Alliance of Bioversity International and CIAT Address Dakar, Senegal Phone Fax Email m.ouedraogo@cgiar.org URL ORCID Author FamilyName Zorom Particle Given Name Malicki Suffix Division Organization Laboratoire Eaux Hydro-Systèmes Et Agriculture (LEHSA), Institut International d’Ingénierie de L’Eau, Et de L’Environnement (2iE) Address Rue de La Science, 01 BP 594, Ouagadougou 01, Burkina Faso Phone Fax Email malicki.zorom@2ie-edu.org URL ORCID Author FamilyName Bologo/Traoré Particle Given Name Maïmouna Suffix Division Organization Laboratoire Eaux Hydro-Systèmes Et Agriculture (LEHSA), Institut International d’Ingénierie de L’Eau, Et de L’Environnement (2iE) Address Rue de La Science, 01 BP 594, Ouagadougou 01, Burkina Faso Phone Fax Email maimouna.bologo@2ie-edu.org URL ORCID Author FamilyName Toé Particle Given Name Patrice Suffix Division Laboratoire d’Etudes Rurales Sur L’Environnement Et Le Développement Economique Et Social (LERE/DES) Organization Université Nazi BONI (UNB) Address 01 BP 1091, Bobo-Dioulasso 01, Burkina Faso Phone Fax Email patrice_toe57@yahoo.fr URL ORCID Author FamilyName Karambiri Particle Given Name Harouna Suffix Division Organization Laboratoire Eaux Hydro-Systèmes Et Agriculture (LEHSA), Institut International d’Ingénierie de L’Eau, Et de L’Environnement (2iE) Address Rue de La Science, 01 BP 594, Ouagadougou 01, Burkina Faso Phone Fax Email harouna.karambiri@2ie-edu.org URL ORCID Schedule Received 4 Mar 2024 Revised Accepted 14 Sep 2024 Abstract Ouagadougou, the capital city of Burkina Faso, is facing significant economic and social damages due to recurring floods. This study aimed to develop a flood susceptibility map for Ouagadougou using a logistic regression (LR) model and 14 flood conditioning factors, including elevation, slope, aspect, profile curvature, plan curvature, topographic position index (TPI), topographic roughness index (TRI), flow direction, topographic wetness index (TWI), distance to river, rainfall, land use/land cover (LULC), normalized difference vegetation index (NDVI) and soil type. A historical flood inventory map was created from household survey data, identifying 1026 flooded sites which were divided into a training dataset (70%) and a validation dataset (30%). The factors that had a statistically significant influence (p-value < 0.05 and │Z│ > 1.96) at the 95% confidence level were, in order of importance, elevation, distance to river, rainfall, plan curvature and NDVI. The receiver operating characteristic (ROC) curve method was used to validate the model. The area under the curve (AUC) values of the model were 81% for the prediction rate and 82% for the success rate indicating its effectiveness in identifying areas susceptible to flooding. The results showed that 18.48% of the city is very high susceptible to flooding, 18.99% has high susceptibility, 18.43% has moderate susceptibility, and 19.98% and 24.18% have low and very low susceptibility, respectively. This research provides valuable information for policy makers to develop effective flood prevention and urban development strategies. Keywords (separated by '- ') Flood susceptibility - Geospatial data - Household survey - Logistic regression - Ouagadougou - Burkina Faso Footnote Information UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Vol.:(0123456789) Environmental Earth Sciences _#####################_ https://doi.org/10.1007/s12665-024-11871-0 ORIGINAL ARTICLE Mapping urban flood susceptibility in Ouagadougou, Burkina Faso Karim Traoré1,2 · Tazen Fowe1 · Mathieu Ouédraogo2,3 · Malicki Zorom1 · Maïmouna Bologo/Traoré1 · Patrice Toé2 · Harouna Karambiri1 Received: 4 March 2024 / Accepted: 14 September 2024 © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024 Abstract Ouagadougou, the capital city of Burkina Faso, is facing significant economic and social damages due to recurring floods. This study aimed to develop a flood susceptibility map for Ouagadougou using a logistic regression (LR) model and 14 flood conditioning factors, including elevation, slope, aspect, profile curvature, plan curvature, topographic position index (TPI), topographic roughness index (TRI), flow direction, topographic wetness index (TWI), distance to river, rainfall, land use/land cover (LULC), normalized difference vegetation index (NDVI) and soil type. A historical flood inventory map was created from household survey data, identifying 1026 flooded sites which were divided into a training dataset (70%) and a validation dataset (30%). The factors that had a statistically significant influence (p-value < 0.05 and │Z│ > 1.96) at the 95% confidence level were, in order of importance, elevation, distance to river, rainfall, plan curvature and NDVI. The receiver operating characteristic (ROC) curve method was used to validate the model. The area under the curve (AUC) values of the model were 81% for the prediction rate and 82% for the success rate indicating its effectiveness in identifying areas susceptible to flooding. The results showed that 18.48% of the city is very high susceptible to flooding, 18.99% has high susceptibility, 18.43% has moderate susceptibility, and 19.98% and 24.18% have low and very low susceptibility, respectively. This research provides valuable information for policy makers to develop effective flood prevention and urban development strategies. Keywords  Flood susceptibility · Geospatial data · Household survey · Logistic regression · Ouagadougou · Burkina Faso Introduction Floods are the most frequent natural disasters, affecting the largest number of people worldwide (Wannous and Velasquez 2017). The Emergency Events Database (EM- DAT) developed by the Center for Research on the Epide- miology of Disasters (CRED) at the Catholic University of Louvain (UCLouvain) lists 5,169 events that affected the world between 1980 and 2023. The resulting economic dam- age is estimated, by the same source, at 973.14 trillion US dollars; those of 1993, 1998, 2010, 2011, 2013, 2016, 2020 and 2021 were particularly high. Floods can also cause con- siderable damage and devastation, including injury, death, loss of property and livelihoods, destruction of infrastructure and displacement of populations (Nicholls et al. 2015). For example in 2019, 127 floods affected 69 countries, killed * Karim Traoré karim.traore@2ie-edu.org; karimtraore.sm@gmail.com Tazen Fowe tazen.fowe@2ie-edu.org Mathieu Ouédraogo m.ouedraogo@cgiar.org Malicki Zorom malicki.zorom@2ie-edu.org Maïmouna Bologo/Traoré maimouna.bologo@2ie-edu.org Patrice Toé patrice_toe57@yahoo.fr Harouna Karambiri harouna.karambiri@2ie-edu.org 1 Laboratoire Eaux Hydro-Systèmes Et Agriculture (LEHSA), Institut International d’Ingénierie de L’Eau, Et de L’Environnement (2iE), Rue de La Science, 01 BP 594 Ouagadougou 01, Burkina Faso 2 Laboratoire d’Etudes Rurales Sur L’Environnement Et Le Développement Economique Et Social (LERE/DES), Université Nazi BONI (UNB), 01 BP 1091 Bobo‑Dioulasso 01, Burkina Faso 3 Alliance of Bioversity International and CIAT, Dakar, Senegal AQ1 AQ2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15 A16 A17 A18 A19 A20 A21 A22 A23 A24 http://crossmark.crossref.org/dialog/?doi=10.1007/s12665-024-11871-0&domain=pdf UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 2 of 24 1586 people and displaced 10 million others (IDMC 2019). The risk of flooding is particularly high in urban areas and some studies suggest that the problem of urban flooding is likely to worsen with the increased frequency of extreme precipitation due to climate change (Feloni et al. 2022). Thus, accelerated urban development contributes to the transformation of natural areas into impermeable areas, which increases urban runoff and, consequently, increases the exposure of populations to the risk of flooding (Ham- mond et al. 2015). All regions of the world are affected by floods, however, developing countries are particularly highly vulnerable to natural hazards such as floods due to their low adaptive capacity and inadequate infrastructure to cope with these disasters (Shah et al. 2019). The situation requires effective, sustainable flood management to reduce the extent of the damage (Aslam 2018; Hong et al. 2018). Although it is impossible to totally prevent floods, future flood-prone areas can be delineated through the use of fore- casting techniques (Tehrany et al. 2015). Consequently, the development of flood forecasting models becomes necessary, which could enable a flood risk reduction plan to be put in place and assistance services to be provided in the event of a flood disaster (Schumann et al. 2022). The identification of areas susceptible to flooding by means of flood susceptibility maps is an essential tool that can reduce future flood dam- age. In addition, specifying areas with low susceptibility to flooding could be useful for development activities (Sarhadi et al. 2012). Over the years, the rapid development of Geographic Information Systems (GIS) and Remote Sensing (RS) has greatly contributed to the advancement of flood stud- ies, thanks to various methods and approaches that have improved data collection and analysis. The techniques employed include qualitative, semi-quantitative and quan- titative methods, each based on hypotheses concerning the factors influencing the occurrence of floods (Kaya and Derin 2023). Qualitative and semi-quantitative methods such as the analytic hierarchy process (AHP) (Sharir et al. 2022) and spatial multi-criteria evaluation (SMCE) (Fadhil et al. 2020) are based on expert knowledge, while quantitative methods provide numerical values for the probability of flooding. The literature abounds in various quantitative approaches, including the frequency ratio (FR) (Megahed et al. 2023), the logistic regression (LR) (Liu et al. 2022), the weight of evidence (WoE) (Saleh et al. 2022), the artificial neural networks (ANN) (Priscillia et al. 2021) and evidential belief function (EBF) (Chowdhuri et al. 2020). Machine learning- based models such as the random forest (RF) (Ren et al. 2024), the boosted regression tree (BRT) (Arabameri et al. 2019), the local spatial sequential long short-term memory neural network (LSS-LSTM) (Fang et al. 2021) and naïve bayes tree (NBT) (Khosravi et al. 2018) have also been explored for their ability to predict floods based on a variety of factors. These models each have specific advantages and disadvantages, the performance of which often depends on the nature of the study area. Although advanced machine learning algorithms have been used to assess flood susceptibility, the classic LR model remains popular and has played an important role in produc- ing susceptibility maps and explaining the roles of flood con- ditioning factors worldwide (Shafizadeh-Moghadam et al. 2018). The LR model offers many advantages in terms of data processing and representation of results. For example, the independent variables of the LR model do not need to be normally distributed, and LR model results can be very effective in detecting the accuracy of sample data (Boateng and Abaye 2019). However, its application has never before been used in the urban areas of Sahelian West Africa, which present a unique character for testing and studying the pre- dictive capacity of flood-prone areas for mitigation and management purposes. The geographical characteristics of a given area can make such an environment susceptible to flooding (Mind’je et al. 2019). This can be considered true for Ouagadougou, the capital city of Burkina Faso, since its geographical characteristics and climatic profile make it vulnerable to flooding (Hangnon et al. 2015; Nouaceur 2020). Since the end of the twentieth century, Burkina Faso, has experienced a recurrence of flooding phenomena whose impacts are very detrimental to the population. In this coun- try, the average number of floods is increasing, ranging from one flood per year between 1986 and 2005 to five floods per year between 2006 and 2016, and a third of these floods occurred in Ouagadougou (Tazen et al. 2019). In September 2020, the country was hit hard by heavy rains, with nearly 41 deaths, 112 injuries and more than 100,000 people affected. This heavy rainfall affected more than 2,300 households in Ouagadougou (Da and Bonnet 2021). Even today, Ouaga- dougou is still marked by major floods, notably those of 2009, 2016 and 2020. Faced with the intensification of flooding in the city of Ouagadougou, improving flood practices has become a pri- ority for the government of Burkina Faso. Thus, to reduce the exposure of populations to the risk of flooding, the government of Burkina Faso adopted a decree locating and demarcating flood zones in the city of Ouagadougou after the major flood of September 1, 2009. Thus, the adminis- trative limits of non-constructible and submersible zones are defined (and delimited locally) at a distance of 100 m and 300 m, respectively, from the high-water mark. Accord- ing to the decree, an unbuildable flood zone is defined by easements of 100 m on either side of the primary rainwater drainage channels, as well as by areas located below the water level of the backwaters. However, zoning maps do not consider the topography of the site or the hydrology of the environment. The mapping of flood zones corresponds to the definition of the safety perimeters of the primary canals 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################_ Page 3 of 24  _####_ and the water levels of the dams. This representation reduces the risk of flooding in Ouagadougou to the natural overflow of water bodies or the backflow of pipes, and does not con- sider flooding due to stormwater runoff. This map cannot be considered a hazard map. In addition, some authors have attempted to map susceptibility to flooding in the city of Ouagadougou (Bani and Yonkeu 2016; Kafando et al. 2023). However, these studies used the AHP method based on the judgement of experts, and did not take into account the diversity of flood susceptibility factors and their interaction. In addition, it appears that these flood susceptibility factors are subject to collinearity bias and the maps established were not validated. In consideration of the above limitations, the available susceptibility maps are not robust enough to be used in decision-making for better flood management in the city of Ouagadougou. This study aims to elaborate a new flood susceptibility map for the city of Ouagadougou using a LR model and independents flood conditioning factors. The objectives of this study are threefold: (1) to determine the main factors contributing to flood occurrence in the city; (2) to expose flood-prone and flood-safe areas; and finally (3) to explore the capability of the LR model for flood suscepti- bility mapping and evaluate its performance. This research highlights susceptibility to flooding as a natural phenom- enon, without integrating the anthropogenic causes which are an aggravating factor of the phenomenon. The following section describes the study area. Sect. “Materials and methods” describes the methodologi- cal approach adopted in this research. Sect. “Results and discussion” presents the obtained results and discussion, which provides the necessary comparisons based on previous research. Concluding remarks are presented in Sect. “Conclusion”. Description of the study area The study was carried out in Ouagadougou, the capital city of Burkina Faso (Fig. 1). It is the country’s largest economic and cultural center, and the most populated city, with an estimated population of two million six hun- dred and eighty-four thousand and fifty-two (2,684,052) inhabitants in 2019, representing 11.78% of the country’s total population (INSD 2022). Ouagadougou has a flat topography that slopes gently from south to north, with no physical barriers hindering its expansion. The shallow, erosion-prone soil is sandy, sandy-clay or clay (Kêdowidé et al. 2010). The climate is dry tropical, characterized by a single rainy season from May to October, with peak rainfall generally recorded in August, and a dry season from November to April. The average annual rainfall varies between 511 and 1003 mm/year during the period 1980–2020. Temperatures range from a daily low of 16 °C in December to a daily high of 40 °C in March and April (Tazen et al. 2019). Ouagadougou is part of the Massili watershed, which flows into the Nakanbe River through a moderate hydrographic network. Due to the presence of rivers and dams, the city is prone to flooding, particularly during heavy rains. Fig. 1   The location of the study area and the flooded sites 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 4 of 24 Materials and methods Description of the general framework of the study To map flood susceptibility, open access spatial data for the city of Ouagadougou was collected from vari- ous sources (Table 1). All conditioning factors, shown in Table 1, were included in LR model for flood susceptibility analysis. The collected data were used at different stages of the research (Fig. 2). In this study, Erdas Imagine 9.2 software, ArcGIS 10.5 software and R 4.1.1 software were used as basic analysis tools for spatial management and data manipulation. Flood inventory map To establish the flood susceptibility map, the first step is to draw up a flood inventory map of the study area, as likely flood-susceptible areas are predicted based on the mathe- matical relationships between past floods and factors influ- encing them (De Risi et al. 2018; Eder et al. 2022). The performance of a model largely depends on the datasets used to train the model as well as to validate the model (Fenza et al. 2021). In this study, to collect past common flooding Table 1   Datasets used and their sources ID Data Resolution Format sources Derived maps 1 Digital elevation model (DEM) 30 m Raster https://​urs.​earth​data.​nasa.​gov Elevation, slope, aspect, plan curvature, profile curvature, topographic wetness index (TWI), Topo- graphic position index (TPI), Topographic rough- ness index (TRI), flow direction, distance to river 2 Landsat 8 imagery (2022) 30 m Raster https://​earth​explo​rer.​usgs.​gov Land use/land cover (LULC), Normalized difference vegetation index (NDVI) 3 Soil data – Vector National Soil Services in Burkina Faso (BUN- ASOLS) Soil map 4 Rainfall data (2011–2020) – Vector https://​cruda​ta.​uea.​ac.​uk Average annual rainfall map 5 Flood sites – Vector Household survey Flood inventory map Fig. 2   The working framework for flood susceptibility analysis 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 https://urs.earthdata.nasa.gov https://earthexplorer.usgs.gov https://crudata.uea.ac.uk UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################_ Page 5 of 24  _####_ sites, a perception survey was conducted between July 04 and September 10, 2022 among 2459 randomly selected households spread over the Ouagadougou commune (Fig. 1). The survey identified 1026 flooding sites in the study area. Subsequently, the 1026 flood sites collected were randomly divided into groups of 70% (718 sites) and 30% (308 sites) respectively to calibrate and validate the flood susceptibil- ity model (Dutta et al. 2023). However, flood susceptibil- ity mapping is considered a binary classification in which the flood inventory was classified into two classes, such as flood sites and non-flood sites (Talukdar et al. 2020). There- fore, in order to construct the flood inventory map, which is considered the dependent factor for model implementation, binary values such as 1 as flood sites and 0 as non-flood sites are required. In this study, flood sites have been considered as regularly flooded sites, while non-flood sites have been considered as sites where no flooding has been recorded in recent years. To avoid bias, several researchers have recom- mended choosing the same number of non-flooded sites as positive or flooded samples (Tang et al. 2020; Tehrany et al. 2013). Therefore, on the basis of field surveys, 1026 non- flooded sites were randomly selected. The non-flooded sites were then randomly classified into groups of 70% (718 sites) and 30% (308 sites). Thus, a dependent factor was estab- lished in the form of a training dataset comprising 718 flood sites as 1 and 718 non-flood sites as 0. Similarly, a data- set for test was selected to evaluate the final model, which included 308 flood sites as 1 and 308 non-flood sites as 0. According to some previous studies (Tehrany et al. 2015; Zaharia et al. 2017), this step is mandatory as the training sample is used to train the model while the test sample is used to validate the results provided by the model. Flood conditioning factors The flood susceptibility model is generally very complex and requires several geospatial data (Towfiqul Islam et al. 2021). Thus, the identification of flood susceptibility fac- tors is, therefore, essential for assessing flood susceptibility. The choice of these factors must be made within a frame- work that ensures the whole issue. Furthermore, the factors should be minimized in order to reduce the complexity of the assessment process (Stefanidis & Stathis 2013). Based on previous studies on flood susceptibility modeling (Rah- man et al. 2023; Wang et al. 2023), topographic, hydro- logical and complementary factors were selected as flood conditioning parameters. These parameters are available at different spatial resolution. A resampling technique was applied to reformat all these parameters to a spatial resolu- tion of 30 m in a 30 WGS 84-Universal Transverse Mercator (UTM) area coordinate system (Bole et al. 2024; Wang et al. 2023). According to previous studies (Hurt 2023; Sola & Sevilla 1997), machine learning models such as LR model require input data to be normalized using the same interval, as results can be biased due to the larger magnitude of the initial untransformed data. In this study, min–max scaling is used to rescale the range of flood conditioning factors to scale the range in [0, 1]. The general formula for a min–max of [0, 1] is given as Eq. (1): where Xn is the normalized value, X is the measured value, Xmin and Xmax are the minimum and maximum values of the measured parameter respectively. In addition, the categorical factors (LULC, soil type) for each pixel of the training and validation samples are assigned normalized values (between 0 and 1). Topographic factors Eight topographic factors, such as elevation, slope, aspect, plan curvature, TPI, TRI, flow direction and profile curva- ture were used in this study (Fig. 3). Elevation (m): elevation has been identified as a major factor for flood modeling (Tehrany et al. 2015). It has an inverse relationship with flood sensitivity, meaning that low- lying areas are more sensitive to flooding compared to higher ground areas since water flows along topographic gradients (Botzen et al. 2013). In this study, the elevation map was automatically extracted from the DEM with a pixel size of 30 × 30 m in ArcGIS (Fig. 3a). Slope (degree): slope is generally of great importance in mapping flood susceptibility. The slope angle determines surface runoff, the velocity of water flow, the aggravation of soil erosion and vertical percolation and, therefore, sig- nificantly influences the physical predispositions to flooding (Vojtek and Vojteková, 2019). In areas with low slopes that flooding frequently occurs (Kaur et al. 2017). In this study, the slope map was automatically extracted using the DEM in ArcGIS (Fig. 3b). Aspect: the aspect measures for each cell the direction of the downward slope measured clockwise in degrees from 0 to 360, where 0 is north-facing, 90 is east-facing, 180 is south-facing, and 270 is west-facing. The value -1 is assigned to flat areas with no downward slope. Gener- ally, aspect has implications for soil moisture content; in fact, north-facing slopes are generally of higher moisture content, which can lead to soil fragility in the face of flood risk (Choubin et al. 2022). In this study, the aspect map was automatically extracted using the DEM in ArcGIS (Fig. 3c). Plan curvature: the plan curvature reflects the slope of the exposure, describes the horizontal shape of the topogra- phy and highlights converging (concave curvature) or diverg- ing (convex curvature) water flows (Ohlmacher 2007). Plan (1)�� = � − ���� ���� − ���� 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 6 of 24 curvature represents a flood conditioning variable. In this study, the plan curvature map was automatically extracted using the DEM in ArcGIS (Fig. 3d). Profile curvature: the profile curvature reflects the com- plexities of the ground and can be determined as the slope of the slope (Dai et al. 2023). Profile curvature represents a flood conditioning variable. It is parallel to the slope and indicates the direction of maximum slope. It affects the acceleration and deceleration of flow across the surface. The profiles curves are negative, positive and zero going from top to bottom. A negative value means that the surface is upwardly convex at that cell, and the flow will be deceler- ated. A positive profile means that the surface is upwardly concave at that cell, and the flow will be accelerated. A value of zero indicates that the surface is linear (no slope). In this study, the profile curvature map was automatically extracted using the DEM in ArcGIS (Fig. 3e). TPI: TPI indicates the difference in altitude between neighboring and focused cells in a digital elevation model (Choubin et  al. 2022). TPI can be used to determine the robustness of the terrain. This index measures the difference between elevation at each pixel (Z0) and the average elevation ((Z) ̅) around it within a predetermined radius (R). It is obtained from Eqs. (2) and (3) (Weiss 2001). In this study, the TPI map was automatically extracted using the DEM in ArcGIS (Fig. 3e). where n is the total number of surrounding pixels and Zi is the elevation of each adjacent pixel. Positive values of the TPI indicate that the central cell is situated higher than its neighborhood, and negative values when it is situated lower, usually represented by valleys. TRI: TRI objectively expresses topographic heteroge- neity. It is a measure to represent the amount of elevation difference between adjacent cells of a digital elevation grid. The process essentially computes the difference in (2)��� = �0 − � (3)� = 1 �� �∑ �=1 �� Fig. 3   Topographic factors: a elevation, b slope, c aspect, d plan curvature, e profile curvature, f TPI, g TRI, h Flow direction 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################_ Page 7 of 24  _####_ elevation values from a central cell and the eight cells surrounding it immediately. The TRI is derived by taking the square root of the average of the squares of each of the eight elevation difference values, and corresponds to aver- age elevation change between any point on a grid and its surrounding area Eq. (4) (Riley et al. 1999). In this study, the TRI map was automatically extracted using the DEM in ArcGIS (Fig. 3g). where Z0 is the elevation of the central pixel, Zi is the eleva- tion of the neighboring pixel, N is the number of surround- ing pixels, which is generally taken to be eight. Flow direction: one of the fundamental characteristics of surface hydrology is the ability to differentiate the flow direction of each raster pixel. Flow direction indicates how overland flow is distributed over a catchment and is a key parameter when performing hydrological modeling for (4)��� = [ 1 � �∑ �=1 ( �0 − �� )2 ]1∕2 such flood forecasting (Pham et al. 2021). The flow direc- tion is a grid whose value represents the sharpest point in the direction of the stream flows in each cell. In this study, the flow direction map was automatically extracted using the DEM in ArcGIS (Fig. 3h). Hydrological factors Three hydrological factors, TWI, distance to river and rain- fall were used in this study (Fig. 4). TWI: TWI is an important factor for mapping flood susceptibility (Choubin et al. 2022). TWI is a useful model for estimating where water will accumulate in an area with elevation differences. It is a function of slope and the upstream contributing area (Nhu et al. 2020). It provides a measure of water accumulation, saturation and flood- ing potential for each pixel in a given basin (Manfreda et al. 2011). Areas with a high wetness are more prone to flooding than areas with low TWI (areas that have less wetness) (Samanta et al. 2018). TWI can be calculated Fig. 3   (continued) 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 8 of 24 using Eq. (5) (Beven & Kirkby 1979). In this study, the TWI map was automatically extracted using the DEM in ArcGIS (Fig. 4a). where As is the upstream contributing area (m2), and β is the slope angle of the site (in radians). Distance to river (m): the distance to the river is an important factor in determining susceptibility to flooding, as the areas most affected during a flood are those located close to the river bank due to overflowing water (Chapi et al. 2017; Fernández and Lutz 2010). In this study, the river distance map (Fig. 4b) was extracted from the hydro- graphic network map using the Euclidean distance tool in ArcGIS. Rainfall (mm): rainfall is the main driver of runoff and flooding (Chapi et al. 2017). Floods are generally preceded by heavy and prolonged rainfall (Aldiansyah and Wardani 2023). However, low-intensity rainfall can cause flooding in an area if the soil is already saturated (Ali et al. 2020). The spatial distribution of rainfall was estimated using inverse distance weighting (IDW). In this study, the average annual rainfall over the period (2011–2020) was used (Fig. 4c). (5)��� = �� ( �� ���(�) ) Complementary factors Three complementary factors, LULC, NDVI and soil type were used in this study (Fig. 5). LULC: LULC is also an important factor in identify- ing flood-prone areas. This factor influences the infil- tration rate and runoff volume (Yin et al. 2017). On the other hand, urban areas, which are mainly composed of impermeable surfaces and bare land, increase water runoff (Costache et al. 2020). LULC map are based on Landsat 8 OLI images (27/04/20122) with a resolution of 30 m downloaded from the US government website https://​earth​explo​rer.​usgs.​gov (Chowdhuri et al. 2020). To calculate the LULC of the study area, maximum likeli- hood supervised classification algorithms are used (Altaf et al. 2014). The Erdas Imagine 9.2 software was used for this classification. The classification was performed in such a way that the produced classes are those which have a direct impression on water runoff. In the study area, four types of LULC were detected namely housing area, water surfaces, fallow and vegetation. The produced LULC was verified using 798 ground control points in the field. Kappa coefficient for the final map was estimated by Eq. (6) (Foody 2002): Fig. 4   Hydrological factors: a TWI, b distance to river, c rainfall 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 https://earthexplorer.usgs.gov UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################_ Page 9 of 24  _####_ where, r is number of rows in the error matrix; Xii is the number of observations in row i and column I; Xi+ is total number of observations in row i; X+i the total number of observations in column i; and N is the total number of obser- vations included in the matrix. In this research, the Kappa coefficient obtained for LULC is 74.39%. This significant value indicates that LULC clas- sification using a supervised classification method is both reliable and relevant for the city of Ouagadougou (Rwanga and Ndambuki 2017). NDVI: NDVI is widely used to determine susceptibility to flooding (Askar et al. 2022). NDVI values varies between -1 and + 1. Negative values of NDVI (values approaching -1) correspond to water. NDVI values close to zero (-0.1 to 0.1) generally correspond to barren areas of rock, sand, or snow. Finally, low, positive values represent shrub and grass- land (approximately 0.2 to 0.4), while high values indicate temperate and tropical rainforests (values approaching 1). There is an inverse relationship between vegetation density and flooding (Tehrany et al. 2013). Areas with more vegeta- tion cover are less prone to flooding than areas with less (6)� = � � ∑� �=1 � ��� � − ∑� �=1 (��+�+� � �2 − �� �=1 ��+�+� vegetation cover. To calculate the NDVI for the study area, Landsat 8 OLI images (27/04/20122) with a resolution of 30 m were selected and processed using Erdas Imagine 9.2 software. The NDVI is calculated by spectral reflectance measurements, as expressed in Eq. (7) (Tucker and Sellers 1986), where NIR and RED are the near-infrared and red bands, respectively, used to determine vegetated and non- vegetated areas. Soil type: soil type directly affects the drainage process due to inherent soil characteristics such as texture, perme- ability level and structure (Mojaddadi et al. 2017). The impact of soil typology on flooding is quite significant since it control the volumes of water that can infiltrate or runoff to the surface (Basri et al. 2022). The mapping of soil type encountered was based on data from BUNASOLS for the city of Ouagadougou (Fig. 5c). Pearson’s correlation coefficient The correlation is a measure of an association between vari- ables (Schober et al. 2018). In correlated data, a change in (7)���� = ��� − ��� ��� + ��� Fig. 5   Complementary factors: a LULC, b NDVI, c Soil type 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 10 of 24 the magnitude of one variable is associated with a change in the magnitude of another variable, either in the same direc- tion (positive correlation) or in the opposite direction (nega- tive correlation). In the applied sciences and the humanities, the term correlation is used in the context of a linear rela- tionship between 2 continuous variables, and expressed as Pearson’s correlation coefficient. The Pearson’s correlation coefficient measures the value between −1 and + 1, where 0 indicates that there is no linear or monotonic association, and the relationship strengthens and eventually approaches a straight line. This is calculated using Eq. (8): where RXY represents the Pearson’s correlation coefficient between variables X and Y, n is the sample size. RXY values correspond to a specific level including the absence of correlation (|RXY|= 0), the very weak correla- tion (0 <|RXY|< 0.2), the weak correlation (0.2 <|RXY|< 0.4), the medium correlation (0.4 <|RXY|< 0.6), the strong cor- relation (0.6 <|RXY|< 0.8) and the very strong correlation (0.8 <|RXY|< 1) (Cao et al. 2020; Wang et al. 2023). Because of the statistical collinearity between the explan- atory variables, the estimated regression coefficients may be high in absolute value; their signs may be counter-intuitive; the variances of the estimators may be high; the regres- sion coefficients and the multiple correlation coefficient are unstable in relation to the correlation coefficients between the explanatory variables (Brunet-Le Rouzic 1981). Sta- tistical collinearity therefore creates significant difficulties in interpreting results (Tehrany et al. 2019). Therefore, to improve model performance, one variable is deleted if the correlation between the two variables is more than 0.8 in absolute value (Maharjan et al. 2024). Multicollinearity analysis Multicollinearity arises in a regression analysis when mul- tiple independent variables exhibit substantial correlation among themselves, in addition to being significantly related to the dependent variable (Young 2017). This condition can lead to wider confidence intervals and less reliable estimates for the predictor variables (Shrestha 2020). Consequently, the outcomes derived from a model affected by multicol- linearity may lack credibility (Harrell 2015; Hosmer et al. 2013). Therefore, factors with high multicollinearity should be removed from models to improve prediction accuracy (Salmerón et  al. 2018). Commonly used techniques for analyzing multicollinearity include the variance inflation factor (VIF) and tolerance (TOL). High multicollinearity (8)��� = ∑� �=1 (�� − �)(�� − �)�∑� �=1 (�� − �) 2 �∑� �=1 (�� − �) 2 is observed when VIF values exceed 10 and TOL values are less than 0.10 (Bai et al. 2010). The VIF and TOL of a multicollinearity analysis can be calculated using Eqs. (9) and (10) respectively. Logistic regression (LR) model Despite the existence of various different modeling methods, forecasting and predicting flood-related disasters are still complex and confusing challenges (Dottori et al. 2018). The selection of appropriate approaches to apply when studying susceptibility is often limited by the availability, quantity and quality of data, the objectives of the study and its scope (Mind’je et al. 2019). Also, models must be simple, easy and straightforward for planners to use in policy making (Fustos et al. 2017). The LR is a simple and powerful tool to support policy and decision makers using limited avail- able data. This method is most useful for understanding the influence of several independent variables on a single dichotomous outcome variable. And it produces good results based on one or more independent variables. These results make it easy to be interpreted through the coefficients that this model generates to predict flood occurrence in different regions (Tehrany et al. 2013). An additional advantage of LR is that, by adding an appropriate relationship function to the usual linear regression model, variables can be con- tinuous, discrete or any combination of the both, and do not necessarily have a normal distribution (Al-Juaidi et al. 2018; Lin and Billa 2021). Thus, for this study, LR was chosen to predict the relationship between flood occurrence (a depend- ent variable) and its causal factors (one or more independent variables). The dependent variable is a binary variable (0 or 1) representing the absence or presence of flooding. There- fore, the LR function can be used. The relationship between flood occurrence and its dependence on other variables is described by the Eq. (11) and Eq. (12): where P is the probability of flood occurrence, which varies from 0 to 1 on a sigmoid-shaped curve; Z is a linear regres- sion model, and its value varies from -∞ to + ∞:Where β0 is the intercept term of the model, n is the number of (9)���� = 1 1 − �2 � (10)���� = 1 − �2 � (11)� = 1 1 + �−� (12)� = �0 + �1�1 +⋯ + ���� 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################_ Page 11 of 24  _####_ independent variables (Xi), and βi (i = 1, 2, 3, …, n) is the slope coefficient for independent variable Xi (i = 1, 2, 3, …, n). This flood probability (p) corresponds to the flood sus- ceptibility in this study. The flood susceptibility estimated by LR varies between 0 and 1, and the closer the value is to 1, the higher the probability of flood occurrence. Interaction analysis An interaction occurs when the impact of one factor var- ies according to the levels of another, indicating that they do not independently affect an outcome (Knol et al. 2011). This variation can be measured on scales such as additivity (combined effect greater or less than the sum of the indi- vidual effects) or multiplicity (combined effect greater or less than the product of the individual effects). The analysis of interactions allows us to understand how factors influence the occurrence of floods in the study area (Wang et al. 2023). Let A and B be two factors with combinations such as A−, A + , B− and B + . In the additive interaction, the relationship between A and B satisfies Eq. (13): It indicates that there is no additive interaction between the two factors, where R is absolute risk (the same as below). If the relationship between A and B satisfies Eq. (14): It indicates that there is a positive additive interaction between the two factors and vice versa. In the multiplication interaction, if the relationship between A and B satisfies Eq. (15): It shows that there is no multiplicative interaction between the two factors. If the relationship between A and B satisfies Eq. (16): (13) ��+�+ − ��−�− = ( ��+�− − ��−�− ) + (��−�+ − ��−�−) (14) ��+�+ − ��−�− > ( ��+�− − ��−�− ) + (��−�+ − ��−�−) (15) ��+�+∕��−�− = ( ��+�−∕��−�− ) + (��−�+∕��−�−) (16) ��+�+∕��−�− > ( ��+�−∕��−�− ) + (��−�+∕��−�−) It indicates that the two factors have positive multiplica- tion and interaction, and vice versa. The index (RERI) evaluating the relative excess risk due to interaction is calculated with the Eq. (17): The attributable proportion due to interaction (AP) is cal- culated using Eq. (18): The synergy index S is calculated using Eq. (19): When there is no additive interaction between the two factors, the confidence interval of RERI and AP should con- tain 0, and the confidence interval of S should contain 1. In the LR used in this study, the condition satisfies Eq. (20) The separate effect of A is expressed by Eqs. (21) and (22): The separate effect of B is expressed by Eqs. (23) and (24): (17) ���� = (��+�+∕��−�− − ��+�−∕��−�−) − (��+�−∕��−�− − ��−�−∕��−�−) − (��−�+∕��−�− − ��−�−∕��−�−) = (���+�+ − 1) − (���+�− − 1) − (���−�+ − 1) = ���+�+ − ���+�− − ���−�+ + 1 (18)�� = ���� ���+�+ (19)� = ���+�+ − 1 (���+�+ − 1) + (���−�+ − 1) (20) �����(�) = ��( � 1 − � ) = ��(����) = �0 + �1� + �2� + �3�� (21) �� ( �����+�− ) − �� ( �����−�− ) = ��( �����+�− �����−�− ) = �� ( ���+�− ) = �0 + �1 − �0 = �1 (22)���+�− = ���(�1) (23) �� ( �����−�+ ) − �� ( �����−�− ) = ��( �����−�+ �����−�− ) = �� ( ���−�+ ) = �0 + �2 − �0 = �2 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 12 of 24 The combined effect of A and B is expressed by Eqs. (25) and (26): The evaluation multiplication interaction is calculated using Eqs. (27) and (28): The evaluation additive interaction is calculated using Eqs. (29), (30) and (31) Model validation The most important ambition of flood susceptibility map- ping is to determine the areas that are prone to flood hazards. (24)���−�+ = ���(�2) (25) �� ( �����+�+ ) − �� ( �����−�− ) = ��( �����+�+ �����−�− ) = �� ( ���+�+ ) = �0 + �1 + �2 + �3 − �0 = �1 + �2 + �3 (26)���+�+ = ���(�1 + �2 + �3) (27) ���+�+ ���+�− × ���−�+ = ��� ( �1 + �2 + �3 ) ��� ( �1 ) × ��� ( �2 ) = ���(�3) (28) ⎧⎪⎨⎪⎩ ��� � �1 + �2 + �3 � = ��� � �1 � × ��� � �2 � = ��� � �3 � , �3 = 0,�� �������������� ����������� ��� � �1 + �2 + �3 � = ��� � �1 � × ��� � �2 � > ��� � �3 � , �3 > 0,�������� �������������� ����������� ��� � �1 + �2 + �3 � = ��� � �1 � × ��� � �2 � < ��� � �3 � , �3 < 0,�� ����� �������������� ����������� (29) ���� = ���+�+ − ���+�− − ���−�+ + 1 = ��� ( �1 + �2 + �3 ) − ��� ( �1 ) − ��� ( �2 ) + 1 (30) �� = ���� ���+�+ = ��� ( �1 + �2 + �3 ) − ��� ( �1 ) − ��� ( �2 ) + 1 ��� ( �1 + �2 + �3 ) (31) � = ���+�+ − 1 (���+�+ − 1) + (���−�+ − 1) = ��� ( �1 + �2 + �3 ) − ��� ( �1 ) − 1 [ ��� ( �1 ) − 1 ] + [��� ( �2 ) − 1] However, it is essential to validate the flood susceptibility map to ensure its reliability (Ahmed 2023; Hong et al. 2018). Without such validation, the relevance of the map is compro- mised and it is non-sense to the application of disaster reduc- tion programs (Wubalem et al. 2020). In addition, political decision-makers need this validation to assess the accuracy of the results and ensure that they can be used with confi- dence in the decision-making process (Mind’je et al. 2020). Although researchers used many techniques to validate the flood susceptibility model, the receiver operating character- istics (ROC) method is routinely used (Liuzzo et al. 2019; Seleem et al. 2022; Shafizadeh-Moghadam et al. 2018; Teh- rany et al. 2013) because of its simplify and produce clear as well as reliable results (Khosravi et al. 2016; Pradhan & Lee 2010). Therefore, in this study, a receiver operating characteristic (ROC) analysis was carried out using a con- fusion matrix in order to determine the extent to which the areas of high flood susceptibility predicted by the LR model were consistent with the flood inventory map derived from the surveys. A confusion matrix is a table (Table 2) that reports the results of the classifiers using specific terms, such as “True positives (TP)”, the predicted and actual positives results; “False positives (FP)”, the predicted positive but actual negative results; “True negatives (TN)”, the predicted and actual negative result; and “False negatives (FN)”, the predicted negative but actual positive result. The ROC curve is the plot of the true positive rate (sensitivity) against the false positive rate (1—specific- ity) at each threshold setting for the diagnostic model for flood-prone areas (Delacour et al. 2005). The two oper- ating characteristics (specificity and sensitivity) can be expressed as Eq. (32) and Eq. (33) respectively, and the range of values is 0–1 (Ha et al. 2021). (32)������ ����� = �� �� + �� (33)����������� = �� �� + �� Table 2   Overview of confusion matrix Flood inventory Flooded site Non-flooded site Flood susceptibility Susceptible location True positive (TP) False positive (FP) Non-susceptible location False negative (FN) True negative (TN) 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################_ Page 13 of 24  _####_ In ROC analysis, the performance of a test method can be assessed by calculating the area under the curve (AUC) using Eqs. (34) (Mind’je et al. 2020): In this equation, Xj represents (1-specificity) and Yj the sensitivity at the threshold j, when X(n + 1) = 1 and Y(n + 1) = 1. By tradition, the plot shows the false-positive rate (1-specificity) on the x-axis and the true-positive rate (the sensitivity or 1 the false negative rate) on the y-axis. In other words, AUC presents the percentages of true posi- tive rating of past flood against the false positive rating percentage of susceptibility index. The closer the AUC is to 1, the better the detection performance, while the closer the AUC is to 0, the worse the detection performance. AUC can be evaluated using the following numerical and qualitative classifications: less than 0.5 (not useful), 0.5 to 0.6 (bad), 0.6 to 0.7 (34)��� = �+1∑ �=1 1 2 √ (�� − ��+1) 2∗(�� + ��+1) Fig. 6   Pearson’s correlation matrix Table 3   The variance inflation factors (VIF) and tolerance (TOL) val- ues of the flood conditioning factors Conditioning factors TOL VIF Soil type 0.93 1.08 LULC 0.97 1.03 Rainfall 0.69 1.44 TPI 0.56 1.79 NDVI 0.94 1.06 Distance to river 0.78 1.28 TRI 0.98 1.02 TWI 0.61 1.63 Flow direction 0.97 1,03 Plan curvature 0.61 1.64 Profile curvature 0.64 1.57 Aspect 0.96 1.04 Slope 0.79 1.26 Elevation 0.58 1.73 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 14 of 24 (sufficient), 0.7 to 0.8 (good), 0. 8 to 0.9 (very good) and 0.9–1 (excellent) (Özay & Orhan 2023; Simundic 2012). The success rate is obtained if the AUC is calculated using the training data set. The prediction rate is calculated when the AUC is obtained using the test data set. Results and discussion Statistical validation analyses Pearson’s correlation matrix, TOL and VIF were carried out on the 14 flood conditioning factors selected to ensure the validity of the regression hypotheses. The resulting correlation matrix (Fig. 6) indicates a Pearson’s correla- tion coefficient, with values less than 0.6 indicating that most variables are independent of each other. The vari- ables in the dataset with the highest positive correlation coefficients were distance to river and elevation, with an RXY value of 0.53. On the other hand, plan curvature and profile curvature had the highest negative correlation coefficient in the study, with an RXY value of −0.56. The lowest tolerance value for collinearity evidence (Table 3) was obtained for elevation (0.58), which was higher than the theoretical critical value (0.10). Additionally, the VIF values for all variables were below the predicted multi- collinearity threshold (< 10.00). Multicollinearity analysis (Table 3) showed no significant multicollinearity between the flood-related independent variables. After multicol- linearity was concluded, all conditioning factors were used in the LR method. Logistic regression (LR) estimation. For LR, the training sample is used to estimate the slope coefficients for all independent variables. The results of the LR analysis for all independent variables are shown in Table 4. Table 4   Logistic regression coefficients ***p-value ≤ 0.001; **p-value ≤ 0.01; *p-value ≤ 0.05 Conditioning factors Estimated coef- ficient (β) Standard errors z value p-value Intercept −2.47 2.90 −0.85 0.39 Soil type 0.09 0.22 −0.43 0.67 LULC 0.38 0.26 1.47 0.14 Rainfall 3.23 0.39 8.57  < 2 × 10–16*** TPI 1.62 2.35 0.69 0.49 NDVI −3.60 1.40 −2.57 0.01* Distance to river −3.90 0.44 −8.95  < 2.2 × 10–16*** TRI 0.29 0.64 0.45 0.65 TWI −0.68 0.77 −0.89 0.38 Flow direction −0.08 0.23 −0.34 0.75 Plan curvature 8.28 3.75 2.21 0.03* Profile curvature 1.92 3.21 0.60 0.55 Aspect 0.27 0.22 1.26 0.21 Slope −1.17 1.60 −0.73 0.47 Elevation −6.10 0.65 −9.45  < 2 × 10–16*** Table 5   Regression coefficients obtained for the five statistically significant conditioning factors ***p-value ≤ 0.001; **p-value ≤ 0.01; *p-value ≤ 0.05 Conditioning factors Estimated coef- ficient (β) Standard Errors z value p-value Intercept −0.98 1.42 −0.69 0.49 Plan curvature 9.43 3.06 3.08 2.04 × 10–3** Rainfall 3.16 0.38 8.30  < 2.2 × 10–16*** NDVI −3.79 1.41 −2.69 7.11 × 10–3** Distance to river −3.80 0.45 −8.43  < 2.2 × 10–16*** Elevation −5.85 0.63 −9.31  < 2.2 × 10–16*** 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################_ Page 15 of 24  _####_ The z value and p-value columns represent the statisti- cal significance of each coefficient in the model. A higher absolute value of the z value indicates that the estimated coefficient is more statistically significant (Basu et al. 2017). A lower p-value indicates that the coefficient is more statis- tically significant, and a value less than 0.05 is often con- sidered as evidence to reject the null hypothesis (Andrade 2019). The results reveal that the null hypothesis was rejected for predictors such as rainfall, NDVI, distance to river, plan curvature and elevation (Table 4), indicating that these predictors are statistically significant (p-value < 0.05 and │Z│ > 1.96) at the 95% confidence level in the LR model. In addition, a positive coefficient (β) means that the effect of the relevant variable increases the likelihood of flooding and a negative value of β means that the presence of the decision variable decreases the probability of flood- ing occurrence (Chauhan et al. 2010; Shafapour Tehrany et al. 2017). Table 5 shows the new regression coefficients obtained by implementing the LR model using only the five statistically significant predictors. According to the estimated coefficient (β) obtained, the factors plan curvature and precipitation showed a positive relationship with flood incidence, while elevation, distance to river and NDVI showed negative relationships (Table 5). Of all these factors, elevation, distance to river and precipita- tion had the highest absolute z-values (9.31, 8.43 and 8.30 respectively), while plan curvature and NDVI had the lowest absolute z-values (3.08 and 2.69). This shows that elevation, distance to river and rainfall have a significant impact on the occurrence of flooding in the city of Ouagadougou. Based on the absolute z-value (Table 5), the five significant predic- tors for flood prediction can be ranked according to their importance as follows: (1) elevation, (2) distance to river, (3) rainfall, (4), plan curvature and (5) NDVI. Finally, on the strength of the regression results (Table 5), these five factors with a statistically significant impact on the occurrence of floods were used in the GIS to obtain a flood probability map (Fig. 7a) based on Eq. (35): Flood susceptibility was then represented by classify- ing the probability in the range [0, 1] into five classes using the natural break method (Fig. 7b). The idea of the natural break method is to minimize the variance between objects within chosen subsets and maximize the variance between subsets (Fenglin et al. 2023; Jenks 1967; Lee & Kim 2021). The five classes included very high (0.72–1), (35) � = − 0.98 + (9.43 × �������������) + (3.16 × ��������) − (3.79 × ����) − (3.80 × �� ������������) − (5.85 × ���������) Fig. 7   Flood susceptibility map: a probability map obtained by logis- tic regression analyses, b reclassified flood susceptibility map Table 6   Relationship between number of floods and level of susceptible areas Susceptibility classes Number of floods Flood ratio (%) Partition area (ha) Partition ratio (%) Flood disas- ter density (1000 ha) Very high 466 45.42 10,178.17 18.48 45.78 High 306 29.82 10,458.51 18.99 29.26 Medium 174 14.33 10,149.95 18.43 17.14 Low 79 7.70 10,970.58 19.98 7.20 Very low 28 2.73 13,317.04 24.18 2.10 Total 1052 100 55,074.25 100 19.10 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 16 of 24 Table 7   Incidence of flooding according to different categories of the flood conditioning factors Factors Classification Number of floods Flood ratio (%) Partition area (ha) Partition ratio (%) Flood disas- ter density (1000 ha) Elevation (m) 275–296 214 29.76 9370.26 17.02 22.84 296–305 282 39.22 15,287.85 27.76 18.45 305–313 165 22.95 14,075.55 25.56 11.72 313–323 50 6.95 11,463.57 20.82 4.36 323–349 8 1.11 4866.93 8.84 1.64 Distance to river (m) 0–218.40 419 58.28 16,275.06 29.56 25.75 218.40–466.69 183 25.45 13,196.70 23.97 13.87 466.69–726.22 76 10.57 11,405.70 20.71 6.66 726.22–1012.02 28 3.89 8921.07 16.20 3.14 1012.02–1667.09 13 1.81 5267.43 9.57 2.47 Rainfall (mm) 882.38–883.54 159 22.11 14,092.38 25.59 11.28 883.54–884.41 107 14.88 12,690.45 23.05 8.43 884.41–885.34 193 26.84 10,382.94 18.86 18.59 885.34–886.29 218 30.32 9696.96 17.61 22.48 886.29–887.78 42 5.84 8200.89 14.89 5.1 Plan curvature  < 0 (convex) 216 30.04 18,291.96 33.22 11.81 0 (linear surface) 241 33.52 18,416.34 33.44 13.09  > 0 (concave) 262 36.44 18,355.86 33.34 14.27 NDVI −0.23 to −0.05 102 14.19 3752.64 6.82 27.18 −0.05 to −0.02 280 38.94 15,532.29 28.21 18.03 −0.02–0 219 30.46 23,735.07 43.10 9.23 0–0.05 111 15.44 9799.65 17.80 11.33 0.05–0.37 7 0.97 2244.33 4.08 3.12 Slope 0–1.22 253 35.19 19,540.71 35.48 12.94 1.22–2.16 146 20.31 19,563.66 35.52 7.46 2.16–3.19 253 35.19 10,930.41 19.85 23.14 3.19–4.90 56 7.79 4386.24 7.96 12.77 4.90–24.89 11 1.53 643.14 1.16 17.10 Aspect −1–65.55 142 19.75 11,975.58 21.74 11.85 65.55–140.19 130 18.08 12,270.96 22.28 10.59 140.19–212.74 158 21.97 10,243.44 18.60 15.42 212.74–284.53 160 22.25 9838.08 17.86 16.26 284.53–358.73 129 17.94 10,736.1 19.49 12.01 TPI −0.20.78 to −2.38 221 30.74 4589.73 8.33 48.15 −2.38 to −0.72 178 24.76 13,339.71 24.22 13.34 −0.72–0.67 74 10.29 18,030.96 32.74 4.10 0.67–2.28 201 27.96 14,228.82 25.84 14.12 2.28–26.77 45 6.26 4874.94 8.85 9.23 TRI 0–0.40 40 5.56 2850.66 5.17 14.03 0.40–0.47 140 19.47 12,657.06 22.98 11.06 0.52–0.52 215 29.90 18,759.69 34.07 11.46 3.19–0.58 232 32.27 15,306.75 27.80 15.15 0.58–24.81 92 12.80 5490 9.97 16.75 Profile curvature  < 0 (convex) 332 46.18 22,640.4 41.12 14.66 0 (linear surface) 100 13.91 9722.52 17.66 10.28  > 0 (concave) 287 39.92 22,701.24 41.22 12.64 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################_ Page 17 of 24  _####_ high (0.54–0.72), medium (0.36–0.54), low (0.18–0.36) and very low (0–0.18). The corresponding susceptibility class statistics were extracted from the statistical zonal toolbox and are shown in Table 6. The Table 7 shows the incidence of flooding according to different categories of the flood conditioning factors. Analysis of interaction results The interaction analysis was based on the five factors with a significant statistical influence on the occurrence of flooding, namely elevation, distance to river, rainfall, plan curvature and NDVI. Elevation, distance to river and rainfall were divided into high and low categories based on the value of the mean. NDVI was divided into wetlands Table 7   (continued) Factors Classification Number of floods Flood ratio (%) Partition area (ha) Partition ratio (%) Flood disas- ter density (1000 ha) Flow direction 1–37 141 19.61 39,801.96 72.28 3.54 37–95 500 69.54 9444.96 17.15 52.93 95–161 71 9.87 5129.73 9.31 13.84 161–217 2 0.28 438.3 0.79 4.56 217–255 5 0.70 249.21 0.45 20.06 TWI 4.17–7.56 208 28.93 24,852.96 45.13 8.36 7.56–9.07 100 13.91 15,763.59 28.62 6.34 9.07–11.05 330 45.90 8466.93 15.37 38.97 11.05–13.72 56 7.79 4453.2 8.08 12.57 13.72–22.87 25 3.48 1527.48 2.77 16.36 Soil type Hydromorphic soil 657 91.38 42,744.46 77.64 15.37 Poorly developed soil 62 8.62 12,308.12 22.35 5.037 LULC Fallow 52 7.23 9222.62 16.75 5.63 Water surfaces 16 2.23 1391.5 2.52 11.49 Vegetation 53 7.37 6179.47 11.22 8.57 Housing area 598 83.17 38,258.99 69.49 15.63 Table 8   Analysis of interaction results Conditioning factors Individual effect Common effect Additive scale Multiplicative scale Elevation 5.97 Distance to river 3.78 Rainfall 2.62 Plan curvature 0.70 NDVI 1.34 Elevation* Distance to river 0.66 Positive interaction Negative interaction Elevation* Rainfall 0.65 No interaction Negative interaction Elevation* Plan curvature 0.98 No interaction Negative interaction Elevation* NDVI 0.91 Positive interaction Negative interaction Distance to river* Rainfall 0.89 No interaction Negative interaction Distance to river* Plan curvature 1.72 No interaction Negative interaction Distance to river* NDVI 1.66 Positive interaction Positive interaction Rainfall* Plan curvature 1.32 No interaction Negative interaction Rainfall*NDVI 1.05 No interaction Negative interaction Plan curvature*NDVI 0.80 No interaction Negative interaction 803 804 805 806 807 808 809 810 811 812 813 814 815 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 18 of 24 for values less than or equal to 0 and other land for values greater than 0. Plan curvature was divided into flat land for values equal to 0 and other land for values other than 0. The results show that for product interaction, the indi- vidual effects of elevation, distance to river, rainfall, plan curvature and NDVI are 5.97, 3.78, 2.62, 0.70 and 1.34 respectively (Table 8). Elevation had the greatest influence on flood susceptibility, followed by distance to river and rainfall. For the interaction of various factors, the multi- plicative interaction between distance to river and NDVI promotes the occurrence of flooding, while the multipli- cative interaction between the other factors is negative. On the additive scale, the interactions between elevation and distance to river and between distance to river and NDVI promote the occurrence of flooding, while there is no interaction between the other factors. Validation of the flood susceptibility map The ROC analysis was carried out in this study to quan- titatively confirm whether the flood susceptibility areas obtained and those that have occurred in the past overlap. The ROC curve of the LR model is shown in Fig. 8, with an AUC of approximately 81 and 82% of the prediction and success accuracies, respectively. On this basis, it is arguable that the model implemented has a good predic- tive capacity, but the prediction of non-flooded sites is still insufficient considering that for a perfect model, the AUC is equal to 100%. Discussion Many factors can cause flooding, and future floods can be very difficult to predict with any accuracy (Khosravi et al. 2016). For this reason, it is important to collect as many variables affecting floods as possible, and to choose a suit- able analysis model for detecting future floods. This study used the LR model and combined it with 14 geospatial vari- ables related to flooding in the study area: elevation, slope, aspect, plan curvature, profile curvature, TPI, TRI, TWI, flow direction, distance to river, rainfall, LULC, NDVI and soil type. A flood susceptibility map was created, providing satisfactory and reliable results that can be used as a geo- spatial database for flood risk management decision-making and the development of the city of Ouagadougou. The study revealed that 18.48% of the city is very high susceptible to flooding, 18.99% has high susceptibility, 18.43% has mod- erate susceptibility, and 19.98% and 24.18% have low and very low susceptibility, respectively (Table 6). These results contrast with those of Kafando et al. (2023), who underesti- mated areas of high flood sensitivity probably due to the use of the AHP technique, a method based on expert’s opinion rather than actual flood incidents in Ouagadougou. The findings of this research underscore five elements that significantly affect the likelihood of flooding, as evidenced by a p-value less than 0.05: elevation, distance to river, rain- fall, plan curvature, and NDVI (Table 5). Table 7 delineates the incidence of flooding across various categories/values of these influential factors. Notably, flooding is more preva- lent at lower elevations (275–296 m), where the density of flooding is high. Rainfall events exacerbate this issue; water flows from higher elevations accumulate towards lower ones, causing flooding. Areas closer to rivers (0–218.40 m and 218.40–466.69 m) experience denser flooding due to the transformation of Ouagadougou’s primary rivers into canals, diminishing the permeable land areas. Therefore, the river- banks and beds have become almost entirely impervious to runoff, extending the time required for stormwater drain- age. This leads to water overflow into homes, where it pools for extended periods. The rainfall is positively correlated Fig. 8   Results of the ROC analysis: a success rate curve using the training dataset; b prediction rate curve using the validation dataset 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################_ Page 19 of 24  _####_ with flood density, but there are fewer floods in the wettest regions (882.38–883.54 mm and 883.54–884.41 mm). This discrepancy might stem from incomplete flood mapping or the combined effect of other variables. Moreover, some areas, even those with low annual rainfall, may be flooded due to extreme rainfall events or to several consecutive days of rain (Hangnon et al. 2015; Nouaceur 2020). Not taking extreme rainfall events into account, so seasonal variability in rainfall is a limitation of this study, and may also help to explain this result. Positive plan curvature (concave shape) enhances flood frequency since it concentrates water flow within the basin, increasing soil moisture and reducing satu- ration time, thus facilitating flooding during intense rain- falls. Similarly, flooding is more common in areas with low NDVI values (−0.23 to −0.05 and −0.05 to −0.02), primar- ily because such areas are saturated with water, making them prone to flooding during heavy precipitation. The results of the study in the city of Ouagadougou revealed various variables that affect susceptibility to flooding and that are in line with other research on other territories (Breinl et al. 2021; Hidayat Jati et al. 2019; Moazzam et al. 2018; Ullah and Zhang 2020; Waiyasusri et al. 2023). In a case study of flood susceptibility mapping in the Swat River basin, Rahman et al. (2023) found that slope significantly influ- enced flood risk, in contrast to Ouagadougou, where slope had little impact, probably due to its uniform morphology (Kêdowidé et al. 2010). This underlines the importance of taking local topography and geomorphology into account in flood susceptibility studies, emphasizing the need to iden- tify the main causal factors specific to the geography and physical characteristics of each area (Waiyasusri et al. 2023; Wang et al. 2023). Analysis of the interaction of the various factors (Table 8) showed that the combined action of altitude and distance from the river increases the probability of flooding on the additive scale. This can be explained by the fact that flooding occurs more frequently in areas of low elevation, whereas in the city of Ouagadougou, it is mainly areas close to rivers that are characterized by low elevation (Figs. 3a and 4b). Furthermore, the combined actions of elevation, distance to river and rainfall reduce the probability of flooding on the multiplicative scale, which is contrary to our previous understanding. However, Fig. 4d shows that areas of high rainfall are located in the western part of the study area, characterized by high elevation values and a reduced number of rivers. In these areas, water is easily evacuated during rainfall, reducing the risk of flooding. In assessing the performance of the model used in this study, it was noted that, in some cases, flooding occurred in areas with a susceptibility less than 0.5, and in other cases, no flooding occurred at all, even in areas with a suscep- tibility high than 0.5 (Table 6). Therefore, it is not clear that areas with relatively low susceptibility are safer. Some areas may not be flooded due to flood protection facilities, such as retention basins and stormwater drainage networks, which have already been built to cope with heavier rainfall thanks to the management of municipal authorities (Sobieraj et al. 2022). Other areas, on the other hand, may be subject to flooding due to human factors, such as road infrastruc- ture development and building density (Hitouri et al. 2024; Rafiei-Sardooi et al. 2021). The performance of the model could be improved if these factors were taken into account. It is true that areas with high susceptibility require special management because of their high probability of flooding, however low susceptibility areas also need to be carefully managed in order to reduce flood damage. Another short- coming of this study that needs to be improved is the resolu- tion of the datasets used. Indeed, free low-resolution datasets were used for the acquisition of predictive variables from the digital terrain, soil and precipitation model. Data resolu- tion has always had an impact on the accuracy of prediction results. The low-resolution datasets can sometimes fail to capture the detail required for accurate flood susceptibility modelling (Saha et al. 2021). This lack of detail can lead to inaccuracies in the identification of potentially flood-prone areas. In addition, free datasets often fail to take into account temporal variations in environmental conditions, such as changes in vegetation cover, urban sprawl, seasonal vari- ations of rainfall and extreme weather events (Saber et al. 2020). These temporal variations can have a significant impact on flood susceptibility, and neglecting them can lead to outdated or inaccurate forecasts. Finally, the technology employed for data collection and analysis may have inherent limitations, such as sensor errors and cloud cover issues for satellite imagery (Al-Wassai & Kalyankar 2013; Teh et al. 2020). These technological limitations can lead to errors or gaps in the data, affecting the accuracy and reliability of the model. In consequence, it is crucial to have access to high- quality, high-resolution data to accurately capture the factors influencing flood susceptibility and improve the accuracy of the maps that will be produced. In future research, topo- graphic data from Light Detection and Ranging (LIDAR) surveys, with a spatial resolution of around one meter (Néelz et al. 2006; Zwenzner & Voigt 2009), and high-resolution images from the aerial photography can be used to describe and analyze flooded areas (Schumann et al. 2009). Conclusion The identification of flood-prone areas is essential for the implementation of measures to protect people’s lives and property. Once flood-prone areas have been identified, man- agers and people living in the areas at highest risk need to be vigilant during the rainy season, and should pay particular attention to possible flood disasters. This research applied 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 20 of 24 LR model using flood inventory and fourteen predicting fac- tors to study and map flood susceptibility in Ouagadougou, Burkina Faso. The flood inventory includes 1,026 flood sites identified through a household survey in the city of Ouaga- dougou city. These data were used to calibrate (70%) and validate (30%) the flood susceptibility model. The Pearson correlation test, multicollinearity test and other steps were then carried out to ensure that the factors introduced into the model were valid. Finally, the AUC was used to assess the model’s performance. Five factors conditioning flooding (elevation, distance to river, rainfall, plan curvature and NDVI) were identified as statistically significant (p-value < 0.05 and │Z│ > 1.96) at a 95% confidence level, and used to elaborate the map of flood-sensitive areas. On the multiplicative scale, there is a positive interaction between distance to river and NDVI and a negative interaction between the other factors. On the additive scale, there is a positive interaction between eleva- tion and distance to river and between distance to river and NDVI, while there is no interaction between the other fac- tors. The AUC value approximately 81 and 82% of the pre- diction and success accuracies, respectively, and the model is considered to have good predictability. The reliability of LR model is proved again through this research. The sus- ceptibility map generated can be used to extract data on the population, infrastructure and resources exposed to flood risk. This information can be used as a reference for plan- ners, future researchers and disaster risk managers, and can be used as an additional decision-support tool in the city of Ouagadougou with regard to the integration of flood risk into urban development planning documents. It is crucial that municipal authorities integrate this flood susceptibility map into their urban planning. They should prioritize areas of low flood susceptibility for residential, commercial or public development. This approach aims to reduce the exposure of populations and critical infrastruc- tures to flood risks. In addition, this map should be incorpo- rated into stormwater management strategies, with particular emphasis on the development of efficient drainage systems in areas of high flood susceptibility. It is also recommended to launch educational and awareness-raising programs in order to inform residents about areas of high susceptibility to flooding and the protective measures available. However, an examination of the map revealed certain discrepancies. In specific circumstances, flooding events were observed in areas previously identified as being of low susceptibility. Conversely, in other situations, the absence of flooding was observed despite the high susceptibility indi- cated by the map. These observations can be attributed to the exclusion of anthropogenic factors from the model, as well as to the limited resolution of the data employed in this investigation. For future studies, it would be advisable to incorporate higher resolution data to increase the accuracy of the model. Furthermore, this study emphasized flood susceptibility as a natural phenomenon, without taking into account anthropogenic factors, which are an aggravating fac- tor in this phenomenon. The integration of other parameters in addition to biophysical factors could potentially improve the model’s predictive ability regarding flood-prone areas. It is recommended that further research examine socio- economic vulnerability, which encompasses anthropogenic causes of flooding such as population density, road net- work, quality of building materials and people’s perception of flooding. The overlay of the susceptibility map with the socio-economic vulnerability map could generate a flood risk map for the city of Ouagadougou, providing information on the potential consequences that communities could face during flood events. This approach is crucial for urban plan- ning, emergency preparedness and raising public awareness. Acknowledgements  The authors are grateful to the Institut Interna- tional d’Ingénierie de l’Eau et de l’Environnement (2iE) and Université Nazi BONI (UNB) for their support, and the editors and anonymous reviewers for their insightful and constructive suggestions to improve this manuscript. The authors also acknowledge the World Bank Group under the Africa Centers of Excellence for Development Impact (ACE Impact) Project for its support. Author contributions  All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by K.T., T.F. and M.O. The first draft of the manuscript was written by K.T. All authors commented on previous versions of the manuscript. H.K. helped for the funding acquisition. All authors read and approved the final manuscript. Funding  This work was supported by the International Institute for Water and Environmental Engineering (2iE) and the World Bank through the Africa Centre of Excellence Project (ACE-Impact) [Grant Numbers IDA 6388/D443-BF]. Data availability  No datasets were generated or analysed during the current study. Declarations  Conflict of interest  The authors declare no competing interests. References Ahmed A (2023) Flood susceptibility mapping utilizing the integration of geospatial and multivariate statistical analysis, Erbil area in Northern Iraq as a case study. Sci Rep 13:11919 Aldiansyah S, Wardani F (2023) Evaluation of flood susceptibility pre- diction based on a resampling method using machine learning. J Water Clim Change 14(3):937–961. https://​doi.​org/​10.​2166/​ wcc.​2023.​494 Ali SA, Parvin F, Pham QB, Vojtek M, Vojteková J, Costache R, Linh NTT, Nguyen HQ, Ahmad A, Ghorbani MA (2020) GIS-based comparative assessment of flood susceptibility mapping using hybrid multi-criteria decision-making approach, naïve Bayes tree, bivariate statistics and logistic regression: a case of Topľa 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 https://doi.org/10.2166/wcc.2023.494 https://doi.org/10.2166/wcc.2023.494 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################_ Page 21 of 24  _####_ basin, Slovakia. Ecol Indic 117:106620. https://​doi.​org/​10.​ 1016/j.​ecoli​nd.​2020.​106620 Al-Juaidi AEM, Nassar AM, Al-Juaidi OEM (2018) Evaluation of flood susceptibility mapping using logistic regression and GIS conditioning factors. Arab J Geosci 11(24):765. https://​doi.​org/​ 10.​1007/​s12517-​018-​4095-0 Altaf S, Meraj G, Romshoo SA (2014) Morphometry and land cover based multi-criteria analysis for assessing the soil erosion susceptibility of the western Himalayan watershed. Environ Monit Assess 186(12):8391–8412. https://​doi.​org/​10.​1007/​ s10661-​014-​4012-2 Al-Wassai FA, Kalyankar NV (2013) Major limitations of satellite images. J Glob Res Comput Sci 4(5):51–59. https://​doi.​org/​10.​ 48850/​arXiv.​1307.​2434 Andrade C (2019) The P value and statistical significance: misunder- standings, explanations, challenges, and alternatives. Indian J Psychol Med 41(3):210–215. https://​doi.​org/​10.​4103/​IJPSYM.​ IJPSYM_​193_​19 Arabameri A, Pradhan B, Lombardo L (2019) Comparative assess- ment using boosted regression trees, binary logistic regression, frequency ratio and numerical risk factor for gully erosion sus- ceptibility modelling. CATENA 183:104223. https://​doi.​org/​10.​ 1016/j.​catena.​2019.​104223 Askar S, Zeraat Peyma S, Yousef MM, Prodanova NA, Muda I, Elsa- habi M, Hatamiafkoueieh J (2022) Flood susceptibility mapping using remote sensing and integration of decision table classifier and metaheuristic algorithms. Water 14(19):3062. https://​doi.​org/​ 10.​3390/​w1419​3062 Aslam M (2018) Flood management current state, challenges and pros- pects in Pakistan: a review. Mehran Univ Res J Eng Technol 37(2):297–314. https://​doi.​org/​10.​2581/​muet1​982.​1802.​06 Bai S-B, Wang J, Lü G-N, Zhou P-G, Hou S-S, Xu S-N (2010) GIS- based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the Three Gorges area, China. Geo- morphol 115(1–2):23–31. https://​doi.​org/​10.​1016/j.​geomo​rph.​ 2009.​09.​025 Bani SS, Yonkeu S (2016) Risques d’inondation dans la ville de Ouaga- dougou: cartographie des zones à risques et mesures de préven- tion. JOASG 1(1):1–109 Basri H, Syakur S, Azmeri A, Fatimah E (2022) Floods and their prob- lems: land uses and soil types perspectives. IOP Conf Ser Earth Environ Sci 951(1):012111. https://​doi.​org/​10.​1088/​1755-​1315/​ 951/1/​012111 Basu A, Ghosh A, Mandal A, Martín N, Pardo L (2017) A wald-type test statistic for testing linear hypothesis in logistic regression models based on minimum density power divergence estimator. Electron J Stat. https://​doi.​org/​10.​1214/​17-​EJS12​95 Beven KJ, Kirkby MJ (1979) A physically based, variable contributing area model of basin hydrology / Un modèle à base physique de zone d’appel variable de l’hydrologie du bassin versant. Hydrol Sci Bull 24(1):43–69. https://​doi.​org/​10.​1080/​02626​66790​94918​ 34 Boateng EY, Abaye DA (2019) A review of the logistic regression model with emphasis on medical research. J Data Anal Inf Pro- cess 07(04):190–207. https://​doi.​org/​10.​4236/​jdaip.​2019.​74012 Bole N, Bandyopadhyay A, Bhadra A (2024) PixelSWAT: A user- friendly ArcGIS tool for preparing inputs to run SWAT in a dis- tributed discretization scheme. Appl Comput Geosci 23:100175. https://​doi.​org/​10.​1016/j.​acags.​2024.​100175 Botzen WJW, Aerts JCJH, Van Den Bergh JCJM (2013) Individual preferences for reducing flood risk to near zero through elevation. Mitig Adapt Strat Glob Change 18(2):229–244. https://​doi.​org/​ 10.​1007/​s11027-​012-​9359-5 Breinl K, Lun D, Müller-Thomy H, Blöschl G (2021) Understanding the relationship between rainfall and flood probabilities through combined intensity-duration-frequency analysis. J Hydrol 602:126759. https://​doi.​org/​10.​1016/j.​jhydr​ol.​2021.​126759 Brunet-Le Rouzic L (1981) Réflexions sur la colinéarité. Travaux de L’institut Géographique de Reims 47(1):61–74. https://​doi.​org/​ 10.​3406/​tigr.​1981.​1107 Cao Y, Jia H, Xiong J, Cheng W, Li K, Pang Q, Yong Z (2020) Flash flood susceptibility assessment based on geodetector, certainty factor, and logistic regression analyses in Fujian province, China. ISPRS Int J Geo-Inf 9(12):748. https://​doi.​org/​10.​3390/​ijgi9​ 120748 Chapi K, Singh VP, Shirzadi A, Shahabi H, Bui DT, Pham BT, Khos- ravi K (2017) A novel hybrid artificial intelligence approach for flood susceptibility assessment. Environ Model Softw 95:229– 245. https://​doi.​org/​10.​1016/j.​envso​ft.​2017.​06.​012 Chauhan S, Sharma M, Arora MK (2010) Landslide susceptibility zonation of the Chamoli region, Garhwal Himalayas, using logis- tic regression model. Landslides 7(4):411–423. https://​doi.​org/​ 10.​1007/​s10346-​010-​0202-3 Choubin B, Hosseini FS, Rahmati O, Youshanloei MM (2022) A step towards considering the return period in flood spatial modeling [Preprint]. In Review. https://​doi.​org/​10.​21203/​rs.3.​rs-​16774​18/​ v1 Chowdhuri I, Pal SC, Chakrabortty R (2020) Flood susceptibility map- ping by ensemble evidential belief function and binomial logistic regression model on river basin of eastern India. Adv Space Res 65(5):1466–1489. https://​doi.​org/​10.​1016/j.​asr.​2019.​12.​003 Costache R, Ngo PTT, Bui DT (2020) Novel ensembles of deep learn- ing neural network and statistical learning for flash-flood sus- ceptibility mapping. Water 12(6):1549. https://​doi.​org/​10.​3390/​ w1206​1549 Da MLC, Bonnet E (2021) Risques d’inondations au Sahel: Modélisa- tion des facteurs sociaux porteurs de dommages structurels aux ménages de Bamako (Mali) Dai X, Zhu Y, Sun K, Zou Q, Zhao S, Li W, Hu L, Wang S (2023) Examining the spatially varying relationships between landslide susceptibility and conditioning factors using a geographical ran- dom forest approach: a case study in Liangshan, China. Remote Sens 15(6):1513. https://​doi.​org/​10.​3390/​rs150​61513 De Risi R, Jalayer F, De Paola F, Lindley S (2018) Delineation of flood- ing risk hotspots based on digital elevation model, calculated and historical flooding extents: the case of Ouagadougou. Stoch Env Res Risk Assess 32(6):1545–1559. https://​doi.​org/​10.​1007/​ s00477-​017-​1450-8 Delacour H, Servonnet A, Perrot A, Vigezzi JF, Ramirez JM (2005) La courbe ROC (receiver operating characteristic): Principes et principales applications en biologie clinique. Ann Biol Clin 63:145–154 Dottori F, Martina MLV, Figueiredo R (2018) A methodology for flood susceptibility and vulnerability analysis in complex flood sce- narios. J Flood Risk Manag. https://​doi.​org/​10.​1111/​jfr3.​12234 Dutta M, Saha S, Saikh NI, Sarkar D, Mondal P (2023) Application of bivariate approaches for flood susceptibility mapping: a district level study in Eastern India. HydroResearch 6:108–121. https://​ doi.​org/​10.​1016/j.​hydres.​2023.​02.​004 Eder M, Perosa F, Hohensinner S, Tritthart M, Scheuer S, Gelhaus M, Cyffka B, Kiss T, Van Leeuwen B, Tobak Z, Sipos G, Csikós N, Smetanová A, Bokal S, Samu A, Gruber T, Gălie A-C, Moldove- anu M, Mazilu P, Habersack H (2022) How can we identify active, former, and potential floodplains? Methods and lessons learned from the danube river. Water 14(15):2295. https://​doi.​ org/​10.​3390/​w1415​2295 Fadhil M, Ristya Y, Oktaviani N, Kusratmoko E (2020) Flood vul- nerability mapping using the spatial multi-criteria evaluation (SMCE) method in the Minraleng Watershed, Maros Regency, 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 https://doi.org/10.1016/j.ecolind.2020.106620 https://doi.org/10.1016/j.ecolind.2020.106620 https://doi.org/10.1007/s12517-018-4095-0 https://doi.org/10.1007/s12517-018-4095-0 https://doi.org/10.1007/s10661-014-4012-2 https://doi.org/10.1007/s10661-014-4012-2 https://doi.org/10.48850/arXiv.1307.2434 https://doi.org/10.48850/arXiv.1307.2434 https://doi.org/10.4103/IJPSYM.IJPSYM_193_19 https://doi.org/10.4103/IJPSYM.IJPSYM_193_19 https://doi.org/10.1016/j.catena.2019.104223 https://doi.org/10.1016/j.catena.2019.104223 https://doi.org/10.3390/w14193062 https://doi.org/10.3390/w14193062 https://doi.org/10.2581/muet1982.1802.06 https://doi.org/10.1016/j.geomorph.2009.09.025 https://doi.org/10.1016/j.geomorph.2009.09.025 https://doi.org/10.1088/1755-1315/951/1/012111 https://doi.org/10.1088/1755-1315/951/1/012111 https://doi.org/10.1214/17-EJS1295 https://doi.org/10.1080/02626667909491834 https://doi.org/10.1080/02626667909491834 https://doi.org/10.4236/jdaip.2019.74012 https://doi.org/10.1016/j.acags.2024.100175 https://doi.org/10.1007/s11027-012-9359-5 https://doi.org/10.1007/s11027-012-9359-5 https://doi.org/10.1016/j.jhydrol.2021.126759 https://doi.org/10.3406/tigr.1981.1107 https://doi.org/10.3406/tigr.1981.1107 https://doi.org/10.3390/ijgi9120748 https://doi.org/10.3390/ijgi9120748 https://doi.org/10.1016/j.envsoft.2017.06.012 https://doi.org/10.1007/s10346-010-0202-3 https://doi.org/10.1007/s10346-010-0202-3 https://doi.org/10.21203/rs.3.rs-1677418/v1 https://doi.org/10.21203/rs.3.rs-1677418/v1 https://doi.org/10.1016/j.asr.2019.12.003 https://doi.org/10.3390/w12061549 https://doi.org/10.3390/w12061549 https://doi.org/10.3390/rs15061513 https://doi.org/10.1007/s00477-017-1450-8 https://doi.org/10.1007/s00477-017-1450-8 https://doi.org/10.1111/jfr3.12234 https://doi.org/10.1016/j.hydres.2023.02.004 https://doi.org/10.1016/j.hydres.2023.02.004 https://doi.org/10.3390/w14152295 https://doi.org/10.3390/w14152295 UNCORRECTED PROOF Journal : Large 12665 Article No : 11871 Pages : 24 MS Code : 11871 Dispatch : 21-9-2024 Environmental Earth Sciences _#####################__####_  Page 22 of 24 South Sulawesi. E3S Web Conf 153:01004. https://​doi.​org/​10.​ 1051/​e3sco​nf/​20201​53010​04 Fang Z, Wang Y, Peng L, Hong H (2021) Predicting flood susceptibility using LSTM neural networks. J Hydrol 594:125734. https://​doi.​ org/​10.​1016/j.​jhydr​ol.​2020.​125734 Feloni E, Anayiotos A, Baltas E (2022) A spatial analysis approach for urban flood occurrence and flood impact based on geomorpho- logical, meteorological, and hydrological factors. Geographies 2(3):516–527. https://​doi.​org/​10.​3390/​geogr​aphie​s2030​031 Fenglin W, Ahmad I, Zelenakova M, Fenta A, Dar MA, Teka AH, Belew AZ, Damtie M, Berhan M, Shafi SN (2023) Explora- tory regression modeling for flood susceptibility mapping in the GIS environment. Sci Rep 13(1):247. https://​doi.​org/​10.​1038/​ s41598-​023-​27447-0 Fenza G, Gallo M, Loia V, Orciuoli F, Herrera-Viedma E (2021) Data set quality in machine learning: consistency measure based on group decision making. Appl Soft Comput 106:107366. https://​ doi.​org/​10.​1016/j.​asoc.​2021.​107366 Fernández DS, Lutz MA (2010) Urban flood hazard zoning in Tucumán Province, Argentina, using GIS and multicriteria decision analy- sis. Eng Geol 111(1–4):90–98. https://​doi.​org/​10.​1016/j.​enggeo.​ 2009.​12.​006 Foody GM (2002) Status of land cover classification accuracy assess- ment. Remote Sens Environ 80(1):185–201. https://​doi.​org/​10.​ 1016/​S0034-​4257(01)​00295-4 Fustos I, Abarca-del-Rio R, Ávila A, Orrego R (2017) A simple logis- tic model to understand the occurrence of flood events into the Biobío River Basin in central Chile. J Flood Risk Manag 10(1):17–29. https://​doi.​org/​10.​1111/​jfr3.​12131 Ha H, Luu C, Bui QD, Pham D-H, Hoang T, Nguyen V-P, Vu MT, Pham BT (2021) Flash flood susceptibility prediction map- ping for a road network using hybrid machine learning mod- els. Nat Hazards 109(1):1247–1270. https://​doi.​org/​10.​1007/​ s11069-​021-​04877-5 Hammond MJ, Chen AS, Djordjević S, Butler D, Mark O (2015) Urban flood impact assessment: a state-of-the-art review. Urban Water J 12(1):14–29. https://​doi.​org/​10.​1080/​15730​62X.​2013.​857421 Hangnon H, De Longueville F, Ozer P (2015) PRÉCIPITA- TIONS ‘EXTRÊMES’ ET INONDATIONS À OUAGADOU- GOU : QUAND LE DÉVELOPPEMENT URBAIN EST MAL MAÎTRISÉ… 6 Harrell FE (2015) Regression modeling strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. Springer International Publishing, Cham Jati H, Suroso MI, Santoso PB (2019) Prediction of flood areas using the logistic regression method (case study of the prov- inces Banten, DKI Jakarta, and West Java). J Phys Conf Ser 1367(1):012087. https://​doi.​org/​10.​1088/​1742-​6596/​1367/1/​ 012087 Hitouri S, Mohajane M, Lahsaini M, Ali SA, Setargie TA, Tripathi G, D’Antonio P, Singh SK, Varasano A (2024) Flood suscep- tibility mapping using SAR data and machine learning algo- rithms in a small watershed in Northwestern Morocco. Remote Sensing 16(5):858. https://​doi.​org/​10.​3390/​rs160​50858 Hong H, Tsangaratos P, Ilia I, Liu J, Zhu A-X, Chen W (2018) Appli- cation of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China. Sci Total Environ 625:575–588. https://​doi.​org/​10.​ 1016/j.​scito​tenv.​2017.​12.​256 Hosmer DW, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, 1st edn. Wiley, Hoboken Hurt, G. (2023). Normalization and generalization in deep learning. A thesis submitted in partial fulfillment of the requirements for the Degree of Master of Science in Applied and Computational Mathematics School of Mathematical Sciences, College of Sci- ence Rochester Institute of Technology, Rochester, New York IDMC (Internal Displacement Monitoring Centre) (2019) Global report on internal displacement. p. 82. https://​api.​inter​nal-​displ​ aceme​nt.​org/​sites/​defau​lt/​files/​publi​catio​ns/​docum​ents/​2019-​ IDMC-​GRID.​pdf INSD (Institut National de la Statistique et de la Démographie) (2022) Cinquième recensement général de la population et de l’habitation du Burkina Faso. p. 136. [Synthèse des résultats définitifs]. https://​www.​insd.​bf/ Jenks GF (1967) The data model concept in statistical mapping, vol. 7 Kafando H, Ouedraogo B, Ojeh VN, Millogo AMD, Sow A (2023) Flood susceptibility mapping using the geographic information system and analytic hierarchy process technique: case of oua- gadougou municipality in Burkina Faso. J Geogr Nat Disasters 13(286):1–13 Kaur H, Gupta S, Parkash S, Thapa R, Mandal R (2017) Geospatial modelling of flood susceptibility pattern in a subtropical area of West Bengal, India. Environ Earth Sci 76(9):339. https://​doi.​org/​ 10.​1007/​s12665-​017-​6667-9 Kaya CM, Derin L (2023) Parameters and methods used in flood sus- ceptibility mapping: a review. J Water Clim Change 14(6):1935– 1960. https://​doi.​org/​10.​2166/​wcc.​2023.​035 Kêdowidé CMG, Sedogo MP, Cissé G (2010) Dynamique spatio temporelle de l’agriculture urbaine à Ouagadougou: cas du Maraîchage comme une activité montante de stratégie de survie. VertigO—la re