Knowing the vast, rural area of Tanzania is crucial to provide timely and effective help for girls during Female Genital Mutilation (FGM) ‘cutting seasons’. In recent years, we have managed to map millions of buildings which can help us determine the distribution of the population. Although low population density areas in Tanzania are not sufficiently mapped yet, the initial steps have already been taken.
Goals of mapping
Crowd2Map Tanzania is a “crowdsourced mapping project aiming to put rural Tanzania on the map”. A primary goal is to help fight against FGM. Girls are rescued and taken to safe houses by local volunteers and police. However, for this they need maps. But maps can do more than just show these rescue teams the way to remote villages. The existence of spatial information can help with development and to increase commercial efficiency and economic growth opportunities for businesses and entrepreneurs, giving them the opportunity to make better-informed decisions. Growing wealth improves the quality of life, gives a chance for more opportunities and a better quality of education.
Find the village
So, we now know where to find traces of human settlements, but how do we delineate each settlement and, more importantly, how do we know what the name of the settlement is?
The delimitation of human habitats is not easy, the structure of the settlement is often region dependent. What does it mean? In the Ruvuma region (southern Tanzania) the settlements are well separated on the map. In contrast, in agricultural areas of the Shinyanga region, delimitation sometimes seems an impossible task.
And what about the names of the settlements? Local volunteers can help us identify all the names of circa 10,000 – 12,000 settlements in Tanzania, OR we can try to find some open source data which contains this information. Recruiting hundreds of volunteers from all over the country is beyond our power, so we need to focus on the second SOLUTION in most places. Fortunately, we have some open source data from The United Republic of Tanzania – Government Basic Statistics Portal, like health facilities or schools, or waterpoints located all over Tanzania.
Our project objective is to add the missing village names in Tanzania, using open source government data about water sources in Tanzania.
Method for the estimation of village position
The shared database contains about 87,000 water sources, which can be lakes, rivers, machine drilled boreholes or springs. The database also contains the physical condition (quality, quantity) of the water sources as well as their spatial location, indicating, for example, the village name where the water source is, or the nearest village to it. This data helps us determine the name of the village in OSM.
For data validation the best possible application is JOSM, which can prepare our data to upload to OSM after data validation. During validation, the next datasets and imagery were used:
- Thyessen polygons were calculated from the water points layer, to get the influence zone of each water point. Then, the polygons were merged by attribute, where the village name is the same. The resulting polygons can help to determine the area where the village has to be.
- In the same time, Mean center was calculated for the points inside a polygon → potential position of the village. (Since in a few cases the name of a village occurs more than once in the country, a “village+district” combined data was used to help us to find the real mean center.) This is our village data POI which need to be implemented to OSM.
- OpenStreetMap imagery was used to identify the trace of human activity if the area was well mapped. We were also able to get an answer as to whether the name of the settlement has already been given to OSM.
- Maxar satellite imagery was used for those areas that weren’t mapped yet.
- Other useful datasets for validation
- Waterpoints: can be really useful, if the position of the village’s POI is unusually far from any populated area. In this case, it is worth looking at how each water point is located in the area. Another example, when the village consists of two sub-villages, then the “SUBVILLAGE” attribute of the water database can help determine where the center of the village can be.
- Health facilities data: The government data contains more than 7,000 health facilities like hospitals or clinics. The names of these facilities are usually, but not exclusively, the same as the name of the municipality where it is located.
- Education data: The government data contains almost 7,000 schools. The village names are available in this data.
The Voronoi polygon assigns the area where the village is located (or has to be). The village POI assigns the potential location of the settlement, BUT its accuracy depends on the number of water abstraction points and their location in/around the given settlement.
By the end of September, more than 143 districts were validated (88% of all districts), and 5505 villages POIs were added which is 52% of the total village POIs in Tanzania.
Crowd2map volunteers in the lead
The OSM database currently contains 10483 Tanzanian village points, a significant part was added by the volunteers of the Crowd2map team. The following pie chat shows how this 10483 POIs is divided between the TOP5 volunteers and the rest of mapper community:
Updated results – 31/10/2020
By the end of October, more than 157 districts were validated (97% of all districts), and 6759 villages POIs were added.