Knowing the vast, rural area of Tanzania is crucial to provide timely and effective help for girls during Female Genital Mutilation (FGM) ‘cutting seasons’. In recent years, we have managed to map millions of buildings which can help us determine the distribution of the population. Although low population density areas in Tanzania are not sufficiently mapped yet, the initial steps have already been taken.

Goals of mapping

Crowd2Map Tanzania is a “crowdsourced mapping project aiming to put rural Tanzania on the map”. A primary goal is to help fight against FGM. Girls are rescued and taken to safe houses by local volunteers and police. However, for this they need maps. But maps can do more than just show these rescue teams the way to remote villages. The existence of spatial information can help with development and to increase commercial efficiency and economic growth opportunities for businesses and entrepreneurs, giving them the opportunity to make better-informed decisions. Growing wealth improves the quality of life, gives a chance for more opportunities and a better quality of education.

Find the village

So, we now know where to find traces of human settlements, but how do we delineate each settlement and, more importantly, how do we know what the name of the settlement is?

The delimitation of human habitats is not easy, the structure of the settlement is often region dependent. What does it mean? In the Ruvuma region (southern Tanzania) the settlements are well separated on the map. In contrast, in agricultural areas of the Shinyanga region, delimitation sometimes seems an impossible task.

And what about the names of the settlements? Local volunteers can help us identify all the names of circa 10,000 – 12,000 settlements in Tanzania, OR we can try to find some open source data which contains this information. Recruiting hundreds of volunteers from all over the country is beyond our power, so we need to focus on the second SOLUTION in most places. Fortunately, we have some open source data from The United Republic of Tanzania – Government Basic Statistics Portal, like health facilities or schools, or waterpoints located all over Tanzania.  

Our project objective is to add the missing village names in Tanzania, using open source government data about water sources in Tanzania. 

Water Points Location in Rural Water Supply – 2015-2016

Method for the estimation of village position

The shared database contains about 87,000 water sources, which can be lakes, rivers, machine drilled boreholes or springs. The database also contains the physical condition (quality, quantity) of the water sources as well as their spatial location, indicating, for example, the village name where the water source is, or the nearest village to it. This data helps us determine the name of the village in OSM.


For data validation the best possible application is JOSM, which can prepare our data to upload to OSM after data validation. During validation, the next datasets and imagery were used: 

  1. Thyessen polygons were calculated from the water points layer, to get the influence zone of each water point. Then, the polygons were merged by attribute, where the village name is the same. The resulting polygons can help to determine the area where the village has to be.
  2. In the same time, Mean center was calculated for the points inside a polygon → potential position of the village. (Since in a few cases the name of a village occurs more than once in the country, a “village+district” combined data was used to help us to find the real mean center.) This is our village data POI which need to be implemented to OSM.
  3. OpenStreetMap imagery was used to identify the trace of human activity if the area was well mapped. We were also able to get an answer as to whether the name of the settlement has already been given to OSM. 
  4. Maxar satellite imagery was used for those areas that weren’t mapped yet. 
  5. Other useful datasets for validation
    • Waterpoints: can be really useful, if the position of the village’s POI is unusually far from any populated area. In this case, it is worth looking at how each water point is located in the area. Another example, when the village consists of two sub-villages, then the “SUBVILLAGE” attribute of the water database can help determine where the center of the village can be.
    • Health facilities data: The government data contains more than 7,000 health facilities like hospitals or clinics. The names of these facilities are usually, but not exclusively, the same as the name of the municipality where it is located. 
    • Education data: The government data contains almost 7,000 schools. The village names are available in this data. 

In summary

The Voronoi polygon assigns the area where the village is located (or has to be). The village POI assigns the potential location of the settlement, BUT its accuracy depends on the number of water abstraction points and their location in/around the given settlement.

In a well-mapped area - where, moreover, the settlements can be easily separated from each other - we did not have a difficult time with validation (mean centers before validation).
In a well-mapped area – where, moreover, the settlements can be easily separated from each other – we did not have a difficult time with validation (mean centers before validation).
The mean center of the Waterpoints sometimes clearly shows the center of the settlement if these water points are evenly distributed within and around the settlement.

Provisional results

By the end of July, more than 75 districts were validated (46% of all districts), and 2509 villages POIs were added which is 39% of the total village POIs in Tanzania.

User nameFirst editLast edit LifespanTotal edits
Bgabor18/04/2020 19:1625/07/2020 16:37971047
SHABANI MAGAWILA21/05/2020 12:0826/07/2020 15:4866785
Kasunga24/04/2020 16:0620/06/2020 06:2856884
Stuart Ward21/04/2020 18:4221/04/2020 18:42049

Leave a Reply

Your email address will not be published. Required fields are marked *