| Article of the Month - December 2024 | 
		Improving Cadastral Accuracy for Disaster 
		Management: The Role of Segment Anything Model (SAM) in Digitizing 
		Historical Cadastral Maps
		Sanjeevan Shrestha, Tina Baidar and Shangharsha 
		Thapa, Nepal
		
			
				|  |  |  | 
			
				| Sanjeevan SHRESTHA | Tina BAIDAR | Shangharsha THAPA | 
		
		
			
			This article in .pdf-format 
			(12 pages)
		
			
			This paper was presented at the FIG Regional Conference 2024 in 
			Kathmandu, Nepal, 14-16 November 2024
			
		
						SUMMARY
		Up-to-date cadastral maps with detailed land ownership, boundaries, 
		and values, are crucial in disaster-prone regions like Nepal, where 
		accurate land data significantly impact disaster risk management for 
		efficient resource allocation, response planning, and so on. Given the 
		challenges associated with updating cadastral mapping, there is a 
		pressing need to digitize existing maps to establish an up-to-date 
		cadastral database. The digitization of old cadastral maps faces 
		challenges like inconsistent skill levels, human errors, and data 
		quality issues, making the process time-consuming and prone to 
		inaccuracies. Hence, automating the process is essential to create an 
		accurate and up-to-date cadastral database.
		This study explores the application of the Segment Anything Model 
		(SAM) for automating the digitization of historical cadastral maps, 
		specifically focussing on land parcel boundary extraction, specifically 
		in the context of Nepal. Using a diverse dataset of scanned cadastral 
		maps, the study evaluates SAM’s zero-shot segmentation performance under 
		different prompting conditions, including bounding box, multi-point 
		prompts, and their combinations. Key factors such as parcel size, shape, 
		eccentricity, clarity of boundaries, and noise levels of the cadastral 
		map were analyzed. SAM demonstrated promising results, particularly when 
		employing combined prompts, but challenges arose in handling noisy data 
		near parcel boundaries and complex configurations within the parcel. 
		Moreover, false positives between segmented parcels continue to be 
		significant challenges, and increasing the scanning resolution also did 
		not noticeably improve segmentation accuracy.
		The study concludes that SAM provides promising solutions for 
		enhancing cadastral digitization in Nepal. The challenges faced 
		highlight the need for integrating Geographic Information Systems (GIS) 
		with SAM, along with human oversight, to ensure the creation of accurate 
		and complete cadastral databases. Future research should focus on 
		fine-tuning SAM for one-shot learning or using SAM-2 model and 
		integrating it with diverse remote sensing data to further improve 
		segmentation accuracy and resilience in land administration systems. 
		1. INTRODUCTION
		Nepal is highly vulnerable to various natural disasters, including 
		earthquakes, floods, landslides, and more recently, glacial lake 
		outburst floods (GLOFs). These disasters not only destroy lives and 
		properties but also adversely affect land administration by erasing 
		physical land boundaries and destroying land records (Lukman Syahid, 
		2011). In the event of a natural disaster, land tenure will only remain 
		secure if adequate land administration records exist or if landowners 
		possess legal documentation proving their rights to the land (Mitchell, 
		2009).  The adverse effects of disasters can be minimized by 
		linking efficient land administration with disaster risk management. 
		Cadastral maps are foundational to land administration systems as they 
		provide detailed records of land parcels, ownership, boundaries, and 
		legal rights. These maps are essential for managing land-related 
		activities, including land registration, property taxation, and land use 
		planning. Up-to-date cadastral information is essential for disaster 
		risk management as it facilitates efficient resource allocation, 
		improves response planning, ensures accurate damage assessment, and 
		provides legal and administrative clarity (Lukman Syahid, 2011). It also 
		enables informed decision-making and improves environmental and risk 
		management strategies.
		In the context of cadastral records of Nepal, the initial cadastral 
		survey, completed in 1995 A.D., provided analog cadastral maps for all 
		of Nepal but excluded densely populated areas such as village blocks and 
		public lands (Sapkota, 2012). As demands grew for accurate and easily 
		accessible land records, the Department of Land Information and Archive 
		(DoLIA) was established in 2000 A.D. to implement a Land Information 
		System (LIS) aimed at efficient land management. DoLIA began archiving 
		cadastral records and developing software systems for acquiring spatial 
		data from hard copies of cadastral sheets through digitization and their 
		attribute data as well (Sapkota, 2012). Despite these advancements, 
		significant challenges persist in the scanning and digitization of old 
		maps, including susceptibility to human errors, variability in 
		interpretation and digitization skills among personnel, and 
		inconsistencies in data quality. The digitization process remains 
		time-consuming and error-prone due to differing skill levels and 
		interpretations among individuals working on different map sections, 
		resulting in edge problems and data inconsistencies. Additionally, not 
		all personnel are proficient in digital technology, further complicating 
		the digitization efforts. Given the challenges associated with 
		digitizing cadastral records, there is a pressing need to automate the 
		process to establish an up-to-date cadastral database. 
		Several studies have been conducted since a long time on developing 
		automatic map interpretation systems and methods for the automatic 
		extraction of cadastral records, aiming to streamline and improve the 
		efficiency of cadastral map digitization and analysis. One of the 
		earliest studies used a baseline automatic cadastral map interpretation 
		method that employed processes including noise removal and 
		skeletonization of scanned maps, vectorization, parcel detection, and 
		interpretation (Janssen, Duin, & Vossepoel, 2002).  Among the 
		recent studies, a study used a segmentation method that combined four 
		steps of image processing algorithms to extract land regions 
		automatically from historical cadastral maps and demonstrated that the 
		method extracted land boundaries with an average error of 0.62% with a 
		standard deviation of  ± 0.61% (Kim, Lee, Lee, & Seo, 2014). The 
		results imply that while the average error is low, there are some 
		fluctuations in accuracy across different maps. The study also 
		acknowledges limitations in the approach, particularly when dealing with 
		maps that lack clear delineations or contain ambiguities. Another study 
		overviewed the use of deep learning techniques including convolutional 
		neural networks (CNN) and semantic segmentation methods, to automate the 
		digitization of historical cadastral maps (Ignjatić, Nikolić, Rikalović, 
		& Ćulibrk, 2018).  The study addressed the limitations of the deep 
		learning algorithms that they require large, high-quality training 
		datasets and the models’ struggle with generalizing across different map 
		types. Moreover, accuracy concerns persist, particularly with faded or 
		complex map features, necessitating ongoing human oversight to correct 
		errors. Furthermore, another study assessed the application of 
		Object-Based Image Analysis (OBIA) procedures for the semi-automatic 
		digitalization of heritage maps including historical cadaster maps which 
		demonstrated OBIA techniuqe is viable approach to digitalization over 
		classical pixel based classification methods (Gobbi et al., 2019). The 
		limitations observed in previous studies are either lengthy procedures 
		involved in the case of pixel-based and object-based classifications and 
		complex (combination of) image processing algorithms or the 
		unavailability of large training datasets for deep neural networks such 
		as CNN to perform efficiently. 
		Segment Anything Model (SAM), recently released by Meta AI Research, 
		is a foundational model in the field of artificial intelligence. SAM has 
		been trained on a massive dataset, consisting of 11 million images and 
		1.1 billion masks, and it demonstrates impressive zero-shot performance 
		across a wide range of segmentation tasks (Kirillov et al., 2023). 
		Foundation models like SAM, which have made significant strides in both 
		natural language processing (NLP) and more recently in computer vision, 
		are capable of zero-shot learning. This means they can adapt to new 
		datasets and perform unfamiliar tasks using 'prompting' techniques, even 
		with little or no prior training. This capability has the potential to 
		reduce human efforts during the digitization and annotation process and 
		presents an opportunity to alleviate time-intensive tasks. A recent 
		study demonstrated promising adaptability to segmentation of various 
		remote sensing data (satellite, airborne, and UAV)  and its 
		analysis and recommended further research models to improve the model’s 
		performance by integrating it with additional fine-tuning techniques and 
		other network architectures (Osco et al., 2023). 
		This study explores the potential of the Segment Anything Model (SAM) 
		for the automatic digitalization of historical cadastral maps, with a 
		specific focus on land parcel boundary extraction. The primary objective 
		is to assess the feasibility and effectiveness of SAM in automating the 
		segmentation of land parcels from scanned cadastral maps into GIS 
		databases. The model's robustness and adaptability were evaluated under 
		varying scenarios and complexities of cadastral parcels in the context 
		of Nepal. A Zero-shot segmentation technique, based on SAM, was employed 
		throughout the study to examine its performance across diverse 
		conditions.
		2. MATERIALS AND METHODS
		The study has investigated SAM’s segmentation capacity with different 
		scanned cadastral maps under different prompting conditions. Figure 1 
		shows the schematic representation of the overall workflow of the study.
		
		
		Figure 1: Schematic representation of a step-wise process for 
		evaluating the efficacy of SAM
		2.1  Cadastral Data Synthesis
		The dataset for this study comprises a diverse array of scanned 
		cadastral images, providing a broad foundation for evaluating the 
		Segment Anything Model (SAM) in terms of robustness and adaptability 
		across a wide range of conditions (Table 1). The georeferenced analog 
		cadastral maps were systematically categorized into five key attributes: 
		size, shape, visual clarity, noise condition, and scanning resolution, 
		allowing for a detailed exploration of SAM’s capabilities. Regarding 
		size, the dataset included images of uniform, large, and small 
		dimensions, facilitating the assessment of SAM’s performance across 
		varying parcel sizes. For shape, the dataset covered both regular 
		parcels and those with significant eccentricity, enabling the model’s 
		adaptability to irregular geometries to be tested. Visual clarity was 
		addressed by comparing parcels with clear and blurred boundaries, which 
		provided insight into SAM’s ability to handle imperfect or degraded 
		imagery. Noise conditions were evaluated by including both noisy and 
		noise-free images, simulating issues like scanning defects. All images 
		were initially scanned at a resolution of 300 DPI to standardize the 
		evaluation. To further explore the effect of resolution, the same 
		cadastral maps were scanned at 300, 400, 500, 600, and 800 DPI, allowing 
		an additional layer of analysis on SAM's performance in response to 
		varying image quality and detail. This multi-dimensional dataset serves 
		as a rigorous test bed for assessing SAM’s versatility and effectiveness 
		in automating land parcel boundary segmentation under diverse 
		conditions.
		Table 1: Diverse cadastral datasets and prompting conditions
		
			
				| S.N | Condition | Scenario | Scanning Resolution
 | Target | Box | Point | Combination | 
			
				| 1 | Size | Equal size | 300 | Parcel | Yes | Yes | Yes | 
			
				| 2 | Big size | 300 | Parcel | Yes | Yes | Yes | 
			
				| 3 | Small size | 300 | Parcel | Yes | Yes | Yes | 
			
				| 4 | Shape | Regular | 300 | Parcel | Yes | Yes | Yes | 
			
				| 5 | Large eccentricity
 | 300 | Parcel | Yes | Yes | Yes | 
			
				| 6 | Visual clarity
 | Clear | 300 | Parcel | Yes | Yes | Yes | 
			
				| 7 | Blur | 300 | Parcel | Yes | Yes | Yes | 
			
				| 8 | Noise Condition
 | Noise-free | 300 | Parcel | Yes | Yes | Yes | 
			
				| 9 | Noisy | 300 | Parcel | Yes | Yes | Yes | 
			
				| 10 | Scanning 
				resolution | 300 | Parcel | No | No | Yes | 
			
				| 11 | 400 | Parcel | No | No | Yes | 
			
				| 12 | 500 | Parcel | No | No | Yes | 
			
				| 13 | 600 | Parcel | No | No | Yes | 
			
				| 14 | 800 | Parcel | No | No | Yes | 
		
		2.2  Prompt Configuration 
		The study particularly investigated SAM’s segmentation capacity in 
		the context of automatic extraction of parcels from scanned analog 
		cadastral maps under different prompting conditions, focusing on 
		zero-shot segmentation. Multi-point and bounding box prompts were 
		provided as a baseline. Bounding boxes (rectangular areas) highlight 
		specific areas within the image restricting SAM’s segmentation per 
		object (in our case each parcel) for the sake of segmentation. Moreover, 
		multi-point prompts are a series of specific foreground and background 
		points within the image to guide SAM’s processing. We also experimented 
		with combining point-based and bounding box prompts in the segmentation 
		process. This combined approach was intended to harness the strengths of 
		both methods and enhance SAM's adaptability for automated cadastral 
		segmentation. 
		2.3  Zero-Shot 
		This section outlines the process of adapting the SAM for automatic 
		cadastral segmentation. The QGIS plugin "GeoSAM" was used to perform 
		zero-shot segmentation(Zhao, Fan, & Liu, 2023). Initially, image 
		features were extracted and saved using SAM’s image encoder through the 
		plugin's encoding module. SAM offers various models, including ViT-H, 
		ViT-L, and ViT-B, each with different computational requirements and 
		architectural complexities (Kirillov et al., 2023). For this study, we 
		employed the ViT-L model, which offers a balance between high accuracy 
		and manageable computational demand. Using the saved image features and 
		prompt encoder i.e. bounding boxes, multi-point, and combined approach, 
		valid masks representing individual land parcels were generated and 
		subsequently converted into polygon shapefiles.
		2.4  Model Evaluation
		The performance of adopted zero-shot models was evaluated by simply 
		inspecting the visual quality of the segmentation. The segmentation 
		result from each prompt action on each scanned image scenario was 
		inspected visually and inference was made. This is because individual 
		scenarios for cadastral parcels that are considered for the analysis are 
		not present ideally within a scanned parcel image. There is the presence 
		of a combination of multiple scenarios within an image scene, making it 
		difficult to evaluate using quantitative metrics.
		3. RESULTS AND ANALYSIS 
		This section explains the results obtained from various prompt 
		configurations used for diverse cadastral datasets and analyzes the 
		results through visual inspection and comparison of the outputs. For 
		this, representative areas were selected for analysis, focusing on the 
		unique characteristics of parcels in the context of Nepal.
		The variation in parcel size and shape in Nepal is primarily due to 
		the differing map scales and the geographic diversity of land parcels. 
		Figure 2(i) illustrates the results of parcel extraction using SAM’s 
		selected base prompts (multi-point and bounding box) and their 
		combinations for areas with varying parcel densities. The figure 
		demonstrates that for equally sized parcels, all base prompts performed 
		comparably well, producing high accuracy, except at parcel boundaries 
		where false negatives were observed. However, for areas with dense 
		parcel configurations, a noticeable decline in accuracy was evident 
		compared to the performance on equally sized parcels. In such dense 
		areas, false negatives were observed not only at the parcel boundaries 
		but also within the parcel interior when using base prompts. This 
		underestimation was mitigated to some extent by employing a combination 
		of the base prompts. The underestimation of dense parcels can be 
		attributed to the complexity of closely packed parcels, which increases 
		the challenge of accurately delineating boundaries. In these scenarios, 
		the proximity of adjacent parcels may cause the model to struggle with 
		distinguishing between them, leading to boundary confusion and 
		misclassification. Additionally, the limited resolution of the base 
		prompts in high-density areas may contribute to the difficulty in 
		accurately capturing finer details within tightly clustered parcels.
		
		
		Figure 2: Visualization of prediction of three variations of prompts 
		of zero-shot segmentation of SAM on cadastral parcel extraction task 
		from historical scanned cadastral images (i) based on parcel density (a) 
		equally sized; (b) dense and variety of pixel; and (ii) based on 
		combination of parcel size and its eccentricity. Green pixels are True 
		Positive; red pixels are False Negative, and yellow pixels are False 
		Positive.
		Additionally, the capability of zero-shot segmentation was evaluated 
		for all sizes of parcels, as presented in Figure 2(ii). Across all 
		parcel sizes, the mixed prompt approach outperformed individual base 
		prompts. In Figure 2-ii(a), both large and small parcels were accurately 
		extracted using all prompt types, particularly when the parcel shape 
		closely matched well-defined geometric forms, with an eccentricity value 
		near one. However, there was an observed underestimation in delineating 
		larger parcels with high eccentricity (i.e., where the length is 
		significantly greater than the width). Figure 2-ii(b) further 
		demonstrates that as parcel eccentricity increases, the performance of 
		zero-shot segmentation declines, irrespective of parcel size. This 
		finding suggests that the segmentation accuracy for parcels of different 
		sizes is strongly correlated with their eccentricity. The increase in 
		eccentricity introduces greater heterogeneity within the parcel shape, 
		which poses a challenge to SAM’s segmentation capability.
		
		
		Figure 3: Visualization of prediction of three variations of prompts 
		of zero-shot segmentation of SAM on cadastral parcel extraction task 
		from historical scanned cadastral images based on different visibility 
		of parcel boundary. Green pixels are True Positive; red pixels are False 
		Negative, and yellow pixels are False Positive.
		
		
		Figure 4: Visualization of prediction of three variations of prompts 
		of zero-shot segmentation of SAM on cadastral parcel extraction task 
		from historical scanned cadastral images based on different noise 
		levels. Green pixels are True Positive; red pixels are False Negative, 
		and yellow pixels are False Positive.
		In the case of scanned images with a lack of clarity in the parcel 
		boundaries primarily due to suboptimal scanning processes and the use of 
		pencil marks during parcel subdivision, the results obtained are shown 
		in Figure 3. The clarity of parcel boundaries is a critical factor in 
		accurate delineation. To assess the impact of boundary clarity, the 
		zero-shot segmentation capability was tested across different levels of 
		line clarity in the scanned images. Notably, all prompt types produced 
		promising results in delineating parcels, even under varying degrees of 
		boundary clarity or ambiguity.
		Some representative scanned cadastral map images exhibited 
		significant noise, both within and adjacent to parcel boundaries. Noise 
		within the boundary, excluding parcel numbers, did not negatively impact 
		the performance of SAM's zero-shot segmentation, as illustrated in 
		Figure 4. In these cases, the model was able to delineate parcels 
		accurately despite the internal noise. However, noise located adjacent 
		to parcel boundaries significantly reduced the accuracy of segmentation, 
		as shown by the red box in Figure 4. This adjacent noise interfered with 
		the model's ability to precisely delineate parcel boundaries. 
		Additionally, when faint boundary lines were accompanied by adjacent 
		noise, the segmentation was further compromised. In such cases, 
		illustrated by the yellow box in Figure 4, the model either failed to 
		properly delineate the parcels or mistakenly merged two adjacent parcels 
		into one. This highlights the negative impact of adjacent noise on 
		segmentation accuracy and the importance of clear boundary delineation 
		in scanned images.
		
		
		Figure 5: Visualization of prediction of zero-shot segmentation of 
		SAM on cadastral parcel extraction task from historical scanned 
		cadastral images scanned at different scanning resolution levels. Green 
		pixels are True Positive; red pixels are False Negative, and yellow 
		pixels are False Positive.
		A common issue across all the experiments was the occurrence of false 
		negatives between parcels. This can be attributed to the fact that the 
		cadastral maps were scanned at a relatively low resolution (300 DPI), 
		where parcel boundary lines occupied only a few pixels, leading to 
		segmentation errors. To address this, the same cadastral map was scanned 
		at higher resolutions to assess the impact on mitigating boundary 
		underestimation. Surprisingly, increasing the scanning resolution did 
		not significantly reduce the occurrence of false negatives between 
		parcels, as shown in Figure 5. In fact, the delineation capability of 
		SAM further decreased with higher scanning resolutions. This reduction 
		in performance can be attributed to the increased heterogeneity 
		introduced by higher resolutions, which likely introduced more noise and 
		finer details, complicating SAM's ability to accurately segment parcel 
		boundaries.
		In summary, our findings indicate that the combination of base 
		prompts consistently outperforms individual base prompts in the 
		zero-shot learning approach across all datasets. However, SAM's 
		zero-shot approach faces challenges when handling noisy data near 
		boundaries and areas with complex parcel configurations. Additionally, 
		the occurrence of false positives between segmented parcels remains a 
		persistent issue. These challenges highlight the need for integrating 
		Geographic Information Systems (GIS) with SAM, along with human 
		oversight, to ensure the creation of accurate and complete cadastral 
		databases.
		4. Conclusion
		In this study, we conducted a comprehensive analysis of the zero-shot 
		segmentation capabilities of the Segment Anything Model (SAM) for 
		cadastral data extraction from scanned historical cadastral maps under 
		various scenarios and complexities. Our analysis revealed that SAM's 
		different prompting methods (points, bounding boxes, and combinations) 
		performed notably well in most cases, except when dealing with noisy 
		data near boundaries and areas with complex parcel configurations. The 
		model demonstrated the potential to significantly reduce human workload 
		and error with minimal or no supervision. However, this initial 
		experiment was limited to exploring SAM's zero-shot capabilities. Future 
		research should focus on evaluating SAM's one-shot segmentation 
		capabilities as well as SAM-2 model, which may further enhance its 
		performance. Additionally, SAM has the potential to integrate with 
		diverse remote sensing data, such as UAV imagery, to quickly generate 
		segmentation outputs without the need for extensive training. This makes 
		SAM particularly well-suited for Nepal's varied geographic conditions, 
		especially in post-disaster scenarios like earthquakes or floods. By 
		incorporating SAM into existing GIS platforms and remote sensing 
		workflows, Nepal's cadastral system can be made more resilient to 
		natural disasters and ongoing land use challenges. 
		REFERENCES:
		
			- 
			Gobbi, S., Ciolli, M., La Porta, N., Rocchini, D., Tattoni, C., & 
			Zatelli, P. (2019). New tools for the classification and filtering 
			of historical maps. ISPRS International Journal of Geo-Information, 
			8(10), 1–24. Link
			 
- 
			Ignjatić, J., Nikolić, B., Rikalović, A., & Ćulibrk, D. (2018). 
			Ignjatić.pdf. 4617(Cd), 42–47. 
- 
			Janssen, R. D. T., Duin, R. P. W., & Vossepoel, A. M. (2002). 
			Evaluation method for an automatic map interpretation system for 
			cadastral maps. 125–128.
			Link  
- 
			Kim, N. W., Lee, J., Lee, H., & Seo, J. (2014). Accurate 
			segmentation of land regions in historical cadastral maps. Journal 
			of Visual Communication and Image Representation, 25(5), 1262–1274.
			Link  
- 
			Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., 
			Gustafson, L., … Girshick, R. (2023). Segment Anything. In 
			Proceedings of the IEEE/CVF International Conference on Computer 
			VisionProceedings of the IEEE/CVF International Conference on 
			Computer Vision (pp. 4015–4026). 
- 
			Lukman Syahid, H. (2011). Land administration and disaster risk 
			management : case of earthquake in Indonesia. 91. 
- 
			Mitchell, D. (2009). Reducing Vulnerability to Natural Disasters 
			in the Asia Pacific through Improved Land Administration and 
			Management. International Federation of Surveyors Working Week 2009, 
			Eilat, Israel, 3-8 May, (October 2010), 1–12. 
- 
			Osco, L. P., Wu, Q., de Lemos, E. L., Gonçalves, W. N., Ramos, A. 
			P. M., Li, J., & Marcato, J. (2023). The Segment Anything Model 
			(SAM) for remote sensing applications: From zero to one shot. 
			International Journal of Applied Earth Observation and 
			Geoinformation, 124.
			Link  
- 
			Sapkota, R. K. (2012). LIS Activities in Nepal: An Overview in 
			prospect of DoLIAe. Nepalese Journal on Geoinformatics, 11, 23–28. 
- 
			Zhao, Z., Fan, C., & Liu, L. (2023). Geo SAM: A QGIS plugin using 
			Segment Anything Model (SAM) to accelerate geospatial image 
			segmentation. Research Software.
			Link  
BIOGRAPHICAL NOTES
		Sanjeevan Shrestha is a Chief Survey Officer at the Survey Department 
		with 13 years of experience under the Ministry of Land Management, 
		Cooperative and Poverty Alleviation. He currently also serves as the 
		Vice President of Nepalese Remote Sensing and Photogrammetry Society. He 
		holds a Master of Science degree in Geospatial Technologies from 
		Universidad Nova de Lisboa, Portugal and University of Munster, Germany 
		through Erasmus Mundus program and a Bachelor in Geomatics Engineering 
		from Kathmandu University, Nepal. His expertise includes remote sensing, 
		geospatial analysis, geo-statistics, and the applications of deep 
		learning and machine learning techniques.
		CONTACTS 
		Sanjeevan Shrestha
		Survey Department, Minbhawan. Kathmandu,
		NEPAL
		Tel. +9779865464752