Proposing a Method for Creating Test Data for Accuracy Assessment #431

Cenan-Alhassan · 2025-12-31T18:39:39Z

Cenan-Alhassan
Dec 31, 2025

Greetings, Mr Congedo.

I would like to propose a method for creating testing data for accuracy assessment.

Proposal

I have been working with the SCP plugin for a while, and one of the things I have needed to do a lot is create testing data to use for accuracy assessment.

As far as I am aware, there is no native method to create testing data using SCP. I came across a video on your channel where you created testing data: https://www.youtube.com/watch?v=H1cL0yhIygg

From what I understood, you created a certain number of pixel-sized vectors within each class of the classified raster. You then manually labelled each vector with a label. This was fed into the accuracy assessment.

For my personal use, I wrote a script that creates testing data automatically. Let's say we have a polygon that represents two classes (blue and green):

What the script does is perform stratified random sampling on the polygons by separating a percentage of each polygon into testing data, while keeping the rest as training data.

For example, we can perform the split on the aforementioned polygons with two classes. A random area of 30% percent from each polygon is split into testing data.

Training polygons:

Testing polygons:

As you can see, the training polygon shares the same class as the polygon it was split from. Using the overlap function between the original polygon and the testing polygon, we find that the overlap for all three polygons is roughly 30% as intended (see bottom right):

The area of the testing polygon is composed of squares. Each square is the size of the pixels in the classification raster and sits on the same grid as the raster. This allows direct overlap of the testing vector over the classified raster for proper accuracy assessment.

Benefits Over Old Method

I believe this method has two very big benefits:

Inclusivity:
Since the testing polygon is derived from the training polygon, the testing polygons will include every variation that is captured in the training polygon. So then, assuming your training data is thorough and captures all variations within classes, your testing data will as well. You can be assured that the testing data does not overlap, but still includes all spectral variations.
Convenience:
Creating the testing data shares the exact same process as creating the training data, instead of having to create testing data separately. The labelling is automatic.

Integrating the Script into SCP

In order to implement this method, I need to first convert the .scpx training data to a standard vector, perform the processing script, and then convert the training polygon back into .scpx for use with ML models. The following are the parameters for the processing script:

It requires you to input the training data, the class IDs for reference, and the classification raster for pixel size, or input the pixel size manually. The percentage of split can be manually inputted or calculated using some other precise method.

If implemented into SCP, except for the split percentage, all these inputs can be pulled directly from the .scpx file and bandset that are loaded in the project, perhaps with a click of a button.

Conclusion

I would like your opinion on whether this method is good or not, and whether it can or should be implemented into the plugin. I believe it is a very useful idea if done correctly. I would love to discuss it further! And happy new year!

semiautomaticgit · 2026-01-04T11:55:55Z

semiautomaticgit
Jan 4, 2026
Maintainer

Greetings @Cenan-Alhassan , and happy new year!

Thank you for your proposal, I really appreciate your spirit of contribution!

The tutorial you watched has also a web page (https://fromgistors.blogspot.com/2019/09/Accuracy-Assessment-of-Land-Cover-Classification.html) which describes the design of sampling units.
In particular, the paper "Olofsson et al., 2014. Good practices for estimating area and assessing accuracy of land change. Remote Sensing of Environment, 148, 42 – 57" describes a rigorous method for sample size and stratification.

In general, validation requires data that is not used for training the classification algorithm. I see that you are separating the polygons in training and testing data, which is similar to what is implemented in machine learning algorithms (such as Multi-Layer Peceptron) to tune the algorithm.

What the script does is perform stratified random sampling on the polygons by separating a percentage of each polygon into testing data, while keeping the rest as training data.
For example, we can perform the split on the aforementioned polygons with two classes. A random area of 30% percent from each polygon is split into testing data.

I think that the problem with the method that you are proposing, is that even if you exclude the validation pixels from the training, you don't consider the real class area proportion (which can be different from the polygon area proportion), and the spatial distribution of samples is concentrated inside the input polygons (i.e. there could be spatial correlation between training and testing pixels, unless you have a very large number of input polygons randomly distributed across the image) which could increase uncertainty of accuracy metrics.
This is why the internal accuracy calculated by machine learning algorithms (separating training and testing data) often produces different results than independent validation (using stratified random sampling on the whole classification).

If you haven't, I invite you to read Olofsson et al. (2014) which describes the sample size design and calculation of uncertainty (confidence intervals) of the metrics.
I hope this can provide additional value to your work.
Of course I'm glad to further discuss this topic.

1 reply

Cenan-Alhassan Jan 4, 2026
Author

Thank you very much for pointing the flaw out! I usually try to keep my polygons small and scattered, but that shouldn't be a requirement. I'll look into the source you provided.

I'll try my best to learn and improve so I can make a solid contribution one day. I really appreciate this project and I would be honoured to add value.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposing a Method for Creating Test Data for Accuracy Assessment #431

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Proposing a Method for Creating Test Data for Accuracy Assessment #431

Uh oh!

Uh oh!

Cenan-Alhassan Dec 31, 2025

Proposal

Benefits Over Old Method

Integrating the Script into SCP

Conclusion

Replies: 1 comment · 1 reply

Uh oh!

semiautomaticgit Jan 4, 2026 Maintainer

Uh oh!

Cenan-Alhassan Jan 4, 2026 Author

Cenan-Alhassan
Dec 31, 2025

Replies: 1 comment 1 reply

semiautomaticgit
Jan 4, 2026
Maintainer

Cenan-Alhassan Jan 4, 2026
Author