A reconstruction method for cross-cut shredded documents based on the extreme learning machine algorithm

Date

2022-07-24

Advisors

Journal Title

Journal ISSN

ISSN

1433-7479

Volume Title

Publisher

Springer

Type

Article

Peer reviewed

Yes

Abstract

Reconstruction of cross-cut shredded text documents (RCCSTD) has important applications for information security and judicial evidence collection. The traditional method of manual construction is a very time-consuming task, so the use of computer-assisted efficient reconstruction is a crucial research topic. Fragment consensus information extraction and fragment pair compatibility measurement are two fundamental processes in RCCSTD. Due to the limitations of the existing classical methods of these two steps, only documents with specific structures or characteristics can be spliced, and pairing error is larger when the cutting is more fine-grained. In order to reconstruct the fragments more effectively, this paper improves the extraction method for consensus information and constructs a new global pairwise compatibility measurement model based on the extreme learning machine algorithm. The purpose of the algorithm’s design is to exploit all available information and computationally suggest matches to increase the algorithm’s ability to discriminate between data in various complex situations, then find the best neighbor of each fragment for splicing according to pairwise compatibility. The overall performance of our approach in several practical experiments is illustrated. The results indicate that the matching accuracy of the proposed algorithm is better than that of the previously published classical algorithms and still ensures a higher matching accuracy in the noisy datasets, which can provide a feasible method for RCCSTD intelligent systems in real scenarios.

Description

The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.

Keywords

Reconstruction of cross-cut shredded text documents (RCCSTD), Extreme learning machine algorithm, Consensus information, Pairwise compatibility measurement model

Citation

Zhang, Z., Zou, J. Yang, S., Zheng, J., Gong, D. and Pei. T. (2022) A reconstruction method for cross-cut shredded documents based on the extreme learning machine algorithm. Soft Computing, 26, pp. 12851–12862

Rights

Research Institute