Paper
11 October 2023 Lqscomp: an efficient quality score compression method based on run length coding and mapping
Fuzhi Li
Author Affiliations +
Proceedings Volume 12800, Sixth International Conference on Computer Information Science and Application Technology (CISAT 2023); 128004Q (2023) https://doi.org/10.1117/12.3004069
Event: 6th International Conference on Computer Information Science and Application Technology (CISAT 2023), 2023, Hangzhou, China
Abstract
With the continuous development of high-throughput sequencing technology over the past two decades, the cost of gene sequencing has fallen sharply, resulting in the rapid growth of genetic data. Therefore, it is urgent and important to compress large scale DNA data effectively. The core of DNA data compression is the compression of the base sequence and the quality score sequence. Base compression has made great progress in recent years, but quality score compression is still challenging. In this paper, a lossless compression tool Lqscomp for quality scores is proposed. It includes four steps: partitioning, indexing, mapping and compression. The feature of the tool is that the mapping step uses a technology similar to run-length encoding. Experimental results show that the proposed Lqscomp algorithm has a good compression performance on all test sets.
(2023) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Fuzhi Li "Lqscomp: an efficient quality score compression method based on run length coding and mapping", Proc. SPIE 12800, Sixth International Conference on Computer Information Science and Application Technology (CISAT 2023), 128004Q (11 October 2023); https://doi.org/10.1117/12.3004069
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Computer programming

Associative arrays

Data compression

Data modeling

Image compression

Data storage

Data transmission

Back to Top