Journal List > J Korean Soc Med Inform > v.14(3) > 1102926

Kim, Yi, and Kim: CDA Compression via Automatic Type Inference

Abstract

OBJECTIVE

CDA is a standard for the exchange and sharing of clinical documents among all entities in the healthcare domain. As it proliferates, the number of CDA documents will increase exponentially and it will require huge storage spaces to store them. The main goal of this study is to devise an efficient compression method optimized for CDA documents so that the storage requirement can be lowered.

METHODS

The method proposed in this paper is based on a compression method called Xmill which has been designed specifically for XML documents at large, which requires human intervention for the effective compression, especially, of CDA. Our proposed method, CDACOM, automatically extracts type information from CDA documents to infer the data type, assigns data values of the same type to the same data container, and applies an optimized encoder to the container so that a better compression rate can be achieved.

RESULTS

Experiments with various types of CDA documents were performed to evaluate the effectiveness of CDACOM over Xmill. The results show that CDACOM indeed outperforms Xmill and can decrease the output file size by about 24.1% on average, compared to Xmill. If documents are combined and compressed together, the gap gets even bigger to about 50%.

CONCLUSION

The proposed compression method, CDACOM, is very effective and promising. It will help lowering the cost for systems to transmit and store CDA documents and, hence, expediting the adoption of the standard in the healthcare domain.

TOOLS
Similar articles