Abstract
This paper describes the computer program, l\_bnl\_compress.py. which is a Python script implementing the lossy compressions described in Bernstein, H. J., Soares, A. S., Horvat, K. & Jakoncic, J. (2024). J. Sync. Rad. 32(2) for macromolecular crystallographic diffraction data. The lossy compressions use pixel-by-pixel binning, image-by-image summing, JPEG-2000 Daubechies (DB) wavelet compression from the movie industry, and HCompress Haar (now known as DB0) wavelet compression from astronomy which are combined with the usual lossless MX compressions.
Supplementary weblinks
Title
Massive Compression for High Data Rate Macromolecular Crystallography: Impact on Diffraction Data and Subsequent Structural Analysis Creators
Description
This is a dataset containing raw "uncompressed" diffraction data from test sample.
Data are collected on a lysozyme sample at 7.5 keV for a S-SAD experiment at the AMX beamline using an EIGEr 9M detector.
The compression used, to generate the cbf files can be derived from the filenames:
BINx: pixel binning by a factor x
SUMx: frame summing by a factor x
J2Kx: JPEG2000 compression used with a factor of x
HCOMPx: Hcompress used with a scale factor x
For example: lyso_BIN2_SUM2_HCOMP4 was compressed using 2x pixel binning + 2x frame summing + Hcompress with scale of 4.
More information will be included after publication, in the meantime, please contact author if details about processing are required.
All zstd tar directories contain the cbf files, ready to be processed.
All data were collected at the AMX beamline at the NSLS-II using a DECTRIS EIGER X 9 M detector.
Actions
View Title
Application of Lossy Compression for MX Diffraction Data: use of lossy but not lossy compression (l_bnl_compress)
Description
These are 3 data sets used to evaluate application of lossy compression for MX diffraction data.
A lysosyme data set collected at 7500 eV to solve the S_SAD structure (Ly_01_22013).
A thermolysin data set from a fragment screening campaign collection at NSLS-II AMX beamline with a fragment bound (tlys-817_10982).
A CBASS Cap5 from Pseudomonas syringae as an activated tetramer with the cyclic dinucleotide 3'2'-c-diAMP ligand data set (Endo6_23AA_2v_502).
The CBASS Cap5 structure deposited to the PDB site (8FMG.PDB and https://doi.org/10.1038/s41594-024-01220-x).
The 7500 eV lysosyme structure solved using S_SAD phases initial compression results are published (9B7F.PDB and https://doi.org/10.1107/S160057752400359X).
For each data set, we include the corresponding XDS.INP file containing all metadata required for data reduction).
We are grateful to Dr. Dale Kreitler (NSLS-II, BNL) for access to the thermolysin data set.
and to Dr. Olga Rechkoblit (MSSM) and Dr. Aneel Aggarwal (MSSM) for access to the CBASS Cap5 data set.
Actions
View