Chengxin Zhang Ph.D
 Correspondence Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China, 518055
 Email cx.zhang2siat.ac.cn
 ORCID 0000-0001-7290-1324
Biography 

Dr. Chengxin Zhang received his PhD in bioinformatics from the Yang Zhang lab at the University of Michigan in 2020. From 2021 to 2023, he was an HHMI postdoctoral associate at the Anna Pyle lab at Yale University. In 2023, he became a research assistant professor at the Lydia Freddolino lab at the University of Michigan. Since 2025, he is a PI at the Shenzhen Institutes of Advanced Technology. His research focus on the development of novel algorithms for structure prediction and function annotation of proteins and RNAs.

Published Papers [Google Scholar Profile]


First, co-first and corresponding author papers
  • Zhang C, Freddolino L, Zhang Y (2025) A graphic and command line protocol for quick and accurate comparisons of protein and nucleic acid structures with US-align. Nature Protocols, in press.
  • Liu Q, Zhang C, Freddolino L (2024) InterLabelGO+: unraveling label correlations in protein function prediction. Bioinformatics, 40 (11), btae655 (Co-corresponding author)
  • Zhang C, Freddolino L (2024) FURNA: A database for functional annotations of RNA structures. Plos Biology, 22 (7), e3002476 (Co-corresponding author)
  • Zhang C, Freddolino L (2024) A large-scale assessment of sequence database search tools for homology-based protein function prediction. Briefings in Bioinformatics 25 (4), bbae349 (Co-corresponding author)
  • Liu Z, Zhang C, Zhang Q, Zhang Y, Yu DJ (2024) TM-search: an efficient and effective tool for protein structure database search. Journal of Chemical Information and Modeling, 64 (3), 1043-1049 (Co-first author)
  • Zhang C, Zhang X, Freddolino PL, Zhang Y (2024) BioLiP2: an updated structure database for biologically relevant ligand-protein interactions. Nucleic Acids Research, 52 (D1), D404-D412.
  • Zhang C (2023) BeEM: fast and faithful conversion of mmCIF format structure files to PDB format. BMC bioinformatics 24 (1), 1-6. (Corresponding author)
  • Li Y, Zhang C, Feng C, Pearce R, Freddolino PL, Zhang Y (2023) Integrating end-to-end learning with deep geometrical potentials for ab initio RNA structure prediction. Nature Communications, 14 (1), 5745. (Co-first author)
  • Perry ZR, Pyle AM, Zhang C (2023) Arena: Rapid and accurate reconstruction of full atomic RNA structures from coarse-grained models. Journal of Molecular Biology, 435 (18), 168210 (Corresponding author)
  • Zhang C, Zhang Y, Pyle AM (2023) rMSA: a sequence search and alignment algorithm to improve RNA structure modeling. Journal of Molecular Biology, 435 (14), 167904
  • Zhang C, Pyle AM (2023) PDC: a highly compact file format to store protein 3D coordinates. Database, baad018 (Corresponding author)
  • Zhang C, Shine M, Pyle AM, Zhang Y (2022) US-align: universal structure alignments of proteins, nucleic acids, and macromolecular complexes. Nature Methods, 19 (9), 1109-1115
  • Zhang C, Pyle AM (2022) CSSR: assignment of secondary structure to coarse-grained RNA tertiary structures. Acta Crystallographica Section D: Structural Biology, 78 (4)
  • Zhang C, AM Pyle (2022) A unified approach to sequential and non-sequential structure alignment of proteins, RNAs, and DNAs. iScience, 25 (10), 105218.
  • Zhang C, Zheng W, Cheng M, Omenn GS, Freddolino P, Zhang Y (2021) Functions of essential genes and a scale-free protein interaction network revealed by structure-based function and interaction prediction for a minimal genome. Journal of Proteome Research, 20(2), 1178-1189.
  • Li Y, Zhang C, Bell EW, Zheng W, Zhou X, Yu DJ, Zhang Y (2021) Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks. PLOS Computational Biology, 17(3), e1008865. (Co-first author)
  • Chang J, Zhang C, Cheng H, Tan YW (2021) Rational design of adenylate kinase thermostability through coevolution and sequence divergence analysis. International Journal of Molecular Sciences, 22(5), 2768. (Co-first author)
  • Zheng W, Zhang C, Li Yang, Pearce R, Bell EW, Zhang Y (2021) Folding non-homology proteins by coupling deep-learning contact maps with I-TASSER assembly simulations. Cell Reports Methods, 1(3), 100014. (Co-first author)
  • Zhang C, Zheng W, Mortuza SM, Li Y, Zhang Y (2020) DeepMSA: constructing deep multiple sequence alignment to improve contact prediction and fold-recognition for distant-homology proteins. Bioinformatics, 36(7), 2105-2112.
  • Zhang C, Zheng W, Huang X, Bell EW, Zhou X, Zhang Y (2020) Protein structure and sequence reanalysis of 2019-nCoV genome refutes snakes as its intermediate host and the unique similarity between its spike protein insertions and HIV-1. Journal of Proteome Research, 19(4), 1351-1360.
  • Wei X, Zhang C, Freddolino PL, Zhang Y (2020) Detecting Gene Ontology misannotations using taxon-specific rate ratio comparisons. Bioinformatics, 36(16), 4383-4388.(Co-first author)
  • Zhang C, Lane L, Omenn GS, Zhang Y (2019) A blinded testing of function annotation for uPE1 proteins by the I-TASSER/COFACTOR pipeline using the 2018-2019 additions to neXtProt and the CAFA3 challenge. Journal of Proteome Research, 18(12), 4154-4166.
  • Li Y, Zhang C, Bell EW, Yu D, Zhang Y (2019) Ensembling multiple raw coevolutionary features with deep residual neural networks for contact-map prediction in CASP13. Proteins: Structure, Function, and Bioinformatics, 87, 1082-1091. (Co-first author)
  • Gong S, Zhang C, Zhang Y (2019) RNA-align: quick and accurate alignment of RNA 3D structures based on size-independent TM-scoreRNA. Bioinformatics, 35(21), 4459-4461. (Co-first author)
  • Zheng W, Zhang C, Bell EW, Zhang Y (2019) I-TASSER gateway: a protein structure and function prediction server powered by XSEDE. Future Generation Computer Systems, 99, 73-85. (Co-first author)
  • Wang Y, Shi Q, Yang P, Zhang C, Mortuza SM, Xue Z, Ning K, Zhang Y (2019) Fueling ab initio folding with oceanic metagenomics enables structure and function predictions of new protein families. Genome Biology, 20 (229). (Co-first author)
  • Zhang C, Wei X, Omenn GS, Zhang Y (2018) Structure and protein interaction-based Gene Ontology annotations reveal likely functions of uncharacterized proteins on human chromosome 17. Journal of Proteome Research, 17(12), 4186-4196.
  • Zhang C, Zheng W, Freddolino PL, Zhang Y (2018) MetaGO: Predicting Gene Ontology of non-homologous proteins through low-resolution protein structure prediction and protein-protein network mapping. Journal of Molecular Biology, 430(15), 2256-2265.
  • Zhang C, Mortuza SM, He B, Wang Y, Zhang Y (2018) Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12. Proteins: Structure, Function, and Bioinformatics, 86(Suppl 1): 136-151.
  • Zhang C, Freddolino PL, Zhang Y (2017) COFACTOR: improved protein function prediction by combining structure, sequence and protein-protein interaction information. Nucleic Acids Research, 45(W1), W291-W299.

  • Other publications
  • Liu T, Xu L, Chung K, Sisto LJ, Hwang J, Zhang C, Van Zandt MC, Pyle AM (2025) Molecular insights into de novo small-molecule recognition by an intron RNA structure. Proceedings of the National Academy of Sciences of the United States of America (PNAS), in press
  • Van der Pijl RJ, Ma W, Lewis CTA, Haar L, Buhl A, Farman GP, Rhodehamel M, Jani VP, Nelson OL, Zhang C, Granzier H, Ochala J (2025) Increased cardiac myosin super-relaxation as an energy saving mechanism in hibernating grizzly bears. Molecular Metabolism, 92, 102084
  • Zheng W, Wuyun Q, Li Y, Zhang C, Freddolino L, Zhang Y (2024) Improving deep learning protein monomer and complex structure prediction using DeepMSA2 with huge metagenomics data. Nature Methods, 21 (2), 279-289
  • Wei X, Lu Y, Lin LL, Zhang C, Chen X, Wang S, Wu SA, Li ZJ, Quan Y, Sun S, Qi L (2024) Proteomic screens of SEL1L-HRD1 ER-associated degradation substrates reveal its role in glycosylphosphatidylinositol-anchored protein biogenesis. Nature Communications, 15 (1), 659
  • LaLone CA, Blatz DJ, Jensen MA, Vliet SMF, Mayasich S, Mattingly KZ, Transue TR, Melendez W, Wilkinson A, Simmons CW, Ng C, Zhang C, Zhang Y (2023) From protein sequence to structure: The next frontier in cross-species extrapolation for chemical safety evaluations. Environmental toxicology and chemistry, 42 (2), 463-474
  • Zhu YH, Zhang C, Yu DJ, Zhang Y (2022) Integrating unsupervised language model with triplet neural networks for protein gene ontology prediction. PLOS Computational Biology, 18 (12), e1010793
  • Shine M, Zhang C, Pyle AM (2022) AMIGOS III: pseudo-torsion angle visualization and motif-based structure comparison of nucleic acids. Bioinformatics, 38 (10), 2937-2939
  • Li Y, Zhang C, Yu DJ, Zhang Y (2022) Deep learning geometrical potential for high-accuracy ab initio protein structure prediction. iScience, 25 (6), 104425.
  • Zhu YH, Zhang C, Liu Y, Omenn GS, Freddolino PL, Yu DJ, Zhang Y (2022) Integrating transcript expression profiles with protein homology inferences for gene function prediction. Genomics, Proteomics & Bioinformatics, 20(5), 1013-1027.
  • MacCarthy EA, Zhang C, Zhang Y, KC D (2022) GPU-I-TASSER: a GPU accelerated I-TASSER protein structure prediction tool. Bioinformatics, 38 (6), 1754-1755.
  • Zhou X, Li Y, Zhang C, Zheng W, Zhang G, Zhang Y (2022) Progressive assembly of multi-domain protein structures from cryo-EM density maps.Nature computational science, 2 (4), 265-275.
  • Zhou X, Zheng W, Li Y, Pearce R, Zhang C, Bell EW, Zhang G, Zhang Y (2022) I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nature Protocols, 17 (10), 2326-2353
  • Si L, Shen Q, Li J, Chen L, Shen J, Xiao X, Bai H, Feng T, Ye AY, Li L, Zhang C, Li Z, Wang P, Oh CY, Nurani A, Niu S, Zhang C, Wei X, Yuan W, Liao H, Huang X, Wang N, Tian WX, Tian H, Li L, Liu X, Plebani R (2022) Generation of a live attenuated influenza A vaccine by proteolysis targeting. Nature Biotechnology, 40 (9), 1370-1377
  • Tsou PS, Lu C, Gurrea-Rubio M, Muraoka S, Campbell PL, Wu Q, Model EN, Lind ME, Vichaikul S, Mattichak MN, Brodie WD, Hervoso JL, Ory S, Amarista CI, Pervez R, Junginger L, Ali M, Hodish G, O‘Mara MM, Ruth JH, Robida AM, Alt AJ, Zhang C, Urquhart AG, Lawton JN, Chung KC, Maerz T, Saunders TL, Groppi VE, Fox DA, Amin MA (2022) Soluble CD13 induces inflammatory arthritis by activating the bradykinin receptor B1. The Journal of Clinical Investigation, 132(11)
  • Wei X, Zou S, Xie Z, Wang Z, Huang N, Cen Z, Hao Y, Zhang C, Chen Z, Zhao F, Hu Z, Teng X, Gui Y, Liu X, Zheng H, Zhou H, Chen S, Cheng J, Zeng F, Zhou Y, Wu W, Hu J, Wei Y, Cui K, Li J (2022) EDIL3 deficiency ameliorates adverse cardiac remodeling by neutrophil extracellular traps (NET)-mediated macrophage polarization. Cardiovascular Research, 118 (9), 2179-2195.
  • Mohan HM, Trzeciakiewicz H, Pithadia A, Crowley EV, Pacitto R, Safren N, Trotter B, Zhang C, Zhou X, Zhang Y, Basrur V, Paulson HL, Sharkey LM (2022) RTL8 promotes nuclear localization of UBQLN2 to subnuclear compartments associated with protein quality control. Cellular and Molecular Life Sciences, 79 (3), 176
  • Li Y, Zhang C, Zheng W, Zhou X, Bell EW, Yu DJ, Zhang Y (2021) Protein inter-residue contact and distance prediction by coupling complementary coevolution features with deep residual networks in CASP14. Proteins: Structure, Function, and Bioinformatics, 89(12), 1911-1921.
  • Woodard J, Zhang C, Zhang Y (2021) ADDRESS: A database of disease-associated human variants incorporating protein structure and folding stabilities. Journal of Molecular Biology, 433(11), 166840.
  • Mortuza SM, Zheng W, Zhang C, Li Y, Pearce R, Zhang Y (2021) Improving fragment-based ab initio protein structure assembly using low-accuracy contact-map predictions. Nature Communications, 12, 5011.
  • Zheng W, Li Y, Zhang C, Zhou X, Pearce R, Bell EW, Zhang Y (2021) Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14. Proteins: Structure, Function, and Bioinformatics, 89(12), 1734-1751.
  • Gong W, Guerler A, Zhang C, Warner E, Li C, Zhang Y (2021) Integrating multimeric threading with high-throughput experiments for structural interactome of Escherichia coli. Journal of Molecular Biology. 433(10), 166944.
  • Zhou D, Feng H, Huang T, Yang Y, Qiu P, Zhang C, Olsen TR, Zhang J, Chen YE, Mizrak D, Yang B (2021) hiPSC modeling of lineage-specific smooth muscle cell defects caused by TGFBR1A230T variant, and its therapeutic implications for Loeys-Dietz Syndrome. Circulation. 144(14), 1145-1159.
  • Kryshtafovych A, Moult J, Billings WM, Corte DD, Fidelis K, Kwon S, Olechnovic K, Seok C, Venclovas C, Won J, CASP-COVID participants. (2021) Modeling SARS-CoV-2 proteins in the CASP-commons experiment. Proteins: Structure, Function, and Bioinformatics, 89 (12), 1987-1996.
  • Huang X, Zhang C, Pearce R, Omenn GS, Zhang Y (2020) Identifying the zoonotic origin of SARS-CoV-2 by modeling the binding affinity between the spike receptor-binding domain and host ACE2. Journal of Proteome Research, 19(12), 4844-4856.
  • Zheng W, Zhang C, Wuyun Q, Pearce R, Li Y, Zhang Y (2019) LOMETS2: Improved meta-threading server for fold-recognition and structure-based function annotation for distant-homology proteins. Nucleic Acid Research, 47 (W1), W429-W436.
  • Zhou X, Hu J, Zhang C, Zhang G, Zhang Y (2019) Assembling multi-domain protein structures through analogous global structural alignments. Proceedings of the National Academy of Sciences of the United States of America (PNAS), 116 (32), 15930-15938.
  • Wei Z, Li Y, Zhang C, Pearce R, Zhang Y (2019) Deep-learning contact-map guided protein structure prediction in CASP13. Proteins: Structure, Function, and Bioinformatics, 87, 1149-1164.
  • Li Y, Hu J, Zhang C, Yu DJ, Zhang Y (2019) ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks. Bioinformatics, 35(22), 4647-4655.
  • Wu J, Yin Q, Zhang C, Geng J, Wu H, Hu H, Ke X, Zhang Y (2019) Function prediction for G protein-coupled receptors through text mining and induction matrix completion. ACS Omega, 4(2), 3045-3054.
  • Zhou et al. (2019) The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biology, 20:244.
  • Wei Z, Wuyun Q, Li Y, Mortuza SM, Zhang C, Pearce R, Ruan J, Zhang Y (2019) Detecting distant-homology protein structures by aligning deep neural-network based contact maps. PLOS Computational Biology, 15(10): e1007411.
  • Xiong P, Zhang C, Zheng W, Zhang Y (2017) BindProfX: assessing mutation-induced binding affinity change by protein interface profiles with pseudo-Counts. Journal of molecular biology, 429(3), 426-434.
  • Thomas JM, Simkovic F, Keegan R, Mayans O, Zhang C, Zhang Y, Rigden DJ (2017) Approaches to ab initio molecular replacement of α-helical transmembrane proteins. Acta Crystallographica Section D: Structural Biology, 73(12), 985-996.
  • Janson G, Zhang C, Prado MG, Paiardini A (2016) PyMod 2.0: improvements in protein sequence-structure analysis and homology modeling within PyMOL. Bioinformatics, 33(3), 444-446.