摘要
背景:药物发现是一个复杂而昂贵的过程,涉及几个及时而昂贵的阶段,新的潜在药物化合物必须通过这些阶段才能获得批准。其中一个关键步骤是先导化合物的识别和优化,通过引入包括深度学习(DL)技术在内的计算方法,先导化合物的识别和优化变得更加容易。人们提出了不同的DL模型架构,以了解蛋白质与配体之间相互作用的广阔前景,并预测它们的亲和力,有助于鉴定先导化合物。 目的:本调查通过全面分析最常用的数据集并讨论其质量和局限性,填补了以往研究的空白。它还提供了在蛋白质配体结合亲和力预测(BAP)背景下的最新DL方法的全面分类,为这一不断发展的领域提供了新的视角。 方法:我们彻底检查了常用的BAP数据集及其固有特征。我们的探索扩展到各种预处理步骤和深度学习技术,包括图神经网络,卷积神经网络和变压器,这些都可以在文献中找到。我们进行了广泛的文献研究,以确保在撰写本文时包含最新的BAP深度学习方法。 结果:本研究使用的系统方法强调了通过DL进行BAP的固有挑战,如数据质量、模型可解释性和可解释性,并提出了对未来研究方向的考虑。我们提出了有价值的见解,以加速在研究界开发更有效和可靠的BAP DL模型。 结论:本研究可以大大促进未来预测蛋白质与配体分子亲和力的研究,从而进一步改善整个药物开发过程。
关键词: 深度学习,蛋白质-配体结合亲和力,化合物-蛋白质相互作用,药物发现,药物再利用,DNA序列。
[http://dx.doi.org/10.1038/nrd1468] [PMID: 15286734]
[http://dx.doi.org/10.1093/bioinformatics/btab009] [PMID: 33471069]
[http://dx.doi.org/10.1093/nar/gkq406]
[http://dx.doi.org/10.1186/s13321-021-00522-2] [PMID: 34108002]
[http://dx.doi.org/10.1093/bioinformatics/btaa524] [PMID: 32428219]
[http://dx.doi.org/10.1371/journal.pcbi.1007129] [PMID: 31199797]
[http://dx.doi.org/10.1016/j.compbiomed.2023.107136] [PMID: 37329615]
[http://dx.doi.org/10.1093/bib/bbz157] [PMID: 31950972]
[http://dx.doi.org/10.1093/bib/bbaa107] [PMID: 32591817]
[http://dx.doi.org/10.1016/j.csbj.2023.11.009]
[http://dx.doi.org/10.3389/fbinf.2022.885983] [PMID: 36187180]
[http://dx.doi.org/10.1111/j.1476-5381.2010.01127.x] [PMID: 21091654]
[http://dx.doi.org/10.1038/nrd4507] [PMID: 25435204]
[http://dx.doi.org/10.1016/j.jhealeco.2016.01.012] [PMID: 26928437]
[http://dx.doi.org/10.1038/nrd.2018.168] [PMID: 30310233]
[http://dx.doi.org/10.1093/bib/bbr013] [PMID: 21690101]
[http://dx.doi.org/10.2174/157340911793743547] [PMID: 20807187]
[http://dx.doi.org/10.1002/prot.340080302] [PMID: 2281083]
[http://dx.doi.org/10.1006/jmbi.1996.0897]
[http://dx.doi.org/10.1002/cpt.318] [PMID: 26659699]
[http://dx.doi.org/10.2174/138920207780076910] [PMID: 18645629]
[http://dx.doi.org/10.1016/j.csbj.2021.03.004] [PMID: 33841755]
[http://dx.doi.org/10.1093/nar/gkaa1100] [PMID: 33237286]
[http://dx.doi.org/10.1093/nar/28.1.235] [PMID: 10592235]
[http://dx.doi.org/10.1093/nar/gkj067]
[http://dx.doi.org/10.1093/nar/gkw1074] [PMID: 27899562]
[http://dx.doi.org/10.1021/jm030580l] [PMID: 15163179]
[http://dx.doi.org/10.1038/nbt.1990] [PMID: 22037378]
[http://dx.doi.org/10.1093/nar/gkl999]
[http://dx.doi.org/10.1021/ci400709d] [PMID: 24521231]
[http://dx.doi.org/10.1021/ci500081m] [PMID: 24708446]
[http://dx.doi.org/10.1021/acs.jcim.8b00545] [PMID: 30481020]
[http://dx.doi.org/10.1021/acsomega.2c02156] [PMID: 35694511]
[http://dx.doi.org/10.1093/bioinformatics/btu626] [PMID: 25301850]
[http://dx.doi.org/10.1038/nchembio.530] [PMID: 21336281]
[http://dx.doi.org/10.1038/nbt.2017] [PMID: 22037377]
[http://dx.doi.org/10.1038/s41598-022-23014-1] [PMID: 36307509]
[http://dx.doi.org/10.1039/C8RA00003D] [PMID: 35539386]
[http://dx.doi.org/10.1093/bioinformatics/bty593] [PMID: 30423097]
[http://dx.doi.org/10.1137/1.9781611977172.82]
[http://dx.doi.org/10.1093/bioinformatics/btaa544] [PMID: 32462178]
[http://dx.doi.org/10.1186/s13321-022-00591-x] [PMID: 35292100]
[http://dx.doi.org/10.1021/acs.jpclett.1c00867] [PMID: 33904745]
[http://dx.doi.org/10.1186/s13321-021-00510-6] [PMID: 33858485]
[http://dx.doi.org/10.1093/bioinformatics/btaa921] [PMID: 33119053]
[http://dx.doi.org/10.48550/ARXIV.1902.04166]
[http://dx.doi.org/10.1016/j.compbiolchem.2023.107969] [PMID: 37866117]
[http://dx.doi.org/10.1016/j.compbiomed.2023.107621] [PMID: 37907030]
[http://dx.doi.org/10.1016/j.compbiomed.2023.107372] [PMID: 37597410]
[http://dx.doi.org/10.1016/j.ymeth.2023.02.007] [PMID: 36804213]
[http://dx.doi.org/10.1093/bioinformatics/btad049] [PMID: 36688724]
[http://dx.doi.org/10.1016/j.ymeth.2023.11.005] [PMID: 37952703]
[http://dx.doi.org/10.1101/2022.07.15.500218]
[http://dx.doi.org/10.2174/1389203718666161114111656] [PMID: 27842479]
[http://dx.doi.org/10.1186/s13321-017-0195-1] [PMID: 28224019]
[http://dx.doi.org/10.1093/bioinformatics/btad340] [PMID: 37225408]
[http://dx.doi.org/10.1039/D0RA02297G] [PMID: 35517730]
[http://dx.doi.org/10.1093/bioinformatics/bty374] [PMID: 29757353]
[http://dx.doi.org/10.1002/bip.360221211] [PMID: 6667333]
[http://dx.doi.org/10.1021/acs.jcim.9b00387] [PMID: 31443612]
[http://dx.doi.org/10.1016/j.compbiomed.2022.106145] [PMID: 37859276]
[http://dx.doi.org/10.3389/fncom.2016.00064] [PMID: 27471460]
[http://dx.doi.org/10.48550/ARXIV.1512.03385]
[http://dx.doi.org/10.1145/3065386]
[http://dx.doi.org/10.48550/ARXIV.1409.1556]
[http://dx.doi.org/10.1093/bib/bbu010] [PMID: 24723570]
[http://dx.doi.org/10.1186/s13321-017-0209-z] [PMID: 29086119]
[http://dx.doi.org/10.1109/BIBM47256.2019.8983125]
[http://dx.doi.org/10.1093/bib/bbab072] [PMID: 33834190]
[http://dx.doi.org/10.1007/s12559-021-09840-x] [PMID: 33552306]
[http://dx.doi.org/10.3390/ijms17020144] [PMID: 26821017]
[http://dx.doi.org/10.1021/acsomega.9b01997] [PMID: 31592466]
[http://dx.doi.org/10.1038/s41598-021-83679-y] [PMID: 33627791]
[http://dx.doi.org/10.1093/bioinformatics/btaa858] [PMID: 33067636]
[http://dx.doi.org/10.3390/pharmaceutics14030625] [PMID: 35336000]
[http://dx.doi.org/10.1186/1471-2105-12-333] [PMID: 21831268]
[http://dx.doi.org/10.1093/bioinformatics/btt020] [PMID: 23325618]
[http://dx.doi.org/10.1016/j.ymssp.2020.107398]
[http://dx.doi.org/10.1186/s40537-021-00444-8] [PMID: 33816053]
[http://dx.doi.org/10.1023/A:1016357811882] [PMID: 12197663]
[http://dx.doi.org/10.1021/jm061277y] [PMID: 17300160]
[http://dx.doi.org/10.1021/acs.jcim.7b00650] [PMID: 29309725]
[http://dx.doi.org/10.1186/s12859-022-04762-3] [PMID: 35676617]
[http://dx.doi.org/10.1109/BIBM47256.2019.8982964]
[http://dx.doi.org/10.3390/ijms21228424] [PMID: 33182567]
[http://dx.doi.org/10.1021/acs.jmedchem.2c00487] [PMID: 35608179]
[http://dx.doi.org/10.3389/fphar.2020.00069] [PMID: 32161539]
[http://dx.doi.org/10.1371/journal.pone.0220113] [PMID: 31430292]
[http://dx.doi.org/10.1016/j.jmgm.2021.107865] [PMID: 33640787]
[http://dx.doi.org/10.1093/bioinformatics/btab715] [PMID: 34664614]
[http://dx.doi.org/10.26599/BDMA.2022.9020005]
[http://dx.doi.org/10.3389/fgene.2020.607824] [PMID: 33737946]
[http://dx.doi.org/10.48550/ARXIV.1912.05911]
[http://dx.doi.org/10.1109/TCBB.2020.3007544]
[http://dx.doi.org/10.1007/s10930-021-10003-y] [PMID: 34050498]
[http://dx.doi.org/10.1007/s12065-018-0171-3]
[http://dx.doi.org/10.1186/s12859-021-04102-x] [PMID: 33789581]
[http://dx.doi.org/10.1186/s12859-023-05497-5] [PMID: 37777712]
[http://dx.doi.org/10.3390/ijms23073780] [PMID: 35409140]
[http://dx.doi.org/10.5121/csit.2022.120703]
[http://dx.doi.org/10.1145/3534678.3539426]
[http://dx.doi.org/10.1145/3307339.3342186]
[http://dx.doi.org/10.1093/bib/bbab556] [PMID: 37861172]
[http://dx.doi.org/10.1093/bioinformatics/btab083] [PMID: 33538820]
[http://dx.doi.org/10.1093/bib/bbab005] [PMID: 33539511]
[http://dx.doi.org/10.1155/2021/7764764] [PMID: 34484416]
[http://dx.doi.org/10.1038/s41587-022-01435-7] [PMID: 36050551]
[http://dx.doi.org/10.1093/bib/bbab060] [PMID: 33834200]
[http://dx.doi.org/10.1093/bioinformatics/btab823] [PMID: 34875006]
[http://dx.doi.org/10.1109/BIBM49941.2020.9313456]
[http://dx.doi.org/10.1101/2021.09.30.462610]
[http://dx.doi.org/10.1038/s41592-019-0598-1] [PMID: 31636460]
[http://dx.doi.org/10.48550/ARXIV.2104.02443]
[http://dx.doi.org/10.48550/ARXIV.1907.11692]
[http://dx.doi.org/10.1016/j.compbiomed.2022.105772] [PMID: 35777085]
[http://dx.doi.org/10.1101/2024.02.08.575577]
[http://dx.doi.org/10.48550/ARXIV.2007.06225]
[http://dx.doi.org/10.48550/ARXIV.2010.09885]
[http://dx.doi.org/10.1109/IJCNN.2005.1555942]
[http://dx.doi.org/10.48550/ARXIV.2003.13902]
[http://dx.doi.org/10.3390/biom11121783] [PMID: 34944427]
[http://dx.doi.org/10.1101/2021.06.17.448780]
[http://dx.doi.org/10.1186/s12864-022-08648-9] [PMID: 35715739]
[http://dx.doi.org/10.3390/molecules27165114] [PMID: 36014351]
[http://dx.doi.org/10.48550/ARXIV.1609.02907]
[http://dx.doi.org/10.1371/journal.pone.0249404] [PMID: 33831016]
[http://dx.doi.org/10.3390/ijms22084023] [PMID: 33919681]
[http://dx.doi.org/10.1109/TPAMI.2021.3054830] [PMID: 33497331]
[http://dx.doi.org/10.1109/BIBM52615.2021.9669341]
[http://dx.doi.org/10.48550/ARXIV.1810.00826]
[http://dx.doi.org/10.1371/journal.pone.0296676] [PMID: 38232063]
[http://dx.doi.org/10.1101/2023.02.01.526585]
[http://dx.doi.org/10.3390/biomedicines11010067] [PMID: 36672575]
[http://dx.doi.org/10.1016/j.neunet.2023.11.018] [PMID: 37976593]
[http://dx.doi.org/10.1186/s12859-024-05698-6] [PMID: 38365583]
[http://dx.doi.org/10.26434/chemrxiv-2023-qs2w0]
[http://dx.doi.org/10.3390/ijms22168993] [PMID: 34445696]
[http://dx.doi.org/10.1093/bioinformatics/btad234]
[http://dx.doi.org/10.1186/s12859-022-04912-7] [PMID: 36076158]
[http://dx.doi.org/10.3390/ijms241814061] [PMID: 37762364]
[http://dx.doi.org/10.1016/j.bmc.2022.117003] [PMID: 36103795]
[http://dx.doi.org/10.1021/acsomega.3c00085] [PMID: 37396234]
[http://dx.doi.org/10.1109/ICCV.2017.74]
[http://dx.doi.org/10.1145/2939672.2939778]
[http://dx.doi.org/10.1038/s42003-020-01577-x] [PMID: 33473151]
[http://dx.doi.org/10.1038/s42256-024-00792-z]
[http://dx.doi.org/10.1021/acsomega.3c07328] [PMID: 38075811]
[http://dx.doi.org/10.1038/s41467-023-43597-1] [PMID: 38030641]