AlphaFold3 protein_dataset 模块 ProteinDataset 类 get_anchor_ind
方法是一个 @staticmethod
静态方法,用来获取“锚定残基(anchor residues)”的索引,目的是在蛋白质序列中被遮蔽(masked)的区域两端找到“已知(known)”的残基,以便后续作为上下文参考。
源代码:
@staticmethoddef get_anchor_ind(masked_res, mask):"""Get the indices of the anchor residues.Anchor residues are defined as the first and last known residues before andafter each continuous masked region.Parameters----------masked_res : torch.TensorA boolean tensor indicating which residues should be predictedmask : torch.TensorA boolean tensor indicating which residues are knownReturns-------listA list of indices of the anchor residues"""anchor_ind = []masked_ind = torch.where(masked_res.bool())[0]known_ind = torch.where(mask.bool())[0]for _, g in groupby(enumerate(masked_ind), lambda x: x[0] - x[1]):group = map(itemgetter(1), g)group = list(map(int, group))start, end = group[0], group[-1]