고해상도 공간 단백질체학을 사용하여 아프리카 트리파노솜의 다양성 매핑

Nature Communications 14권, 기사 번호: 4401(2023) 이 기사 인용

2065 액세스

26 알트메트릭

측정항목 세부정보

아프리카 트리파노솜은 사하라 이남 아프리카에 인간과 수의학 질병에 상당한 부담을 주는 진핵 기생충입니다. 종과 생활주기 단계 사이의 다양성은 이 그룹 내에서 뚜렷한 숙주 및 조직 친화성과 수반됩니다. 여기에서는 두 아프리카 트리파노소마 종인 Trypanosoma brucei와 Trypanosoma congolense의 공간 프로테옴이 두 가지 수명 단계에 걸쳐 매핑되어 있습니다. 4개의 결과 데이터 세트는 세포 유형당 약 5500개의 단백질이 발현된다는 증거를 제공합니다. 세포 유형당 2500개 이상의 단백질이 특정 세포하 구획으로 분류되어 4개의 포괄적인 공간 프로테옴을 제공합니다. 비교 분석은 다양한 생물학적 틈새에 대한 기생 적응의 주요 경로를 밝히고 이러한 병원체 종 내 및 종 간의 다양성에 대한 분자 기반에 대한 통찰력을 제공합니다.

Kinetoplastids는 인간, 가축 및 작물 종의 중요한 기생충을 포함하고 일반적으로 무척추 동물에 의해 전염되는 단세포, 편모 진핵생물입니다. 이 클래스에는 다양한 포유류를 집합적으로 감염시키고 인간과 동물의 아프리카 트리파노소마증을 유발하는 아프리카 트리파노소마가 있습니다. 아프리카 트리파노소마를 특성화하는 대부분의 연구는 Trypanosoma brucei를 사용하여 수행되었습니다. 부분적으로는 두 아종이 인간에게 감염되기 때문이며, 또한 이 종에 대한 시험관 배양 및 유전자 조작이 상대적으로 용이하기 때문입니다. T. brucei는 기생충으로 연구되었지만 잘 보존되어 있고 비정규적인 진핵 생물학적 특징을 모두 갖춘 다양한 모델 유기체로도 연구되었습니다. 관련 종인 Trypanosoma congolense와 Trypanosoma vivax는 소 트리파노소마증의 주요 원인균입니다. 수의학적인 중요성에도 불구하고 이 종에 대한 연구는 상당히 적습니다1. 다양한 아프리카 트리파노소마 종은 뚜렷한 세포 및 감염 특성을 가지고 있지만, 이들 중 대부분의 분자적 기초는 알려져 있지 않습니다2,3,4,5.

아프리카 트리파노솜은 수명 주기 동안 다양한 외부 환경에 노출되며 기생충은 현재 환경에서 성장과 생존에 각각 적응하거나 다음 환경에 미리 적응하는 일련의 생활 단계를 구별합니다6. 각 생활 단계는 단일 편모와 단일 및 다중 사본 소기관 모음을 갖춘 공통적이고 고도로 조직화된 극핵 세포 구조를 기반으로 합니다. 세포 소기관의 상대적인 크기, 위치 및 단백질 함량은 생활 단계에 따라 다릅니다. 모든 진핵 세포에서와 마찬가지로 트리파노솜에 있는 단백질의 세포하 위치는 해당 단백질의 생화학적 환경뿐만 아니라 분자 상호 작용의 가능성도 정의합니다. 따라서 단백질 기능은 단백질 위치화와 밀접하게 연관되어 있습니다.

세포 내 단백질 위치를 결정하는 데에는 현미경 검사법과 단백질체학이라는 두 가지 주요 접근 방식이 있습니다. 현미경 검사법을 사용하면 특정 위치를 정확하게 확인할 수 있습니다. 샘플 내 세포 간의 변화를 감지할 수 있습니다. 태깅이나 부적절한 발현으로 인해 어려움을 겪을 수 있지만 여러 부위에 국한된 단백질을 쉽게 식별할 수 있습니다. 단백질 특이적 항체를 얻거나 관심 단백질을 유전적으로 조작해야 하기 때문에 현미경 검사는 일반적으로 연구당 소수의 단백질로 제한됩니다. 프로테옴 전체 현미경 분석은 귀중하고 풍부한 데이터 세트이지만 지금까지 Saccharomyces cerevisiae7, Humans8 및 T. brucei9와 같은 소수의 종으로 제한되어 있는 사소하지 않고 시간 소모적인 노력입니다.

공간 단백질체학은 소기관의 분리 또는 농축에 이어 질량 분석법(MS)을 기반으로 하며 일반적으로 유전자 변형 없이 특정 세포 내 위치에 농축된 단백질을 식별할 수 있습니다. 이러한 방법은 미토콘드리아, 글리코솜, 편모 및 핵과 같은 트리파노소마티드 내의 소기관 또는 구조의 단백질 거주자를 밝히는 데 매우 효과적이었습니다. 이제 처리량이 많은 MS 기반 방법을 사용하여 여러 조건, 상태 또는 세포 유형에 대한 단일 실험에서 수천 개의 단백질을 체계적으로 위치화할 수 있습니다. 이러한 방법에는 hyperLOPIT(동위원소 태깅에 의한 소기관 단백질의 과복합 국소화)가 포함됩니다. 이는 기계 학습 알고리즘24,25,26,27을 적용하여 개별 소기관을 분리할 필요 없이 프로테옴의 공간 맵을 확인할 수 있는 정량적 프로테오믹스 접근 방식입니다. HyperLOPIT 및 관련 LOPIT 방법론은 포유류, 곤충, 효모, 식물 및 원생동물 세포 유형의 공간 지도를 생성하는 데 활용되었습니다.

0.85 for all pairwise comparisons), demonstrating the reproducibility between experimental iterations (Supplementary Fig. 7 and Supplementary Data 8)47. Next, to define the spatial proteomes for each cell-type, final classifications were generated by performing TAGM-MAP analysis on each combined 33-plex dataset (Fig. 1A and Supplementary Data 9). Using this approach, 2679 and 2795 proteins were classified in T. brucei BSF and PCF respectively (Fig. 3A, B and Supplementary Data 9). In T. congolense, 2507 and 2504 proteins were classified in the BSF and PCF respectively (Fig. 3A, B and Supplementary Data 9). These four spatial proteomes provide a comprehensive localisation dataset for two closely related species, across two life-stages each, which has not been achieved on this scale before in any parasitic organism./p>

1e3; Average Reporter S/N > = 5; Isolation Interference <= 50%32. PSMs matching to contaminants (cRAP and cRFP) and those with missing values in any of the 11-plex TMT quantitation channels were removed. PSM intensities were sum-normalised then median-aggregated to the protein level. Each 11-plex TMT experiment was then concatenated to form a 33-plex dataset and proteins with missing values in any of the experiments were removed35./p>0.999 and separately an outlier probability <5 E-5. Proteins that did not meet the thresholding criteria were designated as ‘unknown’. To assess the reproducibility of classifications produced by individual experimental iterations, TAGM-MAP was also performed with the default settings on each 11-plex dataset separately. Classifications were retained if they exceeded a localisation probability >0.99. To assess the variability in classification between the experiments, datasets were compared pairwise using the adjusted Rand index which assigns a score of 0 if consistency is what is expected at random and 1 for perfect consistency using the R package mmclust (v1.0.1)47. To avoid inflating or deflating the ARI due to an excess of “unknown” allocations these were filtered before comparison. Analysis using TAGM-MCMC was then used to provide insight into proteins that were unknown according to TAGM-MAP where it could be due to dynamic protein localisation. This model was implemented using Markov-chain Monte-Carlo. The collapsed Gibbs sampler was run in parallel for 9 chains (T. brucei) and 4 chains (T. congolense) with each chain run for 10,000 iterations. Convergence was assessed using the Gelman-Rubin’s diagnostic and all Markov chains were retained for T. congolense; whilst for T. brucei the best two chains were retained. No thresholding criteria was applied with protein allocations and compartment joint probabilities are reported. Joint probabilities were used to evaluate proteins that may exhibit localisation to more than one compartment./p>=2X the number of genes versus the counterpart in T. brucei or T. congolense accordingly. Cases of two-one gene count orthogroups in T. brucei-T. congolense where two fasta header identifiers matched to a single gene identifier were removed from this set in T. brucei./p>