Fig. 1From: The paralog-to-contig assignment problem: high quality gene models from fragmented assembliesThe EMS-pipeline explicitly solves the paralog-to-contig assignment problem. Sequence-matches to individual TCEs are collected in a step-wise procedure applying either tblastn (from single sequences of individual TCEs) or hmmsearch (starting from a sequence alignment for each TCE). Depending on the input, pre-processing steps (0a) or (0b) are performed before similarity search. The colored boxes represent TCEs. The pre-processing steps, which are performed separately for all individual TCEs of all paralogs, are exemplified here for one paralog encoded by three exons. For a detailed description of the individual steps, we refer to the "Methods" section. AA amino acid sequence, hMMs hidden Markov Models, ILP integer linear programming problem, TCE translated coding exonBack to article page