Pairwise sequence comparison
The following exercises will illustrate several aspects of pairwise sequence comparison:
- Using some important web resources for sequence analysis software, presented in the preceding practicals.
- Understanding the usage of web based sequence analysis tools.
- Using pairwise sequence comparison tools to illustrate some theoretical aspects studied during the
course.
Reminder: some links for the recovery of sequences
[ Easily fetch sequences ]
[ Search SWISS-PROT ]
[ NCBI's Entrez ]
Pairwise alignment with LALIGN
Compare the sequences OPRM_RAT and SSR1_HUMAN (these are the SWISS-PROT IDs) with lalign using default parameters.
The sequences can be fetched here
(choose the "FASTA" format) using the SWISS-PROT IDs.
Don't hesitate to look at the complete SWISS-PROT entries (OPRM_RAT and
SSR1_HUMAN),
in order to get more information about these two proteins !
Try to answer the following questions:
-
Is this a local or global alignment ?
-
Switch between local and global alignment . Try to understand the differences.
-
Why are there several alignments displayed when performing the local alignment ?
-
What does "% identity" mean ? How is it computed ?
-
What do the symbols ":" and "." stand for ?
-
When two residues are different, there can be either a "."
or a blank. Try to understand the difference and what parameters influence this result ?
-
Try to modify the gap penalties, examine more closely how these parameters influence the occurence
and the length of gaps ("-").
-
Try to modify the scoring matrices used (i.e. BLOSUM35 and BLOSUM80), examine more closely how these parameters influence the scores
and the alignments.
Dotplot using Dotlet
Compare the same sequences (OPRM_RAT and SSR1_HUMAN) using Dotlet
(If you are working on a Mac Dotlet may not work).
The sequences can be fetched here
(choose the "FASTA" format).
Start with a look onto the Dotlet documentation
-
Load the two sequences into Dotlet and compute the dotplot.
-
What does the intensity (gray level) of a pixel mean ?
-
Try changing the grayscale borders. Where would be an optimal position for the upper and lower limits of the grayscale ?
-
What do the diagonal lines represent ?
-
Try to identify corresponding aligned regions in the dotplot and the alignments found by LALIGN.
-
Try to modify the noise by changing the window size, the threshold, both.
-
Try comparing each sequence against itself.
Dotlet examples and method comparison
The Dotlet learn by example pages show different typical sequence analysis problems.
-
Take an interested look at the Dotlet examples.
-
Try to understand the dotplots.
Supplementary exercise
For those who can't get enough: get some more practice by compairing some of the pairs of sequences below.
-
CO9_HUMAN - PERF_HUMAN
- FOSL_DROME - GCN4_YEAST
-
HBB_HUMAN - LGB1_PEA
-
YOR6_ADEG1 - CD4_HUMAN