Difference between revisions of "BLAST for dummies"
Line 1: | Line 1: | ||
− | = Sequence similarity searches: queries and hits | + | = Sequence similarity searches: queries and hits = |
− | <span style="font-size: 12px;">The BLAST algorithm is more or less the standard way of performing sequence similarity searches. With ‘sequences’, we mean biological (nucleotide or amino acid) sequences. There are many different reasons as to why such searches may be performed. Typically, the user has one (or many) unknown sequences, and he/she wants to understand what these sequences are or what they do. In the terminology used by BLAST, these are the query sequences. A sequence search will (hopefully) identify sequences that are similar (or even identical) to the queries. The identified sequences are often called the hit sequences (or just hits). Typically, there is much more known about the hits than the query. For instance, we may know that a specific hit is an enzyme. If the match between the query and the hit is sufficient good, we may conclude that the query sequence also is an enzyme (but not necessarily with exactly the same specificity!). Sometimes, we also perform BLAST searches with queries that are already known to the user. In keeping with the previous example, we may use the sequence of a well-known enzyme as a query sequence. After performing a BLAST search, the hit sequences do not help us identify the nature of the query sequence, but they may tell us something about the distribution of this particular protein in other organisms (provided this information is included in the hit descriptions). | + | <span style="font-size: 12px;">The BLAST algorithm is more or less the standard way of performing sequence similarity searches. With ‘sequences’, we mean biological (nucleotide or amino acid) sequences. There are many different reasons as to why such searches may be performed. Typically, the user has one (or many) unknown sequences, and he/she wants to understand what these sequences are or what they do. In the terminology used by BLAST, these are the query sequences. A sequence search will (hopefully) identify sequences that are similar (or even identical) to the queries. The identified sequences are often called the hit sequences (or just hits). Typically, there is much more known about the hits than the query. For instance, we may know that a specific hit is an enzyme. If the match between the query and the hit is sufficient good, we may conclude that the query sequence also is an enzyme (but not necessarily with exactly the same specificity!). Sometimes, we also perform BLAST searches with queries that are already known to the user. In keeping with the previous example, we may use the sequence of a well-known enzyme as a query sequence. After performing a BLAST search, the hit sequences do not help us identify the nature of the query sequence, but they may tell us something about the distribution of this particular protein in other organisms (provided this information is included in the hit descriptions).</span> |
= The BLAST database = | = The BLAST database = |
Revision as of 11:56, 23 February 2015
Contents
Sequence similarity searches: queries and hits
The BLAST algorithm is more or less the standard way of performing sequence similarity searches. With ‘sequences’, we mean biological (nucleotide or amino acid) sequences. There are many different reasons as to why such searches may be performed. Typically, the user has one (or many) unknown sequences, and he/she wants to understand what these sequences are or what they do. In the terminology used by BLAST, these are the query sequences. A sequence search will (hopefully) identify sequences that are similar (or even identical) to the queries. The identified sequences are often called the hit sequences (or just hits). Typically, there is much more known about the hits than the query. For instance, we may know that a specific hit is an enzyme. If the match between the query and the hit is sufficient good, we may conclude that the query sequence also is an enzyme (but not necessarily with exactly the same specificity!). Sometimes, we also perform BLAST searches with queries that are already known to the user. In keeping with the previous example, we may use the sequence of a well-known enzyme as a query sequence. After performing a BLAST search, the hit sequences do not help us identify the nature of the query sequence, but they may tell us something about the distribution of this particular protein in other organisms (provided this information is included in the hit descriptions).