E-value & Bit Score Calculator - BLAST Statistics

Raw Alignment Score (S) Sum of substitution matrix scores

Database Size (D) Number of sequences in database

Query Length (m) Length of query sequence

Subject Length (n) Average length in database

Lambda (λ) Scale parameter for scoring system

K Parameter Search space parameter

E-value Expected number of hits by chance

Database Size For bit score calculation

Convert From

Value

Database Size Required for conversions

Database Size (log scale) 10^6

Raw Score 50

Query Length 300

E-value

7836.7099

Expected hits by chance

Not Significant

Bit Score

23.9

Database-independent score

P-value

1.0000

Probability of chance occurrence

Search Space

4.92e+9

Effective search space

Karlin-Altschul Statistics

E = K × m × n × e^(-λS) Where: E = E-value (expected number of hits) K = Minor constant (~0.041 for BLOSUM62) m = Query sequence length n = Database length (or subject length) λ = Scaling factor (~0.267 for BLOSUM62) S = Raw alignment score

Bit Score Conversion

S' = (λS - ln K) / ln 2 Where: S' = Bit score (database-independent) S = Raw score λ, K = Karlin-Altschul parameters

What is an E-value?

The E-value (Expectation value) represents the number of alignments with scores at least as good as the observed score that would be expected to occur by chance in a database search. Lower E-values indicate more significant matches.

Interpreting E-values

Common significance thresholds:

E < 1e-50: Extremely significant
E < 1e-10: Very significant
E < 0.01: Significant
E < 1: Possibly significant
E ≥ 1: Not significant

Bit Score vs E-value

Key differences:

Bit Score: Database-independent, allows comparison across searches
E-value: Database-dependent, changes with database size
Higher bit scores = better alignments
Lower E-values = more significant

Karlin-Altschul Parameters

Matrix-specific constants (BLOSUM62):

λ (lambda) ≈ 0.267
K ≈ 0.041
H (entropy) ≈ 0.68 bits

Different matrices have different parameters.

Database Size Effect

E-values increase linearly with database size. An alignment with E=0.001 in a database of 1 million sequences would have E=1 in a database of 1 billion sequences, but the bit score remains constant.

FAQ

Q: Why do E-values change with database size?
A: Larger databases have more chances for random matches, increasing the expected number of hits by chance.

Q: What's a good E-value cutoff?
A: Typically 0.01 or 0.001, but it depends on your analysis goals and tolerance for false positives.