Suppose you want to calculate an expression of the form
\(\displaystyle{\log\left(\sum_{i=1}^n \exp(a_i) – \sum_{j=1}^m \exp(b_j)\right)},\)
where \(\sum_{i=1}^n \exp(a_i) > \sum_{j=1}^m \exp(b_j)\). Such expressions can be difficult to evaluate directly since the exponentials can easily cause overflow errors. In this post, I’ll talk about a clever way to avoid such errors.
If there were no terms in the second sum we could use the log-sum-exp trick. That is, to calculate
\(\displaystyle{\log\left(\sum_{i=1}^n \exp(a_i) \right)},\)
we set \(a=\max\{a_i : 1 \le i \le n\}\) and use the identity
\(\displaystyle{a + \log\left(\sum_{i=1}^n \exp(a_i-a)\right) = \log\left(\sum_{i=1}^n \exp(a_i)\right)}. \)
Since \(a_i \le a\) for all \(i =1,\ldots,n\), the left hand side of the above equation can be computed without the risk of overflow. To calculate,
\(\displaystyle{\log\left(\sum_{i=1}^n \exp(a_i) – \sum_{j=1}^m \exp(b_j)\right)},\)
we can use the above method to separately calculate
\(\displaystyle{A = \log\left(\sum_{i=1}^n \exp(a_i) \right)}\) and \(\displaystyle{B = \log\left(\sum_{j=1}^m \exp(b_j)\right)}.\)
The final result we want is
\(\displaystyle{\log\left(\sum_{i=1}^n \exp(a_i) – \sum_{j=1}^m \exp(b_j) \right) = \log(\exp(A) – \exp(B)) = A + \log(1-\exp(B-A))}.\)
Since $A > B$, the right hand side of the above expression can be evaluated safely and we will have our final answer.
R code
The R code below defines a function that performs the above procedure
# Safely compute log(sum(exp(pos)) - sum(exp(neg)))
# The default value for neg is an empty vector.
logSumExp <- function(pos, neg = c()){
max_pos <- max(pos)
A <- max_pos + log(sum(exp(pos - max_pos)))
# If neg is empty, the calculation is done
if (length(neg) == 0){
return(A)
}
# If neg is non-empty, calculate B.
max_neg <- max(neg)
B <- max_neg + log(sum(exp(neg - max_neg)))
# Check that A is bigger than B
if (A <= B) {
stop("sum(exp(pos)) must be larger than sum(exp(neg))")
}
# log1p() is a built in function that accurately calculates log(1+x) for |x| << 1
return(A + log1p(-exp(B - A)))
}
An example
The above procedure can be used to evaulate
\(\displaystyle{\log\left(\sum_{i=1}^{200} i! – \sum_{i=1}^{500} \binom{500}{i}^2\right)}.\)
Evaluating this directly would quickly lead to errors since R (and most other programming languages) cannot compute \(200!\). However, R has the functions lfactorial() and lchoose() which can compute \(\log(i!)\) and \(\log\binom{n}{i}\) for large values of \(i\) and \(n\). We can thus put this expression in the general form at the start of this post
\(\displaystyle{\log\left(\sum_{i=1}^{200} \exp(\log(i!)) – \sum_{i=1}^{500} \exp\left(2\log\binom{500}{i}\right)\right)}.\)
The following R code thus us exactly what we want:
pos <- sapply(1:200, lfactorial)
neg <- sapply(1:500, function(i){2*lchoose(500, i)})
logSumExp(pos, neg)
# 863.237
Leave a Reply