Category: numerical-methods

  • Log-sum-exp trick with negative numbers

    Suppose you want to calculate an expression of the form

    \(\displaystyle{\log\left(\sum_{i=1}^n \exp(a_i) – \sum_{j=1}^m \exp(b_j)\right)},\)

    where \(\sum_{i=1}^n \exp(a_i) > \sum_{j=1}^m \exp(b_j)\). Such expressions can be difficult to evaluate directly since the exponentials can easily cause overflow errors. In this post, I’ll talk about a clever way to avoid such errors.

    If there were no terms in the second sum we could use the log-sum-exp trick. That is, to calculate

    \(\displaystyle{\log\left(\sum_{i=1}^n \exp(a_i) \right)},\)

    we set \(a=\max\{a_i : 1 \le i \le n\}\) and use the identity

    \(\displaystyle{a + \log\left(\sum_{i=1}^n \exp(a_i-a)\right) = \log\left(\sum_{i=1}^n \exp(a_i)\right)}. \)

    Since \(a_i \le a\) for all \(i =1,\ldots,n\), the left hand side of the above equation can be computed without the risk of overflow. To calculate,

    \(\displaystyle{\log\left(\sum_{i=1}^n \exp(a_i) – \sum_{j=1}^m \exp(b_j)\right)},\)

    we can use the above method to separately calculate

    \(\displaystyle{A = \log\left(\sum_{i=1}^n \exp(a_i) \right)}\) and \(\displaystyle{B = \log\left(\sum_{j=1}^m \exp(b_j)\right)}.\)

    The final result we want is

    \(\displaystyle{\log\left(\sum_{i=1}^n \exp(a_i) – \sum_{j=1}^m \exp(b_j) \right) = \log(\exp(A) – \exp(B)) = A + \log(1-\exp(B-A))}.\)

    Since $A > B$, the right hand side of the above expression can be evaluated safely and we will have our final answer.

    R code

    The R code below defines a function that performs the above procedure

    # Safely compute log(sum(exp(pos)) - sum(exp(neg)))
    # The default value for neg is an empty vector.
    
    logSumExp <- function(pos, neg = c()){
      max_pos <- max(pos)
      A <- max_pos + log(sum(exp(pos - max_pos)))
      
      # If neg is empty, the calculation is done
      if (length(neg) == 0){
        return(A)
      }
      
      # If neg is non-empty, calculate B.
      max_neg <- max(neg)
      B <- max_neg + log(sum(exp(neg - max_neg)))
      
      # Check that A is bigger than B
      if (A <= B) {
        stop("sum(exp(pos)) must be larger than sum(exp(neg))")
      }
      
      # log1p() is a built in function that accurately calculates log(1+x) for |x| << 1
      return(A + log1p(-exp(B - A)))
    }

    An example

    The above procedure can be used to evaulate

    \(\displaystyle{\log\left(\sum_{i=1}^{200} i! – \sum_{i=1}^{500} \binom{500}{i}^2\right)}.\)

    Evaluating this directly would quickly lead to errors since R (and most other programming languages) cannot compute \(200!\). However, R has the functions lfactorial() and lchoose() which can compute \(\log(i!)\) and \(\log\binom{n}{i}\) for large values of \(i\) and \(n\). We can thus put this expression in the general form at the start of this post

    \(\displaystyle{\log\left(\sum_{i=1}^{200} \exp(\log(i!)) – \sum_{i=1}^{500} \exp\left(2\log\binom{500}{i}\right)\right)}.\)

    The following R code thus us exactly what we want:

    pos <- sapply(1:200, lfactorial)
    neg <- sapply(1:500, function(i){2*lchoose(500, i)})
    
    logSumExp(pos, neg)
    # 863.237