What does Trump talk about when he talks about COVID-19?

The code and details can be found on GitHub here.

Table of Contents

Background

Since March 14, 2020, President Trump has given semi-regular press updates on the global COVID-19 outbreak. While this pandemic is impacting nearly every aspect of everyday life, it is important to understand how our leaders perceive and communicate the next steps as we tackle these unprecedented issues.

The New York Times (NYT) recently ran a fascinating article that painstakingly analyzed over 260,000 words Trump has publicly spoken about the novel coronavirus. It took three journalists at the NYT to review all these statements by hand, and they found that the President’s comments were particularly self-congratulatory, exaggerated, and startlingly lacking in empathy when compared to how often he praises himself and his administration.

I wondered, though, if a more data-driven approach to Trump’s remarks would yield any new insights. In particular, using data and natural language processing (NLP) to understand:

  • What topics Trump thinks are particularly important?

  • What are some of the keywords for each day’s briefing?

  • Trump’s attitude (sentiment) during his speeches?

  • What underlying factors in the news or world correlate with his message?

Given that all his remarks are publicly recorded on the White House webpage, I knew we had what we needed to answer these questions.

Findings

What does Trump talk about?

I wanted to get an idea of what topics Trump talks about during his briefings. To answer this, I used non-negative matrix factorization (NMF) to cluster keywords around certain topics. The number of topics and the interpretation of the topic is up to the scientist. There’s no right answer as this is an unsupervised learning problem, but I found that the topics made the most sense if you limit them to six.

The most important keywords are listed and my intepretation of the topic. For example, the first topic is clearly talking about new treatments, cures, and vaccines for COVID-19. We should expect this to be a topic duing his briefings! Other ones make sense to me as well, like economic recovery. I was a little surprised (OK maybe not too surprised) how often he remarked on his media coverage. And the COVID-19 Task Force topic generally seems to bring up how well his administration is handling the pandemic. This is consistent with what the NYT article found.

You can see for yourself below!

Click for examples of Trump remarks by topic

Topic 1: Vaccine and treatment

April 13: We have things happening that are unbelievable. I saw a presentation today that I can’t talk about yet, but it’s incredible. Plus, I think they’re doing — Tony — I think they’re doing very well in the vaccines. They’re working hard on the vaccines and I think you’ll have an answer for vaccines. I believe that there’s some great things coming out with respect to that. Now you need a testing period, but you’re going to have some great things.

Topic 2: Global impact

April 19: I had a G7 call and their economies are in tatters. They’re shattered, the G7 countries. You have Japan and Germany and France, and the different countries. Italy — look at what happened to Italy. Look at what happened to these countries. Look at what happened to Spain. Look what happened to Spain, how — how incredible. It’s just been shattered. And so many other countries are shattered.

Topic 3: Media coverage

March 29: I mean, even the media is much more fair. I wouldn’t say all of it, but that’s okay. They should be fair because they should want this to end. This is — this is about death.

Topic 4: Farms, trade, and tariffs

April 22: We don’t want to do — you know, the border has been turned off a number of times over the years. And you know what happened? Our farmers all went out of business. They were out of business. They couldn’t farm. We’re taking care of our farmers. Nobody ever took care of farmers like I take care of farmers.

Topic 5: COVID Task Force

April 16: I would now like to ask Vice President Mike Pence and Dr. Birx to further explain the new guidelines. I want to thank Dr. Birx. I want to thank Dr. Fauci. And I want to thank, really especially, a man who has devoted 24 hours a day to his task force and done such an incredible job — our great Vice President, Mike Pence.

Topic 6: Economic recovery

April 2: And we’ve learned a lot. We’ve learned about borders. We’ve learned about reliance on other countries. We’ve learned so much — so much that I think we really have a chance to be bigger and better and stronger. And I think it’s going to come back very quickly, but first we have to defeat this enemy.

In general, when Trump speaks his statements fall into one of these topics 18% of the time. That’s pretty good coverage for our topics given that we limit ourselves to only 6 topics. That said, relatively speaking, what does he talk most about?

Topic % Relative % Total
Vaccine & treatment 16.6 2.9
Global impact 21.4 3.7
Media coverage 10.2 1.8
Farms, trade, & tariffs 24.7 4.3
COVID Task Force 14.1 2.5
Economic recovery 12.9 2.3

When Trump’s statements fall into one of these six topics, Trump talks about farmers, trade, and tariffs nearly 25% of the time (4.3% overall). If we lump this group in with the economic recovery topic, it seems Trump talks about the economy well over a third of the time, nearly double that of the actual health impact (vaccines and treatment). Another quarter of the time is spent commenting on the media and how his administration is handling the crisis. These are rough numbers, but the pattern emerges that Trump’s preferred topics are the economy and his image (combining both press coverage and his administration’s COVID Task Force).

Keywords for each day’s briefing

It’s interesting to see what Trump thinks is important each day. One way to address this is to pull out the top keywords for each day’s briefing. The technique we use is term-frequency inverse-document-frequency (TF-IDF) and treat each day’s briefing as a “document”. It’s a cheap and effective way to pull out those unique keywords that distinguish each day’s briefing.

Here’s an example of the important keywords found on March 14:

Date Keyword 1 Keyword 2 Keyword 3 Keyword 4 Keyword 5
March 14 fed proactive mic closure refinance

So here you can see that “mic” (or “microphone”) was a keyword of interest. Interestingly, Trump went on a bit of a rant during his talk because he was accused of touching the microphone:

Somebody said yesterday I touched the microphone. I was touching it because we have different height people and I’m trying to make it easy for them because they’re going to have to touch, because they wouldn’t be able to reach the mic; they wouldn’t be able to speak in the mic. So I’ll move the mic down. And they said, “Oh, he touched the microphone.” Well, if I don’t touch it, they’re going to have to touch it. Somebody is going to have to, so I might as well be the one to do it.

Amusing that the model identified this (somewhat) bizarre tangent.

You can see for yourself below. Near the beginning of the briefings, things seem much more focused on health-care and government action (“proactive”, “self swab”, “buyback”). But as the economy progressively gets worse (see March 23!) then we see more of an economic focus (“exchange”, “cure is worse than the problem”). The economic shutdown is no small deal!

Click to show important keywords by day

Date Keyword 1 Keyword 2 Keyword 3 Keyword 4 Keyword 5
March 14 fed proactive mic closure refinance
March 15 predict Federal Reserve Google relax store
March 16 postponing symptom Friday night wash Rochelle
March 17 payroll tax Boeing telehealth tourism screened
March 18 self swab calm invincible target passenger
March 19 Syria chloroquine Tice mother chaos
March 21 student loan penalty inaccurately buyback religious leader
March 22 veteran VA Federal Medical Station preexisting rich
March 23 Boeing cure is worse than the problem watching closely exchange learning
March 24 timeline independence Larry large section established
March 25 Steve Olympics South Korea specification significance
March 26 steel Brady Tom South Korea 96
March 27 appreciative Peter smartest Boeing handle
March 30 June 1st rating waive 300,000 22 million
March 31 scarf guest ride impeached oil
April 01 cash violence train Logan Facebook
April 02 gesture cut flat Iran bump
April 03 voter id 81 voluntary mail oil
April 04 whistleblower transcript informer commissioner 180
April 05 182 sue rushing oil sign
April 06 letter London intensive nuclear replacement
April 08 delegate Bernie memo Easter voting
April 09 oil mental Scalia storage OPEC
April 10 pastor Denver interruption Mexico World Trade Organization
April 13 January committee clip Jan 17th Wuhan
April 15 judge approved Sonny session nominee
April 16 count wide arena devastated capacity
April 19 swab aroused ran pressure hotel
April 20 defending squeeze sustainable 60,000 rally
April 22 corona parlor misquoted ember flu
April 23 sun heat laboratory surface Brian Kemp
April 24 Stephen approving declined payment Tim
April 27 pharmacy damage Attorney General General Flynn quarter

You can also see the importance of oil in April. It wasn’t a huge issue early on, but you see it become more important starting around April 3, peaking around April 9. That day, if you recall, was when OPEC made huge cuts in oil production. So naturally it was a topic in Trump’s briefing that day!

How positive/negative is Trump during his speeches?

We can investigate how positive or negative Trump is speaking during his briefings. To do this, we look at the compound sentiment for each briefing using the Valence Aware Dictionary and sEntiment Reasoner (VADER). Although it was trained on social media, it works well for our purposes. It ranges from -1 (negative) to +1 (positive).

So given the statement: “I hate you!”, VADER assigns it a -1. But given the statement “I love you!” VADER assigns it a +1. For statements like “I love pizza, but hate the color red”, VADER gives it a compound sentiment of zero. For Trump, we take the average compound sentiment for a given day.

You can see that, on the whole, Trump is pretty positive-to-neutral. Politically, I think this would be better to lean positive during the briefings. Of course, you don’t want to be too positive either during these tough times!

On average, his sentiment lies around 0.4, with a standard deviation of 0.1. So leaning positive, but by no means elated, which is probably the right tone to hit.

It’s interesting to note the spike in Trump’s positivity on April 24. Was he really more positive than usual? The model suggests so — and he was!

POLITICO noticed then same thing, and ran an article about it. After getting rebuked the day before for claiming sunlight and ingesting Lysol as a cure for COVID-19, Trump took a more conventional approach. In this particular briefing, he didn’t bash any political opponents, didn’t talk about any unproven cures, or argue with reporters. He only predicted quick economic turnaround. The article states:

For a ritual that in recent days has displayed some the trappings of a political rally, the most unusual thing about the briefing was how conventional it was … There was no touting of unproven remedies, no swipes at political foes; only prediction of a swift economic rebound, praise for governors who are rolling out plans to reopen their states and updates on his administration’s public health and economic response to the crisis that has killed more than 50,000 Americans. (POLITICO 04/24/2020)

Fascinating to see the change in attitude reflected in the VADER analysis!

What might be influencing Trump’s attitude?

Perhaps more interesting is to try and find patterns in Trump’s sentiment. Initially, I noticed the big spike in sentiment on March 24, as well as a pretty big drop in sentiment on April 1. On March 24, there was a big stock market rally, and on April 1, the market suffered a few days of losses. I wondered: could it be that Trump’s mood is correlated with stock market performance?

To investigate, I plotted Trump’s sentiment versus the percent change in the closing value of the Dow Jones Industrial Average. On first look, you can see that the briefing sentiment shows some correlation with changes in the DJIA. If the market is doing bad, Trump is generally more negative, and if the market is doing well, Trump is more positive. It’s tough to say which direction the correlation goes (correlation is not causation!) but given that Trump usually holds his briefings later in the day, I’m inclined to think the market bears more influence on his mood. But the issues are complicated and there are many more variables at play, so take it with a grain of salt.

We can actually make the correlation between stock market and Trump’s mood a bit tighter. Below we have a regression plot, and find that the Pearson correlation is around +0.45. Note that we don’t include the cases where Trump gives a briefing on a weekend or holiday! (Which shouldn’t be correlated with the markets.) We also see that the R2 is 0.2 with a p-value of 0.03. So the Dow daily variation can account for 20% of Trump’s sentiment and is statistically significant (p-value < 0.05). This is good enough to say that there is likely a moderate positive correlation between Trump’s sentiment and the market performance. Makes sense, as he prides himself on his business acumen and being able to keep the American economy strong. (No comment if that corresponds to reality.)

Details on Data Acquisition

The raw data was scraped from https://www.whitehouse.gov/remarks/ using requests and BeautifulSoup4. For now, all we explore Trump’s statements during the Coronavirus Task Force press meetings (which as of today, May 6, may be ending).

The president’s name is listed before he speaks, so once the page is downloaded locally, it is relatively easy to extract just Trump’s statements. We treat a “statement” as any of Trump’s spoken words separated by a newline. This gave a variety of statements, usually around 3 to 5 sentences long. If any statement was less than 16 words, we ignored it (these usually are statements like “Thank you John, next question”, etc.) as these are not meaningful. If a statement was too long (longer than 140 words) we split them into smaller chunks. This foresight allows us to use the BERT transformer in future work, which in the default model cannot handle sentences with more than 512 word embeddings.

Once the raw text was extracted to a pandas DataFrame, we applied standard text-cleaning routines to make all lowercase, remove stopwords, and add bigrams and trigrams (so mayor, de, blasio is not three separate words, but instead is treated as mayor_de_blasio, which is much more informative!).

Neural Networks by Analogy with Linear Regression

Most scientists are aware of the importance and significance of neural networks. Yet for many, neural networks remain mysterious and enigmatic.

Here, I want to show that neural networks are simply generalizations of something we scientists are perhaps more comfortable with: linear regression.

Regression models are models that find relationships among data. Linear regression models fit linear functions to data, like the equations we learned in algebra. If you have one variable, it’s single-variate linear regression; if you have more than one variable, it’s multi-variate linear regression.

One example of a linear regression model would be Charles’ Law which says that temperature and volume of a gas are proportional. By collecting temperatures and volumes of a gas, we can derive the proportionality constant between the two. For a multi-variable case, one example might be finding the relationship between the price of a house and its square footage and how many bedrooms it has.

It turns out we can write linear regression in terms of a graph like the one below. In this example we have two types of input data (features), and . The function to be fit takes each of these data points, multiplies it by a weight or and adds them together. The is the “bias”, which is analogous to the in a linear equation .

To make a model, we have to solve for the weights and the bias . This can be done by many types of optimization algorithms (say, steepest descent or by pseudoinverse). Don’t worry about how the weights are obtained, just realize that that, one, we can obtain them and, two, once we have the weights and bias then we now have a functional relationship among our data.

So for example, if is housing price, and is square footage and is number of bedrooms, then we can predict the housing price given any house area and bedroom number via . So far so good.

Now we take things a bit further. The next step in our regression “evolution” is to pass the whole above linear regression model to a function. In the example below, we pass it to a sigmoid or logistic function . This is what logistic regression looks like as a graph:

This is known as a “generalized linear model.” For our purposes, the exact function you pass the linear model into doesn’t matter so much. The important part is that we can make a generalized linear model by passing the linear model into a function. If you pass it into a sigmoid function, like we did above, you get a logistic regression model which can be useful for classification problems.

Like if I tell you the color and size of an animal, the logistic model can predict whether the animal is either a flamingo or an elephant.

So to recap, we started with a linear model, then by adding a function to the output, we got a generalized linear model. What’s the next step?

Glad you asked.

The next step with to take our linear model, pass it into a function (generalized linear model), then take that output and use it as an input for another linear model. If you do this, as depicted in the graph below, you will have obtained what is a called a shallow neural network.

At this point, we have obtained a nonlinear regression model. That’s really all neural networks are: models for doing nonlinear regression.

At the end of the day all these models are doing the same thing, namely, finding relationships among data. Nonlinear regression allows you to find more complex relationships among the data. This doesn’t mean nonlinear regression is better, rather it is just a more flexible model. For many real-world problems, simpler linear regression is better! As long as it is suitable to handle your problem, it will be faster and easier to interpret.

So to recap, we went from linear regression, to logistic regression (a generalized linear model), to shallow neural network (nonlinear regression). The final step to get to deep neural networks is to pass the output of the shallow network into another function and make a linear model based off that output. This is depicted below:

To reiterate, all these deep networks do is take data, make a linear model, transform with a function, then make another linear model with that output, transform that with a function, then make another linear model…and so on and so forth.

Data linear model function linear model function linear model…

It’s a series of nested functions that give you even more flexibility to handle strange nonlinear relationships among your data. And if that’s what your problem requires, that’s what it’s there for!

Obviously, modern developments of neural networks are far more complicated. I’m simplifying a lot of the details, but the general idea holds: neural networks are models to do nonlinear regression, and they are built up from linear models. They take one of the workhorses of scientific analysis, linear regression, and generalize it to handle complex, nonlinear relationships among data.

But as long as you keep in mind that it can be broken down to the pattern of passing the output of linear regression to a function, and then passing that output to another linear model, you can get pretty far.

Hope that helps!

Real-Time Time-Dependent Electronic Structure Theory

I have a new review on (and titled) real-time time-dependent electronic structure theory out now, which has been one of my active research areas over the past five years or so. Like other electronic structure methods, real-time time-dependent (RT-TD) methods seeks to understand molecules through a quantum treatment of the motions of their electrons. What sets RT-TD theories and methods apart is that they explore how electrons evolve in time, especially as they respond to things like lasers, magnetic fields, X-rays, and so on. Real time methods look at molecular responses to these things explicitly in time, and gives an intuitive and dynamic view of how molecules behave under all sorts of experimental conditions. From this, we can predict and explain how certain molecules make better candidates for solar energy conversion, molecular electronics, or nanosensors, etc.

The truth is that there is a wealth of information that can be obtained by real-time time-dependent methods. Because of this, RT-TD methods are often criticized as inefficient and overkill for many predictive applications. In some cases this is true, for example when computing absorption spectra of small organic molecules (linear response methods are usually a better choice). However, to dismiss all RT-TD methods is a mistake, as the methods comprise a general technique for studying extremely complex phenomena. RT-TD methods allow for complete control over the number and strength of interacting external perturbations (multiple intense lasers, for example) in the study of molecular responses. In the review, we address many of these unique applications, ranging from non-equilibrium solvent dynamics to dynamic hyperpolarizability to the emerging real-time quantum electrodynamics (QED), where even the photon field is quantized.

The article was extremely fun to write, and I hope you find something useful and interesting in it. RT-TD comprises a broad field with extremely talented scientists working on its development. You can find the article here.

The Magnus Expansion

PDF version

The goal of all real-time electronic dynamics methods is to solve the time-dependent Schrödinger equation (TDSE)

where is the time-dependent Hamiltonian and is the time-dependent wave function. The goal of the Magnus expansion is to find a general solution for the time-dependent wave function in the case where is time-dependent, and, more crucially, when does not commute with itself at different times, e.g. when . In the following we will follow closely the notation of Blanes, et al..

First, for simplicity we redefine and introduce a scalar as a bookkeeping device, so that

At the heart of the Magnus expansion is the idea of solving the TDSE by using the quantum propagator that connects wave functions at different times, e.g. Furthermore, the Magnus expansion assumes that can be represented as an exponential, This yields the modified TDSE

Now, for scalar and , the above has a simple solution, namely

However, if and are matrices this is not necessarily true. In other words, for a given matrix the following expression does not necessarily hold:

because the matrix and its derivatives do not necessarily commute. Instead, Magnus proved that in general satisfies

where are the Bernoulli numbers. This equation may be solved by integration, and iterative substitution of . While it may appear that we are worse off than when we started, collecting like powers of (and setting ) allows us to obtain a power-series expansion for ,

This is the Magnus expansion, and here we have given up to the third-order terms. We have also made the notational simplification that . This is the basis for nearly all numerical methods to integrate the many-body TDSE in molecular physics. Each subsequent order in the Magnus expansion is a correction that accounts for the proper time-ordering of the Hamiltonian.

The Magnus expansion immediately suggests a route to many numerical integrators. The simplest would be to approximate the first term by

leading to a forward-Euler-like time integrator of

which we can re-write as

where subscript gives the node of the time-step stencil. This gives a first-order method with error . A more accurate second-order method can be constructed by approximating the first term in the Magnus expansion by the midpoint rule, leading to an time integrator

Modifying the stencil to eliminate the need to evaluate the Hamiltonian at fractional time steps (e.g. change time step to ) leads to the modified midpoint unitary transformation (MMUT) method

which is a leapfrog-type unitary integrator. Note that the midpoint method assumes is linear over its time interval, and the higher order terms (containing the commutators) in this approximation go to zero. There are many other types of integrators based off the Magnus expansion that can be found in the literature. The key point for all of these integrators is that they are symplectic, meaning they preserve phase-space relationships. This has the practical effect of conserving energy (within some error bound) in long-time dynamics, whereas non-symplectic methods such as Runge-Kutta will experience energetic “drift” over long times. A final note: in each of these schemes it is necessary to evaluate the exponential of the Hamiltonian. In real-time methods, this requires computing a matrix exponential. This is not a trivial task, and, aside from the construction of the Hamiltonian itself, is often the most expensive step in the numerical solution of the TDSE. However, many elegant solutions to the construction of the matrix exponential can be found in the literature.

References

Blanes, S., Casas, F., Oteo, J.A. and Ros, J., 2010. A pedagogical approach to the Magnus expansion. European Journal of Physics, 31(4), p.907.

Magnus, W., 1954. On the exponential solution of differential equations for a linear operator. Communications on pure and applied mathematics, 7(4), pp.649-673.

A (hopefully) gentle guide to the computer implementation of molecular integrals over Gaussian basis functions.

Note: You can find the following integral routines implemented in a working Hartree-Fock code here.

Special thank you to Joy Alamgir for pointing out several corrections to this post (and PDF)!

PDF version

In quantum chemistry, we are often interested in evaluating integrals over Gaussian basis functions. Here I am going to take much of the theory for granted and talk about how one might actually implement the integrals for quantum chemistry by using recursion relationships with Hermite Gaussians. Our goal is to have clean, readable code. I’m writing this for the people are comfortable with the mathematics behind the Gaussian integrals, but want to see a readable computer implementation. I’ll talk a bit about some computational considerations at the end, but my goal is to convert equations to code. In fact, I’ve tried to structure the equations and the code in such a way that the two look very similar.

For a very good overview of integral evaluation, please see:

Helgaker, Trygve, and Peter R. Taylor. “Gaussian basis sets and molecular integrals.” Modern Electronic Structure (1995).

I will try and follow the notation used in the above reference.

Mathematical preliminaries

Let’s start with some of the basics. First, we have our 3D Gaussian functions

with orbital exponent , electronic coordinates , origin , and

also, are the angular quantum numbers (e.g. is -type, is type, etc.) Cartesian Gaussians are separable in 3D along so that

with the 1D Gaussian

So far, so good. Let’s consider the overlap integral of two 1D Gaussians, and

where we used the Gaussian product theorem so that

When using Hermite Gaussians, we can express as

where are expansion coefficients (to be determined recursively) and is the Hermite Gaussian overlap of two Gaussians and . It has a simple expression that kills the sum via the Kronecker delta . It can be shown that the expansion coefficients can be defined using the following recursive definitions

The first equation gives us a way to reduce the index and the second gives us a way to reduce index so that we can get to the third equation, which is our base case. The last equation tells us what to do if we go out of index bounds.

Overlap integrals

The first thing we need to do is implement a function E which returns our expansion coefficients . Aside from angular momentum and from the Gaussian functions, we also need the distance between Gaussians and the orbital exponent coefficients and as inputs.

def E(i,j,t,Qx,a,b):
    ''' Recursive definition of Hermite Gaussian coefficients.
        Returns a float.
        a: orbital exponent on Gaussian 'a' (e.g. alpha in the text)
        b: orbital exponent on Gaussian 'b' (e.g. beta in the text)
        i,j: orbital angular momentum number on Gaussian 'a' and 'b'
        t: number nodes in Hermite (depends on type of integral, 
           e.g. always zero for overlap integrals)
        Qx: distance between origins of Gaussian 'a' and 'b'
    '''
    p = a + b
    q = a*b/p
    if (t < 0) or (t > (i + j)):
        # out of bounds for t  
        return 0.0
    elif i == j == t == 0:
        # base case
        return np.exp(-q*Qx*Qx) # K_AB
    elif j == 0:
        # decrement index i
        return (1/(2*p))*E(i-1,j,t-1,Qx,a,b) - \
               (q*Qx/a)*E(i-1,j,t,Qx,a,b)    + \
               (t+1)*E(i-1,j,t+1,Qx,a,b)
    else:
        # decrement index j
        return (1/(2*p))*E(i,j-1,t-1,Qx,a,b) + \
               (q*Qx/b)*E(i,j-1,t,Qx,a,b)    + \
               (t+1)*E(i,j-1,t+1,Qx,a,b)

This is simple enough! So for a 1D overlap between two Gaussians we would just need to evaluate and multiply it by . Overlap integrals in 3D are just a product of the 1D overlaps. We could imagine a 3D overlap function like so

import numpy as np

def overlap(a,lmn1,A,b,lmn2,B):
    ''' Evaluates overlap integral between two Gaussians
        Returns a float.
        a:    orbital exponent on Gaussian 'a' (e.g. alpha in the text)
        b:    orbital exponent on Gaussian 'b' (e.g. beta in the text)
        lmn1: int tuple containing orbital angular momentum (e.g. (1,0,0))
              for Gaussian 'a'
        lmn2: int tuple containing orbital angular momentum for Gaussian 'b'
        A:    list containing origin of Gaussian 'a', e.g. [1.0, 2.0, 0.0]
        B:    list containing origin of Gaussian 'b'
    '''
    l1,m1,n1 = lmn1 # shell angular momentum on Gaussian 'a'
    l2,m2,n2 = lmn2 # shell angular momentum on Gaussian 'b'
    S1 = E(l1,l2,0,A[0]-B[0],a,b) # X
    S2 = E(m1,m2,0,A[1]-B[1],a,b) # Y
    S3 = E(n1,n2,0,A[2]-B[2],a,b) # Z
    return S1*S2*S3*np.power(np.pi/(a+b),1.5) 

Note that we are using the NumPy package in order to take advantage of the definitions of and the fractional power to the . The above two functions overlap and E are enough to get us the overlap between two Gaussian functions (primitives), but most basis functions are contracted, meaning they are the sum of multiple Gaussian primitives. It is not too difficult to account for this, and we can finally wrap up our evaluation of overlap integrals with a function S(a,b) which returns the overlap integral between two contracted Gaussian functions.

def S(a,b):
    '''Evaluates overlap between two contracted Gaussians
       Returns float.
       Arguments:
       a: contracted Gaussian 'a', BasisFunction object
       b: contracted Gaussian 'b', BasisFunction object
    '''
    s = 0.0
    for ia, ca in enumerate(a.coefs):
        for ib, cb in enumerate(b.coefs):
            s += a.norm[ia]*b.norm[ib]*ca*cb*\
                     overlap(a.exps[ia],a.shell,a.origin,
                     b.exps[ib],b.shell,b.origin)
    return s

Basically, this is just a sum over primitive overlaps, weighted by normalization and coefficient. A word is in order for the arguments, however. In order to keep the number of arguments we have to pass into our functions, we have created BasisFunction objects that contain all the relevant data for the basis function, including exponents, normalization, etc. A BasisFunction class looks like

class BasisFunction(object):
    ''' A class that contains all our basis function data
        Attributes:
        origin: array/list containing the coordinates of the Gaussian origin
        shell:  tuple of angular momentum
        exps:   list of primitive Gaussian exponents
        coefs:  list of primitive Gaussian coefficients
        norm:   list of normalization factors for Gaussian primitives
    '''
    def __init__(self,origin=[0.0,0.0,0.0],shell=(0,0,0),exps=[],coefs=[]):
        self.origin = np.asarray(origin)
        self.shell = shell
        self.exps  = exps
        self.coefs = coefs
        self.norm = None
        self.normalize()

    def normalize(self):
        ''' Routine to normalize the basis functions, in case they
            do not integrate to unity.
        '''
        l,m,n = self.shell
        L = l+m+n
        # self.norm is a list of length equal to number primitives
        # normalize primitives first (PGBFs)
        self.norm = np.sqrt(np.power(2,2*(l+m+n)+1.5)*
                        np.power(self.exps,l+m+n+1.5)/
                        fact2(2*l-1)/fact2(2*m-1)/
                        fact2(2*n-1)/np.power(np.pi,1.5))

        # now normalize the contracted basis functions (CGBFs)
        # Eq. 1.44 of Valeev integral whitepaper
        prefactor = np.power(np.pi,1.5)*\
            fact2(2*l - 1)*fact2(2*m - 1)*fact2(2*n - 1)/np.power(2.0,L)

        N = 0.0
        num_exps = len(self.exps)
        for ia in range(num_exps):
            for ib in range(num_exps):
                N += self.norm[ia]*self.norm[ib]*self.coefs[ia]*self.coefs[ib]/\
                         np.power(self.exps[ia] + self.exps[ib],L+1.5)

        N *= prefactor
        N = np.power(N,-0.5)
        for ia in range(num_exps):
            self.coefs[ia] *= N

So, for example if we had a STO-3G Hydrogen 1s at origin (1.0, 2.0, 3.0), we could create a basis function object for it like so

myOrigin = [1.0, 2.0, 3.0]
myShell  = (0,0,0) # p-orbitals would be (1,0,0) or (0,1,0) or (0,0,1), etc.
myExps   = [3.42525091, 0.62391373, 0.16885540] 
myCoefs  = [0.15432897, 0.53532814, 0.44463454]
a = BasisFunction(origin=myOrigin,shell=myShell,exps=myExps,coefs=myCoefs)

Where we used the EMSL STO-3G definition

!  STO-3G  EMSL  Basis Set Exchange Library 
! Elements                             References
! --------                             ----------
!  H - Ne: W.J. Hehre, R.F. Stewart and J.A. Pople, J. Chem. Phys. 2657 (1969).

****
H     0 
S   3   1.00
      3.42525091             0.15432897       
      0.62391373             0.53532814       
      0.16885540             0.44463454       
****

So doing S(a,a) = 1.0, since the overlap of a basis function with itself (appropriately normalized) is one.

Kinetic energy integrals

Having finished the overlap integrals, we move on to the kinetic integrals. The kinetic energy integrals can be written in terms of overlap integrals

where

For a 3D primitive, we can form a kinetic function analogous to overlap,

def kinetic(a,lmn1,A,b,lmn2,B):
    ''' Evaluates kinetic energy integral between two Gaussians
        Returns a float.
        a:    orbital exponent on Gaussian 'a' (e.g. alpha in the text)
        b:    orbital exponent on Gaussian 'b' (e.g. beta in the text)
        lmn1: int tuple containing orbital angular momentum (e.g. (1,0,0))
              for Gaussian 'a'
        lmn2: int tuple containing orbital angular momentum for Gaussian 'b'
        A:    list containing origin of Gaussian 'a', e.g. [1.0, 2.0, 0.0]
        B:    list containing origin of Gaussian 'b'
    '''
    l1,m1,n1 = lmn1
    l2,m2,n2 = lmn2
    term0 = b*(2*(l2+m2+n2)+3)*\
                            overlap(a,(l1,m1,n1),A,b,(l2,m2,n2),B)
    term1 = -2*np.power(b,2)*\
                           (overlap(a,(l1,m1,n1),A,b,(l2+2,m2,n2),B) +
                            overlap(a,(l1,m1,n1),A,b,(l2,m2+2,n2),B) +
                            overlap(a,(l1,m1,n1),A,b,(l2,m2,n2+2),B))
    term2 = -0.5*(l2*(l2-1)*overlap(a,(l1,m1,n1),A,b,(l2-2,m2,n2),B) +
                  m2*(m2-1)*overlap(a,(l1,m1,n1),A,b,(l2,m2-2,n2),B) +
                  n2*(n2-1)*overlap(a,(l1,m1,n1),A,b,(l2,m2,n2-2),B))
    return term0+term1+term2

and for contracted Gaussians we make our final function T(a,b)

def T(a,b):
    '''Evaluates kinetic energy between two contracted Gaussians
       Returns float.
       Arguments:
       a: contracted Gaussian 'a', BasisFunction object
       b: contracted Gaussian 'b', BasisFunction object
    '''
    t = 0.0
    for ia, ca in enumerate(a.coefs):
        for ib, cb in enumerate(b.coefs):
            t += a.norm[ia]*b.norm[ib]*ca*cb*\
                     kinetic(a.exps[ia],a.shell,a.origin,\
                     b.exps[ib],b.shell,b.origin)
    return t

Nuclear attraction integrals

The last one-body integral I want to consider here is the nuclear attraction integrals. These differ from the overlap and kinetic energy integrals in that the nuclear attraction operator is Coulombic, meaning we cannot easily factor the integral into Cartesian components .

To evaluate these integrals, we need to set up an auxiliary Hermite Coulomb integral that handles the Coulomb interaction between a Gaussian charge distribution centered at and a nuclei centered at . The Hermite Coulomb integral, like its counterpart , is defined recursively:

where is the Boys function

which is a special case of the Kummer confluent hypergeometric function,

which is convenient for us, since SciPy has an implementation of as a part of scipy.special. So for R we can code up the recursion like so

def R(t,u,v,n,p,PCx,PCy,PCz,RPC):
    ''' Returns the Coulomb auxiliary Hermite integrals 
        Returns a float.
        Arguments:
        t,u,v:   order of Coulomb Hermite derivative in x,y,z
                 (see defs in Helgaker and Taylor)
        n:       order of Boys function 
        PCx,y,z: Cartesian vector distance between Gaussian 
                 composite center P and nuclear center C
        RPC:     Distance between P and C
    '''
    T = p*RPC*RPC
    val = 0.0
    if t == u == v == 0:
        val += np.power(-2*p,n)*boys(n,T)
    elif t == u == 0:
        if v > 1:
            val += (v-1)*R(t,u,v-2,n+1,p,PCx,PCy,PCz,RPC)
        val += PCz*R(t,u,v-1,n+1,p,PCx,PCy,PCz,RPC)
    elif t == 0:
        if u > 1:
            val += (u-1)*R(t,u-2,v,n+1,p,PCx,PCy,PCz,RPC)
        val += PCy*R(t,u-1,v,n+1,p,PCx,PCy,PCz,RPC)
    else:
        if t > 1:
            val += (t-1)*R(t-2,u,v,n+1,p,PCx,PCy,PCz,RPC)
        val += PCx*R(t-1,u,v,n+1,p,PCx,PCy,PCz,RPC)
    return val

and we can define our boys(n,T) function as

from scipy.special import hyp1f1

def boys(n,T):
    return hyp1f1(n+0.5,n+1.5,-T)/(2.0*n+1.0) 

There are other definitions of the Boys function of course, in case you do not want to use the SciPy built-in. Note that R requires knowledge of the composite center from two Gaussians centered at and . We can determine using the Gaussian product center rule

which is very simply coded up as

def gaussian_product_center(a,A,b,B):
    return (a*A+b*B)/(a+b)

Now that we have a the Coulomb auxiliary Hermite integrals , we can form the nuclear attraction integrals with respect to a given nucleus centered at , , via the expression

def nuclear_attraction(a,lmn1,A,b,lmn2,B,C):
    ''' Evaluates kinetic energy integral between two Gaussians
         Returns a float.
         a:    orbital exponent on Gaussian 'a' (e.g. alpha in the text)
         b:    orbital exponent on Gaussian 'b' (e.g. beta in the text)
         lmn1: int tuple containing orbital angular momentum (e.g. (1,0,0))
               for Gaussian 'a'
         lmn2: int tuple containing orbital angular momentum for Gaussian 'b'
         A:    list containing origin of Gaussian 'a', e.g. [1.0, 2.0, 0.0]
         B:    list containing origin of Gaussian 'b'
         C:    list containing origin of nuclear center 'C'
    '''
    l1,m1,n1 = lmn1 
    l2,m2,n2 = lmn2
    p = a + b
    P = gaussian_product_center(a,A,b,B) # Gaussian composite center
    RPC = np.linalg.norm(P-C)

    val = 0.0
    for t in range(l1+l2+1):
        for u in range(m1+m2+1):
            for v in range(n1+n2+1):
                val += E(l1,l2,t,A[0]-B[0],a,b) * \
                       E(m1,m2,u,A[1]-B[1],a,b) * \
                       E(n1,n2,v,A[2]-B[2],a,b) * \
                       R(t,u,v,0,p,P[0]-C[0],P[1]-C[1],P[2]-C[2],RPC)
    val *= 2*np.pi/p 
    return val

And, just like all the other routines, we can wrap it up to treat contracted Gaussians like so:

def V(a,b,C):
    '''Evaluates overlap between two contracted Gaussians
       Returns float.
       Arguments:
       a: contracted Gaussian 'a', BasisFunction object
       b: contracted Gaussian 'b', BasisFunction object
       C: center of nucleus
    '''
    v = 0.0
    for ia, ca in enumerate(a.coefs):
        for ib, cb in enumerate(b.coefs):
            v += a.norm[ia]*b.norm[ib]*ca*cb*\
                     nuclear_attraction(a.exps[ia],a.shell,a.origin,
                     b.exps[ib],b.shell,b.origin,C)
    return v

Important: Note that this is the nuclear repulsion integral contribution from an atom centered at . To get the full nuclear attraction contribution, you must sum over all the nuclei, as well as scale each term by the appropriate nuclear charge!

Two electron repulsion integrals

We are done with the necessary one-body integrals (for a basic Hartree-Fock energy code, at least) and are ready to move on to the two-body terms: the electron-electron repulsion integrals. Thankfully, much of the work has been done for us on account of the nuclear attraction one-body integrals.

In terms of Hermite integrals, to evaluate the two electron repulsion terms, we must evaluate the summation

which looks terrible and it is. However, recalling that letting (that is, the Gaussian exponents on and , and and ), we can write the equation in a similar form to the nuclear attraction integrals

def electron_repulsion(a,lmn1,A,b,lmn2,B,c,lmn3,C,d,lmn4,D):
    ''' Evaluates kinetic energy integral between two Gaussians
        Returns a float.
        a,b,c,d:   orbital exponent on Gaussian 'a','b','c','d'
        lmn1,lmn2
        lmn3,lmn4: int tuple containing orbital angular momentum
                   for Gaussian 'a','b','c','d', respectively
        A,B,C,D:   list containing origin of Gaussian 'a','b','c','d'
    '''
    l1,m1,n1 = lmn1
    l2,m2,n2 = lmn2
    l3,m3,n3 = lmn3
    l4,m4,n4 = lmn4
    p = a+b # composite exponent for P (from Gaussians 'a' and 'b')
    q = c+d # composite exponent for Q (from Gaussians 'c' and 'd')
    alpha = p*q/(p+q)
    P = gaussian_product_center(a,A,b,B) # A and B composite center
    Q = gaussian_product_center(c,C,d,D) # C and D composite center
    RPQ = np.linalg.norm(P-Q)

    val = 0.0
    for t in range(l1+l2+1):
        for u in range(m1+m2+1):
            for v in range(n1+n2+1):
                for tau in range(l3+l4+1):
                    for nu in range(m3+m4+1):
                        for phi in range(n3+n4+1):
                            val += E(l1,l2,t,A[0]-B[0],a,b) * \
                                   E(m1,m2,u,A[1]-B[1],a,b) * \
                                   E(n1,n2,v,A[2]-B[2],a,b) * \
                                   E(l3,l4,tau,C[0]-D[0],c,d) * \
                                   E(m3,m4,nu ,C[1]-D[1],c,d) * \
                                   E(n3,n4,phi,C[2]-D[2],c,d) * \
                                   np.power(-1,tau+nu+phi) * \
                                   R(t+tau,u+nu,v+phi,0,\
                                       alpha,P[0]-Q[0],P[1]-Q[1],P[2]-Q[2],RPQ)

    val *= 2*np.power(np.pi,2.5)/(p*q*np.sqrt(p+q))
    return val

And, for completeness’ sake, we wrap the above to handle contracted Gaussians

def ERI(a,b,c,d):
    '''Evaluates overlap between two contracted Gaussians
        Returns float.
        Arguments:
        a: contracted Gaussian 'a', BasisFunction object
        b: contracted Gaussian 'b', BasisFunction object
        c: contracted Gaussian 'b', BasisFunction object
        d: contracted Gaussian 'b', BasisFunction object
    '''
    eri = 0.0
    for ja, ca in enumerate(a.coefs):
        for jb, cb in enumerate(b.coefs):
            for jc, cc in enumerate(c.coefs):
                for jd, cd in enumerate(d.coefs):
                    eri += a.norm[ja]*b.norm[jb]*c.norm[jc]*d.norm[jd]*\
                             ca*cb*cc*cd*\
                             electron_repulsion(a.exps[ja],a.shell,a.origin,\
                                                b.exps[jb],b.shell,b.origin,\
                                                c.exps[jc],c.shell,c.origin,\
                                                d.exps[jd],d.shell,d.origin)
    return eri

And there you have it! All the integrals necessary for a Hartree-Fock SCF code.

Computational efficiency considerations

Our goal here has been to eliminate some of the confusion when it comes to connecting mathematics to actual computer code. So the code that we have shown is hopefully clear and looks nearly identical to the mathematical equations they are supposed to represent. This is one reason we chose Python as the code of choice to implement the integrals. It emphasizes readability.

If you use the code as is, you’ll find that you can only really handle small systems. To that end, I’ll give a few ideas on how to improve the integral code to actually be usable.

First, I would recommend not using pure Python. The problem is that we have some pretty deep loops in the code, and nested for loops will kill your speed if you insist on sticking with Python. Now, you can code in another language, but I would suggest rewriting some of the lower level routines with Cython (http://www.cython.org). Cython statically compiles your code to eliminate many of the Python calls that slow your loops down. In my experience, you will get several orders of magnitude speed up. This brings me to my second point. One of the problems with Python is that you have the global interpreter lock (GIL) which basically means you cannot run things in parallel. Many of the integral evaluation routines would do well if you could split the work up over multiple CPUs. If you rewrite some of the low level routines such that they do not make Python calls anymore, you can turn off the GIL and wrap the code in OpenMP directives, or even use Cython’s cython.parallel module. This will take some thought, but can definitely be done. Furthermore, removing the explicit recursion in the E and R functions and making them iterative would go a long way to speed up the code.

A couple other thoughts: be sure to exploit the permutational symmetry of the integrals. The two electron repulsion integrals, for example, can be done in of the time just by exploiting these symmetries, which are unrelated to point group. Also, you can exploit many of the integral screening routines, since many of the two electron integrals are effectively zero. There are a lot of tricks out there in the literature, go check them out!