
What does integration do?

Integration is to calculate XRD data from diffraction image. There are four steps in PDFstream integration: background subtraction, auto masking, histogram calculation with corrections, and visualization.

The background subtraction is to subtract the diffraction image from the sample by the diffraction image of the background multiplied by a scalar which is one in default. The background diffraction image is usually the diffraction of air and container of the sample. Two images must have the same dimension.

The auto masking is to mask the background subtracted image automatically. In default settings, the pixels at the margin of the image, the pixels whose intensity is below the low threshold or above the high threshold and the pixels whose intensity is too far away from the median value of the pixels in a ring will be masked out. The masked pixles will not be counted in the histogram calculation.

The histogram calculation is to calculate a histogram of intensities on the pixels. The pixels are binned in rings and the mean value will be calculated for the bins. The rings will be mapped to the momentum transfer value, two theta value or radius according to users’ settings. The result will be the XRD data and it will be saved in .chi files. This step is based on the pyFAI.azimuthalIntegrator. Before the histogram, polarization correction and other processes will be done according to the settings.

The visualization is to show the masked background subtracted image and the result of the histogram. Users can tune the visualization settings to achieve their desired effects.

How to do integration?

Here shows the python example how to do the integration. Since there is a one-to-one relation ship between the python function in pdfstream and the command line, the same tasks can be done using the command line.

Import the function to start.

from pdfstream.cli import integrate

A simple integration

For example, we are going to calculate XRD data I(Q) using diffraction image “sample_diffraction.tiff”. We have already done the calibration and gotten the .poni file “geometry.poni”.

We run the following line to calculate XRD data I(Q).


After it finishes, we will find a file “sample_diffraction.chi” in the same folder where we run the script.

If we would like to integrate another image “another_sample_diffraction.tiff” using the same .poni file.


We can add an arbitrary number of image files after the first argument. The configuration for the integration is all in the key word arguments described in the following sections.

Output directory

If we would like to output the files in a directory called data, we can use key output_dir.


If the folder data doesn’t exist, it will be created.

Background subtraction

Continuing with the last example, we would like to subtract the background scattering from the air and the container of our sample and the scattering is measured and saved in the “background_diffraction.tiff”. We run the following line. Remember that all the arguments except .poni file and image files muse be key word arguments.


If the background image is measured using a 10 times stronger beam intensity, we can use bg_scale to scale the background image.


Auto masking

In default, the auto masking is applied using the default setting.

If we would like to tune the setting, we can use the key mask_setting

        "alpha": 1.5,
        "lower_thresh": 1.,
        "upper_thresh": 1e5,
        "edge": 50

If we would like to use our own mask “user_mask.npy” overlapping with the auto generated mask, we can use the key mask_file.


Note that PDFstream use the pyFAI convention of masking. The mask is an array of integers. The 0 pixels are good and the 1 pixels are bad which will be masked out.

If we don’t want the auto masking, we can set the mask_setting to "OFF"


This will allow us to use our own mask. Also, we can run without any masks using the following line.


Histogram Calculation

In default, the histogram calculation is applied using the default setting.

The configuration can be tuned by the key integ_setting. An example below shows how to tune the configuration to calculate a histogram of I(2theta) with 2048 points using the numpy method.

        "npt": 2048,
        "unit": "2th_deg",
        "method": "numpy"

For details of the configuration, please see pyFAI


In default, the visualization configuration is applied using the default setting.

We can use the key img_setting to tune how the image is shown. The keys are the same as those of matplotlib.axes.Axes.matshow. An additional key is z_score. It determines the maximum and minimum values for the color map. The color map is determined by vmin = mean - z_score * std, vamx = mean + z_score * std, where mean is the mean value of the image, std is the standard deviation of the image. If we would like to show image in a large constrast, we can tune down the z_score to 1 for example.

    img_setting={'z_score': 1}

We can use the key plot_setting to tune how the result of integration is shown. The keys are the same as those of the matplotlib.axes.Axes.plot. For example, we would like to plot a line with green circles.

    plot_setting={'marker': 'o', 'color': 'green'}

Both of the key img_setting and plot_setting can be set to OFF to skip the visualization steps.


Parallel Computing

The integrate supports parallel computing for multiple images. If we would like to use the parallel computing for the integration for a long list of images, we can use the key parallel.


The efficiency depends on how many cores our machine has. It is recommended to turn off the visualization if there are a large number of images.