ptyrax.dataset

Contents

ptyrax.dataset#

Functions

add_constant_sample_shift(ptychogram, ...)

Add a constant offset to all sample orientation representations.

add_gaussian_noise(ptychogram, noise_mean, ...)

Add Gaussian noise to the diffraction patterns and clamp to non- negative.

add_poisson_noise(ptychogram[, ...])

Add Poisson-distributed shot noise to the diffraction patterns.

amplitude_to_intensity(ptychogram)

Convert amplitude-valued diffraction patterns to intensity by squaring.

apply_orientation(ptychogram[, orientation, ...])

Apply a geometric orientation transformation to diffraction patterns.

center_scan_positions(ptychogram)

Translate sample positions so their centroid is at the origin.

clip_low_intensity(ptychogram[, pixel_ratio])

Set pixels below a percentile threshold to zero.

cut_center(ptychogram[, ratio])

Crop each diffraction pattern to a central sub-region.

exclude_positions_by_distance(ptychogram[, ...])

Remove scan positions outside a distance range from the mean position.

experiment_folder_to_ptychogram_cxi(...[, ...])

Convert a raw experiment folder into a CXI-format HDF5 file.

experiment_folder_to_ptychogram_hdf5(...[, ...])

Convert a raw experiment folder into a flat ptychogram HDF5 file.

fftshift_ptychogram(ptychogram)

Apply an FFT-shift to all diffraction patterns along the spatial axes.

find_missing_folders(darkframe_folder, ...)

Auto-detect missing folder and file paths from the experiment layout.

flip_scan_axis(ptychogram, axis)

Negate a component of all sample positions.

flip_scan_x(ptychogram)

Negate the x-component of all sample positions.

flip_scan_y(ptychogram)

Negate the y-component of all sample positions.

flip_scanning_positions(ptychogram)

Reverse the order of coordinates within each sample position vector.

from_cxi(cxi_path[, background_path])

Loads a cxi file (as created by experiment_folder_to_cxi()) and returns a Ptychogram object.

from_hdf5(ptychogram_path[, key_converter, ...])

Load a Ptychogram from an HDF5 or remote URL, autodetecting format.

intensity_to_amplitude(ptychogram)

Convert intensity-valued diffraction patterns to amplitude by taking the square root.

load_raw_images(folder_path[, precision, ...])

Load all images from a folder into a single numpy array.

load_spe_from_files(filepaths)

Allows user to load multiple files at once.

make_constant_tilt_angle(ptychogram, tilt_angle)

Set a uniform sample tilt angle and recompute all geometry accordingly.

make_multiwavelength(ptychogram[, ...])

Tile the wavelength array to simulate multi-wavelength illumination.

mirror_coordinates(ptychogram)

Mirror the geometry about the y-z plane (negate x-coordinates).

non_negative(ptychogram)

Clamp negative pixel values in the diffraction patterns to zero.

normalize_by_max(ptychogram[, new_max])

Rescale diffraction patterns so the global maximum equals new_max.

normalize_by_mean(ptychogram)

Rescale diffraction patterns so their global mean value becomes unity.

normalize_by_mean_intensity(ptychogram)

Rescale diffraction patterns by the mean of the per-pattern L2 norms.

old_tud_key_converter(key)

plot_dataset_dynamic_range(dataset, output_path)

Plot and save a pixel intensity histogram for the dataset.

propagation_distance_to_full_position(n, local)

Convert a scalar or per-position propagation distance to 3-D detector positions.

quantize_diffraction_patterns(ptychogram[, ...])

Quantize diffraction patterns to simulate a finite bit-depth detector.

read_at(file, pos, size, ntype)

Reads SPE source file at specific byte position.

read_excel_columns(excel_path[, ...])

Read columns from an Excel file by matching column headers with regex patterns.

read_excel_scan_pos(excel_path[, precision])

Read scan positions from an Excel file using fixed column indices.

read_image(file_path[, precision])

Read an image file, dispatching to the appropriate reader by extension.

read_mat_metadata(file_path)

Load and simplify the metaData struct from a MATLAB .mat file.

read_mat_scan_pos(file_path, **kwargs)

Read scan positions from a MATLAB .mat metadata file.

read_png(file_path[, precision])

Read a PNG image file and return it as a numpy array.

read_scan_pos_file(scan_pos_file, **kwargs)

Dispatch scan position loading based on file extension.

read_spe(file_path[, precision])

Read a Princeton Instruments SPE file and return its image data.

remove_zeros(ptychogram)

Replace exact-zero pixels with the minimum non-zero value.

save_all_in_hdf5(data_list, ...)

Save multiple arrays to a single HDF5 file.

scale(ptychogram, scale)

Multiply all diffraction pattern values by a constant factor.

scale_camera_distance(ptychogram, scale)

Multiply all detector positions by a constant factor.

scale_diffraction_pattern_maximum(...)

Rescale diffraction patterns so the global maximum equals maximum.

scale_length_unit(ptychogram[, scale])

Multiply all length quantities by a constant factor.

scale_scan_positions(ptychogram, scale)

Scale sample positions by a per-axis factor in the local sample frame.

scale_wavelength(ptychogram, scale)

Multiply the wavelength array by a constant factor.

set_constant_detector_orientations(...)

Set all detector orientations to a single constant rotation.

set_constant_detector_positions(ptychogram, ...)

Set all detector positions to a single constant value.

set_constant_sample_orientations(ptychogram, ...)

Set all sample orientations to a single constant rotation.

shift_to_center_of_mass(ptychogram[, order])

Shift all diffraction patterns so the intensity center-of-mass is at the array center.

sort_images_by_timestamp(image_paths)

Sort image file paths chronologically by embedded timestamp.

standardize_hdf5_shapes(n, local)

Normalize raw HDF5 data fields to the canonical shapes expected by Ptychogram.

subtract_background(ptychogram, background_path)

Subtract a background image loaded from file from all diffraction patterns.

subtract_low_intensity(ptychogram[, pixel_ratio])

Subtract a percentile-based threshold from all patterns and clamp to zero.

wavelength_units(ptychogram)

Rescale all length quantities so that the first wavelength becomes unity.

Classes

ImageDataset()

Ptychogram(diffraction_patterns, pixel_size, ...)

Core dataset class for ptychographic reconstruction experiments.

SimpleImageDataset(images)

A simple implementation of ImageDataset that just wraps a single array of images.

SpeFile(filepath)

Reader for Princeton Instruments SPE v3.x format files.

class ptyrax.dataset.ImageDataset[source]#

Bases: ABC

property image_shape: int#

The shape of a single image in the dataset.

Returns:

The shape of a single image.

Return type:

tuple[int, int]

abstract property images: Shaped[Array, 'd m n']#

Gets all images in the dataset as a single array.

Returns:

The full (d, m, n) array of images.

Return type:

ArrayLike

classmethod load(path)[source]#

A function to load the dataset from disk. By default, loads all fields of the dataclass from hdf5. If the dataset contains fields which are not supported by hdf5, this method should be overridden.

Parameters:

path (pathlib.Path) – The path where to load the dataset from.

Return type:

ImageDataset

property n: int#

The number of positions in the dataset.

Returns:

The number of positions in the dataset

Return type:

int

save(path)[source]#

A function to save the dataset to disk. By default, saves all fields of the dataclass to hdf5. If the dataset contains fields which are not supported by hdf5, this method should be overridden.

Parameters:

path (pathlib.Path) – The path where to save the dataset.

Return type:

None

abstractmethod to_gpu()[source]#
Return type:

None

class ptyrax.dataset.Ptychogram(diffraction_patterns, pixel_size, sample_positions, sample_orientations, propagation_distance, wavelength, detector_positions, detector_orientations, loaded_from='Not specified', diffraction_pattern_scale=1.0, detector_darkframe=None, mask=None)[source]#

Bases: ImageDataset

Core dataset class for ptychographic reconstruction experiments.

A Ptychogram holds diffraction patterns along with the full geometric metadata (scan positions, orientations, detector geometry, wavelength) needed for forward-model-based reconstruction.

Coordinates follow the CXI convention: z is along the incoming beam, y is vertical. 2-D arrays are indexed (x, y) with 'ij' indexing.

Parameters:
  • diffraction_patterns (Integer[Array, 'n h w'] | Float[Array, 'n h w']) – Measured intensity or amplitude patterns with shape (n, h, w).

  • pixel_size (Float[Array, '2']) – Detector pixel pitch as [dx, dy].

  • sample_positions (Float[Array, 'n 3']) – Per-position sample translation vectors (n, 3) in global coordinates.

  • sample_orientations (Float[Array, 'n 6']) – Per-position 6-D orientation representations (n, 6).

  • propagation_distance (Float[Array, 'n']) – Per-position sample-to-detector distance (n,).

  • wavelength (Float[Array, 'm']) – One or more illumination wavelengths (m,).

  • detector_positions (Float[Array, 'n 3']) – Per-position detector translation (n, 3).

  • detector_orientations (Float[Array, 'n 6']) – Per-position detector orientation (n, 6).

  • loaded_from (str) – Human-readable string indicating the data source.

  • diffraction_pattern_scale (float) – Cumulative scaling factor applied to patterns.

  • detector_darkframe (ndarray) – Detector dark-current image subtracted during preprocessing.

  • mask (Array | ndarray | bool | number | bool | int | float | complex | LiteralArray | None) – Optional boolean or float mask for invalid detector pixels.

Example

>>> from ptyrax.dataset import from_hdf5
>>> ptychogram = from_hdf5("data/lenspaper.hdf5")
>>> ptychogram.n
256
__plot__(*args, **kwargs)[source]#
Return type:

None

batch(batch_size=1, shuffle_mode='random')[source]#

Yield batches of diffraction patterns and their indices.

Parameters:
  • batch_size (int) – Number of samples per batch.

  • shuffle_mode (str) – One of ‘random’, ‘by_distance’, or ‘clustered’ to select batching order.

Returns:

Generator yielding a tuple of (indices, diffraction_pattern_batch).

Return type:

Generator[tuple[ndarray, ndarray], None, None]

detector_darkframe: ndarray = None#
detector_orientations: Float[Array, 'n 6']#
detector_positions: Float[Array, 'n 3']#
diffraction_pattern_scale: float = 1.0#
diffraction_patterns: Integer[Array, 'n h w'] | Float[Array, 'n h w']#
property image_shape: int#

The shape of a single image in the dataset.

Returns:

The shape of a single image.

Return type:

tuple[int, int]

property images: Integer[Array, 'n h w'] | Float[Array, 'n h w']#

Gets all images in the dataset as a single array.

Returns:

The full (d, m, n) array of images.

Return type:

ArrayLike

classmethod load(path)#

A function to load the dataset from disk. By default, loads all fields of the dataclass from hdf5. If the dataset contains fields which are not supported by hdf5, this method should be overridden.

Parameters:

path (pathlib.Path) – The path where to load the dataset from.

Return type:

ImageDataset

classmethod load_from(path)[source]#
Parameters:

path (Path)

Return type:

Ptychogram

loaded_from: str = 'Not specified'#
mask: Array | ndarray | bool | number | bool | int | float | complex | LiteralArray | None = None#
property n: int#

The number of positions in the dataset.

Returns:

The number of positions in the dataset

Return type:

int

property pixel_number: tuple[int, ...]#

The number of pixels along each spatial dimension of a single diffraction pattern.

Returns:

A tuple (height, width) giving the detector pixel count.

Return type:

tuple[int, …]

pixel_size: Float[Array, '2']#
propagation_distance: Float[Array, 'n']#
sample_orientations: Float[Array, 'n 6']#
sample_positions: Float[Array, 'n 3']#
save(path)[source]#

A function to save the dataset to disk. By default, saves all fields of the dataclass to hdf5. If the dataset contains fields which are not supported by hdf5, this method should be overridden.

Parameters:

path (pathlib.Path) – The path where to save the dataset.

Return type:

None

to_cxi(cxi_path)[source]#

Saves a Ptychogram object into a .cxi file (inverse of from_cxi()).

Parameters:

cxi_path (str)

Return type:

None

to_gpu()[source]#
Return type:

None

to_hdf5(output_path)[source]#

Serialize the ptychogram to a flat HDF5 file.

Each field is stored as a top-level dataset inside the file. The inverse operation is from_hdf5().

Parameters:

output_path (str) – Filesystem path for the output .h5 / .hdf5 file.

Return type:

None

wavelength: Float[Array, 'm']#
class ptyrax.dataset.SimpleImageDataset(images)[source]#

Bases: ImageDataset

A simple implementation of ImageDataset that just wraps a single array of images.

This can be used for simple cases where no additional metadata is needed.

Parameters:

images (Array | ndarray | bool | number | bool | int | float | complex | LiteralArray)

property image_shape: int#

The shape of a single image in the dataset.

Returns:

The shape of a single image.

Return type:

tuple[int, int]

property images: Shaped[Array, 'd m n']#

Gets all images in the dataset as a single array.

Returns:

The full (d, m, n) array of images.

Return type:

ArrayLike

classmethod load(path)#

A function to load the dataset from disk. By default, loads all fields of the dataclass from hdf5. If the dataset contains fields which are not supported by hdf5, this method should be overridden.

Parameters:

path (pathlib.Path) – The path where to load the dataset from.

Return type:

ImageDataset

property n: int#

The number of positions in the dataset.

Returns:

The number of positions in the dataset

Return type:

int

save(path)#

A function to save the dataset to disk. By default, saves all fields of the dataclass to hdf5. If the dataset contains fields which are not supported by hdf5, this method should be overridden.

Parameters:

path (pathlib.Path) – The path where to save the dataset.

Return type:

None

to_gpu()[source]#
Return type:

None

class ptyrax.dataset.SpeFile(filepath)[source]#

Bases: object

Reader for Princeton Instruments SPE v3.x format files.

Parses the binary header, XML footer, and raw image data from SPE files produced by Princeton Instruments cameras (e.g. via LightField software). Supports multiple frames and regions of interest.

Parameters:

filepath (str) – Path to the .spe file to read.

Variables:
  • filepath – Path to the source file.

  • header_version – SPE format version number from the binary header.

  • nframes – Number of frames in the file.

  • footer – Parsed XML footer as an untangle.Element tree.

  • dtype – Numpy dtype of the stored image data.

  • xdim – List of x-dimensions for each region of interest.

  • ydim – List of y-dimensions for each region of interest.

  • roi – List of region-of-interest metadata elements.

  • nroi – Number of regions of interest.

  • wavelength – Wavelength calibration array (if available).

  • data – Nested list [frame][roi] of image arrays.

  • metadata – Per-frame metadata array or None.

  • metanames – List of metadata field names or None.

Raises:

ValueError – If filepath is not a string or the SPE version is < 3.0.

ptyrax.dataset.add_constant_sample_shift(ptychogram, constant_shift)[source]#

Add a constant offset to all sample orientation representations.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • constant_shift (Array | ndarray | bool | number | bool | int | float | complex | LiteralArray) – 6-element array added element-wise to each sample orientation vector.

Returns:

The ptychogram with shifted sample orientations.

Return type:

Ptychogram

ptyrax.dataset.add_gaussian_noise(ptychogram, noise_mean, noise_variance)[source]#

Add Gaussian noise to the diffraction patterns and clamp to non- negative.

Draws from a normal distribution with the specified mean and standard deviation and adds it element-wise. Resulting negative values are clipped to zero.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • noise_mean (float) – Mean of the Gaussian noise distribution.

  • noise_variance (float) – Standard deviation of the Gaussian noise.

Returns:

The ptychogram with Gaussian noise added.

Return type:

Ptychogram

ptyrax.dataset.add_poisson_noise(ptychogram, photons_per_count=None, total_photon_count=None, total_power=None, wavelength=None, exposure_time=None, diffraction_pattern_normalized=True)[source]#

Add Poisson-distributed shot noise to the diffraction patterns.

Scales patterns to photon counts, draws Poisson samples, and rescales back. Exactly one of the following must be specified to determine the noise level:

  • photons_per_count: direct conversion factor from pixel value to expected photon count.

  • total_photon_count: total number of photons across all patterns.

  • total_power with wavelength and exposure_time: computes total photon count from physical beam parameters.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • photons_per_count (float) – Conversion factor from detector counts to photons.

  • total_photon_count (float) – Total number of photons summed over all patterns.

  • total_power (float) – Beam power in watts (requires wavelength and exposure_time).

  • wavelength (float) – Wavelength in metres (used with total_power).

  • exposure_time (float) – Exposure time in seconds (used with total_power).

  • diffraction_pattern_normalized (bool) – If True, assume patterns are already normalized when computing scale from total_power.

Returns:

The ptychogram with Poisson noise applied.

Raises:

ValueError – If none of the scaling parameters are specified.

Return type:

Ptychogram

ptyrax.dataset.amplitude_to_intensity(ptychogram)[source]#

Convert amplitude-valued diffraction patterns to intensity by squaring.

Applies element-wise squaring to the diffraction patterns.

Parameters:

ptychogram (Ptychogram) – Input ptychogram with amplitude values.

Returns:

The ptychogram with intensity-valued patterns.

Return type:

Ptychogram

ptyrax.dataset.apply_orientation(ptychogram, orientation=0, darkframe_orientation=None)[source]#

Apply a geometric orientation transformation to diffraction patterns.

The orientation code follows the convention:

  • 0: identity

  • 1: flip along y (last axis)

  • 2: flip along x (second-to-last axis)

  • 3: flip both axes

  • 4: transpose x and y

  • 5: transpose then flip y

  • 6: transpose then flip x

  • 7: transpose then flip both

Also transforms the mask, darkframe, and pixel_size accordingly.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • orientation (Literal[0, 1, 2, 3, 4, 5, 6, 7]) – Integer code (0–7) specifying the desired transformation.

  • darkframe_orientation (Literal[0, 1, 2, 3, 4, 5, 6, 7]) – Separate orientation for the darkframe. Defaults to the same value as orientation.

Returns:

The ptychogram with reoriented diffraction patterns.

Raises:

ValueError – If orientation is not in the range 0–7.

Return type:

Ptychogram

ptyrax.dataset.center_scan_positions(ptychogram)[source]#

Translate sample positions so their centroid is at the origin.

Subtracts the mean of all sample positions, centering the scan around (0, 0, 0).

Parameters:

ptychogram (Ptychogram) – Input ptychogram.

Returns:

The ptychogram with zero-mean sample positions.

Return type:

Ptychogram

ptyrax.dataset.clip_low_intensity(ptychogram, pixel_ratio=0.9)[source]#

Set pixels below a percentile threshold to zero.

Computes the pixel_ratio * 100-th percentile of all pixel values and zeros out any pixel at or below that threshold. This removes low-intensity background while preserving the bright signal.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • pixel_ratio (float) – Fraction (0–1) of pixels to zero out, specified as a percentile threshold.

Returns:

The ptychogram with low-intensity pixels clipped to zero.

Return type:

Ptychogram

ptyrax.dataset.cut_center(ptychogram, ratio=0.5)[source]#

Crop each diffraction pattern to a central sub-region.

Keeps a fraction ratio of the total extent around the center along each spatial dimension.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • ratio (float) – Fraction (0–1) of the original extent to retain in each dimension.

Returns:

The ptychogram with cropped diffraction patterns.

Return type:

Ptychogram

ptyrax.dataset.exclude_positions_by_distance(ptychogram, min_distance=-1.0, max_distance=9999.0)[source]#

Remove scan positions outside a distance range from the mean position.

Filters out positions whose Euclidean distance from the centroid of all sample positions falls below min_distance or above max_distance. Corresponding diffraction patterns and orientations are also removed.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • min_distance (float) – Minimum distance from the mean position to keep.

  • max_distance (float) – Maximum distance from the mean position to keep.

Returns:

The ptychogram with outlier positions removed.

Return type:

Ptychogram

ptyrax.dataset.experiment_folder_to_ptychogram_cxi(output_path, experiment_folder, extra_fields, darkframe_folder=None, raw_folder=None, frames_folder=None, scan_pos_file=None, use_raw=False, filename_filter_fn=<function <lambda>>, column_matcher={'phi': 'AOI \\[.*', 'phi_prime': 'CamRot.*', 'theta': 'AZI \\[.*', 'x': 'x \\[.*', 'y': 'y \\[.*', 'z': 'z \\[.*'}, calibration={'PHI_C_OFF': 9.9146, 'PHI_OFF': 5.837, 'PHI_ZSTAGE': -84.8, 'Z_BEAM_ALIGNED': 0.00277})[source]#

Convert a raw experiment folder into a CXI-format HDF5 file.

Reads raw images and scan position metadata from a TU Delft experiment folder layout, computes full 3-D sample/detector geometry using the provided calibration parameters, and writes a standards-compliant CXI file.

Parameters:
  • output_path (str) – Output path for the .cxi file.

  • experiment_folder (str) – Root folder of the experiment data.

  • extra_fields (dict) – Dictionary containing at least camera_pixel_size, propagation_distance, and wavelength.

  • darkframe_folder (str | None) – Path to the darkframe images folder. Auto-detected if None.

  • raw_folder (str | None) – Path to raw image folder. Auto-detected if None.

  • frames_folder (str | None) – Path to processed frames folder. Auto-detected if None.

  • scan_pos_file (str | None) – Path to scan position file (.xlsx or .mat). Auto-detected if None.

  • use_raw (bool) – If True, read from raw_folder instead of frames_folder.

  • filename_filter_fn (Callable) – Filter function applied to image filenames.

  • column_matcher (Callable) – Regex patterns for matching scan position columns.

  • calibration (dict) – Dictionary of instrument calibration parameters (offsets and geometry constants).

Return type:

None

ptyrax.dataset.experiment_folder_to_ptychogram_hdf5(output_ptychogram_hdf5, experiment_folder, extra_fields, darkframe_folder=None, raw_folder=None, scan_pos_file=None, filter=None)[source]#

Convert a raw experiment folder into a flat ptychogram HDF5 file.

A simpler alternative to experiment_folder_to_ptychogram_cxi() that writes a flat HDF5 file loadable by from_hdf5().

Parameters:
  • output_ptychogram_hdf5 (str) – Output path for the .hdf5 file.

  • experiment_folder (str) – Root folder of the experiment data.

  • extra_fields (dict) – Dictionary containing at least tilt_angle, camera_pixel_size, propagation_distance, and wavelength.

  • darkframe_folder (str | None) – Path to darkframe folder. Auto-detected if None.

  • raw_folder (str | None) – Path to raw image folder. Auto-detected if None.

  • scan_pos_file (str | None) – Path to scan position file. Auto-detected if None.

  • filter (Callable | None) – Optional filename filter function.

Raises:

KeyError – If required fields are missing from extra_fields.

Return type:

None

ptyrax.dataset.fftshift_ptychogram(ptychogram)[source]#

Apply an FFT-shift to all diffraction patterns along the spatial axes.

Swaps quadrants so that the zero-frequency component moves to the center of each pattern. This is required when raw detector data stores the DC component in the corner.

Parameters:

ptychogram (Ptychogram) – Input ptychogram with unshifted patterns.

Returns:

The same ptychogram with shifted diffraction_patterns.

Return type:

Ptychogram

ptyrax.dataset.find_missing_folders(darkframe_folder, experiment_folder, extra_fields, frames_folder, raw_folder, scan_pos_file)[source]#

Auto-detect missing folder and file paths from the experiment layout.

Inspects the experiment_folder for standard sub-directories (frames, RAW, darkframe) and scan position files (.xlsx, .mat) to fill in any arguments left as None.

Parameters:
  • darkframe_folder (str | None) – Explicit darkframe folder path or None to auto-detect.

  • experiment_folder (str | None) – Root folder of the experiment.

  • extra_fields (dict) – Dictionary of extra metadata fields (passed through).

  • frames_folder (str | None) – Explicit frames folder or None.

  • raw_folder (str | None) – Explicit raw folder or None.

  • scan_pos_file (str | None) – Explicit scan position file or None.

Returns:

Tuple of (darkframe_folder, extra_fields, frames_folder, raw_folder, scan_pos_file) with resolved paths.

Raises:

NotADirectoryError – If darkframe folder cannot be found.

Return type:

tuple[str, dict, str, str, str]

ptyrax.dataset.flip_scan_axis(ptychogram, axis)[source]#

Negate a component of all sample positions.

Mirrors the scan pattern along the specified axis.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • axis (int) – Index of the axis to flip (0 for x, 1 for y, 2 for z).

Returns:

The ptychogram with the specified axis flipped.

Raises:

ValueError – If axis is not 0, 1, or 2.

Return type:

Ptychogram

ptyrax.dataset.flip_scan_x(ptychogram)[source]#

Negate the x-component of all sample positions.

Parameters:

ptychogram (Ptychogram)

Return type:

Ptychogram

ptyrax.dataset.flip_scan_y(ptychogram)[source]#

Negate the y-component of all sample positions.

Parameters:

ptychogram (Ptychogram)

Return type:

Ptychogram

ptyrax.dataset.flip_scanning_positions(ptychogram)[source]#

Reverse the order of coordinates within each sample position vector.

Flips the last axis of sample_positions, effectively swapping the x and z coordinates (with y in between).

Parameters:

ptychogram (Ptychogram) – Input ptychogram.

Returns:

The ptychogram with flipped sample position vectors.

Return type:

Ptychogram

ptyrax.dataset.from_cxi(cxi_path, background_path=None)[source]#

Loads a cxi file (as created by experiment_folder_to_cxi()) and returns a Ptychogram object.

Parameters:
  • cxi_path (str)

  • background_path (str)

Return type:

Ptychogram

ptyrax.dataset.from_hdf5(ptychogram_path, key_converter=<function _convert_old_key_names>, convert_to_standard=True)[source]#

Load a Ptychogram from an HDF5 or remote URL, autodetecting format.

This function accepts local paths or HTTP(S) URLs. If given a URL the file is downloaded to a temporary file and inspected. The loader will dispatch to the appropriate sub-loader based on file attributes and keys.

Parameters:
  • ptychogram_path (str) – Local filesystem path or HTTP(S) URL to the dataset.

  • key_converter (callable) – Function to normalize legacy key names.

  • convert_to_standard (bool) – If True, coerce legacy files to canonical fields.

Returns:

The loaded ptychogram object.

Return type:

Ptychogram

ptyrax.dataset.intensity_to_amplitude(ptychogram)[source]#

Convert intensity-valued diffraction patterns to amplitude by taking the square root.

Applies sqrt element-wise to both the diffraction patterns and the detector darkframe.

Parameters:

ptychogram (Ptychogram) – Input ptychogram with intensity values.

Returns:

The ptychogram with amplitude-valued patterns.

Return type:

Ptychogram

ptyrax.dataset.load_raw_images(folder_path, precision=None, filter_fn=<function <lambda>>)[source]#

Load all images from a folder into a single numpy array.

Images are sorted by timestamp (or filename as fallback) before stacking.

Parameters:
  • folder_path (str) – Path to the directory containing image files.

  • precision (type) – Numpy dtype to cast image data to.

  • filter_fn (Callable) – Callable that receives a file path and returns True to include it.

Returns:

Array of shape (n_images, height, width) containing all loaded images.

Return type:

ndarray

ptyrax.dataset.load_spe_from_files(filepaths)[source]#

Allows user to load multiple files at once.

Each file is stored as an SpeFile object in the list batch.

Parameters:

filepaths (list[str])

Return type:

list[SpeFile] | None

ptyrax.dataset.make_constant_tilt_angle(ptychogram, tilt_angle, detector_tilt_angle=None)[source]#

Set a uniform sample tilt angle and recompute all geometry accordingly.

Overrides the per-position sample orientations with a single rotation defined by tilt_angle (interpreted as rotation about y-axis in degrees). Sample positions are transformed to the new local frame, and detector orientations and positions are recomputed assuming specular geometry.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • tilt_angle (float) – Rotation angle about the y-axis in degrees.

  • detector_tilt_angle (float | None) – Optional separate tilt for the detector orientation. If None, the detector orientation follows from the sample tilt via specular reflection.

Returns:

The ptychogram with updated orientations, positions, and propagation distances.

Return type:

Ptychogram

ptyrax.dataset.make_multiwavelength(ptychogram, wavelength_amount_factor=1)[source]#

Tile the wavelength array to simulate multi-wavelength illumination.

Repeats the existing wavelength entries wavelength_amount_factor times.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • wavelength_amount_factor (int) – Number of times to tile the wavelength array.

Returns:

The ptychogram with an expanded wavelength array.

Return type:

Ptychogram

ptyrax.dataset.mirror_coordinates(ptychogram)[source]#

Mirror the geometry about the y-z plane (negate x-coordinates).

Flips the x-component of both sample and detector positions and adjusts the corresponding orientation representations to maintain consistency.

Parameters:

ptychogram (Ptychogram) – Input ptychogram.

Returns:

The ptychogram with mirrored coordinate geometry.

Return type:

Ptychogram

ptyrax.dataset.non_negative(ptychogram)[source]#

Clamp negative pixel values in the diffraction patterns to zero.

Negative values can appear after background subtraction or due to detector artifacts. This ensures all intensities are non-negative. Use with caution as this may introduce bias if negative values are significant. Really, it is better to fix the underlying issue in the loss function.

Parameters:

ptychogram (Ptychogram) – Input ptychogram.

Returns:

The ptychogram with all negative pixel values set to zero.

Return type:

Ptychogram

ptyrax.dataset.normalize_by_max(ptychogram, new_max=1.0)[source]#

Rescale diffraction patterns so the global maximum equals new_max.

Also rescales the detector darkframe and the stored diffraction_pattern_scale factor accordingly.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • new_max (float) – Target value for the brightest pixel across all patterns.

Returns:

The ptychogram with rescaled intensities.

Return type:

Ptychogram

ptyrax.dataset.normalize_by_mean(ptychogram)[source]#

Rescale diffraction patterns so their global mean value becomes unity.

Also rescales the detector darkframe and stored scale factor.

Parameters:

ptychogram (Ptychogram) – Input ptychogram.

Returns:

The ptychogram with mean-normalized intensities.

Return type:

Ptychogram

ptyrax.dataset.normalize_by_mean_intensity(ptychogram)[source]#

Rescale diffraction patterns by the mean of the per-pattern L2 norms.

Computes the average Frobenius norm across all patterns and divides all intensities (and the darkframe/scale) by that value.

Parameters:

ptychogram (Ptychogram) – Input ptychogram.

Returns:

The ptychogram normalized by mean per-pattern intensity.

Return type:

Ptychogram

ptyrax.dataset.old_tud_key_converter(key)[source]#
Parameters:

key (str)

Return type:

str

ptyrax.dataset.plot_dataset_dynamic_range(dataset, output_path, dpi=200)[source]#

Plot and save a pixel intensity histogram for the dataset.

Reads diffraction patterns from an HDF5 file, computes a histogram of all pixel values, and saves the resulting plot to output_path.

Parameters:
  • dataset (str) – Path to the HDF5 file containing a diff_pat dataset.

  • output_path (str) – Directory where the pixel_histogram.png file will be saved.

  • dpi (int) – Resolution of the saved figure in dots per inch.

Return type:

None

ptyrax.dataset.propagation_distance_to_full_position(n, local)[source]#

Convert a scalar or per-position propagation distance to 3-D detector positions.

The propagation distance is interpreted as a displacement along the detector z-axis. If detector orientations are available they are used to transform the local z-offset into global coordinates.

Parameters:
  • n (int) – Number of scan positions.

  • local (dict) – Mutable dictionary containing at least propagation_distance and optionally detector_orientations. The computed detector_positions key is added in place.

Returns:

The updated dictionary with a detector_positions entry of shape (n, 3).

Raises:

KeyError – If propagation_distance is not present in local.

Return type:

None

ptyrax.dataset.quantize_diffraction_patterns(ptychogram, dynamic_range_bits=14, overexpose_fraction=0.0)[source]#

Quantize diffraction patterns to simulate a finite bit-depth detector.

Scales patterns to fill the dynamic range defined by dynamic_range_bits, rounds to integer counts, and clips to the maximum value. An optional overexpose_fraction allows a fraction of the brightest pixels to saturate.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • dynamic_range_bits (int) – Number of bits defining the detector dynamic range (e.g. 14 gives a range of 0–16384).

  • overexpose_fraction (float) – Fraction of pixels allowed to saturate (0–1).

Returns:

The ptychogram with quantized diffraction patterns.

Return type:

Ptychogram

ptyrax.dataset.read_at(file, pos, size, ntype)[source]#

Reads SPE source file at specific byte position.

Adapted from https://scipy.github.io/old-wiki/pages/Cookbook/Reading_SPE_files.html

Parameters:
  • file (IO)

  • pos (int)

  • size (int)

  • ntype (type)

Return type:

ndarray

ptyrax.dataset.read_excel_columns(excel_path, column_matcher=None, data_filters=None)[source]#

Read columns from an Excel file by matching column headers with regex patterns.

Each key in column_matcher maps a logical field name to a regex that is matched against the Excel column headers. Optional filters can restrict rows to specific values.

Parameters:
  • excel_path (str) – Path to the .xlsx file.

  • column_matcher (dict[str, str]) – Dictionary mapping field names to regex patterns for matching column headers.

  • data_filters (dict) – Optional dictionary of {column_name: value} pairs used to filter rows.

Returns:

Dictionary mapping field names to pandas Series of matched column data.

Raises:

ValueError – If a regex matches zero or more than one column.

Return type:

dict

ptyrax.dataset.read_excel_scan_pos(excel_path, precision=None)[source]#

Read scan positions from an Excel file using fixed column indices.

Expects columns at indices 3–7 to contain x, y, z, phi, and theta respectively.

Parameters:
  • excel_path (str) – Path to the .xlsx file.

  • precision (type) – Optional numpy dtype to cast position values to.

Returns:

Dictionary with keys x, y, z, phi, theta, phi_prime as numpy arrays.

Return type:

dict

ptyrax.dataset.read_image(file_path, precision=None)[source]#

Read an image file, dispatching to the appropriate reader by extension.

Supported formats: .png, .spe.

Parameters:
  • file_path (str) – Path to the image file.

  • precision (type) – Numpy dtype to cast the image data to.

Returns:

Image as a numpy array.

Return type:

ndarray

ptyrax.dataset.read_mat_metadata(file_path)[source]#

Load and simplify the metaData struct from a MATLAB .mat file.

Recursively converts MATLAB structs and structured arrays into plain Python dictionaries and lists.

Parameters:

file_path (str) – Path to the .mat file containing a metaData variable.

Returns:

Nested dictionary representation of the MATLAB metadata struct.

Raises:

KeyError – If the file does not contain a metaData variable.

Return type:

dict

ptyrax.dataset.read_mat_scan_pos(file_path, **kwargs)[source]#

Read scan positions from a MATLAB .mat metadata file.

Extracts position arrays (x, y, z, phi, theta, phi_camera) from the metaData struct in the file.

Parameters:
  • file_path (str) – Path to the .mat file.

  • **kwargs – Unused; accepted for interface compatibility.

Returns:

Dictionary with keys x, y, z, phi, theta, phi_camera as numpy arrays.

Return type:

array

ptyrax.dataset.read_png(file_path, precision=None)[source]#

Read a PNG image file and return it as a numpy array.

Parameters:
  • file_path (str) – Path to the .png file.

  • precision (type) – Numpy dtype to cast the image data to.

Returns:

Tuple of (image array, None). The second element is a placeholder for interface compatibility with other readers.

Return type:

ndarray

ptyrax.dataset.read_scan_pos_file(scan_pos_file, **kwargs)[source]#

Dispatch scan position loading based on file extension.

Supports .xlsx (Excel) and .mat (MATLAB) formats.

Parameters:
  • scan_pos_file (str) – Path to the scan positions file.

  • **kwargs – Additional keyword arguments passed to the format-specific reader.

Returns:

Dictionary with keys such as x, y, z, phi, theta containing arrays of scan positions.

Return type:

dict

ptyrax.dataset.read_spe(file_path, precision=None)[source]#

Read a Princeton Instruments SPE file and return its image data.

Extracts the last frame from the SPE file along with the exposure time metadata.

Parameters:
  • file_path (str) – Path to the .spe file.

  • precision (type) – Numpy dtype to cast the image data to.

Returns:

Tuple of (image array, exposure_time_string).

Return type:

tuple[ndarray, float]

ptyrax.dataset.remove_zeros(ptychogram)[source]#

Replace exact-zero pixels with the minimum non-zero value.

This avoids division-by-zero or log-of-zero issues during reconstruction while preserving the dynamic range of the data.

Parameters:

ptychogram (Ptychogram) – Input ptychogram.

Returns:

The ptychogram with zero pixels replaced by the per-pixel minimum of the non-zero values.

Return type:

Ptychogram

ptyrax.dataset.save_all_in_hdf5(data_list, dataset_name_list, hdf5_path)[source]#

Save multiple arrays to a single HDF5 file.

Each array is stored as a top-level dataset with the corresponding name. Existing datasets with the same name are overwritten.

Parameters:
  • data_list (list) – Iterable of arrays to save.

  • dataset_name_list (list[str]) – Iterable of dataset names (one per array).

  • hdf5_path (str) – Output HDF5 file path.

Return type:

None

ptyrax.dataset.scale(ptychogram, scale)[source]#

Multiply all diffraction pattern values by a constant factor.

Also updates the stored diffraction_pattern_scale.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • scale (float) – Multiplicative scaling factor.

Returns:

The ptychogram with scaled diffraction patterns.

Return type:

Ptychogram

ptyrax.dataset.scale_camera_distance(ptychogram, scale)[source]#

Multiply all detector positions by a constant factor.

Effectively changes the sample-to-detector distance without modifying orientations.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • scale (float) – Multiplicative factor for detector positions.

Returns:

The ptychogram with rescaled detector positions.

Return type:

Ptychogram

ptyrax.dataset.scale_diffraction_pattern_maximum(ptychogram, maximum)[source]#

Rescale diffraction patterns so the global maximum equals maximum.

First normalizes to [0, 1] by dividing by the current maximum, then multiplies by maximum.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • maximum (float) – Desired maximum pixel value.

Returns:

The ptychogram with rescaled patterns.

Return type:

Ptychogram

ptyrax.dataset.scale_length_unit(ptychogram, scale=1.0)[source]#

Multiply all length quantities by a constant factor.

Applies the same scaling to wavelengths, sample positions, detector positions, pixel sizes, and propagation distances. Use this to convert between unit systems (e.g. metres to micrometres).

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • scale (Float) – Multiplicative factor applied to all length fields.

Returns:

The ptychogram with rescaled length quantities.

Return type:

Ptychogram

ptyrax.dataset.scale_scan_positions(ptychogram, scale)[source]#

Scale sample positions by a per-axis factor in the local sample frame.

Positions are first transformed to the local frame defined by sample_orientations, scaled by the given factor, and then transformed back to global coordinates.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • scale (Float[Array, '3']) – Array of length 3 giving the multiplicative scale factor for each local coordinate axis (x, y, z).

Returns:

The ptychogram with rescaled sample positions.

Return type:

Ptychogram

ptyrax.dataset.scale_wavelength(ptychogram, scale)[source]#

Multiply the wavelength array by a constant factor.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • scale (float) – Multiplicative factor for wavelength.

Returns:

The ptychogram with rescaled wavelength.

Return type:

Ptychogram

ptyrax.dataset.set_constant_detector_orientations(ptychogram, euler_angles)[source]#

Set all detector orientations to a single constant rotation.

The rotation is specified as extrinsic Euler angles in the xyz convention (in degrees) and is converted to a 6-D representation.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • euler_angles (Array | ndarray | bool | number | bool | int | float | complex | LiteralArray) – 3-element array [rx, ry, rz] of Euler angles in degrees.

Returns:

The ptychogram with uniform detector orientations.

Return type:

Ptychogram

ptyrax.dataset.set_constant_detector_positions(ptychogram, constant_position)[source]#

Set all detector positions to a single constant value.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • constant_position (Array | ndarray | bool | number | bool | int | float | complex | LiteralArray) – 3-element array [x, y, z] specifying the detector position applied uniformly to all frames.

Returns:

The ptychogram with uniform detector positions.

Return type:

Ptychogram

ptyrax.dataset.set_constant_sample_orientations(ptychogram, euler_angles)[source]#

Set all sample orientations to a single constant rotation.

The rotation is specified as extrinsic Euler angles in the xyz convention (in degrees) and is converted to a 6-D representation.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • euler_angles (Array | ndarray | bool | number | bool | int | float | complex | LiteralArray) – 3-element array [rx, ry, rz] of Euler angles in degrees.

Returns:

The ptychogram with uniform sample orientations.

Return type:

Ptychogram

ptyrax.dataset.shift_to_center_of_mass(ptychogram, order=2)[source]#

Shift all diffraction patterns so the intensity center-of-mass is at the array center.

Computes the intensity-weighted centroid of the mean diffraction pattern raised to the given power and applies a sub-pixel shift via interpolation.

The center of mass is computed as:

\[\mathbf{c} = \frac{\sum_{\mathbf{r}} \mathbf{r}\, I(\mathbf{r})^p} {\sum_{\mathbf{r}} I(\mathbf{r})^p}\]

where p is order.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • order (int) – Power to which the mean pattern is raised before computing the centroid. Higher values emphasize the bright peak.

Returns:

The ptychogram with recentered diffraction patterns.

Return type:

Ptychogram

ptyrax.dataset.sort_images_by_timestamp(image_paths)[source]#

Sort image file paths chronologically by embedded timestamp.

Parses timestamps from filenames (format: YYYY Month DD HH_MM_SS) and returns the paths sorted from earliest to latest. Handles Dutch month names.

Parameters:

image_paths (list[str]) – List of image file paths to sort.

Returns:

The same paths sorted by the timestamp extracted from filenames.

Raises:

ValueError – If a filename cannot be parsed into a valid timestamp.

Return type:

list[str]

ptyrax.dataset.standardize_hdf5_shapes(n, local)[source]#

Normalize raw HDF5 data fields to the canonical shapes expected by Ptychogram.

Handles legacy key names, converts 2-D scan positions to 3-D, infers missing orientation and detector position fields from tilt angles and propagation distances, and renames the background key to detector_darkframe.

Parameters:
  • n (int) – Number of diffraction pattern frames in the dataset.

  • local (dict) – Mutable dictionary of dataset fields loaded from HDF5. Modified in place and also returned.

Returns:

The updated dictionary with standardized shapes and keys.

Raises:

KeyError – If n is None (cannot determine frame count) or required position/distance fields are missing.

Return type:

None

ptyrax.dataset.subtract_background(ptychogram, background_path, orientation=0)[source]#

Subtract a background image loaded from file from all diffraction patterns.

The background image is loaded from background_path, optionally reoriented, and then subtracted element-wise from each pattern.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • background_path (str) – Path to the background image file (e.g. .png or .spe).

  • orientation (int) – Orientation code (0–7) to apply to the background before subtraction.

Returns:

The ptychogram with background-subtracted patterns.

Return type:

Ptychogram

ptyrax.dataset.subtract_low_intensity(ptychogram, pixel_ratio=0.9)[source]#

Subtract a percentile-based threshold from all patterns and clamp to zero.

Computes the pixel_ratio * 100-th percentile across all pixel values, subtracts it uniformly, and sets any resulting negative values to zero.

Parameters:
  • ptychogram (Ptychogram) – Input ptychogram.

  • pixel_ratio (float) – Fraction (0–1) specifying the percentile used as the subtraction threshold.

Returns:

The ptychogram with the baseline subtracted.

Return type:

Ptychogram

ptyrax.dataset.wavelength_units(ptychogram)[source]#

Rescale all length quantities so that the first wavelength becomes unity.

Divides wavelengths, sample positions, detector positions, pixel sizes, and propagation distances by the first wavelength entry. Useful for working in dimensionless (wavelength-normalized) coordinates.

Parameters:

ptychogram (Ptychogram) – Input ptychogram with physical-unit lengths.

Returns:

The ptychogram with all lengths expressed in units of the first wavelength.

Return type:

Ptychogram