The Slide class

The main class of PathML for performing operations on a whole-slide image. Here is a brief summary of some of the crucial functions of Slide and when to use them:

_images/api_table.png
class pathml.slide.Slide(slideFilePath, newSlideFilePath=False, level=0, verbose=False)

The main class of PathML; a representation of whole-slide image containing dictionary of tiles, and upon which further analyses are added, including but not limited to tissue detection and annotation, and from which tiles from whole-slide images can be extracted.

Parameters
  • slideFilePath (str) – path to a WSI (to make from scratch) or to a .pml file (to reload a saved Slide object, see Slide.save())

  • newSlideFilePath (str, optional) – if loading a .pml file and the location of the WSI has changed, the new path to WSI can be inputted here

  • level (int, optional) – the level of the WSI pyramid at which to operate on; 0 is the highest resolution and default and how many levels are present above that depends on the WSI

  • verbose (Bool, optional) – whether to output a verbose output. Default is false.

addAnnotations(self, annotationFilePath, classesToAdd=False, negativeClass=False, level=0, overwriteExistingAnnotations=False, mergeOverlappingAnnotationsOfSameClass=True, acceptMultiPolygonAnnotations=True)

A function that adds the overlap between all (desired) classes present in an annotation file and each tile in the tile dictionary Annotations within groups in ASAP are taken to be within one class, where the name of the ASAP group is the name of the class; similarly, annotations within classes in QuPath are taken to be within one class, where the name of the QuPath class is the name of the class. (except the negativeClass if one is specified). Acceptable ASAP annotation tools to make annotations for this function include the RectangleAnnotation, PolyAnnotation, and SplineAnnotation tools; in QuPath, acceptable tools include the Rectangle, Ellipse, Polygon, and Brush tools. Annotations should be polygons, i.e. closed regions that do not self-overlap at any point. Annotations of different classes are expected to never overlap, and annotations of the same class can only overlap (and will be merged into one polygon) if mergeOverlappingAnnotationsOfSameClass is set to True.

Parameters
  • annotationFilePath (str) – the path to the file containing the annotation. The file must be either an xml file from the ASAP software or a GeoJSON file from the QuPath software.

  • classesToAdd (list of str, optional) – a list of classes to add from the annotation file. Default is that all annotation classes will be used.

  • negativeClass (str, optional) – the name of the class of negative annotations (donut holes) to subtract from the other annotations. Default is not to consider any class to be a negative space class.

  • level (int, optional) – the level of the WSI pyramid to make use of. Default is 0.

  • overwriteExistingAnnotations (Bool, optional) – whether to overwrite any preexisting annotations in the tile dictionary. Default is False.

  • mergeOverlappingAnnotationsOfSameClass (Bool, optional) – whether to automatically merge annotations of the same class that overlap into one polygon. Default is True.

  • acceptMultiPolygonAnnotations (Bool, optional) – whether or not to accept annotations that parse into MultiPolygons using Shapely. Multipolygons tend to arise when there are small self-overlapping regions like loops in annotations. If this argument is True, then the polygon with the largest area among those constituting the multipolygon created from one annotation will be retained, and the others will not be used. Default is True.

Example

pathml_slide.addAnnotations(“/path/to/annotations.xml”, negativeClass=”negative”)

appendTag(self, tileAddress, key, val)

A function to add key-value pair of data to a certain tile in the tile dictionary.

Parameters
  • tileAddress (Tuple[int, int]) – the (x, y) coordinate touple of the desired tile to save.

  • key (str) – the key to store at the tile address.

  • val – the value to store at the key at the tile address.

Example

pathml_slide.appendTag((15,20), “brightness_level”, 0.7)

classifierMetricAtThreshold(self, classToThreshold, probabilityThresholds, tileAnnotationOverlapThreshold=0.5, metric='accuracy', assignZeroToTilesWithoutAnnotationOverlap=True)

A function to return the tile-level metric of a class probability threshold (or list of thresholds) compared to the ground truth, where a tile with ground truth annotation overlap greater than or equal to tileAnnotationOverlapThreshold is considered to be ground truth positive for that class. Ground truth annotations are expected to have been added to each tile in the tile dictionary by addAnnotations(). Class probability labels are expected to have been added to each tile in the tile dictionary by Slide.inferClassifier(). Metrics include ‘accuracy’, ‘balanced_accuracy’, ‘f1’, ‘precision’, or ‘recall’.

Parameters
  • classToThreshold (str) – the class to threshold the tiles by. The class must be already present in the tile dictionary from Slide.inferClassifier().

  • probabilityThresholds (float or list of floats) – the probability threshold or list of probability thresholds (in the range 0 to 1) to check. If a float is provided, just that probability threshold will be used, and a float of the accuracy of the classifier using that threshold as when the model considered a tile positive for the class will be returned. If a list of floats is provided, a list of floats of accuracies for those thresholds will be returned in respective order the inputted threshold list will be returned.

  • tileAnnotationOverlapThreshold (float, optional) – the class annotation overlap threshold at or above which a tile is considered ground truth positive for that class. Default is 0.5.

  • metric (str, optional) – which metric to compute. Options are ‘accuracy’, ‘balanced_accuracy’, ‘f1’, ‘precision’, or ‘recall’. Default is ‘accuracy’.

  • assignZeroToTilesWithoutAnnotationOverlap (Bool, optional) – whether to assign a ground truth metric value of 0 (or else throw an error) for tiles that lack an overlap with classToThreshold in the ground truth annotations. Default is True.

Returns

The specified classifier metric of the Slide’s tiles at the specified threshold; if several thresholds are provided, a list of performance values corresponding to the inputted list of thresholds will be returned instead

Return type

float

Example

pathml_slide.classifierMetricAtThreshold(‘tumor’, [0.85, 0.9, 0.95], metric=’balanced_accuracy’)

detectForeground(self, level=4, overwriteExistingForegroundDetection=False, threshold=None)

A function to implement traditional foreground filtering methods on the tile dictionary to exclude background tiles from subsequent operations.

Parameters
  • level (int, optional) – the level of the WSI pyramid to detect foreground on. Default is 4. Not all WSIs will have a 4th level, so alter if necessary. If memory runs out, increase the level to detect foreground with a less high resolution image.

  • overwriteExistingForegroundDetection (Bool, optional) – whether to old foreground detection if it is present in the tile dictionary already. Default is False.

  • threshold (str or int) – Legacy argument, avoid using. Default is to put the results of all tissue detection methods (Otsu, triangle, simple thresholding) in the tile dictionary. Can be set to ‘otsu’, ‘triangle’ or an int to do simple darkness thresholding at that int value (tiles with a 0-100 foregroundLevel value less or equal to than the set value are considered foreground, where 0 is a pure black tile, 100 is a pure white tile)

Example

pathml_slide.detectForeground()

detectTissue(self, tissueDetectionLevel=1, tissueDetectionTileSize=512, tissueDetectionTileOverlap=0, tissueDetectionUpsampleFactor=4, batchSize=20, numWorkers=16, overwriteExistingTissueDetection=False, modelStateDictPath='../pathml/pathml/models/deep-tissue-detector_densenet_state-dict.pt', architecture='densenet')

A function to apply PathML’s built-in deep tissue detector to assign artifact, background, and tissue probabilities that sum to one to each tile in the tile dictionary. The raw tissue detection map for a WSI is saved into a Slide attribute called rawTissueDetectionMap in the Slide which can be loaded into a new Slide object to save inference time with detectTissueFromRawTissueDetectionMap(). For this reason calling Slide.save() after Slide.detectTissue() finishes is recommended.

Parameters
  • tissueDetectionLevel (int, optional) – the level of the WSI pyramid at which to perform the tissue detection. Default is 1.

  • tissueDetectionTileSize (int, optional) – the edge length in pixels of the tiles that the deep tissue detector will be inferred on. Default is 512.

  • tissueDetectionTileOverlap (float, optional) – the fraction of a tile’s edge length that overlaps the left, right, above, and below tiles. Default is 0.

  • tissueDetectionUpsampleFactor (int, optional) – the factor why which the WSI should be upsampled when performing tissue detection. Default is 4.

  • batchSize (int, optional) – the number of tiles per minibatch when inferring on the deep tissue detector. Default is 20.

  • numWorkers (int, optional) – the number of workers to use when detecting tissue. Default is 16.

  • overwriteExistingTissueDetection (Bool, optional) – whether to overwrite any existing deep tissue detector predictions if they are already present in the tile dictionary. Default is False.

  • modelStateDictPath (str, optional) – the path to the state dictionary of the deep tissue detector; it must be a 3-class classifier, with the class order as follows: background, artifact, tissue. Default is the path to the state dict of the deep tissue detector build into PathML.

  • architecture (str, optional) – the name of the architecture that the state dict belongs to. Currently supported architectures include resnet18, inceptionv3, vgg16, vgg16_bn, vgg19, vgg19_bn, densenet, alexnet, and squeezenet. Default is “densenet”, which is the architecture of PathML’s built-in deep tissue detector.

Example

pathml_slide.detectTissue()

detectTissueFromRawTissueDetectionMap(self, rawTissueDetectionMap, overwriteExistingTissueDetection=False)

Function to load a raw tissue detection map from a previous application of Slide.detectTissue() to a slide.

Parameters
  • rawTissueDetectionMap (np.array) – the raw tissue detection map numpy array saved in a Slide object’s rawTissueDetectionMap attribute.

  • overwriteExistingTissueDetection (Bool, optional) – whether to overwrite any existing deep tissue detection predictions in the tile dictionary if they are present. Default is False.

Example

pathml_slide.detectTissueFromRawTissueDetectionMap(Slide(‘/path/to/old_pathml_slide.pml’)).rawTissueDetectionMap)

extractAnnotationTiles(self, outputDir, slideName=False, numTilesToExtractPerClass='all', classesToExtract=False, otherClassNames=False, extractSegmentationMasks=False, tileAnnotationOverlapThreshold=0.5, foregroundLevelThreshold=False, tissueLevelThreshold=False, returnTileStats=True, returnOnlyNumTilesFromThisClass=False, seed=False)

A function to extract tiles that overlap with annotations into directory structure amenable to torch.utils.data.ConcatDataset.

Parameters
  • outputDir (str) – the path to the directory where the tile directory will be stored

  • slideName (str, optional) – the name of the slide to be used in the file names of the extracted tiles and masks. Default is Slide.slideFileName.

  • numTilesToExtractPerClass (dict or int or 'all', optional) – how many suitable tiles to extract from the slide for each class; if more suitable tiles are available than are requested, tiles will be chosen at random; expected to be positive integer, a dictionary with class names as keys and positive integers as values, or ‘all’ to extract all suitable tiles for each class. Default is ‘all’.

  • classesToExtract (str or list of str, optional) – defaults to extracting all classes found in the annotations, but if defined, must be a string or a list of strings of class names.

  • otherClassNames (str or list of str, optional) – if defined, creates an empty class directory alongside the unannotated class directory for each class name in the list (or string) for torch ImageFolder purposes. If set to ‘discernFromClassesToExtract’, empty class directories will be created for all classes not found in annotations. Default is False.

  • extractSegmentationMasks (Bool, optional) – whether to extract a ‘masks’ directory that is exactly parallel to the ‘tiles’ directory, and contains binary segmentation mask tiles for each class desired. Pixel values of 255 in these masks appear as white and indicate the presence of the class; pixel values of 0 appear as black and indicate the absence of the class. Default is False.

  • tileAnnotationOverlapThreshold (float, optional) – a number greater than 0 and less than or equal to 1, or a dictionary of such values, with a key for each class to extract. The numbers specify the minimum fraction of a tile’s area that overlaps a given class’s annotations for it to be extracted. Default is 0.5.

  • foregroundLevelThreshold (str or int or float, optional) – if defined as an int, only extracts tiles with a 0-100 foregroundLevel value less or equal to than the set value (0 is a black tile, 100 is a white tile). Only includes Otsu’s method-passing tiles if set to ‘otsu’, or triangle algorithm-passing tiles if set to ‘triangle’. Default is not to filter on foreground at all.

  • tissueLevelThreshold (Bool, optional) – if defined, only extracts tiles with a 0 to 1 tissueLevel probability greater than or equal to the set value. Default is False.

  • returnTileStats (Bool, optional) – whether to return the 0-1 normalized sum of channel values, the sum of the squares of channel values, and the number of tiles extracted for use in global mean and variance computation. Default is True.

  • returnOnlyNumTilesFromThisClass (str, optional) – causes only the number of suitable tiles for the specified class in the slide; no tile images are created if a string is provided. Default is False.

  • seed (int, optional) – the random seed to use for reproducible anayses. Default is not to use a seed when randomly selecting tiles.

Returns

A dictionary containing the Slide’s name, 0-1 normalized sum of channel values, the sum of the squares of channel values, and the number of tiles extracted for use in global mean and variance computation; if returnTileStats is set to False, True will be returned

Return type

dict

Example

channel_data = pathml_slide.extractAnnotationTiles(‘/path/to/directory’, numTilesToExtractPerClass=200, tissueLevelThreshold=0.995)

extractAnnotationTilesMultiClassSegmentation(self, outputDir, slideName=False, numTilesToExtract=100, classesToExtract=False, tileAnnotationOverlapThreshold=0.5, foregroundLevelThreshold=False, tissueLevelThreshold=False, returnTileStats=True, seed=False)

A function to extract tiles that overlap with annotations and their corresponding segmentation masks, where annotation masks are returned as .npy files containing ndarray stacks (each array in the stack being one class’s segmentation class) for use in multi-class segmentation problems.

Parameters
  • outputDir (str) – the path to the directory where the tile directory will be stored

  • slideName (str, optional) – the name of the slide to be used in the file names of the extracted tiles and masks. Default is Slide.slideFileName.

  • numTilesToExtractPerClass (int or 'all', optional) – how many suitable tiles to extract from the slide; if more suitable tiles are available than are requested, tiles will be chosen at random; expected to be positive integer or ‘all’ to extract all suitable tiles for each class. Default is ‘all’.

  • classesToExtract (str or list of str, optional) – which classes to consider when selecting tiles and making mask stacks; defaults to extracting all classes found in the annotations, but if defined, must be a string or a list of strings of class names.

  • tileAnnotationOverlapThreshold (float, optional) – a number greater than 0 and less than or equal to 1, or a dictionary of such values, with a key for each class to extract. The numbers specify the minimum fraction of a tile’s area that overlaps the annotations of the classesToExtract for that tile to be extracted. The overlaps with all classesToExtract classes are summed together and if this sum is greater or equal to tileAnnotationOverlapThreshold, then the tile is extracted. Default is 0.5.

  • foregroundLevelThreshold (str or int or float, optional) – if defined as an int, only extracts tiles with a 0-100 foregroundLevel value less or equal to than the set value (0 is a black tile, 100 is a white tile). Only includes Otsu’s method-passing tiles if set to ‘otsu’, or triangle algorithm-passing tiles if set to ‘triangle’. Default is not to filter on foreground at all.

  • tissueLevelThreshold (Bool, optional) – if defined, only extracts tiles with a 0 to 1 tissueLevel probability greater than or equal to the set value. Default is False.

  • returnTileStats (Bool, optional) – whether to return the 0-1 normalized sum of channel values, the sum of the squares of channel values, and the number of tiles extracted for use in global mean and variance computation. Default is True.

  • seed (int, optional) – the random seed to use for reproducible anayses. Default is not to use a seed when randomly selecting tiles.

Returns

A dictionary containing the class order that the class masks appear in ndarray mask stacks, the Slide’s name, 0-1 normalized sum of channel values, the sum of the squares of channel values, and the number of tiles extracted for use in global mean and variance computation; if returnTileStats is set to False, only the class mask order will be returned (a list of strings)

Return type

dict

Example

channel_data = pathml_slide.extractAnnotationTilesMultiClassSegmentation(‘/path/to/directory’, numTilesToExtractPerClass=200, classesToExtract=[‘lymphocyte’, ‘normal’, ‘tumor’], tileAnnotationOverlapThreshold=0.6, tissueLevelThreshold=0.995)

extractRandomUnannotatedTiles(self, outputDir, slideName=False, numTilesToExtract=100, unannotatedClassName='unannotated', otherClassNames=False, extractSegmentationMasks=False, foregroundLevelThreshold=False, tissueLevelThreshold=False, returnTileStats=True, seed=False)

A function to extract randomly selected tiles that don’t overlap any annotations into directory structure amenable to torch.utils.data.ConcatDataset

Parameters
  • outputDir (str) – the path to the directory where the tile directory will be stored

  • slideName (str, optional) – the name of the slide to be used in the file names of the extracted tiles and masks. Default is Slide.slideFileName.

  • numTilesToExtract (int, optional) – the number of random unannotated tiles to extract. Default is 50.

  • unannotatedClassName (str, optional) – the name that the unannotated “class” directory should be called. Default is “unannotated”.

  • otherClassNames (str or list of str, optional) – if defined, creates an empty class directory alongside the unannotated class directory for each class name in the list (or string) for torch ImageFolder purposes

  • extractSegmentationMasks (Bool, optional) – whether to extract a ‘masks’ directory that is exactly parallel to the ‘tiles’ directory, and contains binary segmentation mask tiles for each class desired (these tiles will of course all be entirely black, pixel values of 0). Default is False.

  • foregroundLevelThreshold (str or int or float, optional) – if defined as an int, only extracts tiles with a 0-100 foregroundLevel value less or equal to than the set value (0 is a black tile, 100 is a white tile). Only includes Otsu’s method-passing tiles if set to ‘otsu’, or triangle algorithm-passing tiles if set to ‘triangle’. Default is not to filter on foreground at all.

  • tissueLevelThreshold (Bool, optional) – if defined, only extracts tiles with a 0 to 1 tissueLevel probability greater than or equal to the set value. Default is False.

  • returnTileStats (Bool, optional) – whether to return the 0-1 normalized sum of channel values, the sum of the squares of channel values, and the number of tiles extracted for use in global mean and variance computation. Default is True.

  • seed (int, optional) – the random seed to use for reproducible anayses. Default is not to use a seed when randomly selecting tiles.

Returns

A dictionary containing the Slide’s name, 0-1 normalized sum of channel values, the sum of the squares of channel values, and the number of tiles extracted for use in global mean and variance computation; if returnTileStats is set to False, True will be returned

Return type

dict

Example

channel_data = pathml_slide.extractRandomUnannotatedTiles(‘/path/to/directory’, numTilesToExtract=200, unannotatedClassName=”non_metastasis”, tissueLevelThreshold=0.995)

fetchTile(self, patchWidth, patchHeight, patchX, patchY)
getAnnotationTileMask(self, tileAddress, maskClass, writeToNumpy=False, verbose=False, acceptTilesWithoutClass=False)

A function that returns the PIL Image of the binary mask of a tile-annotation class overlap. Note that the output values are 0 (white) to 255 (black).

Parameters
  • tileAddress (Tuple[int, int]) – the (x, y) coordinate touple of the desired tile to get the annotation mask for.

  • maskClass (str) – the class to extract a segmentation mask for.

  • writeToNumpy (Bool, optional) – whether to return the annotation tile mask in the form of a numpy array instead of a PIL Image. Default is False.

  • acceptTilesWithoutClass (Bool, optional) – whether to allow the input of tiles that lack either annotations or annotations with maskClass present. Default is False. If set to True, in cases where tiles lack either annotations or annotations with maskClass present, a blank mask will be returned.

  • verbose (Bool, optional) – whether to output verbose messages. Default is False.

Returns

An image depicting the binary mask of an annotation class overlapping the specified tile; if writeToNumpy is set to True, an np.array is returned instead

Return type

PIL.Image

Example

pathml_slide.getAnnotationTileMask((15,20), “metastasis”)

getNonOverlappingSegmentationInferenceArray(self, className, aggregationMethod='mean', probabilityThreshold=None, dtype='int', folder=os.getcwd(), verbose=False)

A function to extract the pixel-wise inference result (from Slide.inferClassifier()) of a Slide. Tile overlap is “stitched together” to produce one mask with the same pixel dimensions as the WSI. The resulting mask will be saved to a .npz file as a scipy.sparse.lil_matrix.

Parameters
  • className (str) – the name of the class to extract the binary mask for. Must be present in the tile dictionary from Slide.inferClassifier().

  • aggregationMethod (str, optional) – the method used to combine inference results on a pixel when two inference tiles overlap on that pixel. Default is ‘mean’ and no other options are currently supported.

  • probabilityThreshold (float, optional) – if defined, this is used as the cutoff above which a pixel is considered part of the class className. This will result in a binary mask of Trues and Falses being created. Default is to return a mask of 0-255 int predictions.

  • dtype (str, optional) – the data type to store in the output matrix. Options are ‘int’ for numpy.uint8 (the default), ‘float’ for numpy.float32. To get a Boolean output using a probability threshold, set a value for probabilityThreshold.

  • folder (str, optional) – the path to the directory where the scipy.sparse.lil_matrix will be saved. Default is the current working directory.

  • verbose (Bool, optional) – whether to output verbose messages. Default is False.

Example

pathml_slide.getNonOverlappingSegmentationInferenceArray(‘metastasis’, folder=’path/to/folder’)

getTile(self, tileAddress, writeToNumpy=False, useFetch=False)

A function to return a desired tile in the tile dictionary in the form of a pyvips Image.

Parameters
  • tileAddress (Tuple[int, int]) – the (x, y) coordinate touple of the desired tile to extract.

  • writeToNumpy (Bool, optional) – whether to return a numpy array of the tile (otherwise a pyvips Image object will be returbed). Default is False.

  • useFetch (Bool, optional) – whether to use pyvip’s fetchTile() function to extract the tile, which is purported to be faster than extractArea(). Default is False.

Returns

Tile image from the specified address; if writeToNumpy is set to True, a np.ndarray will be returned instead

Return type

pyvips.Image

Example

pathml_slide.getTile((15,20))

getTileCount(self, foregroundLevelThreshold=False, tissueLevelThreshold=False, foregroundOnly=False)

A function that returns the number of tiles in the tile dictionary. Arguments can be used to find the number of tiles with desired characteristics in the tile dictionary.

Parameters
  • foregroundLevelThreshold (str or int, optional) – returns the number of tiles considered foreground if ‘otsu’ or ‘triangle’ is used, or the number of tiles at or above the minimum threshold specified if simple average darkness intensity foreground filtering was used (0 is a pure black tile, 100 is a pure white tile). Default is not to filter the tile count this way. Slide.detectForeground() must be run first.

  • tissueLevelThreshold (float, optional) – returns the number of tiles at or above the deep tissue detector tissue probability specified. Default is not to filter the tile count this way. Slide.detectTissue() must be run first.

  • foregroundOnly (Bool, optional) – Legacy argument, avoid using. Whether to return the count of only the number of foreground tiles found with detectForeground(). Only available if threshold argument was used when Slide.detectForeground() was called.

Returns

The number of tiles in the Slide’s tile dictionary

Return type

int

Example

pathml_slide.getTileCount()

getTileDiceScore(self, tileAddress, className, pixelBinarizationThreshold=0.5)

A function that returns the Dice coefficient by comparing the tile’s ground truth segmentation mask from Slide.addAnnotations() with the tile’s inference segmentation mask output by a trained model via Slide.inferSegmenter().

Parameters
  • tileAddress (Tuple[int, int]) – the tile dictionary address of the tile to compute the Dice score for.

  • className (str) – the name of the class to compute the Dice score for. This class name must be present in the tile dictionary in both the annotations (added to the tile dictionry via Slide.addAnnotations()) as well in the segmentation inference output (added to the tile dictionary via Slide.inferSegmenter()).

  • pixelBinarizationThreshold (float, optional) – the 0-1 threshold above which a pixel in the segmentation probability mask output by the trained model is considered a member of the class. Default is 0.5.

Returns

The Sorensen-Dice similarity of the specified tile’s ground truth segmentation mask and the segmentation mask predicted by a segmentation model

Return type

float

Example

tile_dice_score = getTileDiceScore(pathml_slide.suitableTileAddresses[0], ‘metastasis’)

hasAnnotations(self)

A function that returns a Boolean of whether annotations have been added to the tile dictionary by Slide.addAnnotations() yet.

Returns

Whether the Slide’s tile dictionary contains any annotations

Return type

Bool

Example

pathml_slide.hasAnnotations()

hasTileDictionary(self)

A function that returns a Boolean of whether the Slide object has a tile dictionary by Slide.setTileProperties() yet.

Returns

Whether the Slide contains a tile dictionary

Return type

Bool

Example

pathml_slide.hasTileDictionary()

hasTissueDetection(self)

A function that returns a Boolean of whether deep tissue detections have been added to the tile dictionary by Slide.detectTissue() yet.

Returns

Whether the Slide contains tissue detections from the deep tissue detector

Return type

Bool

Example

pathml_slide.hasTissueDetection()

ind2sub(self, tileIndex, foregroundOnly=False)
inferClassifier(self, trainedModel, classNames, dataTransforms=None, batchSize=30, numWorkers=16, foregroundLevelThreshold=False, tissueLevelThreshold=False, overwriteExistingClassifications=False)

A function to infer a trained classifier on a Slide object using PyTorch.

Parameters
  • trainedModel (torchvision.models) – A PyTorch torchvision model that has been trained for the classification task desired for inference.

  • classNames (list of str) – an alphabetized list of class names.

  • dataTransforms (torchvision.transforms.Compose) – a PyTorch torchvision.Compose object with the desired data transformations.

  • batchSize (int, optional) – the number of tiles to use in each inference minibatch.

  • numWorkers (int, optional) – the number of workers to use when inferring the model on the WSI. Default is 16.

  • foregroundLevelThreshold (str or int or float, optional) – if defined as an int, only infers trainedModel on tiles with a 0-100 foregroundLevel value less or equal to than the set value (0 is a black tile, 100 is a white tile). Only infers on Otsu’s method-passing tiles if set to ‘otsu’, or triangle algorithm-passing tiles if set to ‘triangle’. Default is not to filter on foreground at all.

  • tissueLevelThreshold (Bool, optional) – if defined, only infers trainedModel on tiles with a 0 to 1 tissueLevel probability greater than or equal to the set value. Default is False.

  • overwriteExistingClassifications (Bool, optional) – whether to overwrite any existing classification inferences if they are already present in the tile dictionary. Default is False.

inferSegmenter(self, trainedModel, classNames, dataTransforms=None, dtype='int', batchSize=1, numWorkers=16, foregroundLevelThreshold=False, tissueLevelThreshold=False, overwriteExistingSegmentations=False)

A function to infer a trained segmentation model on a Slide object using PyTorch.

Parameters
  • trainedModel (torchvision.models) – A PyTorch segmentation model that has been trained for the segmentation task desired for inference.

  • classNames (list of str) – a list of class names. The first class name is expected to correspond with the first channel of the output mask image, the second with the second, and so on.

  • dataTransforms (torchvision.transforms.Compose) – a PyTorch torchvision.Compose object with the desired data transformations.

  • dtype (str, optional) – if ‘float’, saves the pixel probabilities as 0-1 numpy.float32 values; if ‘int’, saves the pixel probabilities as 0-255 numpy.uint8 values (these make for much more memory efficient Slide objects). Default is ‘int’.

  • batchSize (int, optional) – the number of tiles to use in each inference minibatch.

  • numWorkers (int, optional) – the number of workers to use when inferring the model on the WSI

  • foregroundLevelThreshold (str or int or float, optional) – if defined as an int, only infers trainedModel on tiles with a 0-100 foregroundLevel value less or equal to than the set value (0 is a black tile, 100 is a white tile). Only infers on Otsu’s method-passing tiles if set to ‘otsu’, or triangle algorithm-passing tiles if set to ‘triangle’. Default is not to filter on foreground at all.

  • tissueLevelThreshold (Bool, optional) – if defined, only infers trainedModel on tiles with a 0 to 1 tissueLevel probability greater than or equal to the set value. Default is False.

  • overwriteExistingSegmentations (Bool, optional) – whether to overwrite any existing segmentation inferences if they are already present in the tile dictionary. Default is False.

Example

pathml_slide.inferSegmenter(trained_model, classNames=class_names, batchSize=6, tissueLevelThreshold=0.995)

iterateTiles(self, tileDictionary=False, includeImage=False, writeToNumpy=False)

A generator function to iterate over all tiles in the tile dictionary, returning the tile address or the tile address and the tile image if specified with includeImage.

Parameters
  • tileDictionary (dict, optional) – the tile dictionary to iterate over. Default is the Slide’s own tile dictionary.

  • includeImage (Bool, optional) – whether to return a numpy array of the tile alongside its address. Default is False.

  • writeToNumpy (Bool, optional) – whether to return a numpy array of the tile (if not, a pyvips Image object will be returned) if includeImage is set to True.

Example

for address in pathml_slide.iterateTiles(): print(address)

numTilesAboveClassPredictionThreshold(self, classToThreshold, probabilityThresholds)

A function to return the number of tiles at or above one or a list of probability thresholds for a classification class added to each tile in the tile dictionary by Slide.inferClassifier().

Parameters
  • classToThreshold (str) – the class to threshold the tiles by. The class must be already present in the tile dictionary from Slide.inferClassifier().

  • probabilityThresholds (float or list of floats) – the probability threshold or list of probability thresholds (in the range 0 to 1) to check. If a float is provided, just that probability threshold will be used, and an int of the number of tiles at or above that threshold will be returned. If a list of floats is provided, a list of ints of the number of tiles at or above each of those thresholds in respective order to the inputted threshold list will be returned.

Returns

The number of tiles above the speficied class prediction probability threshold; if a list rather than an integer is provided for probabilityThresholds, a list of integer tile counts above the corresponding probability thresholds in the probabilityThresholds list is returned instead

Return type

int

Example

pathml_slide.numTilesAboveClassPredictionThreshold(‘tumor’, [0.85, 0.9, 0.95])

run(self, num)
save(self, fileName=False, folder=os.getcwd())

A function to save a pickled PathML Slide object to a .pml file for re-use later (re-loading is performed by providing the path to the .pml file when initializing a Slide object). This function should be re-run after each major step in an analysis on a Slide.

Parameters
  • fileName (str, optional) – the name of the file where the pickled Slide will be stored, excluding an extension. Default is the slideFileName attribute.

  • folder (str, optional) – the path to the directory where the pickled Slide will be saved. Default is the current working directory.

Example

pathml_slide.save(“pathml_slide” folder=”/path/to/pathml_slides”)

saveTile(self, tileAddress, fileName, folder=os.getcwd())

A function to save a specific tile image to an image file.

Parameters
  • tileAddress (Tuple[int, int]) – the (x, y) coordinate touple of the desired tile to save.

  • fileName (str) – the name of the image file including an image extension.

  • folder (str, optional) – the path to the directory where the tile image will be saved. Default is the current working directory.

Example

pathml_slide.saveTile((15,20), “tile_15x_20y.jpg” folder=”/path/to/tiles_directory”)

saveTileDictionary(self, fileName=False, folder=os.getcwd())

A function to save just the tileDictionary attribute of a Slide object into a pickled file. Note that these pickled files cannot be used as an input when initializing a Slide object; please use Slide.save() instead.

Parameters
  • fileName (str, optional) – the name of the file where the pickled tile dictionary (a dict) will be stored, excluding an extension. Default is the slideFileName attribute.

  • folder (str, optional) – the path to the directory where the pickled tile dictionary will be saved. Default is the current working directory.

Example

pathml_slide.saveTileDictionary(folder=”path/to/pathml_tile_dictionaries”)

segmenterMetricAtThreshold(self, classToThreshold, probabilityThresholds, metric='dice_coeff')

A function to return the pixel-level metric of a class probability threshold (or list of thresholds) compared to the ground truth, where a pixel with ground truth annotation overlap greater than or equal to probabilityThresholds is considered to be ground truth positive for that class. Ground truth annotations are expected to have been added to each tile in the tile dictionary by addAnnotations(). Class probability labels are expected to have been added to each tile in the tile dictionary by inferSegmenter(). The only metric currently available is the Dice coefficient. The metric will be applied to all tiles with predictions added by Slide.inferSegmenter() and the average of that metric across those tiles will be returned to give one metric per slide per threshold in probabilityThresholds.

Parameters
  • classToThreshold (str) – the class to threshold the pixels by. The class must be already present in the tile dictionary from Slide.inferSegmenter().

  • probabilityThresholds (float or list of floats) – the probability threshold or list of probability thresholds (in the range 0 to 1) to check. If a float is provided, just that probability threshold will be used, and a float of the accuracy of the segmenter using that threshold as when the model considered a pixel positive for the class will be returned. If a list of floats is provided, a list of floats of accuracies for those thresholds will be returned in respective order the inputted threshold list will be returned.

  • metric (str, optional) – which metric to compute. The only option currently available is ‘dice_coeff’. Default is ‘dice_coeff’.

Returns

The Dice coefficient performance of the Slide’s tiles at the specified threshold; if several thresholds are provided, a list of performance values corresponding to the inputted list of thresholds will be returned instead

Return type

float

Example

pathml_slide.segmenterMetricAtThreshold(‘tumor’, [0.85, 0.9, 0.95])

setTileProperties(self, tileSize, tileOverlap=0, unit='px')

A function to set the properties of the tile dictionary in a Slide object. Should be the first function called on a newly created Slide object.

Parameters
  • tileSize (int) – the edge length of each square tile in the requested unit

  • tileOverlap (float, optional) – the fraction of a tile’s edge length that overlaps the left, right, above, and below tiles. Default is 0.

  • unit (str, optional) – the unit to measure tileSize by. Default is ‘px’ for pixels and no other units are current supported

Example

pathml_slide.setTileProperties(400)

square_int(self, i)
suitableTileAddresses(self, tissueLevelThreshold=False, foregroundLevelThreshold=False)

A function that returns a list of the tile address tuples that meet set tissue and foreground thresholds. All addresses will be returned if neither tissueLevelThreshold nor foregroundLevelThreshold is defined.

Parameters
  • foregroundLevelThreshold (str or int or float, optional) – if defined as an int, only includes the tile address of tiles with a 0-100 foregroundLevel value less or equal to than the set value (0 is a black tile, 100 is a white tile). Only includes Otsu’s method-passing tiles if set to ‘otsu’, or triangle algorithm-passing tiles if set to ‘triangle’. Default is not to filter on foreground at all.

  • tissueLevelThreshold (int or float, optional) – if defined, only includes the tile addresses of tiles with a 0 to 1 tissueLevel probability greater than or equal to the set value. Default is False.

Returns

List of tile addresses (tuples of integers) meeting the specified conditions

Return type

list

Example

suitable_tile_addresses = pathml_slide.suitableTileAddresses(tissueLevelThreshold=0.995, foregroundLevelThreshold=88)

thumbnail(self, level)
visualizeClassifierInference(self, classToVisualize, fileName=False, folder=os.getcwd(), level=4)

A function to create an inference map image of a Slide after running Slide.inferClassifier() on it. The resulting image is saved at the following path: folder/fileName/fileName_classification_of_classToVisualize.png

Parameters
  • classToVisualize (str) – the class to make an inference map image for. This class must be present in the tile dictionary from Slide.inferClassifier().

  • fileName (str, optional) – the name of the file where the classification inference map image will be saved, excluding an extension. Default is self.slideFileName

  • folder (str, optional) – the path to the directory where the map will be saved. Default is the current working directory.

  • level (int, optional) – the level of the WSI pyramid to make the inference map image from.

Example

pathml_slide.visualizeClassifierInference(“metastasis”, folder=”path/to/folder”)

visualizeForeground(self, foregroundLevelThreshold, fileName=False, folder=os.getcwd(), colors=['#04F900', '#0000FE'])

A function to create a map image of a Slide after running Slide.detectForeground() on it. The resulting image is saved at the following path: folder/fileName/fileName_foregroundLevelThreshold_thresholded_foregrounddetection.png

Parameters
  • foregroundLevelThreshold (str or int, optional) – applies Otsu’s method to find the threshold if set to ‘otsu’, the triangle algorithm to find the threshold if set to ‘triangle’, or simply uses the tiles at or above the minimum darkness intensity threshold specified if set as an int (0 is a pure black tile, 100 is a pure white tile). Default is not to filter the tile count this way. Slide.detectForeground() must be run first.

  • fileName (str, optional) – the name of the file where the foreground map image will be saved, excluding an extension. Default is self.slideFileName

  • folder (str, optional) – the path to the directory where the map will be saved. Default is the current working directory.

  • colors (list, optional) – a list of length two containing the color for the background followed by the color for the foreground in the map image. Colors must be defined for use in matplotlib.imshow’s cmap argument. Default is a light green (#04F900) for background and a dark blue (#0000FE) for foreground.

Example

pathml_slide.visualizeForeground(‘otsu’, folder=’path/to/folder’)

visualizeSegmenterInference(self, classToVisualize, probabilityThreshold=None, fileName=False, folder=os.getcwd(), level=4)

A function to create an inference map image of a Slide after running Slide.inferSegmenter() on it. Tiles are shown with the averageof the probabilities of all their pixels. To get a pixel-level probability matrix, use Slide.getNonOverlappingSegmentationInferenceArray(). The resulting image is saved at the following path: /folder/fileName/fileName_segmentation_of_classToVisualize.png

Parameters
  • classToVisualize (str) – the class to make an inference map image for. This class must be present in the tile dictionary from Slide.inferSegmenter().

  • probabilityThreshold (float, optional) – before plotting the map, binarize the inference matrix’s predictions at this 0 to 1 probability threshold so that only pixels at or above the threshold will considered positive for the class of interest, and the others negative. Default is to plot the raw values in the inference matrix without thresholding.

  • fileName (str, optional) – the name of the file where the segmentation inference map image will be saved, excluding an extension. Default is self.slideFileName

  • folder (str, optional) – the path to the directory where the map will be saved. Default is the current working directory.

  • level (int, optional) – the level of the WSI pyramid to make the inference map image from.

Example

pathml_slide.visualizeSegmenterInference(‘metastasis’, folder=’path/to/folder’)

visualizeThumbnail(self, fileName=False, folder=False, level=4)

A function to create a low-resolution image of the WSI stored in a Slide.

Parameters
  • fileName (str, optional) – the name of the file where the deep tissue detection inference map image will be saved, excluding an extension.

  • folder (str, optional) – the path to the directory where the thumbnail image will be saved; if it is not defined, then the thumbnail image will only be shown with matplotlib.pyplot and not saved.

  • level (int, optional) – the level of the WSI pyramid to make the thumbnail image from. Higher numbers will result in a lower resolution thumbnail. Default is 4.

Example

pathml_slide.visualizeThumbnail(folder=’path/to/folder’)

visualizeTissueDetection(self, fileName=False, folder=os.getcwd())

A function to generate a 3-color tissue detection map showing where on a WSI the deep tissue detector applied with Slide.detectTissue() artifact was found (red), where background was found (green), and where tissue was found (blue). The resulting image is saved at the following path: folder/fileName/fileName_tissuedetection.png

Parameters
  • fileName (str, optional) – the name of the file where the deep tissue detection inference map image will be saved, excluding an extension. Default is self.slideFileName

  • folder (str, optional) – the path to the directory where the deep tissue detection inference map image will be saved. Default is the current working directory.

Example

pathml_slide.visualizeTissueDetection(folder=”/path/where/tissue_detection_map_will_be_saved”)