{ "cells": [ { "cell_type": "code", "execution_count": 1, "id": "13be5b08-e744-464f-9ab0-f71f391b743a", "metadata": { "ExecuteTime": { "end_time": "2026-03-23T13:55:53.297384Z", "start_time": "2026-03-23T13:55:53.285863Z" }, "editable": true, "slideshow": { "slide_type": "" }, "tags": [ "remove-input" ] }, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2\n", "%config InlineBackend.figure_format = 'retina'\n", "\n", "import warnings\n", "\n", "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "markdown", "id": "144130d7e5f8217d", "metadata": {}, "source": [ "# Accessors for WSIData\n", "\n", "The accessor is a concept that use attributes to extend the capabilities of a class.\n", "\n", "There are three in-built accessors in the WSIData class:\n", "\n", "- {py:class}`fetch `: Fetch information about the WSI\n", "- {py:class}`iter `: Iterate over the content of the WSI\n", "- {py:class}`ds `: Create deep learning datasets from the WSI" ] }, { "cell_type": "markdown", "id": "8ee0fdfeb325ce74", "metadata": {}, "source": [ "Here, we will load a WSI that have already been processed with tissue detection, tissue tiling, and feature extraction.\n", "\n", "In your case, you can easily run these steps with `LazySlide` package." ] }, { "cell_type": "code", "execution_count": 2, "id": "initial_id", "metadata": { "ExecuteTime": { "end_time": "2026-03-23T13:55:57.982132Z", "start_time": "2026-03-23T13:55:56.290842Z" } }, "outputs": [], "source": [ "from pathlib import Path\n", "from zipfile import ZipFile\n", "\n", "from huggingface_hub import hf_hub_download\n", "\n", "slide = hf_hub_download(\n", " \"RendeiroLab/LazySlide-data\", \"GTEX-1117F-0526.svs\", repo_type=\"dataset\"\n", ")\n", "slide_zarr_zip = hf_hub_download(\n", " \"RendeiroLab/LazySlide-data\", \"GTEX-1117F-0526.zarr.zip\", repo_type=\"dataset\"\n", ")\n", "if not Path(slide_zarr_zip.replace(\".zip\", \"\")).exists():\n", " with ZipFile(slide_zarr_zip, \"r\") as zip_ref:\n", " zip_ref.extractall(Path(slide).parent)" ] }, { "cell_type": "code", "execution_count": 4, "id": "a68c0dbc143b6fed", "metadata": { "ExecuteTime": { "end_time": "2026-03-23T13:56:13.832880Z", "start_time": "2026-03-23T13:56:13.467075Z" } }, "outputs": [], "source": [ "from wsidata import open_wsi\n", "\n", "wsi = open_wsi(slide)" ] }, { "cell_type": "code", "execution_count": 5, "id": "72ffd54c-4614-4993-8187-370f10a58957", "metadata": { "ExecuteTime": { "end_time": "2026-03-23T13:56:15.042649Z", "start_time": "2026-03-23T13:56:14.784268Z" } }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " \n", "
\n", " WSI: /Users/yzheng/.cache/huggingface/hub/datasets--RendeiroLab--LazySlide-data/snapshots/17ac84a3ea1ff81aa65d8751a1ed7aa40a6b624d/GTEX-1117F-0526.svs
\n", " Reader: openslide
\n", " Dimensions: 19958×19919 (h×w), 3 Pyramids
\n", " Pixel physical size: 0.49 MPP (20X)
\n", "
SpatialData object\n",
       "└── Images\n",
       "      └── 'wsi_thumbnail': DataArray[cyx] (3, 1996, 1992)\n",
       "with coordinate systems:\n",
       "    ▸ 'global', with elements:\n",
       "        wsi_thumbnail (Images)
\n", "
\n", "
" ], "text/plain": [ "WSI: /Users/yzheng/.cache/huggingface/hub/datasets--RendeiroLab--LazySlide-data/snapshots/17ac84a3ea1ff81aa65d8751a1ed7aa40a6b624d/GTEX-1117F-0526.svs\n", "Reader: openslide\n", "Dimensions: 19958×19919 (h×w), 3 Pyramids\n", "Pixel physical size: 0.4942 MPP\n", "SpatialData object\n", "└── Images\n", " └── 'wsi_thumbnail': DataArray[cyx] (3, 1996, 1992)\n", "with coordinate systems:\n", " ▸ 'global', with elements:\n", " wsi_thumbnail (Images)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "wsi" ] }, { "cell_type": "markdown", "id": "fa7a735d54850ff7", "metadata": {}, "source": [ "# Fetch accessor\n", "\n", "Fetch accessor allows you to fetch essential information from WSIData." ] }, { "cell_type": "markdown", "id": "569b7ef9-6927-49a2-8f6f-c1669bc10629", "metadata": {}, "source": [ "## 1. Pyramids information" ] }, { "cell_type": "code", "execution_count": null, "id": "f8f80f1adba130ff", "metadata": {}, "outputs": [], "source": [ "wsi.fetch.pyramids()" ] }, { "cell_type": "markdown", "id": "f53412ac-6aa6-4109-bed7-51232271342f", "metadata": {}, "source": [ "## 2. Retrive the features as AnnData" ] }, { "cell_type": "code", "execution_count": null, "id": "dcb0decfc03c354", "metadata": {}, "outputs": [], "source": [ "wsi.fetch.features_anndata(\"resnet50\")" ] }, { "cell_type": "markdown", "id": "40f4a95b0015ff04", "metadata": {}, "source": [ "# Iter accessor\n", "\n", "Like the name, the iter accessor will always return an iterator, and the iterator will always return data containers.\n", "\n", "The data container usually implements a plot method for inspection." ] }, { "cell_type": "markdown", "id": "1a688650-b3f2-4472-b8dd-bfb1cd89590e", "metadata": {}, "source": [ "## 1. Tissue contours" ] }, { "cell_type": "code", "execution_count": null, "id": "1309db0fc74f4d8e", "metadata": {}, "outputs": [], "source": [ "d = next(wsi.iter.tissue_contours(\"tissues\"))\n", "d" ] }, { "cell_type": "markdown", "id": "b7679189-99b3-44be-a7c6-3e7c9fb1ab0d", "metadata": {}, "source": [ "It's also possible to visualize what's inside." ] }, { "cell_type": "code", "execution_count": null, "id": "e5d130ea-06b3-44e6-a8c5-86ff64f07180", "metadata": {}, "outputs": [], "source": [ "d.plot()" ] }, { "cell_type": "markdown", "id": "1ce94f6a-64be-46e3-9a6d-4a0fd104de28", "metadata": {}, "source": [ "You can use a for loop to iterate every tissue" ] }, { "cell_type": "code", "execution_count": null, "id": "5d14ad5a-aefc-4b45-af22-b8cc3f4aede9", "metadata": {}, "outputs": [], "source": [ "for d in wsi.iter.tissue_contours(\"tissues\"):\n", " d.contour" ] }, { "cell_type": "markdown", "id": "c004d5f0-aaf5-46d2-8490-b25e3b18a1b7", "metadata": {}, "source": [ "## 2. Tissue images\n", "\n", "Iterate through tissue images" ] }, { "cell_type": "code", "execution_count": null, "id": "4f9df14532dc74a", "metadata": {}, "outputs": [], "source": [ "no_mask = next(wsi.iter.tissue_images(\"tissues\"))\n", "with_mask = next(wsi.iter.tissue_images(\"tissues\", mask_bg=True))" ] }, { "cell_type": "code", "execution_count": null, "id": "f5a4b218-2146-4dfd-9a3e-0913377ddb43", "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "\n", "_, (ax1, ax2) = plt.subplots(ncols=2)\n", "\n", "no_mask.plot(ax=ax1)\n", "with_mask.plot(ax=ax2)" ] }, { "cell_type": "markdown", "id": "07db4f30-ff41-41f4-a238-2ade911559be", "metadata": {}, "source": [ "## 3. Tile images" ] }, { "cell_type": "markdown", "id": "5ee0d5b8-b8c3-4ac3-bc6a-2e01853a5f09", "metadata": {}, "source": [ "You can also iterate over all tile images" ] }, { "cell_type": "code", "execution_count": null, "id": "7199fdf948a9291d", "metadata": {}, "outputs": [], "source": [ "d = next(wsi.iter.tile_images(\"tiles\"))\n", "d" ] }, { "cell_type": "markdown", "id": "b4f0dcf3-b1c9-4fca-b122-19f8f3f3c126", "metadata": {}, "source": [ "You can include pathological annotations, this is useful to prepare dataset for training segmentation model." ] }, { "cell_type": "code", "execution_count": null, "id": "f8f5c89e-5d43-4296-b082-33beeb36c8ba", "metadata": {}, "outputs": [], "source": [ "for d in wsi.iter.tile_images(\n", " \"tiles\", annot_key=\"annotations\", annot_names=\"name\", annot_labels={\"sclerosis\": 1}\n", "):\n", " if len(d.annot_shapes) > 1:\n", " d.plot()\n", " break" ] }, { "cell_type": "markdown", "id": "848023462fdb28e1", "metadata": {}, "source": [ "## Dataset accessor" ] }, { "cell_type": "code", "execution_count": null, "id": "7fa272a92f630eaa", "metadata": {}, "outputs": [], "source": [ "dataset = wsi.ds.tile_images()" ] }, { "cell_type": "markdown", "id": "53039756c91f51e4", "metadata": {}, "source": [ "The dataset is a torch dataset that can be used to train a deep learning model. You can load it in the DataLoader and train the model." ] }, { "cell_type": "code", "execution_count": null, "id": "9f26c7a5f16b1447", "metadata": {}, "outputs": [], "source": [ "from torch.utils.data import DataLoader\n", "\n", "dl = DataLoader(dataset, batch_size=36, shuffle=True)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.8" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }