![]() |
Gerda Bortsova Erasmus Medical Center g.bortsova@erasmusmc.nl |
PhD candidate
E-mail: g.bortsova@erasmusmc.nl
Phone: +31-10-7038875
LinkedIn
Gerda received a Bachelor of Science degree in Information Systems from Kazakh-British Technical University (Almaty, Kazakhstan) in 2014 and a Master of Science degree in Informatics from Technical University of Munich (TUM) in 2017. The main focus of her Master’s program was computer vision, machine learning and artificial intelligence. As a part of her studies, she was involved in several projects on development of novel machine learning algorithms for biomedical image analysis under supervision of Dr. Tingying Peng, Prof. Dr. Nassir Navab (Computer Aided Medical Procedures, TUM) and Prof. Marleen de Bruijne (BIGR, Erasmus MC).
In April 2017, Gerda started her PhD at the Biomedical Imaging Group Rotterdam (BIGR) at Erasmus Medical Center. The focus of her PhD project is Weakly Labeled Deep Learning with applications to medical image analysis.
2020 |
Gerda Bortsova, Daniel Bos, Florian Dubost, Meike W. Vernooij, M. Kamran Ikram, Gijs van Tulder, Marleen de Bruijne Automated Assessment of Intracranial Carotid Artery Calcification in Non-Contrast CT Using Deep Learning Journal Article Forthcoming Forthcoming. @article{Bortsova2020a, title = {Automated Assessment of Intracranial Carotid Artery Calcification in Non-Contrast CT Using Deep Learning}, author = {Gerda Bortsova, Daniel Bos, Florian Dubost, Meike W. Vernooij, M. Kamran Ikram, Gijs van Tulder, Marleen de Bruijne}, year = {2020}, date = {2020-08-20}, keywords = {}, pubstate = {forthcoming}, tppubtype = {article} } |
2019 |
Ruwan Tennakoon, Gerda Bortsova, Silas Ørting, Amirali K Gostar, Mathilde MW Wille, Zaigham Saghir, Reza Hoseinnezhad, Marleen de Bruijne, Alireza Bab-Hadiashar Classification of Volumetric Images Using Multi-Instance Learning and Extreme Value Theorem Journal Article IEEE Transactions on Medical Imaging, 39 (4), pp. 854-865, 2019. @article{Tennakoon2019, title = {Classification of Volumetric Images Using Multi-Instance Learning and Extreme Value Theorem}, author = {Ruwan Tennakoon, Gerda Bortsova, Silas Ørting, Amirali K Gostar, Mathilde MW Wille, Zaigham Saghir, Reza Hoseinnezhad, Marleen de Bruijne, Alireza Bab-Hadiashar}, doi = {10.1109/TMI.2019.2936244}, year = {2019}, date = {2019-08-19}, journal = {IEEE Transactions on Medical Imaging}, volume = {39}, number = {4}, pages = {854-865}, abstract = {Volumetric imaging is an essential diagnostic tool for medical practitioners. The use of popular techniques such as convolutional neural networks (CNN) for analysis of volumetric images is constrained by the availability of detailed (with local annotations) training data and GPU memory. In this paper, the volumetric image classification problem is posed as a multi-instance classification problem and a novel method is proposed to adaptively select positive instances from positive bags during the training phase. This method uses the extreme value theory to model the feature distribution of the images without a pathology and use it to identify positive instances of an imaged pathology. The experimental results, on three separate image classification tasks (i.e. classify retinal OCT images according to the presence or absence of fluid build-ups, emphysema detection in pulmonary 3D-CT images and detection of cancerous regions in 2D histopathology images) show that the proposed method produces classifiers that have similar performance to fully supervised methods and achieves the state of the art performance in all examined test cases.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Volumetric imaging is an essential diagnostic tool for medical practitioners. The use of popular techniques such as convolutional neural networks (CNN) for analysis of volumetric images is constrained by the availability of detailed (with local annotations) training data and GPU memory. In this paper, the volumetric image classification problem is posed as a multi-instance classification problem and a novel method is proposed to adaptively select positive instances from positive bags during the training phase. This method uses the extreme value theory to model the feature distribution of the images without a pathology and use it to identify positive instances of an imaged pathology. The experimental results, on three separate image classification tasks (i.e. classify retinal OCT images according to the presence or absence of fluid build-ups, emphysema detection in pulmonary 3D-CT images and detection of cancerous regions in 2D histopathology images) show that the proposed method produces classifiers that have similar performance to fully supervised methods and achieves the state of the art performance in all examined test cases. |
2019 |
Gerda Bortsova, Florian Dubost, Laurens Hogeweg, Ioannis Katramados, Marleen de Bruijne Semi-supervised medical image segmentation via learning consistency under transformations Inproceedings International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 810-818, Springer, Cham, 2019. @inproceedings{Bortsova2019, title = {Semi-supervised medical image segmentation via learning consistency under transformations}, author = {Gerda Bortsova, Florian Dubost, Laurens Hogeweg, Ioannis Katramados, Marleen de Bruijne}, url = {https://arxiv.org/abs/1911.01218}, doi = {https://doi.org/10.1007/978-3-030-32226-7_90}, year = {2019}, date = {2019-10-13}, urldate = {2020-08-20}, booktitle = {International Conference on Medical Image Computing and Computer-Assisted Intervention}, pages = {810-818}, publisher = {Springer, Cham}, abstract = {The scarcity of labeled data often limits the application of supervised deep learning techniques for medical image segmentation. This has motivated the development of semi-supervised techniques that learn from a mixture of labeled and unlabeled images. In this paper, we propose a novel semi-supervised method that, in addition to supervised learning on labeled training images, learns to predict segmentations consistent under a given class of transformations on both labeled and unlabeled images. More specifically, in this work we explore learning equivariance to elastic deformations. We implement this through: (1) a Siamese architecture with two identical branches, each of which receives a differently transformed image, and (2) a composite loss function with a supervised segmentation loss term and an unsupervised term that encourages segmentation consistency between the predictions of the two branches. We evaluate the method on a public dataset of chest radiographs with segmentations of anatomical structures using 5-fold cross-validation. The proposed method reaches significantly higher segmentation accuracy compared to supervised learning. This is due to learning transformation consistency on both labeled and unlabeled images, with the latter contributing the most. We achieve the performance comparable to state-of-the-art chest X-ray segmentation methods while using substantially fewer labeled images.}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } The scarcity of labeled data often limits the application of supervised deep learning techniques for medical image segmentation. This has motivated the development of semi-supervised techniques that learn from a mixture of labeled and unlabeled images. In this paper, we propose a novel semi-supervised method that, in addition to supervised learning on labeled training images, learns to predict segmentations consistent under a given class of transformations on both labeled and unlabeled images. More specifically, in this work we explore learning equivariance to elastic deformations. We implement this through: (1) a Siamese architecture with two identical branches, each of which receives a differently transformed image, and (2) a composite loss function with a supervised segmentation loss term and an unsupervised term that encourages segmentation consistency between the predictions of the two branches. We evaluate the method on a public dataset of chest radiographs with segmentations of anatomical structures using 5-fold cross-validation. The proposed method reaches significantly higher segmentation accuracy compared to supervised learning. This is due to learning transformation consistency on both labeled and unlabeled images, with the latter contributing the most. We achieve the performance comparable to state-of-the-art chest X-ray segmentation methods while using substantially fewer labeled images. |
2018 |
G. Bortsova, F. Dubost, S. Ørting, I. Katramados, L. Hogeweg, L. Thomsen, M. Wille, M. de Bruijne Deep learning from label proportions for emphysema quantification Inproceedings International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 768–776, Springer, Cham, 2018. @inproceedings{Bortsova2018, title = {Deep learning from label proportions for emphysema quantification}, author = {G. Bortsova, F. Dubost, S. Ørting, I. Katramados, L. Hogeweg, L. Thomsen, M. Wille, M. de Bruijne}, url = {https://arxiv.org/pdf/1807.08601.pdf}, doi = {https://doi.org/10.1007/978-3-030-00934-2_85}, year = {2018}, date = {2018-09-26}, booktitle = {International Conference on Medical Image Computing and Computer-Assisted Intervention}, pages = {768--776}, publisher = {Springer, Cham}, abstract = {We propose an end-to-end deep learning method that learns to estimate emphysema extent from proportions of the diseased tissue. These proportions were visually estimated by experts using a standard grading system, in which grades correspond to intervals (label example: 1-5% of diseased tissue). The proposed architecture encodes the knowledge that the labels represent a volumetric proportion. A custom loss is designed to learn with intervals. Thus, during training, our network learns to segment the diseased tissue such that its proportions fit the ground truth intervals. Our architecture and loss combined improve the performance substantially (8% ICC) compared to a more conventional regression network. We outperform traditional lung densitometry and two recently published methods for emphysema quantification by a large margin (at least 7% AUC and 15% ICC), and achieve near-human-level performance. Moreover, our method generates emphysema segmentations that predict the spatial distribution of emphysema at human level. }, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } We propose an end-to-end deep learning method that learns to estimate emphysema extent from proportions of the diseased tissue. These proportions were visually estimated by experts using a standard grading system, in which grades correspond to intervals (label example: 1-5% of diseased tissue). The proposed architecture encodes the knowledge that the labels represent a volumetric proportion. A custom loss is designed to learn with intervals. Thus, during training, our network learns to segment the diseased tissue such that its proportions fit the ground truth intervals. Our architecture and loss combined improve the performance substantially (8% ICC) compared to a more conventional regression network. We outperform traditional lung densitometry and two recently published methods for emphysema quantification by a large margin (at least 7% AUC and 15% ICC), and achieve near-human-level performance. Moreover, our method generates emphysema segmentations that predict the spatial distribution of emphysema at human level. |
2020 |
Suzanne C. Wetstein, Cristina González-Gonzalo, Gerda Bortsova, Bart Liefers, Florian Dubost, Ioannis Katramados, Laurens Hogeweg, Bram van Ginneken, Josien P.W. Pluim, Marleen de Bruijne, Clara I. Sánchez, Mitko Veta Adversarial Attack Vulnerability of Medical Image Analysis Systems: Unexplored Factors Conference 2020. @conference{Adversarial2020, title = {Adversarial Attack Vulnerability of Medical Image Analysis Systems: Unexplored Factors}, author = {Suzanne C. Wetstein, Cristina González-Gonzalo, Gerda Bortsova, Bart Liefers, Florian Dubost, Ioannis Katramados, Laurens Hogeweg, Bram van Ginneken, Josien P.W. Pluim, Marleen de Bruijne, Clara I. Sánchez, Mitko Veta}, url = {https://arxiv.org/abs/2006.06356}, year = {2020}, date = {2020-06-11}, urldate = {2020-08-25}, abstract = {Adversarial attacks are considered a potentially serious security threat for machine learning systems. Medical image analysis (MedIA) systems have recently been argued to be particularly vulnerable to adversarial attacks due to strong financial incentives. In this paper, we study several previously unexplored factors affecting adversarial attack vulnerability of deep learning MedIA systems in three medical domains: ophthalmology, radiology and pathology. Firstly, we study the effect of varying the degree of adversarial perturbation on the attack performance and its visual perceptibility. Secondly, we study how pre-training on a public dataset (ImageNet) affects the models' vulnerability to attacks. Thirdly, we study the influence of data and model architecture disparity between target and attacker models. Our experiments show that the degree of perturbation significantly affects both performance and human perceptibility of attacks. Pre-training may dramatically increase the transfer of adversarial examples; the larger the performance gain achieved by pre-training, the larger the transfer. Finally, disparity in data and/or model architecture between target and attacker models substantially decreases the success of attacks. We believe that these factors should be considered when designing cybersecurity-critical MedIA systems, as well as kept in mind when evaluating their vulnerability to adversarial attacks. }, keywords = {}, pubstate = {published}, tppubtype = {conference} } Adversarial attacks are considered a potentially serious security threat for machine learning systems. Medical image analysis (MedIA) systems have recently been argued to be particularly vulnerable to adversarial attacks due to strong financial incentives. In this paper, we study several previously unexplored factors affecting adversarial attack vulnerability of deep learning MedIA systems in three medical domains: ophthalmology, radiology and pathology. Firstly, we study the effect of varying the degree of adversarial perturbation on the attack performance and its visual perceptibility. Secondly, we study how pre-training on a public dataset (ImageNet) affects the models' vulnerability to attacks. Thirdly, we study the influence of data and model architecture disparity between target and attacker models. Our experiments show that the degree of perturbation significantly affects both performance and human perceptibility of attacks. Pre-training may dramatically increase the transfer of adversarial examples; the larger the performance gain achieved by pre-training, the larger the transfer. Finally, disparity in data and/or model architecture between target and attacker models substantially decreases the success of attacks. We believe that these factors should be considered when designing cybersecurity-critical MedIA systems, as well as kept in mind when evaluating their vulnerability to adversarial attacks. |
C. González-Gonzalo, S. C. Wetstein, G. Bortsova, B. Liefers, B. van Ginneken, C. I. Sánchez European Society of Retina Specialists, 2020. @conference{Gonz20c, title = {Are adversarial attacks an actual threat for deep learning systems in real-world eye disease screening settings?}, author = {C. González-Gonzalo, S. C. Wetstein, G. Bortsova, B. Liefers, B. van Ginneken, C. I. Sánchez}, url = {https://www.euretina.org/congress/amsterdam-2020/virtual-2020-freepapers/}, year = {2020}, date = {2020-10-02}, booktitle = {European Society of Retina Specialists}, abstract = {Purpose: Deep learning (DL) systems that perform image-level classification with convolutional neural networks (CNNs) have been shown to provide high-performance solutions for automated screening of eye diseases. Nevertheless, adversarial attacks have been recently screening settings, where there is restricted access to the systems and limited knowledge about certain factors, such as their CNN architecture or the data used for development. Setting: Deep learning for automated screening of eye diseases. Methods: We used the Kaggle dataset for diabetic retinopathy detection. It contains 88,702 manually-labelled color fundus images, which we split into test (12%) and development (88%). Development data were split into two equally-sized sets (d1 and d2); a third set (d3) was generated using half of the images in d2. In each development set, 80%/20% of the images were used for training/validation. All splits were done randomly at patient-level. As attacked system, we developed a randomly-initialized CNN based on the Inception-v3 architecture using d1. We performed the attacks (1) in a white-box (WB) setting, with full access to the attacked system to generate the adversarial images, and (2) in black-box (BB) settings, without access to the attacked system and using a surrogate system to craft the attacks. We simulated different BB settings, sequentially decreasing the available knowledge about the attacked system: same architecture, using d1 (BB-1); different architecture (randomly-initialized DenseNet-121), using d1 (BB-2); same architecture, using d2 (BB-3); different architecture, using d2 (BB-4); different architecture, using d3 (BB-5). In each setting, adversarial images containing non-perceptible noise were generated by applying the fast gradient sign method to each image of the test set and processed by the attacked system. Results: The performance of the attacked system to detect referable diabetic retinopathy without attacks and under the different attack settings was measured on the test set using the area under the receiver operating characteristic curve (AUC). Without attacks, the system achieved an AUC of 0.88. In each attack setting, the relative decrease in AUC with respect to the original performance was computed. In the WB setting, there was a 99.9% relative decrease in performance. In the BB-1 setting, the relative decrease in AUC was 67.3%. In the BB-2 setting, the AUC suffered a 40.2% relative decrease. In the BB-3 setting, the relative decrease was 37.9%. In the BB-4 setting, the relative decrease in AUC was 34.1%. Lastly, in the BB-5 setting, the performance of the attacked system decreased 3.8% regarding its original performance. Conclusions: The results obtained in the different settings show a drastic decrease of the attacked DL system's vulnerability to adversarial attacks when the access and knowledge about it are limited. The impact on performance is extremely reduced when restricting the direct access to the system (from the WB to the BB-1 setting). The attacks become slightly less effective when not having access to the same development data (BB-3), compared to not using the same CNN architecture (BB-2). Attacks' effectiveness further decreases when both factors are unknown (BB-4). If the amount of development data is additionally reduced (BB-5), the original performance barely deteriorates. This last setting is the most similar to realistic screening settings, since most systems are currently closed source and use additional large private datasets for development. In conclusion, these factors should be acknowledged for future development of robust DL systems, as well as considered when evaluating the vulnerability of currently-available systems to adversarial attacks. Having limited access and knowledge about the systems determines the actual threat these attacks pose. We believe awareness about this matter will increase experts' trust and facilitate the integration of DL systems in real-world settings.}, keywords = {}, pubstate = {published}, tppubtype = {conference} } Purpose: Deep learning (DL) systems that perform image-level classification with convolutional neural networks (CNNs) have been shown to provide high-performance solutions for automated screening of eye diseases. Nevertheless, adversarial attacks have been recently screening settings, where there is restricted access to the systems and limited knowledge about certain factors, such as their CNN architecture or the data used for development. Setting: Deep learning for automated screening of eye diseases. Methods: We used the Kaggle dataset for diabetic retinopathy detection. It contains 88,702 manually-labelled color fundus images, which we split into test (12%) and development (88%). Development data were split into two equally-sized sets (d1 and d2); a third set (d3) was generated using half of the images in d2. In each development set, 80%/20% of the images were used for training/validation. All splits were done randomly at patient-level. As attacked system, we developed a randomly-initialized CNN based on the Inception-v3 architecture using d1. We performed the attacks (1) in a white-box (WB) setting, with full access to the attacked system to generate the adversarial images, and (2) in black-box (BB) settings, without access to the attacked system and using a surrogate system to craft the attacks. We simulated different BB settings, sequentially decreasing the available knowledge about the attacked system: same architecture, using d1 (BB-1); different architecture (randomly-initialized DenseNet-121), using d1 (BB-2); same architecture, using d2 (BB-3); different architecture, using d2 (BB-4); different architecture, using d3 (BB-5). In each setting, adversarial images containing non-perceptible noise were generated by applying the fast gradient sign method to each image of the test set and processed by the attacked system. Results: The performance of the attacked system to detect referable diabetic retinopathy without attacks and under the different attack settings was measured on the test set using the area under the receiver operating characteristic curve (AUC). Without attacks, the system achieved an AUC of 0.88. In each attack setting, the relative decrease in AUC with respect to the original performance was computed. In the WB setting, there was a 99.9% relative decrease in performance. In the BB-1 setting, the relative decrease in AUC was 67.3%. In the BB-2 setting, the AUC suffered a 40.2% relative decrease. In the BB-3 setting, the relative decrease was 37.9%. In the BB-4 setting, the relative decrease in AUC was 34.1%. Lastly, in the BB-5 setting, the performance of the attacked system decreased 3.8% regarding its original performance. Conclusions: The results obtained in the different settings show a drastic decrease of the attacked DL system's vulnerability to adversarial attacks when the access and knowledge about it are limited. The impact on performance is extremely reduced when restricting the direct access to the system (from the WB to the BB-1 setting). The attacks become slightly less effective when not having access to the same development data (BB-3), compared to not using the same CNN architecture (BB-2). Attacks' effectiveness further decreases when both factors are unknown (BB-4). If the amount of development data is additionally reduced (BB-5), the original performance barely deteriorates. This last setting is the most similar to realistic screening settings, since most systems are currently closed source and use additional large private datasets for development. In conclusion, these factors should be acknowledged for future development of robust DL systems, as well as considered when evaluating the vulnerability of currently-available systems to adversarial attacks. Having limited access and knowledge about the systems determines the actual threat these attacks pose. We believe awareness about this matter will increase experts' trust and facilitate the integration of DL systems in real-world settings. |