A newly developed AI algorithm is set to improve the success rates of In Vitro Fertilization (IVF) treatment. The procedure has been helping people improve their reproductive odds since its first successful case in 1977.
While several advancements in technology have improved the process, there are still aspects of the IVF treatment that are time-consuming and relatively inaccurate. One of these is a process called “grading”.
The task typically requires an embryologist to examine embryos under a microscope checking their morphological features and assigning a quality score. Round, even numbers of cells score highly while fractured and fragmented cells score poorly.
The highest scoring embryos are implanted first. The process demands experience, and can be inaccurate since it relies purely on visual attributes. The accuracy at this stage of the process can be improved if a cell is removed from the embryo and tested for abnormalities. The procedure is known as preimplantation genetic screening.
This additional step makes the IVF process expensive and time-consuming. So, until now, visual grading of eggs has remained the best option.
However, that’s all about to change, all thanks to an algorithm which has learned to grade embryos better than its human counterparts. Researchers have trained a Google deep learning algorithm to identify IVF embryos as either good, fair, or poor based on the likelihood each one would successfully implant.
The algorithm training began back in 2011 when the embryology lab at Weill Cornell Medicine installed a time-lapse imaging system inside its embryo incubators. This meant that technicians could watch and record their embryos as they developed.
The resulting videos of anonymized embryos were then freeze-framed and fed into a neural network. Nikica Zaninovic (Director of the lab) teamed up with Olivier Elemento (Director of Cornell’s Englander Institute for Precision Medicine) to take the project to the next step.
The two researchers thought of using AI to automate a process that was notoriously inaccurate and time consuming. To test their trained network, nicknamed STORK, the researchers recruited five embryologists from clinics on three continents to grade 394 embryos based on images taken from different labs.
The recruited experts could only reach the same conclusion on 89 embryos or less than a quarter of the total. To get around this lack of agreement, the embryologists were then told they needed to use a majority voting procedure—three out of five embryologists needed to agree to classify an embryo as good, fair, or poor.
On the other hand, STORK looked at the same images graded by the embryologists and predicted the majority voting decision with 95.7 percent accuracy. Although there is some more research to go before STORK is rolled out in clinics around the world, its initial work is looking promising and may eventually help improve IVF success rates.