Research Article

Is Artificial Intelligence the Most Reliable Way to Predict Mortality After Liver Transplantation?

Marcos Bruna Esteban1*, Eva Montalvá2, Antonio J. Serrano-López3, Joan Vila-Francés3, Javier Maupoey2 and Juan Vila J2
1Department of General and Digestive Surgery, General University Hospital of Valencia, Spain
2University and Polytechnic Hospital of La Fe, Spain
3Department of Data Analysis Laboratory, School of Engineering at the University of Valencia, Spain


*Corresponding author: Marcos Bruna Esteban, Department of General and Digestive Surgery, General University Hospital of Valencia, Spain


Published: 20 Sep, 2018
Cite this article as: Esteban MB, Montalvá E, Serrano- López AJ, Vila-Francés J, Maupoey Jm Juan Vila J. Is Artificial Intelligence the Most Reliable Way to Predict Mortality After Liver Transplantation?. Clin Surg. 2018; 3: 2122.

Abstract

Introduction: Graft allocation in Liver Transplantation (LT) should be based on the greatest survival benefit to the patients awaiting transplantation. This study developed a predictive model to determine recipient mortality 1 year after LT.
Materials and Methods: We developed Artificial Neural Network (ANN) and Logistic Regression (LR) models and compared their results with the Balance of Risk (BAR), Survival Outcomes Following Transplantation (SOFT), and Model for End-stage Liver Disease (MELD) scores. The Development Group used to create the predictive models included 1235 valid cases, while 200 consecutive transplant patients since January 2009 were included in the Generalization Group for internal validation.
Results: The area under the curve (AUC) of the ANN model (0.82) was higher than that of the LR model (0.68). For the Generalization Group, the MELD, SOFT, and BAR scores had AUCs of 0.56, 0.57, and 0.62, respectively. The ANN model had a significantly higher AUC than that of each score (MELD, p=0.005; SOFT, p=0.009; BAR, p=0.02).
Conclusion: ANN model was superior to the LR model and other scales currently used to predict mortality during the first year after LT and to match a particular graft with potential recipients.
Keywords: Artificial Intelligence; Liver Transplantation; Recipient Survival; Waitlist Management


Abbreviations

ANN: Artificial Neural Network; AUC: Area Under the Curve; BAR: Balance of Risk; DRI: Donor Risk Index; HBV: Hepatitis B Virus; HCV: Hepatitis C Virus; IQR: Interquartile Range; LR: Logistic Regression; LT: Liver Transplantation; MELD: Model for End-Stage Liver Disease; NPV: Negative Predictive Value; PPV: Positive Predictive Value; SOFT: Survival Outcomes Following Transplantation


Introduction

Liver Transplantation (LT) has improved markedly and recipient survival has increased progressively [1], with survival rates close to 90% 1 year after transplantation [2]. Although Spain has the highest organ donation rates in the world [3], the supply remains insufficient due to the growing demand and increased number of patients on the waiting list. Therefore, it is necessary to construct predictive mortality and allocation models to increase the efficiency of this process. Data that allow us to predict the expected survival with a specific graft is a very important step in selecting the best recipient in each case [4]. Although many risk scales and predictive models based on a large number of cases and years of experience and follow-up have demonstrated relationships between different variables and liver recipient mortality, none of them constitute a universal model able to predict the result. However, there is evidence that the addition of many risk factors increases the possibility of a worse post-transplant outcome [5]. Therefore, this study developed a predictive model to determine recipient mortality 1 year after LT, based on known preoperative donor, recipient, and graft variables. For this purpose, we developed Artificial Neural Network (ANN) And Logistic Regression (LR) models and compared their results with other predictive scores.


Figure 1

Another alt text

Figure 1
Flowchart of included cases in the Development group.

Materials and Methods

This was a retrospective, descriptive, analytical study of data collected prospectively from a cohort of 1235 adult hepatic recipients followed for the first year after transplantation. Using defined recipient, donor, and graft variables known before transplantation, this cohort was used to develop two predictive models of recipient [5] mortality 1 year after LT: one based on LR and the other based on ANN. The study was approved by the local research ethics committee. To create predictive models, we included all patients undergoing orthotopic LT in the Hepatobilio pancreatic Surgery and Transplantation Unit of the Hospital Universitario y Politécnico La Fe (Valencia-Spain), from November 1994 to December 2008 in the Development Group. We excluded patients younger than 14 years and recipients of a partial liver graft or a graft from a non-heartbeating donor, domino transplantation, combined transplantation, or retransplantation, and recipients followed for less than 1 year (Figure 1). We applied these predictive models to a population of 200 consecutive patients transplanted in the same unit since January 2009 (Generalization Group), using the same inclusion and exclusion criteria. We considered recipient mortality related to the procedure in the following situations: surgical complications, graft dysfunction, recurrence of primary disease or complications due to immunosuppressive therapy.
We used data for 29 variables to create the predictive models (Table 1). A univariate, descriptive, comparative analysis was used to compare the Development and Generalization Groups, using the t-test for independent samples for continuous variables and the chisquare test for qualitative variables. The statistical analysis and LR model were performed using SPSS® ver.20.0 for Windows, including all variables that were selected in the study and using a sequential withdrawal procedure based on the likelihood ratio. MATLAB® ver.2010 was used to develop the predictive model based on the ANN, consisting of a multilayer perceptron with forward connections and supervised learning. To maximize the information [6] available to develop and compare predictive models, missing values in the Development Group were completed using the hot-deck 6 pairing method, filling incomplete cases using the values for the most similar cases in our database. The predictive capacity of the model was assessed using the area under the receiver operating characteristics curve (AUC). Finally, the probability thresholds of the predictive models were determined by a multidisciplinary team to maximize their value, to improve the allocation efficiency and transplantation results. For this purpose, utility values were assigned for each test result (Table 2), establishing the maximum value for the prediction of recipient mortality. All other values were agreed on in relation to the maximum.
Once the LR and ANN models were created from the development group, we applied them to the Generalization Group and analyzed their predictive capacity. The results were evaluated by a comparative analysis of AUC using the Hanley and McNeil test [7]. Similarly, the Model for End-stage Liver Disease (MELD), Survival Outcomes Following Transplantation (SOFT), and Balance of Risk (BAR) [8,9] scores were applied to the Generalization Group and we compared them with the created predictive models.


Figure 2

Another alt text

Figure 2
Comparison of AUROCs for ANN vs. LR models in the Development group (a) and in the Generalization group (b). AUROCs for ANN and LR models and BAR, SOFT and MELD scores in the Generalization group (c).

Figure 3

Another alt text

Figure 3
Importance of the variables and AUC of ANN model when we remove each of them from the model.

Table 1

Another alt text

Table 1
Descriptive and comparative analysis of the variables in the Development and Generalization groups.

Table 2

Another alt text

Table 2
Utility values.

Table 3

Another alt text

Table 3
Causes of recipient mortality related to the procedure during first year after LT in the Development and Generalization groups.

Results

From November 1994 to December 2008, 1435 transplants were performed at our center. After applying the exclusion criteria, 1235 valid cases were included in the Development Group (Figure 1). Since January 2009, 200 cases were included in the Generalization Group [7]. During the first year after transplantation, 12.1% of the recipients (150 patients) in the development group died, with a median survival of 11 (IQR 10-12) months. In the Generalization Group, 19 patients (9.5%) died, with a median survival of 11 (IQR 10-12) months. The main causes of mortality were infections and recurrence of the primary liver disease and the incidence of each were similar in both groups (Table 3). Univariate analysis Table 4 showed that variables significantly (p<0.05) related to recipient mortality during the year after transplantation were donor age, recipient age, etiological diagnosis of cirrhosis, and presence of nephropathy, hepatocellular carcinoma and portal thrombosis in the recipient. The LR model was obtained after 22 steps (Additional Information 1). AUC of the LR model was 0.72 (95% CI 0.68–0.76) in the Development Group (Figure 2) and an AUC of 0.68 (95% CI 0.54–0.82) was obtained when we applied this model in the Generalization Group (Figure 2). For the assigned values of utility, the sensitivity, specificity, and accuracy were 42.1, 84.5, and 80.5% respectively, with a positive predictive value (PPV) of 22.2% and a Negative Predictive Value (NPV) of 93.2% (Additional Information 2). The ANN model was based on a fully connected neural network with 42 input neurons, 27 hidden neurons, and one output neuron, which gives the risk of the recipient’s mortality during the first year after LT. The analyzed variables had different levels of importance in the model and Figure 3 shows the AUCs of the model after removing each of them. Based on previously defined utility values, a threshold of 0.28 (0.2 for the LR model) was obtained to discriminate between positive and negative tests. We applied the model created in the Development Group and obtained an AUC of 0.81 (95% CI 0.77-80.85) (Figure 2). In the Generalization Group (Figure 2), we obtained an AUC of 0.82 (95% CI 0.68-0.96) and the sensitivity, specificity, and accuracy were 68.4, 86.1, and 84.5%, respectively, with a PPV of 34.2% and NPV of 96.3% (Additional Information 3). When we compared the AUCs of the predictive models in both groups, the ANN model was superior to the LR model, with a statistically significant difference (p<0.001) in the Development Group (Figure 3).
The MELD, SOFT, and BAR scores were applied in the Generalization Group, obtaining AUCs of 0.56 (95% CI 0.41-0.71), 0.57 (95% CI 0.42-0.71), and 0.62 (95% CI 0.48–0.75), respectively. The AUC of the ANN model was significantly higher than each of these: MELD, p=0.005; SOFT, p=0.009; BAR, p=0.02 (Figure 2).


Table 4

Another alt text

Table 4
Univariate analysis.

Discussion

Artificial intelligence and models based on an ANN can give predictive results using large databases, such as a liver transplantation series. These models are based on the structure of the brain and they can detect complex and, in many cases, non-linear relationships among variables due to their plasticity. The use of functional and hepatic status assessment scores such as MELD, SOFT, and BAR has helped to optimize waiting list prioritization and to decrease the short-and long-term mortality after LT. Nevertheless, about 10% of liver recipients die within 1 year of the intervention. Therefore, it is necessary to continue searching for a method to make graft allocation to a specific recipient as efficient as possible [10,11]. In this study, ANN methodology was applied to make a model that proved to be superior at [9] predicting recipient mortality during the first year after transplantation in comparison with a model created with LR and other classic scores. The ANN model included 29 variables related to recipient mortality after transplantation. All of them are easily collected during the preoperative period. Only two of them are subjective variables: graft macroscopic steatosis and donor atherosclerosis. However, this subjectivity is reduced because multivisceral extraction is usually performed by surgeons with a lot of experience, and only five surgeons in our group perform the extraction and macroscopic evaluation of the graft and aorta in the donor, using systematically specific protocols and scales validated by a multidisciplinary team.
The characteristics of the liver recipients in the Developmental Group were similar to those published in other studies and in the European Liver Transplantation Registry [12], with a median age of 55 years [13]. More than 60% of them were male and they had some cardiovascular risk factors, as in other series [14]. The main indications for transplantation were cirrhosis due to chronic Hepatitis C Virus (HCV) infection and enolic cirrhosis [15]. Overall, 40% were classified in Child group C, as in other groups [15]. On the other hand, the donor characteristics have changed over time. The donors are now older, with more associated diseases and more frequent deaths due to cerebrovascular injury [16]. Despite the fact that the data used in these models were collected over a long period of time, future advances in treatments and changes in the indications and characteristics of these patients could alter the utility of these predictive models. Our recipient survival is close to 90% 1 year after transplantation, which is similar to published values [17]. Despite the fact that the quality of implanted grafts has [10] been worsening, the range of recipient survival has been maintained over time. The variable with the greater prognostic relevance was the recipient’s liver disease. The ages of the donor and recipient18and the causes of the donors’ death were also important, as other studies have shown [19].
The predictive results of these models are clinically correct. However, their sensitivity is low compared with their high specificity (close to 90%) due to the assigned utility values, supported by the main objective of avoiding recipient deaths. Similarly, the positive predictive values are low in both models and the negative predictive values exceed 90%, to create a test that does not fail to predict recipient mortality. Consequently, when the test is positive, the best option is not to perform the transplantation in that recipient, and to assign the graft to another recipient with better expectations. We noted the superior predictive capacity of the ANN model versus the LR model and the currently used BAR, SOFT, or MELD scores. This better performance of ANN could be justified by the characteristics of the clinical data used and the limitations present in the LR and other models, which are based on a small number of variables, to simplify their applicability. In this way, an ANN can establish more complex relationships among variables and obtain more reliable results with large databases than models created with LR [20]. An ANN establishes interconnections among input variables that give a result that we can use as a risk predictor after the training and learning process.
A few publications have used this mathematical model in LT. Briceño et al. [21] published a multicenter study with more than 1000 transplantation based on this methodology. A total of 57 variables for each donor-recipient pair were used to create 211 predict models than predicted probability of graft survival (AUC: 0.80) and graft loss (AUC: 0.82) 3 months after transplantation. The end point of our model is different: recipient survival 1 year after transplantation, but we only obtain a dichotomous result: dead of alive, with a similar accuracy than Briceño´s model. Their model was also superior to the MELD, D-MELD, P-SOFT, BAR, and DRI 21 scales. Ibañez et al. [22] developed a LR and ANN to predict transplant failure 3 months after transplantation. They selected 19 variables from donor, recipient and operative data and they obtained an AUC in the validation cohort of 0.81 with ANN model, similar to our results. Our predict model based on ANN can maintain accuracy of these models 1 year after transplantation. To facilitate the use of our predictive model, we have developed an application available at http://emac.uv.es/liver (Figure 4). After introducing all variables, a predictive result is obtained. This result could facilitate decision-making to allocate available grafts based on the expected mortality for each particular recipient.
The model based on the ANN was superior to the LR model and other scales currently used to predict mortality during the first year after LT and to match a particular donor with a potential recipient. However, multicenter validation of this model is necessary before it can be considered a valid tool for improving the efficiency of the management of LT wait lists [12].


Figure 4

Another alt text

Figure 4
Web application (http://emac.uv.es/liver/index_eng.php).

References

  1. Schrem H, Till N, Becker T, Bektas H, Manns MP, Strassburg CP, et al. Long-term results after liver transplantation. Chirurg. 2008;79:121-9.
  2. Spanish Registry from Hepatic Transplantation. Results Report. 2010.
  3. Donation and transplant statistics.
  4. Marino IR. Are we ready to match donor and recipient in liver transplantation? Liver Transpl. 2006;12(11):1574-6.
  5. Moore DE, Feurer ID, Speroff T, Gorden DL, Wright JK, Chari RS, et al. Impact of donor, technical, and recipient risk factors on survival and quality of live after liver trasplantation. Arch Surg. 2005;140(3):273-7.
  6. Rivero G. In: Análisisde datos incompletos en Ciencias Sociales. Centro de Investigaciones Sociológicas. 2011;75-9.
  7. Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983;148(3):839-43.
  8. Zaman MB, Hoti E, Qasim A, Maguire D, McCormick PA, Hegarty JE, et al. MELD score as a prognostic model for listing acute liver failure patients for liver transplantation. Transplant Proc. 2006;38(7):2097-8.
  9. Ma Y, Wang Q, Yang J, Yan L. Comparison of different scoring systems based on both donor and recipient characteristics for predicting outcome after living donor liver transplantation. PLoS One. 2015;10(9):e0136604.
  10. Kim WR, Therneau TM, Benson JT, Kremers WK, Charles B, Rosen CB, et al. Deaths on the liver transplant waiting list: an analysis of competing risks. Hepatology. 2006;43(2):345-51.
  11. Briceño J, Ciria R, De la Mata M. Donor-recipient matching: Myths and realities. J Hepatol. 2013;58(4):811-20.
  12. European Liver Transplant Registry. Data analysis booklet. Paris 2007. Accessed August 2008.
  13. Registro español de trasplante hepático. Organización Nacional de Trasplantes. Ministerio de Sanidad y Consumo. Octava memoria de resultados. 1984-2005.
  14. Bilbao I, Armadans L, Lazaro JL, Hidalgo E, Castells L, Margarit C. Predictive factors for early mortality following liver transplantation. Clin Transplant. 2003;17(5):401-11.
  15. Adam R, Hoti E. Liver Transplantation: The Current Situation. Semin Liver Dis. 2009;29(1):3-18.
  16. Bruna M, López R, Montalvá E, Redondo C, Moya A, San Juan F, et al. Donantes de ayer y hoy: ¿han cambiado las características de los donantes de hígado en los últimos 15 años? Cir Esp. 2008;83:192-6.
  17. US Scientific Registry (UNOS). Richmond (VA): United Network for Organ Sharing. Accessed February 12, 2006.
  18. Adam R, Cailliez V, Majno P, Karam V, McMaster P, Caine RY, et al. Normalised intrinsic mortality risk in liver transplantation; European liver Transplant Registry Study. Lancet. 2000;356(9230):621-7.
  19. Schrem H, Till N, Becker T, Bektas H, Manns MP, Strassburg CP. Long-term results after liver transplantation. Chirurg. 2008;79:121-9.
  20. Schumacher M, Robner R, Vach W. In: Neural networks and logistic regression: Part I. Computational Statistics & Data Analysis. 1996;21:661-82.
  21. Briceño J, Cruz-Ramírez M, Prieto M, Navasa M, Ortiz de Urbina J, Orti R, et al. Use of artificial intelligence as an innovative donor-recipient matching model for liver transplantation: results from a multicenter Spanish study. J Hepatol. 2014;61(5):1020-8.
  22. Ibáñez V, Pareja E, Serrano AJ, Vila JJ, Pérez S, Martín JD, et al. Predicting early transplant failure: neural network versus logistic regression models. Open Transplant J. 2009;3:14-21.