Deep Learning Based Classification of Focal Liver Lesions with 3 Deep Learning Based Classification of Focal Liver Lesions with 3 and 4 Phase Contrast-Enhanced CT Protocols and 4 Phase Contrast-Enhanced CT Protocols

It has been noticed that three-phase and four-phase computed tomography protocols with contrast serve as standard examinations for diagnosing liver tumors. Additionally, many patients require periodic follow-up, which entails signi ﬁ cant radiation exposure for them. Advancements in image processing facilitate automated liver lesion segmentation. However, the challenge remains in classifying these small lesions by doctors, especially when the liver has different types of lesions with very little intensity difference. Therefore, deep learning can be utilized for the classi ﬁ cation of liver lesions. The present work introduces a CNN-based module for the classi ﬁ cation of liver lesions. The module consists of four stages: data acquisition, preprocessing, modeling, and evaluating. The proposed system has achieved an accuracy of 96 and 97% for three-phase and four-phase protocols, respectively. Moreover it has been shown that the three-phase protocol outperforms the four-phase protocol, according to the dose report, with only a 1% loss of accuracy. However, this loss has not altered the multiclassi ﬁ cation process. Thus, a three-phase protocol is recommended as a diagnostic tool for detecting focal liver lesions.


Introduction
L iver cancer is the second-leading cause of cancer-related mortality globally.Africa and Asia had the highest incidence rates (Ponnoprat et al., 2020).Hepatocellular carcinoma (HCC) is the most frequent kind of primary liver cancer.HCC incidence rates continue to rise in contrast to several other cancer forms (Ahn et al., 2021;Fahmy et al., 2022;Hamm et al., 2019).It is predicted that a total of 21.6 million additional cancer cases will be identified every year in developed countries by 2030.HCC is a worldwide problem, and local epidemiology data indicated regional differences.Egypt ranks third in HCC rates in Africa.HCC is the Egyptians' fourth-most common cancer.Over the past 10 years, the total number of HCC patients has doubled, and as a result, Egyptian health officials consider HCC to be the most serious health issue (El-Shqnqery et al., 2023).
HCC regions, surrounded by parenchyma, are difficult to detect with the naked eye.Computerized technologies could replace needle biopsy, which may potentially cause tumor spread (Brehar et al., 2020).It is essential to correctly classify focal liver lesions (FLLs) to determine choices for therapy and predict prognosis.
Because of their noninvasive nature, rapid scanning speed, and high-density resolution, threephase and four-phase computed tomography with contrast (3P-CT-WC and 4P-CT-WC) remain the preferred methods in the exact categorization of FLLs (Cao et al., 2020).The only difference between them is in the noncontrast phase.In the previous examinations, patients received injections with contrast material.HCC lesions have low attenuation in the noncontrast stage, an early peak of enhancement in the arterial stages, and a continual reduction in attenuation in the portal venous and delayed stages.The accuracy of the scanning approaches varies according to the size of the tumors (Ponnoprat et al., 2020).When the lesion is large, the data retrieved from it is enough for a diagnosis.On the contrary, when the lesion is relatively modest in size, extracting information from the image to make an accurate classification may be affected.
The major contribution of the research work is divided into segmentation and classification.Usually, the accuracy of segmenting liver from CT images has varied according to the technique used and the data used (Amritha and Manimegalai, 2023;Diao et al., 2023;€ O et al., 2023).The dice similarity measured for these techniques goes.In Kwiatkowski and Dziubich (2023), after comparing four state-ofthe-art (SOTA) models with UNet, the Dice similarity measure for the considered images was found to be 75 %; in Pattwakkar and Kamath (2023), the accuracy of segmentation reached 96.46 ± 0.48 %.In Heidari et al. (2023), the segmentation accuracy of the liver and tumor was 95.8 and 89.3 %, respectively.In Anil and Dayananda (2023), dice similarity has an average of 0.89 in training and an average of 0.86 in testing.In Nallasivan and Ramachandran, after decreasing the input data dimensionality, CNN was used to extract features and detect liver cancer.In Anand et al. (2023), after semantic pixel-wise segmentation and applying CNN, the classification performance did not differentiate so clearly at the level of the segmented area.In Saha Roy et al. (Saha Roy et al., 2023), the authors classify identified liver tumors into three categories with an average accuracy of only 87.8 %.In Midya et al. (2023), using a modified inception v3 network-based classification model to classify HCC, ICC, CRLM, and benign tumors from only one phase of CT scans, the authors achieved an overall accuracy rate of 96 %.In Phan et al. (2023), using four phases to distinguish between cysts, hemangiomas, and HCC, and the classification achieves an accuracy of 95.1 %.Also, high accuracy can be achieved when differentiating between only two classes: 98.61 % and 98.2 % in Manjunath et al. (2023) and Wang et al. (2023), respectively.Using only a small dataset in Gedeon and Liu (2023), the classification accuracy of multiclass tumors is 97 %.
In most of the research, authors used manual segmentation to verify their work, which may be error-prone.Besides, the final accuracy would not reach 100 %.After all, the classification accuracy depends on the accuracy of segmentation; therefore, it can also be suspected.Considering the variety and difficulty of liver masses, the study aimed to evaluate the four-phase CT to solve the multiclassification problem.The input layer was designed to allow a variety of data inputs without depending on the accuracy of the segmentation.A new way was produced to stand on the accuracy of the classification just by pointing at the center of the region of interest (RoI).This classification was used for a small liver lesion to compare the three-phase and four-phase contrast CT protocols.Few studies have discussed the data classification accuracy from the 3P-CT-WC and 4P-CT-WC protocols on small-sized lesions.
It was assumed that deep learning could solve the classification problem of small lesions using the output image from the two examination techniques 3P-CT-WC and 4P-CT-WC.For that, a deep CNN will be constructed to compare the accuracy of the three-phase and four-phase CT protocols.This can be done easily by using a codeless platform program called KNIME (KNIME 2023) for the differentiation of small segments with only 10 mm diameter to answer two questions: (a) is it a lesion or a normal tissue (NORM)?(b) Is it a malignant lesion (HCC) or other benign lesions like CYST or hemangioma (HEM)?So, the result will be four classes: NORM, HCC, HEM, and CYST, and then the importance of data extracted from the 3P-CT-WC and 4P-CT-WC protocols will be discussed.

Data specs and number
Data collected from ELH between January 2015 and December 2016, all patients over the age of 30 years, scanned by a Philips (Egyptian Liver Hospital, Sherbin, Egypt) 16-slice brilliance device with a 4P-CT-WC protocol, all have FLLs with a minimum size of about 10 mm in diameter and the output images were manually diagnosed by at least two abdominal specialist radiology doctors as one of the following classes: HCC, CYST, and HEM examinations were saved previously in DICOM format on a hard drive for research with full resolution.The clinical diagnosis reports were reviewed to exclude: (a) examinations that were not in a phase sequence, (b) examinations that did not have a four phase, (c) examinations that were not given contrast material on time, and (d) injected cases with LIPIODOL.Thus, four 512 Â 512 pixel, one-layer, grayscale images for each case were acquired, representing the four anatomically sequential phases.The class labels take the numbers from 0 to 3, and 1548 images were collected as four classes: CYST, hemangioma, HCC, and normal.The four-phase scan has the following phases: noncontrast phase (N), arterial phase (A), portal phase (P), and delayed phase (D).

Computed tomography specs and the examination factors
The 4P-CT-WC examination was executed on a Philips Brilliance 16 slices with one tube, with the following factors: 120 kV, 500 mA/slice, 3 mm thickness, and computed tomography dose index (CTDI) ¼ 17.5 mGy.From the equation shown in 1.
where DLP is dose length product in mGy Â cm CTDI is CTDI in mGy, and SL is the scan length in cm.
According to case dose reports, the dose length product ranged from 350 to 850 mGy Â cm.Fig. 1 shows a sample dose report, and the examination protocol follows the timeline shown in Fig. 2.

Synchronizing and centering data
The start point of the four stages, according to the device table coordinates, was fixed, and thus, it is easy to take images of the four stages of the lesion.
By testing images extracted from the CT workstation and taking many measurements, it has been found that without the zooming tool, a single pixel represents 1 mm of the lesion.After using the zooming tool, the data's smallest lesion has only 20 pixels.To avoid taking sample data outside the lesion and to simulate the difference between the lesion and healthy cells, a RoI of only 14 pixels from the lesion and 14 pixels from healthy liver cells will be taken.The result will be only (28 Â 28) pixels.In the absence of the lesion, 28 Â 28 pixels only from healthy liver cells will be taken.According to the arterial phase, after pointing into the lesion manually, the Auto-Contour tool is used to make an obvious auteur line around the lesion.To integrate all the model input, first a zooming tool by 200 % was performed.The essential step here is to catch the first appearance of the intravenous contrast matter on the lesion surface line.For that, the centering process will facilitate and best prepare the data to be compiled in the model.This can be done by putting the cursor pointer on the boundary between the lesion and the healthy cells of the liver,

Data augmentation and adaptation
A total of 1548 images were collected for the four phase; all images were rotated by 90 degrees to duplicate data and avoid considering the direction feature in future learning.The final data set will have 3096 images, with 774 images for each phase.Images have been split into four files named pre, arterial, portal, and delayed.Images of each phase were numbered from 001 to 774.An Excel file has been generated.As there are four classes, the appropriate diagnosis for each was inserted in the cell next to the row as follows: CYST, HCC, HEM, and NORM.The classes will be numbered later as integers from 0 to 3.

Modeling
At this stage, four different models are created: the first includes the four phases, the second has different three-phase combinations, the third has all the potential of the two-phase combination, and the fourth has each phase separately.In each case of the previous model configuration, the required images

Processing data in the model
A free codeless open score platform called KNIME is used in creating and processing the model as shown in Figs. 4 and 5.

Read, normalize, and partitioning data
As the data is a grayscale image, its three layers are identical; hence, only the first layer is read, then images are placed in a list, and the image number is put next to it.
Also, the value of any pixel in the image data ranges from 0 to 255.These values are converted to (0e1) by dividing by 255 and converting the data type to a float.So, the resulting images will be better during the training process.Then, the data is divided into: (1) 70 % for learning.

Cropping region of interest
As the images were centered as previously stated, the RoI was token by using the cropping node from the KNIME platform program easily, and the result cropped image will be the 28 Â 28 pixels in the center of the collected images.Fig. 6 shows the RoI selection process for different classes.Fig. 7 shows the RoI cropping process for different classes.

CNN structure and component
For this purpose, a CNN model was constructed.This model consists of two main parts: (a) feature extraction has convolution and pooling layers, and (b) the classification part will predict each category.First, three KERAS convolution two-dimensional (2D) layers are used, with specs as follows: Kernel size 3 Â 3, activation function (RELU), and filter numbers 10, 20, and 30.KERAS Max Pooling 2D layer with a pool size of 2 Â 2 is used after each convolution layer.Second, for prediction, a flattened layer is used.According to the number of input lines (three or four phases), the flattened layers are concatenated to be the input for two dense layers: the KERAS dense layer with activation function (RELU) and output of 100 neurons and the second KERAS dense layer with activation function (SOFTMAX) and four-neurons output.Then, the learner node KERAS Network Learner with ADAM is used as an optimizer and has a built-in categorical cross-entropy loss function because it is the function of the classification tasks (Makram et al., 2023;Nakata and Siina, 2023;Panduri and Rao, 2023).The CNN deep learning model is set as follows.

Model A
The utilized deep learning network comprises four concurrent input lines of images.Each line consists of one of the following imaging stages: N, A, P, and D. It contains previously processed images, so they are 28 Â 28 pixels in size.
Each line consists of multiple layers as follows (Fig. 8).

Model B
The same as model A with the exception of removing only one phase from the input layers, so N, D, A, and P phases are removed to create, respectively.Model 2 (A P D), Model 3 (N A P), Model 4 (N P D), and Model 5 (N A D).

Model C
Models 1e6 are created using only two parallel phases as input: (N A), (N P), (N D), (A P), (A D), and (P D).

Model D
Four models (A), (B), (C), and (D) are built with only one input phase.

Running and evaluating model
After completing the models, to avoid randomly splitting data to learning and testing, a similar chance of selecting data can be achieved for learning and testing in each running operation for the same model to compare different models easily.So, the input data for each case is arranged, and then it is used to partition the data linearly.By choosing the linear mode of the partitioning node in the KNIME platform program, the four-phase model was used for comparison by testing some  watching the learning monitor, it has been noticed that there is a need to increase the epoch number to 150 to achieve above 90 % of the model fitting, and the batch size used was 32.The ADADELTA optimizer was used in the previous model (Qurri and Almekkawy, 2023), and when the optimizer changed to the ADAM optimizer (Azar et al., 2023), the result accuracy was above 90 % with only 50 epochs.The first 32 filters and multiples are used, so the training time for each model exceeds 20 min.And when using only 10 filters, the accuracy achieved was above 90 % after 5 min of training.Models C and D have a very random accuracy result ranging from 10 to 70 % under the same specs used for models A and B, so it cannot be considered in the final result.

Results
A total of 387 cases were split into four classes: 65 CYST, 114 HCC, 94 hemangiomas, and 114 normal, aged from 30 to 70 years, with a FLL for the first three categories having a minimum size of 10 mm, after settling on the final form of the models.The data is entered into the five models for classification.The result varies according to the stochastic nature of the process and the basic model structure.So, each model is executed five times.After the final form of the model, the performance of training by the accuracy, loss graves, and confusion matrix is shown in Figs.9e13.
According to the previous data, when using 70 epochs for learning, the best accuracy reached was found to be 97.854% in the four-phase model, and the three-phase (APD) by 96.137 % was a close second.In this situation, the four-phase and threephase models are rotated to achieve a suitable fit.When using 55 epochs only, the three-phase model (APD) achieves an adequate fit, and the four-phase model accuracy goes down.So, the two cases with 55 and 70 epochs are considered.Figs. 14 and 15 show the learning, loss function, and confusion matrix when using only 55 epochs.
A box plot in Fig. 16 shows the average accuracy score resulting from the five models using 55 and 70 epochs, after running the model five times separately.
In this paper, a comparison between three-phase and four-phase CT protocols is proposed in terms of the accuracy of classification of the different liver lesions to highlight the benefits of each phase and make an accurate classification by using the minimum data and then the minimum dose that is valuable in the early detection of different diseases.A box plot in Fig. 17 shows a comparison between the three-phase (APD) and the four-phase models only in the accuracy score result, which indicates that when using 55 epochs, three-phase (APD) will be faster in the learning, reaching the condition of fit faster, and achieve slightly higher accuracy than four-phase, and on the contrary, when using 70 epochs, the two previous models deliver learning data to fit condition and four-phase achieve little higher accuracy than three-phase (APD).
Table 1.A brief analysis for all the seven model is listed using the upper bound and lower bound.A comparison of all seven models is made in Table 1 with mention of upper bounds and lower bounds for each model.This showed that the CNN models using 70 epochs of four phases and three phases APD were the best.They demonstrated the precision of 0.95 and 0.95, the sensitivity of 0.97 and 0.92, the specificity of 0.98 and 0.98, and the accuracy of 0.97 and 0.96, respectively, for the lower bound, which is very similar.The true positive and true negative were getting closer to 100 % for the lower bound, as shown in Fig. 18.Looking deeply into the classification process, it has been found that the error in the classification for any model like that in Fig. 19 was due to the unequal distribution of information among the four phases, so if we change any phase from the APD phases, the model may be misleading.

Discussion
In this study, a CNN network was designed using different specifications and settings to determine the best output for the workflow.First, take into account the number of filters.The results show that adding only a few filters (10, 20, and 30) reduces the processing time and can lead to the same classification accuracy result when using 16, 32, and 64.Second, it was found that using the ADAM optimizer can make the network reach the learning fit faster than the ADADELTA optimizer, with only 55e70 epochs compared to 150 epochs.
In the produced model, it is not fair to let the learning rate reach the fitting condition with an unequal amount of information in the classification process, so by comparing the two conditions, one with the first fit (using 55 epochs) and the other with all data reaching the fit condition (using 70 epochs), this states the weighted value of the input data for each model.
As a result, the APD model has a very acceptable accuracy, sometimes higher than four-phase, so the APD phases can be used to classify the four different classes: CYST, HCC, HEM, and NORM, and this will reduce the total dose the patient is exposed to by about 25 %.Fig. 1 shows an example of a dose report for the four-phase exam.
In the produced model, only taking sample data from the lesion image replaces the invasive process of taking a biopsy from the liver, as it can lead to serious side effects.With this model, the poor information phases used to separate different lesions can be removed to accelerate the examination.This method can provide a good classification process to classify tinny lesions with high accuracy, which is very difficult to diagnose with a doctor's eyes.This method can be applied to other lesion classifications, especially in the case of rare data.This idea to simulate biopsy surgery can be generalized to all cases that need to take a sample for analysis.The method produced is impressive for solving multiclass classification problems.To answer first, is it an actual lesion or not?Second, what type of lesion is it?This is used to assist radiologists with fast lesion classification and help them make final diagnoses.This model will also be adequate for cases where their images have simultaneously different types of lesions.Many patients need periodic checks, so this model will be beneficial for them to reduce the overall CT doses to which they are exposed.The RoI selection in this model gave the classification process the power to desert undeclared regions in decision-making.There is no need to segment all the lesions, so the accuracy of segmentation is negligible.Only pointing manually at the area that needed to be classified in the image.In the future, there is a need to apply this method to further cancer areas in other organs.

Conclusion
In this paper, a deep CNN model is produced for the accurate classification of different liver lesions by using the minimum data.The novelty of this study is that it has real accuracy as it is segmentation-independent and has a high potential for liver disease differentiation.Also, it is very useful in the classification of multi-different lesions.Then, a comparison between three-phase and four-phase CT protocols is proposed in terms of the accuracy of the classification.After that, by highlighting the benefits of each phase, the total dose can be minimized.When compared to state-of-the-art research, a very acceptable classification accuracy of about 96 % is obtained when utilizing the APD model of the 3P-CT-WC protocol, which is very close to the 4P-CT-WC 97 % and sometimes greater than fourphase under the same conditions.So, it is recommended to utilize just APD phases to identify the four separate classes (CYST, HCC, HEM, and NORM).

Fig. 3 .
Fig. 3. Four sequence of collecting the images and specifying RoI: (a) standard image; (b) making an obvious auteur line around the lesion; (c) zooming by 200 %; and (d) selection of RoI.

Fig. 5 .
Fig. 5. Nodes used to prepare input image data in the preprocessing stage.

Fig. 8 .
Fig. 8. CNN model structure for only one phase; the other images of three and four phase protocols can added after being flattened to the concatenation layer.

Fig. 16 .
Fig. 16.The average accuracy score for the five models using 55 and 70 epochs.