Deep learning is a subset of machine learning based on artificial neural networks with a representation of learning that attempts to model high-level abstract concepts in the data process using a multiple linear/nonlinear processing layers’ deep graph. Deep learning has developed as deep neural networks (DNN), deep belief networks (DBN), recurrent neural networks (RNN), convolutional neural networks (CNN), restricted Boltzmann machines (RBM), and autoencoders sparse coding (ASC) architectures24 which have the exceptional ability to learn the various patterns in data analyses. The DNN (known as dense structural learning) is an artificial neural network (ANN) with multiple layers between the input–output layers which finds the correct mathematical manipulation to turn the input into the output, whether it can be defined as linear or non-linear relationships25. The DNN network moves through the layers calculating the probability of each output. The user can review the results and select which probabilities the DNN network should display and return the proposed label. Each mathematical manipulation is considered as a layer, and a complex DNN has many layers, hence the name ‘deep’ networks. Deep architectures include many variants of a few basic approaches where each stage has found success in specific domains24.

DNN is a deep learning approach that is used for high accurate classification or prediction based on extracted features from input data (basic or primary dataset). Due to the DNN network capability, by increasing neural layers, the accuracy of the analysis can be increased, so increasing the learning depth. The input data provide the 1st layer of DNN evaluation as a data matrix in which each element has a specific feature value. Hence, the input layer is organized by each DNN layer and unit. These units extract different features from the input data. The output layer was considered as classified/predicted layers from the input data. The middle layers were calculation layers of DNN. Combining these layers in the sequence can extract the desired features and, thereby, classify the input data into the desired classes.

Study location

Iran is one of the most sensitive countries during the COVID-19 outbreak. The COVID-19 outbreak affected most parts of Iran very fast26,27. The growing COVID-19 infection in Iran leads to unpredictable development in the production of pollution-prone waste. On the other hand, the lack of proper locations for landfilling caused this pollution to have a social and trans-social aspect. In this regard, the expansion of COVID-19 and the increase of infected patients caused to increase in the COVID-19-based pandemic wastes rapidly. To determine the impacts of COVID-19 outbreaks on the environment and solve problems in the waste management sector during this pandemic, we need appropriate information about the situation of solid waste in Iran’s metropolises. Since only official statistics on infected cases, recovered cases, and mortality are declared by the Ministry of Health and Medical Education of Iran (MHME), it is necessary to contact the municipalities to assess the current state of solid waste management in metropolitan areas. Therefore, extensive field studies were conducted to find the relationship between the number of infected cases and the amount of plastic waste generated in metropolitan areas in both household and hospital wards, and the corresponding graphs were prepared.

Data resources and preparation

In order to implement the proposed DNN-based model, the basic or primary dataset must first be provided. This dataset will be used to train and tested by the DNN techniques and lead to reaching the prediction goal. The dataset was prepared from 8 Iran metropolises concluded Tehran, Mashhad, Esfahan, Karaj, Shiraz, Tabriz, Qum, and Ahvaz. Data on infected cases is gathered per day in the mentioned metropolises from the beginning of February 27, 2020, to October 10, 2021, based on updates from the website “Worldometer.” These data mostly reflected the COVID-19 spread in the cities. Doing the field survey from both household and hospital plastic waste in each city helped to modify the primary dataset. During the field survey of these megacities, basic information was gathered from hospitals, healthcare centers, and cities’ landfills regarding the volume and type of SUPs and PPEs. Table 2 provides information about the data recourses that were used to enrich the dataset. The provided database categorized the infection cases, PPEs, SUPs, and Test Kits and used medical package volumes in time duration to investigate the pandemic plastic pollution in Iran. All data is classified in rows and columns for each city separately.

Table 2 The information about dataset preparation regarding the field survey.

After providing the main dataset, this dataset was divided into training and testing sets (80% and 20% of the information, respectively). The training set was used to learn the DNN model, and the test set was used for testing the performance and accuracy of the proposed model.

The number of newly infected cases in Iran’s metropolises from the beginning of the first wave that appeared on February 27, 2020, to October 10, 2021, is plotted using Microsoft Excel (Fig. 1). As shown in Fig. 1, at the beginning of the Coronavirus outbreak, the number of infected cases was low in all eight of Iran’s metropolises, but because the virus has a rapid spread and due to lack of awareness of how the virus behaves, it has spread rapidly throughout all cities. It led to the beginning of the first wave of COVID-19 in Iran, which reached its wave on March 30, 2020. After passing the first wave, due to the preventive measures of the government and the people becoming more aware and observing the health protocols by them, we witnessed a decrease in the number of infected cases in Iran. Although, due to the reopening of businesses and low observance of health protocols by the people, it did not take long for us to see the start of the second wave again on May 16, 2020, and the number of infected cases increased and reached its wave on June 4, 2020. The third wave of COVID-19 was related to the onset of autumn and the cooling of the weather. In this wave, the number of infected cases increased exponentially, and more patients needed hospitalization and intensive care. Coinciding with the emergence of the new coronavirus mutation, known as the British Variant, the fourth wave of COVID-19 began and remained in Iran until June 4, 2021. Unfortunately, due to the spread of the Delta Variant, Iran is currently in the fifth wave of COVID-19. The number of infected cases in this wave reached a record 50,228 cases on October 10, 2021.

Figure 1
figure 1

The number of new daily infected cases of COVID-19 in Iran’s metropolises.

Field survey and ground investigations

Iran is one of the countries with a high prevalence of COVID-19. As of October 10, 2021, there had been 5,754,047 confirmed cases and 123,498 deaths in Iran, which made this country come 8th ranked in the world (Worldometer website). Along with other issues such as non-compliance with health protocols, not implementing social distancing, and unsafe traveling during the COVID-19 pandemic, improper handling of corona waste in developing countries, including Iran, increases the possibility of Coronavirus propagation. In Iran, with more than 85 million people, over 18 million tonnes of municipal solid waste (MSW) annually are generated. Only 8% of MSW are recycled by legal framework due to poor separation programs implemented all over the country. Therefore, hazardous household waste, including medical waste, is mixed with general household waste and can have health and environmental problems28. In this regard, the presented study provides an extensive field survey from the main hospitals and municipal waste management units in megacities to provide the relevant information used in the primary dataset. This information, after pre-processing, is used in the prediction process. During the pre-processing stage, the non-relevant data, like the waste volume of not pandemic like non-organic wastes, food waste, metals, etc., was removed from the database. The main focus was on pandemic plastics like PPEs, SUPs, and medical wastes that are potentially prone to pollution.

DNN model implementation

After providing the primary dataset that was used as a basic database of the COVID-19 spread and pandemic plastic usage in various megacities in Iran, the dataset was randomly divided into testing and training sets. In the next stage, the model was trained and tested regarding the learning rate. Considering the test/train ratio is important for the model learning rate, that is, the response to the estimated error each time the model weights are updated. In fact, the learning rate controls how quickly the model is adapted to the problem. Lower learning rates require more training epochs as smaller changes are made to the weights at each update, whereas larger learning rates result in rapid changes and require fewer training epochs. Specifically, the learning rate is a configurable hyperparameter used in the training of neural networks that has a small positive value, often in the range between 0.0 and 1.0. The learning rate used in this study was selected by optimizers, which for 0.01 and no momentum were scheduled via callbacks in Keras support. To this end, the DNN model was run for 700 iterations (epochs) using the training and validation datasets.

This database randomly divided into the testing and training data sets which are cover 20% and 80% of the primary database, respectively. Figure 2 is illustrated the processing flowchart of the DNN model implementation. The DNN-based predictive model is used to forecast the riskable pandemic plastic pollution for future events.

Figure 2
figure 2

The processing flowchart of the DNN predictive model.

Performance evaluations

The performance of the proposed methodology was estimated based on both the confusion matrix and statistical error estimators such as mean squared error (MSE), root means square error (RMSE), and mean absolute percentage error (MAPE). The performance matrix is a specific table that visualizes the performance of a prediction algorithm based on its predicted values, and it contains the sensitivity, specificity, and 1-specificity parameters. For classification tasks, the terms true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) compare the results of the classifier in question with trusted external judgments25. Precision (TP/[TP + FP]), also called the positive predictive value, is the fraction of relevant instances (TP) among the retrieved instances. Also, recall (TP/[TP + FN]) is the total fraction of relevant instances.

Both precision and recall, therefore, are based on measures of relevance29. Accuracy can be a misleading metric for imbalanced datasets. For example, for a prediction set with 90 positive and 10 negative values, classifying all values as negative gives a 0.90 (90%) accuracy score. The f1-score which known as the harmonic factor (F1 = 2 × [precision × recall]/[precision + recall]) provides approximately the average of the precision and recall values when they are close and is more generally the harmonic mean.

The overall accuracy represents the probability that an individual will be correctly classified by a test; that is, the sum of TP plus TN divided by the total number of the individuals tested.The application of the performance matrix helps to characterize the trustworthiness of the classifiers in question24.

To estimate the error estimators from the confusion matrix, the mean squared error (MSE), root mean square error (RMSE) and mean absolute percentage error (MAPE) was used to measure the model accuracy. In statistics, MSE, RMSE, and MAPE are considered as an estimator to measure the average of the squares of the errors between the estimated values and the actual value. In machine learning, these errors represent the empirical risk of the average loss on an observed dataset which indicates the rate of predictive model accuracy.

Verifications

The common intelligence learning-based classifiers are used for justification of applied DNN model to verification of modeling. In this regard, the k-nearest neighbors (k-NN), decision tree (DT), random forests (RF), support vector machines (SVM), Gaussian naïve Bayes (GNB), logistic regression (LR), and multilayer perceptron (MLP) methods were selected to comparative subjects for prepare confusion matrix. In the machine learning field and specifically in a statistical classification problem, a confusion matrix is used to investigate the performance of applied algorithms especially supervised learning. The confusion table indicates the degree of visualization based on information retrieval documents which allows more detailed analysis than a mere proportion of correct classifications (accuracy). In the matrix’s context, precision (represents the positive and negative predictive values in numeral diagnostic tests), recall (represents the performance of binary classifications sensitivity), and f1-score (harmonic factor) are defined as relevant documents. The comparative algorithms were used as justification30. The above classifiers were used for verification of DNN based method by providing the comparative confusion tables. Also, the receiver operating characteristic (ROC) is used to control of mentioned predictive models’ performances. The ROC curve is a graphical description that shows the diagnostic ability of a binary classifier system as its discrimination threshold is varied. As a result, the overall accuracy and area under the curve (AUC) from the confusion matrix and ROC curve represent the accuracy of the classifiers. All models from DNN to verification classifiers are tested by both the confusion matrix and ROC to obtain the performance status of the methods.

link

By admin

Leave a Reply

Your email address will not be published. Required fields are marked *