Sentiment Analysis Using Convolutional Neural Network Method to Classify Reviews on Zoom Cloud Meetings Application Based on Reviews on Google Playstore

Zoom Cloud Meetings is an application that is used to conduct video conferencing. On the Google Play Store, the Zoom Cloud Meeting application received a rating of 3.8, with 500 million more downloads as of March 2021. The application has many advantages, such as not being disturbed by pauses in conversation and having good video and audio quality. The advantages possessed by these applications require development so that application services are getting better. For this reason, user reviews are needed to see user satisfaction with the application so that they can determine services that can be developed in the future. Based on this, this research was created to create a web-based application that can classify user reviews of the Zoom Cloud Meetings application using the Convolutional Neural Network (CNN) method and calculate the accuracy value. This application is built using the Flask framework and the Python programming language. Model training is carried out using the TensorFlow library. Applications that have been made are then tested using two stages of testing, namely system testing with black box and data testing. Based on system testing, it was found that the website can run well, and for data testing using test data, the accuracy result is 91.5%.


Introduction
Indonesia has now entered the era of the industrial revolution 4.0.In the era of the industrial revolution 4.0, technology is increasingly sophisticated and emphasizes digitization by utilizing information technology in its implementation.This increasingly sophisticated information technology is used in carrying out daily activities.Utilization of this technology is used for communication, information dissemination, data retrieval, data management, and others.As a result, information technology has become very important and much needed in human life.The need for information technology is getting bigger so that it requires people to be technology literate.The information technology used is laptops, computers, and mobile phones.In addition, supporting applications are also needed for face-to-face communication.One application that can be used is Zoom Cloud Meetings.This application can be used as a means for students, workers, and the public to hold meetings or meetings online.Zoom Cloud Meetings is an application that is used to conduct video conferences.This application is compatible on laptops, computers, and mobile phones.In addition, this application is popular because it is easy to use and easy to install.Installation can be done by downloading the application on the Google Play Store, App Store, or Zoom's official website.On the Google Play Store, the Zoom Cloud Meetings application received a rating of 3.8 with over 500 million downloads as of March 2021.According to information compiled by KompasTekno from CNBC, Zoom CFO Kelly Stackelberg said that Zoom's usability and reliability were the reasons behind its adoption rate.very high.This application produces low latency so that it is relatively undisturbed by pauses in conversation and can maintain video and audio quality even if the internet connection is unstable.This application has two methods of use, which can be used for free and paid.These two usage methods have a difference, namely if they use it for free, participants are limited to only 100 people and the meeting duration is 40 minutes.On the other hand, the paid version provides the advantages of an unlimited number of participants and a longer meeting duration of up to 24 hours.The advantages possessed by Zoom Cloud Meetings require continuous development so that application services are getting better.For this reason, user reviews are needed to see user satisfaction with the application so that they can determine services that can be developed in the future according to the wishes and needs of users.One platform that provides user reviews of applications is the Google Play Store.User reviews contained in this Google Play Store have a variety of positive and negative comments.In addition, user reviews are overwhelming and often contain spelling errors that are difficult to decipher.Based on this, a system is needed to conduct sentiment analysis in classifying these reviews quickly and automatically.
Sentiment analysis is a method used to extract opinion data, understand, and process textual data automatically to see the sentiments contained in an opinion [11,2,4,[6][7][8].This sentiment analysis can be done using machine learning or deep learning techniques.Deep learning is a learning method for data that aims to create a multilevel representation (abstract) of data using several data processing layers.This deep learning has several neural network-based algorithms, one of which is Convolutional Neural Network (CNN).Convolutional Neural Network (CNN) is a neural network using perceptron for supervised learning and for analyzing data [3,5,9].Based on the description described above, research about analysis of sentiment to classify reviews on the zoom cloud meetings application based on reviews on Google Play Store was conducted using the Convolutional Neural Network (CNN) Method.This research is expected to be useful and become a benchmark for companies in improving the service quality of the Zoom Cloud Meetings application.

Method
In this research, there are several stages with the flow of work carried out, planning, analyzing data, making application designs, implementation, and testing.The following figure 1 shows the stages of the research method.

Planning
This research was conducted to create a web-based application that can classify user review sentiments on the Zoom Cloud Meetings application.The web-based application will show the classification results sentiment of positive and negative user reviews.In this research, the review data used were 1000 data up to March 2021 which were written in Indonesian.The review data was obtained from the Google Play Store using a web scraper.

Data Analysis
At this stage of data analysis, review data that has been obtained from the Google Play Store is collected in Microsoft Excel.After that, manually assign negative and positive labels to the review data that has been collected.The negative label is symbolized by the number 0 (zero) given to negative reviews.Meanwhile, the positive label is symbolized by a number (one) given to positive reviews.This review data is divided into two types, training data and test data.The distribution of this data is 80% training data and 20% test data.Furthermore, the review data in Microsoft Excel is saved in Comma Separated Values (CSV) format.The data in this CSV formatted file will be used for the data preprocessing stage.A. Preprocessing: The review data that has been collected is data that still has noise.At this stage the process carried out is eliminating noise in the data.This stage aims to make the sentiment recognition process more accurate.The process at this stage is carried out using Python's Natural Language Toolkit.a. Punctuation and Special Character Removal: The process at this stage is to remove punctuation marks and characters that are not needed.b.Case Folding: At this stage the process carried out is to make case uniformity on the review data.This case uniformity is changing words that have uppercase letters into lowercase letters (lowercase).
c. Stopwords Removal: At this stage the process carried out is to remove words that are not needed by using the Sastrawi module from Python.This Sastrawi module provides various stopword words in Indonesian.d.Stemming: The process at this stage is to change a word into the basic form of the word.This process is done by removing the affixes attached to the word.e. Tokenizing: At this stage the process carried out is to separate the text into parts called tokens.This text separation is done using Python's tokenizer module.

B. CNN Architectural Modeling:
Next, doing the CNN architectural modeling stage, the Convolutional Neural Network (CNN) has an architecture that can recognize predictive information on an object such as images, text, sound snippets, and others.This CNN has a similar way of working with Multi-Layer Perceptron (MLP).This research uses a simple architectural model by using not too many layers.The layers that are passed to carry out this sentiment are word embedding, convolution operations, Rectified Linear Unit (ReLu) activation functions, pooling operations, dropout regularization, and fully connected layers [1,10].a. Word Embedding: In this word embedding process, GloVe is used with dimensions of 100.GloVe is an algorithm used to obtain a vector representation of data in the form of text.This word embedding process produces a matrix vector value that will be used in the convolution stage.In the convolution stage, the vector value of this matrix is the input used in the convolution operation.b.Convolution Operation: The way this convolution operation works is to shift the convolution kernel (filter) which has a certain size.The kernel shifts from the left corner to the bottom right.When a shift occurs, a "dot" operation is performed between the input and the value of the filter.In this research, the kernel used is 5x5 in size.This convolution operation produces an output known as an activation or feature map.c.ReLU Activation Function: The matrix that has been obtained in the form of an activation map from convolution will be normalized using the Rectified Linear Unit (ReLu) function.In this layer, the process carried out is to change all pixel values that have values less than zero to zero.d.Pooling Operation: This layer is a layer that consists of a filter with a certain size to reduce the dimensions of each feature map.
This pooling operation can control overfitting, in this research the pooling operation used is max pooling.The output generated at this layer is then forwarded to the dropout regularization process.e. Dropout Regularization: Dropout regularization is a neural network regularization technique by selecting several neurons at random that will not be used in the training process.The dropout size used in this research is 20%.f.Fully Connected Layer: The last layer is fully-connected.This layer is a layer where all activity neurons from the previous layer are connected to neurons in the next layer.This layer has a purpose to process data so that data can be classified.In this layer the filter used is 5x5 in size.In addition, this research uses 200 batch parameters and 20 epochs.These parameters are defined using the dense method available in TensorFlow.

C. Model Testing:
After the classification results are obtained, the model is tested to measure the accuracy of the classification results.Testing this model is done by using a confusion matrix.This confusion matrix aims to provide comparative information on the classification results.Measurement of accuracy is done to find out how well the model performs data classification.Testing this model is done by using the following formula.
Where TP means positive data detected are positive.FP means positive data detected is negative.FN means negative data detected is positive.TN means negative data detected is negative.

D. Model Conversion:
The next stage is to convert the model.This model conversion is the step taken to change the model that has been made into a JavaScript Object Notation (JSON) form.Changing the shape of this model has resulted in the form of a model that has been made into a JavaScript Object Notation (JSON) form.Changing the shape of this model has resulted in the form of a .jsonfile a .jsonfile.

Application Design
This stage is the stage for designing the application.The application made is a web-based application.The design of the application system is made using Unified Modeling Language (UML) diagrams.The design with UML is described by using a use case diagram [14].A use case diagram is a diagram that describes a series of activities carried out by a system with one or more users (actors).This use case diagram shows who is involved in the system.In addition, this diagram shows the activities of actors or users who get a response from the system.The following figure 2 is a use case diagram for a web-based sentiment classification application.
In designing this web-based application, the navigation structure is used to describe the flow of access relationships between web pages.The type of navigation structure used in designing this web-based application is a mixed type of structure.This mixed structure is a combination of navigational and non-linear navigation structures, figure 3.

Implementation of the Convolutional Neural Network (CNN) Model
In

Creating Application
At this stage the application is built using the Flask framework [12,13].The first step to build this web application is to install Flask.This Flask is installed through the command prompt on the laptop.The following is the code used to install Flask.After that, create a file with the name web.py.The file is the main file for writing various program codes used to build web applications.Written program code such as importing modules, app packages with xxx-name-xxx-program code, writing routing to access every page on the web, and various other program codes.The next step is to make the interface.Making this interface is done using Hypertext Markup Language (HTML).
A. Dashboard page: The dashboard page is the initial view of the website.This page shows an explanation of the Zoom Cloud Meetings application and sentiment analysis.

Embedding Model
Embedding the model on the website is by calling the model and network weights of the Convolutional Neural Network (CNN) method that has been built previously.The model used is the model that is stored with the JSON and h5 extensions.The model with h5 format is a model that stores the weights and configuration of the model that has been trained.Furthermore, so that the model can be used, it is necessary to compile the model.The compilation process is carried out using variables, namely, optimizer, loss, and metrics.

Testing
After the website is successful, the testing phase is then carried out.The testing carried out is testing the system and data.System testing is carried out using the black box method to observe the results of the execution carried out through test data.Based on the results of the system test, it was found that the system was running well as expected.Meanwhile, for data testing, it is carried out to measure the level of accuracy by using a confusion matrix.The dataset used is 200 review data taken from 1000 user review data for the Zoom Cloud Meetings application.The classification results obtained by the system will be compared with the results of the classification performed manually.

Fig 4 .
Fig 4.Flowchart , the model training process at the 20th epoch resulted in an accuracy value of 0.9685 or 96.85%.These results were obtained by taking 12 minutes 23 seconds.In addition to the level of accuracy, the model training process produces a loss value of 0.0244 or 2.44%.The level of accuracy and loss generated by the training of this model is represented in following figure5and figure6

Fig 7 .
Fig 7. Dashboard Page B. CNN classification Zoom Cloud Meetings page: The CNN classification Zoom Cloud Meetings page is a page that users can use to classify sentiments.This sentiment classification can be done by entering a review in the available text area column then clicking the process button to process the classification.

Fig 8 .
Fig 8. CNN Classification Zoom Cloud Meetings Page

Fig 9 .
Fig 9. Upload Data Page D. Result data page: The data result page is a page that users can use to view the results of classifying sentiments against previously uploaded datasets.The results of this classification are presented in tabular form with the results of the classification, namely positive and negative.In addition, the results of this classification are presented in the form of a pie chart.

Fig 10 .
Fig 10.Result Data Page E. Punctuation and special character removal page: This page is a page that users can use to view the results of punctuation and special character removal.The displayed results are datasets that have gone through the process of removing punctuation and special characters.Users can see the results of punctuation and special character removal in the table provided.

Fig 11 .
Fig 11.Punctuation And Special Character Removal Page F. Case folding page: Punctuation and special character removal page: This page is a page that users can use to view the results of punctuation and special character removal.The displayed results are datasets that have gone through the process of removing punctuation and special characters.Users can see the results of punctuation and special character removal in the table provided.

Fig 12 .
Fig 12. Case Folding PageG.Stopwords removal page:This page is a page that users can use to view the stopwords removal results.The displayed results are datasets that have gone through the process of deleting words that are not included in the stopword.Users can see the results of stopwords removal in the table provided.

Table 1 .
this research, several stages were carried out to implement the Convolutional Neural Network (CNN) model, namely im-porting data, preprocessing, dividing the data into two, namely test and training data, and word embedding.After carrying out these stages, the model training is carried out.This model training process is a process for conducting training on the model used.The model used in this research is a Convolutional Neural Network (CNN) model.This training process is carried out using 200 batches and 20 epochs.In this process the filter used is a 5x5 filter.In addition, this process uses 4 convolution layers, 3 dropout layers, and 4 pooling layers.In the training process of this model, the SoftMax function is used so that the resulting output can represent a sentiment category or a class.The following is the accuracy and loss resulting from the model training process.Model Training Results Based on the table above

Table 2 .
Data TestingBased on the data testing, the results were that the system classified correctly on 183 test data and incorrect classification on 17 test data.System errors in classifying can be caused because the model does not recognize the words contained in the test words.Unrecognized words such as slang, abbreviations, and incorrect writing.Furthermore, the level of accuracy based on the experiments carried out was obtained using a confusion matrix table.

Table 3 .
Confusion Matrix Number of review data: 200 Positive Classification by System Negative Classification by System Manual Positive Classification True Positive: 116 False Negative: 9 Manual Negative Classification False Positive: 8 True Negative: 67