Sentiment Analysis of Online Lectures Tweets using Naïve Bayes Classifier Analisis Sentimen Tweet Kuliah Online menggunakan Naïve Bayes Classifier

Online lecture is an alternative learning method during the Covid-19 pandemic. There are opinions with pro and contra of the learning method. The purpose of this study is to evaluate the tweets of opinion or sentiment retrieved from social media Twitter regarding online lectures among the Indonesian community. Twint is used to collect the data tweet and Jupyter notebook is for text preprocessing and classification. The processes started with scraping data from Twitter, text preprocessing, and text classification. Using the Naïve Bayes classifier shows the performance has a precision value of 100%, an accuracy value of 70.8%, an F-measure of 10.2%, and a recall value of 5.4%. Performance rating can be affected by the dataset used for modeling. This analysis covers the positive sentiment and negative sentiments toward online lectures and the result shows 69% negative sentiments and 31% positive sentiments. The negative sentiments had a higher percentage compared to positive sentiments. The results were also supported by the word cloud which expressed a high frequency of negative words such as sleep problems, bored, tired,

The related research mentioned previously discussed the method and the results of the sentiment which can be used to compare with this research. The purpose of this study is to collect a sample of responses from Twitter users in Indonesia at a specified time and analyze the sentiment towards online lectures by using the Naïve Bayes. Naïve Bayes Classifier is the simplest and fastest classification algorithm for large data and is applied successfully in various applications, including spam filtering, text classification, sentiment analysis, and recommendation systems [15]. This research used Naïve Bayes as a method because of its simplicity and good performance in classifying text.

Sentiment Analysis
Sentiment analysis is the process of analyzing or processing text to find out or understand someone's emotions which are poured into a sentence and grouped based on positive, neutral, and negative emotions [16].
In the world of education or academia, it is necessary to evaluate and improve the quality of learning and activities held at certain institutions. Evaluation is done by involving other people's responses to get input from outside parties involved with the institution so that it can be known what is needed, what changes must be made, and what steps must be taken. Sentiment analysis becomes an option in evaluating, by collecting responses from other people and extracting these opinions. It can be seen whether other people's views on the institution are positive or negative and it can be concluded what changes need to be made or what needs to be done based on the results of the sentiment analysis conducted [17].
Sentiment analysis is also an analytical technique used in the business world. In its use, business people also need to monitor social media to find out public and consumer opinions on their business. In addition to monitoring social media, the use of sentiment analysis in the business world, namely, conduct brand monitoring, customer feedback, customer service, and market research [18]. All this is done to get the results of the analysis which can be used in decisionmaking.
Several types of sentiment analysis are distinguished by the focus of analysis or function: 1. Fine-grained Sentiment Analysis: Sentiment analysis of this type uses five separate classes starting from one as the lowest which is very negative, the second is negative, the third is neutral, the fourth is positive and the fifth is very positive [19]. This type of analysis is often used in e-commerce to assess or evaluate customer feedback [18]. 2. Emotion Detection: Today's social media we often encounter is full of feelings, emotions, and opinions expressed by users from all over the world, so this type of sentiment analysis focuses on collecting emotions that are poured or expressed into social media, whether sad, happy, afraid, angry and sad. feelings or other emotions [20]. 3. Aspect-based Sentiment Analysis: Based on the name, this type of sentiment analysis will find out what aspects are in a document or sentence and then determine the sentiment for that aspect. Aspect-based sentiment analysis is also called feature-based opinion mining and there are several stages to doing this aspect-based sentiment analysis, namely aspect extraction which extracts and evaluate the opinion, and aspect sentiment classification which categorized the sentiment into positive, neutral or negative [21]. 4. Multilingual Sentiment Analysis: Sentiment analysis of this type is to conduct sentiment analysis in more than one language, such as conducting sentiment analysis on social media in Indonesian and English. The problem in conducting multilingual sentiment analysis is the lack of significant resources, namely the need for a list of words from various languages or languages that will be used in the sentiment analysis [22].

Social Media
Social media can be used by anyone, anywhere, and anytime. Its users can provide contributions and feedback openly and provide comments and share information about what is happening around them and what is being thought about quickly and unlimitedly [23]. Social media has become a trend in today's society where people around the world cannot be separated from the habit of using technology, such as computers, smartphones, or tablets. These devices become a mediation to open or access social media to interact with other people virtually both those who are close to them and those who are very far from them [24].
There are so many definitions of social media that have been put forward by researchers, influencers, and even experts, here are some definitions of social media according to communication experts. According to Kotler and Keller, social media is used to share messages and digital information with anyone, while according to Lewis, social media allows people to connect and share messages [25]. Based on these definitions, it can be concluded that social media is a medium to connect people virtually so that they can interact and share information without being limited by space, distance, and time [24].
Twitter is a social media type micro-blogging or small blog founded by Jack Dorsey on July 15, 2006. Twitter is a messaging service that has many characteristics in common with the media or communication tools that we have often used such as email, SMS, blogging, etc. However, several factors make Twitter unique and slightly different [26,27]. In receiving and sending messages on Twitter, users cannot create messages or tweets longer than 140 characters. Tweets created are also public which allows anyone to read them without having permission from other Twitter users, thus allowing users to meet new people [27]. Based on data uploaded by Databoks in October 2019, Twitter has the highest rating as a news and magazine application, which is 13.68 million and has been downloaded 500 million times [28]. Twitter is a social media that is commonly used in Indonesia and it was in the top five positions in 2020 [29].

Twint
Twint is a very helpful tool when it comes to scraping data on Twitter with is easy to use but can retrieve a lot of tweet data up to the last 3200 tweets for a single scraping session [30]. Twint itself is written or developed with the Python programming language and the installation of the Twint package can be done via Anaconda or pip [31]. In addition to being able to scrape tweets with a data limit of up to 3200 tweets, Twint also has several other benefits and advantages, namely Twint can be used without the Twitter API to scrape tweets, then Twint can be used without having to register Twitter or not having a Twitter account and the initial settings for Twint are fast also without tariff restrictions [30].

Online Lectures ("Kuliah Online")
The presence of the internet makes many changes that appear in everyday life when interacting. Today's education has also developed and learning interactions have become flexible with the presence of e-learning. One form of e-learning is online lectures. Online lecture is a teaching and learning process that is supported by information technology and the internet as a facilitator in providing lessons and interactions between students and teachers during learning activities [32].
Online lectures are increasingly being practiced in carrying out the learning process during the pandemic. The purpose of this online lecture is to provide a practical teaching and learning process for lecturers and students without being limited by distance so that the learning process can continue [33].

Text Preprocessing
There are so many languages spoken by the global population that different writing and grammatical rules make writing in a particular language less regular [34]. In conducting sentiment analysis, text data is often withdrawn on social media networks. Writing texts like this is often found to be done in the informal and irregular language. For that text preprocessing needs to be done so that the text becomes more grammatical and orderly before running the sentiment analysis process on the data that has been drawn [35].
There are several preprocessing texts, namely case folding, cleansing, tokenizing, stopword removal, and stemming. The case folding stage is to equalize the letters in all sentences so that the data is easier to find and makes the process faster. The cleansing stage is cleaning the data that is not needed. Tokenizing is text preprocessing to cut sentences into single words before proceeding to the next process. The next preprocessing text is stopword where in this process words that often appear in a sentence or paragraph but do not affect their meaning will be removed because they are noise. The last is text preprocessing to remove affixes in a word to get the basic word to produce better accuracy [34].

Library Sastrawi
Library Sastrawi is an algorithm built based on NA (Nazief and Adriani), where this algorithm follows the rules of Indonesian grammar that regulate word affixes. In the grouping based on the placement of affixes, some prefixes are at the beginning of words, suffixes at the end of words, inserted infixes, and confixes which are a combination of both the beginning and the end. This literary library is continuously updated, the first with the CS (Confix Stripping) algorithm where words are changed into their basic form using a basic word dictionary so that the processed word is searched for the basic word in the dictionary [36]. After that, it was updated again with the ECS (Enhanced Confix Stripping) algorithm. This update addresses an issue where some words could not be stemming. Then the last update was made, namely, Modified ECS where improvements were made to overcome over-stemming and under-stemming with the corpusbased method [36].

Naïve Bayes Classifier
Naïve Bayes is a simple form of probabilistic classifier. Naïve Bayes is often used as a text classifier because of its simplicity and good performance in classifying documents and text [37]. Naïve Bayes is commonly used for its speed, easy implementation, and effectiveness [38]. To calculate Naïve Bayes, the maximum likelihood is used which is the highest similarity to the formula (1).
x: Data with unknown class c: Hypothesis data is a specific class P(c|x) is the posterior probability of class (target) given predictor P(c): the prior probability of a class P(x|c): likelihood which is the probability of predictor given class P(x): the prior probability of the predictor

Research Method
The research method is based on the Naïve Bayes classifier. The stages as shown in Figure  1, describe the initial method of analysis, starting with scraping data using Twint. The next step is text preprocessing and continuing with text classification using the Jupyter notebook. The stages are explained as follows.
1. The collection of datasets by scraping data on Twitter using Twint via Google Colaboratory. It is done by taking tweets from the period of August 1, 2020, to May 31, 2021. The scraping process is also carried out by searching for data with the appropriate keywords. The keywords or queries used to retrieve the data are #kuliahonline, #kuliahdaring, and #kulon. After scraping data, it was found that some tweets had duplicated data which caused repetition with the same meaning in the dataset. Irrelevant data and duplicated data obtained from the previous scrape stage had to be removed and cleaned so that the dataset collected is relevant and does not have data loops. 2. Text Preprocessing [39] at the case folding stage is to change the entire text which initially contained capital letters, and changes into lowercase letters so that the data is easier to find. The cleansing [40] stage is carried out to avoid the use of words that are not needed in the dataset such as username, email, URL, hashtag, emoticon, symbol, or RT (retweet).
Tokenizing is a preprocessing text that aims to cut a sentence into independent word fragments [41]. The Stopword Removal [39] stage is a preprocessing text that aims to remove words that often appear but are not related to the topic. Stemming [39] is a text preprocessing stage that aims to change a word that is in morphological form or has an affix to be changed to its basic form. 3. Text classification will produce output in the form of data with positive or negative values and determine the sentiments. Figure 2 shows the flowchart of data analysis starting with scraping data by removing duplicate and irrelevant data, text preprocessing, text classification, evaluation, and ending with the conclusion of sentiment analysis.

Data Collection
Data was collected by scraping data on Twitter social media using Twint with the query or keywords #kuliahonline, #kulon, and #kuliahdaring with 4146 tweet data pulled. Then in the stage of scraped data, it is cleaned for duplicate and irrelevant data. So, the relevant data without repetition and ready to be processed is 597 tweet data.

Text Preprocessing
The text preprocessing stage is the stage to prepare the data so that it can be processed to the next stage. The data that is usually collected before preprocessing the text is raw data, which is often found noisy and inconsistent [42]. In this research, the preprocessing stage of the text is carried out by case folding, cleansing, tokening, stopword removal, and stemming. After the preprocessing stage is carried out, the data is clean and continues for the sentiment classification stage.

Word Cloud
Word cloud is a technique or way to present text frequency data with a visual appearance that is more attractive and easier to reach. A visual display containing the words with the most frequently used word frequencies where the larger a word means the word is used more often [43]. The word cloud in this study was created using the TagCrowd web application [44]. The word cloud from tweets as shown in Figure 3 shows the word's frequency of opinion so that it can be seen what words are often used by Indonesian people to communicate and respond to online lectures. The word cloud shows the chosen keywords such as "kuliahonline", "kulon", and "kuliahdaring". Other words highlighted the high frequency of negative words in Bahasa Indonesia such as "tidur" (sleep), "tugas" (assignments), "bosan" (boring), "lelah" (tired), "pusing" (dizzy), "malas" (lazy), "sakit" (sick), "sedih" (sad), "susah" (difficult), etc. Only a few positive words such as "semangat" (spirit) and "pintar" (smart).

Data classification
To perform data classification, data splitting is carried out to divide training data and testing data, where training data is to train the classifier model to recognize the data and produce probability models, while testing data is to test sentiment classification. The data ratio for training data and testing data is 80:20. So from 597 tweets, the training data is 477 data and the testing data is 120 data. Classification of data is done by the Naïve Bayes formula which has been discussed in section 2.6. Figure 4 shows the process classification of testing data to determine the probability of prediction results. It shows some values of testing data but takes only the top five data, classify as data 1, data 2, data 3, data 4, and data 5 as mentioned in Table 1 and Table 2. Figure 4 Process classification of data testing Table 1 describes the probability values of the top five testing data. Data 1 is the probability of the first testing data and so on until data 5 is the probability of the fifth testing data. The probabilities values of data 1 up to data 5 are in decimals and can be converted to percentage values. Data 1 to data 5 show probabilities of data testing successively. Data 1 has a probability of 0.87440072, around 87.4%. Data 2 has a probability of 0.90901682, around 90.9%. Data 3 has a probability of 0.58742644, around 58.7%. Data 4 with 0.77800838, around 77.8%. Data 5 has 0.53997977, around 53.9%. Data with a higher probability value will determine the sentiment classification of whether the data is negative or positive. A higher probability value means the data is dominant. For example, in Table 1, data 1 shows the probability of 0.87440072 (87.4%), which means the data is classified as negative sentiments.  Table 2 is the result of testing data classification. The results of the distribution of training data and testing data, where the testing data obtained are 120 tweets. Figure 5 shows that, in 120 tweets as data testing, 83 tweets had negative sentiments and 37 tweets had positive sentiments, with 69% and 31% respectively. This research found that from August 1, 2020, to May 31, 2021, the sentiment toward online lectures has 69% negative sentiment and 31% positive sentiment. It can be related to the word cloud in Figure 3 which shows the frequency of negative words.
As already described earlier in the introduction section that there are several related research with findings about online lecture sentiment. Compare to other related research, this research shows that negative sentiments that are more dominant than positive sentiments.

Evaluation
The evaluation was carried out with the aim of testing and measuring the results of the classification using a confusion matrix table. This classification test is carried out by calculating precision, recall, f-measure, and accuracy so that the ability of this classification and its accuracy can be measured. Table 3 shows true negative (TN) or negative predictions data with negative facts are 83 tweets. False positive or positive prediction data with negative facts are 0 tweets. False negative (FN) or negative prediction data with positive facts are 35 tweets. True positive (TP) or positive prediction data with positive facts are 2 tweets. Then the percentage value of precision, recall, fmeasure, and accuracy will be sought. The accuracy between the desired data and the model's predicted outcomes is referred to as precision. Recall (Sensitivity or True Positive Rate) measures how well the model can retrieve data or how frequently the model correctly predicts positive outcomes when the actual class is positive. F-measure (F1-Score) is the harmonic average of Precision and Recall. Accuracy is the overall total of how accurate the model is in classifying correctly.
In table 4 the calculation results from the formula, it is found that the precision value is 100%, the recall value is 5.4%, and the F-measure is 10.2%. The accuracy value is 70.8%.

CONCLUSION
The research has several conclusions. The text preprocessing stage greatly affected the results of the Naïve Bayes text classification. With 120 testing data, it shows a precision value of 100% which means the level of precision of the requested data is high as the results of the Naïve Bayes classification. The accuracy describes how accurate the model classifying correctly. The results of accuracy on sentiment analysis with Naïve Bayes are quite good with a value of 70.8%.
From 557 tweets, there were 393 tweets with negative sentiments and 204 tweets with positive sentiments. Taking 120 tweets for training data shows 83 tweets with negative sentiments or 69% and 37 tweets with positive sentiments or 31%. The visual word cloud shows more negative responses related to sleeping disorders, bored, tiredness, and laziness with online lectures. Only a few respondents with a positive opinion. Comparing the results with other related research conducted in Indonesia shows the same trend of negative sentiment toward online lectures.

FUTURE RESEARCH
Future research related to the improvement of preprocessing stage so that the data obtained from Twitter is better, such as adding normalization of words from slang or terms in Indonesian to standard Indonesian because it was found that some Twitter users used slang, terms, and also abbreviations when making tweets. In the Twitter scrape process, it is suggested to only take relevant tweets by avoiding queries or keywords and Twitter usernames, to avoid irrelevant tweet data containing conversations from users. Another suggestion is adding data so that the results obtained are more accurate by extending the period of scraping data or adding other keywords related to online lectures. The research found that many responses regarding online lectures were expressed by emojis, emoticons, symbols, stickers, pictures, and videos. Hence future research is to provide a mechanism for converting those categories of responses into words or sentences of sentiment because it can collect more responses or public opinions submitted on Twitter.