You are reading the article A Guide To Monte Carlo Simulation! updated in December 2023 on the website Moimoishop.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 A Guide To Monte Carlo Simulation!
This article was published as a part of the Data Science Blogathon
Introductionsampling to get the likelihood of a range of an unknown quantity. Sounds difficult! don’t worry, we will explore this in-depth in this article
A Brief History:The Monte Carlo Method was invented by John Neumann and Ulam Stanislaw to improve decision-making under uncertain conditions. It was named after a well-known casino town Monte Carlo called Monaco since the element of chance is core to the modelling approach as it is similar to a game of roulette.
In easy words, Monte Carlo Simulation is a method of estimating the value of an unknown quantity with the help of inferential statistics. You need not dive deep into inferential statistics to have a strong grasp of Monte Carlo simulation’s working. However, this article will go only through those points of inferential statistics which will be relevant to us in the Monte Carlo Simulation.
Inferential Statistics deals with the population which is our set of examples and sample, which is a proper subset of the population. The key point to notice is that a random sample tends to exhibit the same characteristics/property as the population from which it is drawn.
What is Monte Carlo Simulation in python?Monte Carlo simulation is a computational technique used to model and analyze complex systems or processes through the use of random sampling. It is named after the famous Monte Carlo casino in Monaco, as the simulation relies on generating random numbers.
In Python, Monte Carlo simulation can be implemented using various libraries such as NumPy and random. The basic steps involved in performing a Monte Carlo simulation are as follows:
Define the problem: Clearly state the problem you want to model or analyze using Monte Carlo simulation. This could involve anything from estimating probabilities to evaluating financial risks.
Set up the model: Create a mathematical or computational model that represents the system or process under consideration. This model should include all relevant variables, inputs, and assumptions.
Generate random inputs: Identify the input variables in your model that exhibit uncertainty or randomness. Randomly sample values for these variables according to their probability distributions. This is often done using Python’s random or NumPy’s random functions.
Run simulations: Execute the model multiple times using the randomly generated inputs. Each run of the model is called an iteration. Record the output or results of interest for each iteration.
Analyze the results: With the recorded outputs from the simulations, analyze and summarize the data. This may involve calculating summary statistics, estimating probabilities, or constructing confidence intervals.
Draw conclusions: Based on the analysis of the simulation results, draw conclusions about the behavior, performance, or characteristics of the system or process being modeled. These conclusions can help make informed decisions or gain insights into the problem.
Monte Carlo simulation is a powerful tool that can handle complex problems where analytical or deterministic solutions are difficult or impossible to obtain. It allows for the exploration of a wide range of scenarios and provides a probabilistic understanding of the system under study. Python provides a convenient environment to implement Monte Carlo simulations due to its versatility and the availability of libraries that facilitate random number generation and numerical computations.
We will go through an example to understand the working of the Monte Carlo simulation.We aim to estimate that how likely is it to get ahead if we flip a coin an infinite number of times.
1. Let’s say we flip it once and get ahead. Will we be confident to say that our answer is 1?
2. Now we flipped the coin again and it again appeared head. Are we sure that the next flip will also be ahead?
3. We flipped it over and over again, let’s say 100 times, and strangely head appears every time. Now, do we need to accept the fact that the next flip will result in another head?
4. Let us just change the scenario and assume that out of 100 flips 52 resulted in the head will rest 48 came to be tails. Is the probability of the next flip resulting in the head is 52/100? Given the observation, it’s our best estimate, But the confidence will be still low.
Why is there a difference in Confidence Level?It is important to know that our estimate depends upon two things
1. Size: the size of the sample (e.g., 100 vs 2 in cases 2 and 4 respectively)
3. As the Variance of the observation grows (case 3 and 4), there comes a need for larger observation (as in cases 2 and 4) to have the same degree of confidence.
Law of Large NumbersIn repeated independent tests with the constant probability p of the population of a particular outcome in each test, the probability that the outcome occurs i.e. obtained from the samples differs from p converges to zero as the number of trials goes to infinity.
It simply means that if deviations (Variance) occur from the expected behaviour (probability p), in the future these deviations are likely to be evened out by the opposite deviation.
Now let’s talk about an interesting incident that took place on 18 August 1913, at a casino in Monte Carlo. In roulette, black came up a record twenty-six times in succession, and there arose a panic to bet red (so to even out the deviation from expected behaviour)
Let’s analyze this situation mathematically
1. Probability of 26 consecutive reds = 1/67,108,865
2. Probability of 26 consecutive reds when previous 25 rolls were red =1/2
Regression to Mean1. Following an extreme random event, the next random event is likely to be less extreme so that the mean is maintained.
2. E.g. if the roulette wheel is spun 10 times and reds come every time, then it is an extreme event =1/1024 and it is likely that in the next 10 spins we will get less than 10 reds, But the average number is 5 only.
So, as we look at the mean of 20 spins, it will be closer to the expected mean of 50% reds than to the 100% as of in the first 10 spins.
Now time to face some reality.
Sampling space of possible Outcomes1. It is not possible to guarantee perfect accuracy through sampling and also cannot say that an estimate is not precisely correct
We face a question here that how many samples are required to look at before we can have significant confidence in our answer?
It depends upon the variability in underlying distribution.
Confidence levels and Confidence IntervalsAs in a real-life situation, we cannot be sure of any unknown parameter obtained from a sample for the whole population so we make use of confidence levels and confidence intervals.
The confidence interval provides a range that the unknown value is likely to be contained with the confidence that the unknown value lays strictly within that range.
For example, the return for betting on a slot 1000 times in roulette is -3% with a margin error of +/- 4% with a 95% level of confidence.
It can be further decoded as we conduct an infinite trial of 1000,
The expected average/mean return would be -3%
The return would roughly vary between +1% and -7% that also 95% of the time.
Probability Density Function (PDF).Distribution is usually defined by the probability density function (PDF). It is defined as the probability that the random variable lying between an interval.
The area under the curve between the two points of PDF is the probability of the random variable falling within that range.
Let’s conclude our learning by an example:Let’s say there is a deck of shuffled cards and we need to find the probability of getting 2 consecutive kings if they lay down the cards in the order they are placed.
Analytical method:
P (at least 2 consecutive kings) = 1-P (no consecutive kings)
=1-(49! X 48!)/((49-4)! X52!) = 0.217376
By Monte Carlo Simulation:
Steps
1. Repeatedly select the random data points: Here we assume the shuffling of the cards is random
2. Performing deterministic computation. A number of such shuffling and finding the results.
3. Combine the results: Exploring the result and ending with our conclusion.
By Monte Carlo method we achieve near exact solution as of analytical method.
Advantages of Monte Carlo Simulation
Easy to implement and it gives statistical sampling for numerical experiments using the computer.
Provides us with satisfactory approximate solutions to computationally expensive mathematical problems.
It can be used for deterministic as well as stochastic problems.
It is sometimes time-consuming as we have to generate a large number of samplings to get the desired satisfactory output.
The results obtained from this method are only the approximation of the true solution and not the exact solution.
Frequently Asked QuestionsI am Dinesh Junjariya a Btech student from IIT Jodhpur.
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.
Related
You're reading A Guide To Monte Carlo Simulation!
A Complete Guide To Cubase Shortcuts
Introduction to Cubase Shortcuts
Start Your Free Software Development Course
Web development, programming languages, Software testing & others
Shortcut of Tools of Tool Panel of CubaseGiven below are the shortcut of tools of tool panel of cubase:
Drumstick Tool (0): For making drumstick tool active we should press zero numeric key of keyboard.
Select Tool (1): By pressing 1 numeric key of keyboard we can have select tool active.
Range Tool (2): For accessing range tool quickly you have to press 2 numeric key of keyboard.
Split Tool (3): Press 3 numeric key of keyboard for having split tool.
Glue Tool (4): By pressing 4 numeric key of keyboard you can access Glue tool.
Erase Tool (5): Press 5 numeric key of keyboard as shortcut key of Erase tool.
Zoom Tool (6): 6 numeric key of keyboard can use as shortcut key of Zoom tool.
Mute Tool (7): By pressing 7 numeric key of keyboard we can have Mute tool active.
Draw Tool (8): Shortcut key of draw tool is 8 numeric key of keyboard.
Play Tool (9): 9 numeric key of keyboard can use as shortcut command of Play tool.
Previous Tool/Next Tool (10): 10 numeric key of keyboard can use as shortcut command for previous tool or for next tool selection.
Shortcut Keys for Audio Settings
Adjust Fades to Range (A): For adjusting fades on selected range we can press A key of keyboard.
Crossfade/Fade (X): By pressing X key of keyboard we can have Crossfade for editing propose.
Direct Offline Processing (F7): Press F7 functional key of keyboard for direct processing of your editing in offline mode.
Shortcut Keys for AutomationGiven below are the shortcut keys for automation:
Toggle Read Automation for all Tracks on/off (Alt + R): By pressing Alt + R keys of keyboard you can make on or off all tracks for toggle reading automation purpose.
Toggle Write Automation for all Tracks on/off (Alt + W): By pressing Alt + W keys of keyboard you can make on or off all tracks for toggle writing automation purpose.
Automation Panel (F6): Press F6 functional key of keyboard for enabling automation panel for different purposes.
Shortcut Keys for ChordsGiven below are the shortcut keys for chords:
Shortcut Keys for Device
Given below are the shortcut keys for device:
MixConsole Lower Zone (Alt + F3): For lower zone of mixconsole we can press Alt + F3 functional key of keyboard.
Mixer (F3): By pressing F3 functional key of keyboard we can have Mixer device active.
Virtual Keyboard (Alt +K): For having Virtual keyboard we can press Alt + K keys of keyboard.
VST Connections (F4): We can press F4 functional key of keyboard for having VST Connections device active.
VST Instruments (F11): Same as VST connections we have shortcut command for VST Instruments that is F11 functional key of keyboard.
VST Performance (F12): F12 functional key of keyboard will make VST Performance device active.
Shortcut Keys for Cut, Copy, Paste, Undo and Redo
Cut (Ctrl + X): By pressing Ctrl + X key of keyboard we can cut our selected element for audio editing process.
Copy (Ctrl + C): Same as cut command we can copy our desired element by pressing Ctrl + C key of keyboard.
Cut (Ctrl + V): For pasting any copied or cut element we can press Ctrl + V button of keyboard.
Shortcut Keys for Edit CommandGiven below are the shortcut keys for edit command:
Activate or Deactivate Focused Object (Alt + A): For activating or deactivating focused object of editing process we can press Alt + A key of keyboard that means first time when we press these keys it will make this command active and when we again press these keys it will deactivate this command.
Auto-Scroll On/Off (F): Press F key of keyboard and it will enable or disable auto scroll feature of this software.
Delete (Delete): If you want to delete your selected element then you can simply press Delete key of keyboard.
Duplicate (Ctrl + D): If you want to make duplicate copy of your desired element during editing process then you just have to select that element and press Ctrl + d key of keyboard.
Expand/Reduce (Alt + E): For expending or reduce layer length you can press Alt + E button of keyboard.
Insert Silence (Ctrl + Shift + E): Press Ctrl + Shift + E keys of keyboard and it will insert Silence at your selected area.
Invert (Alt + F): Press Alt + F key of keyboard for invert command.
Left Selection Side to Cursor (E): Press E button of keyboard for making selection at the left side of the cursor.
Write Selection side to Cursor (D): Press D button of keyboard for making selection at the right side of the cursor.
Mute (M): Press M key of keyboard for mute the audio during working with it.
Mute/Unmute Objects ( Alt + M): Same as Mute we have command for making objects mute or unmute and you can do this by pressing Alt + M keys of keyboard.
Conclusion – Cubase ShortcutsThese were some of the important shortcut keys of tools as well command of this software and you can start using them during working on any project in this software. These shortcuts will help you in enhancing your working skill and also provide you efficient working ability.
Recommended ArticlesThis is a guide to Cubase Shortcuts. Here we discuss shortcut of tools of tool panel of cubase, shortcut keys for audio settings, device & edit command. You may also have a look at the following articles to learn more –
A Guide To Building An End
This article was published as a part of the Data Science Blogathon.
Knock! Knock!
Who’s there?
It’s Natural Language Processing!
Today we will implement a multi-class text classification model on an open-source dataset and explore more about the steps and procedure. Let’s begin.
Table of Contents
Dataset
Loading the data
Feature Engineering
Text processing
Exploring Multi-classification Models
Compare Model performance
Evaluation
Prediction
Dataset for Text ClassificationThe dataset consists of real-world complaints received from the customers regarding financial products and services. The complaints are labeled to a specific product. Hence, we can conclude that this is a supervised problem statement, where we have the input and the target output for that. We will play with different machine learning algorithms and check which algorithm works better.
Our aim is to classify the complaints of the consumer into predefined categories using a suitable classification algorithm. For now, we will be using the following classification algorithms.
Linear Support Vector Machine (LinearSVM)
Random Forest
Multinomial Naive Bayes
Logistic Regression.
Loading the DataDownload the dataset from the link given in the above section. Since I am using Google Colab, if you want to use the same you can use the Google drive link given here and import the dataset from your google drive. The below code will mount the drive and unzip the data to the current working directory in colab.
from google.colab import drive drive.mount('/content/drive') !unzip /content/drive/MyDrive/rows.csv.zipFirst, we will install the required modules.
Pip install numpy
Pip install pandas
Pip install seaborn
Pip install scikit-learn
Pip install scipy
Ones everything successfully installed, we will import required libraries.
import os import pandas as pd import numpy as np from scipy.stats import randint import seaborn as sns # used for plot interactive graph. import matplotlib.pyplot as plt import seaborn as sns from io import StringIO from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.feature_selection import chi2 from IPython.display import display from sklearn.model_selection import train_test_split from sklearn.feature_extraction.text import TfidfTransformer from sklearn.naive_bayes import MultinomialNB from sklearn.linear_model import LogisticRegression from sklearn.ensemble import RandomForestClassifier from chúng tôi import LinearSVC from sklearn.model_selection import cross_val_score from sklearn.metrics import confusion_matrix from sklearn import metricsNow after this let us load the dataset and see the shape of the loaded dataset.
# loading data df = pd.read_csv('/content/rows.csv') print(df.shape)From the output of the above code, we can say that the dataset is very huge and it has 18 columns. Let us see how the data looks like. Execute the below code.
df.head(3).TNow, for our multi-class text classification task, we will be using only two of these columns out of 18, that is the column with the name ‘Product’ and the column ‘Consumer complaint narrative’. Now let us create a new DataFrame to store only these two columns and since we have enough rows, we will remove all the missing (NaN) values. To make it easier to understand we will rename the second column of the new DataFrame as ‘consumer_complaints’.
# Create a new dataframe with two columns df1 = df[['Product', 'Consumer complaint narrative']].copy() # Remove missing values (NaN) df1 = df1[pd.notnull(df1['Consumer complaint narrative'])] # Renaming second column for a simpler name df1.columns = ['Product', 'Consumer_complaint'] print(df1.shape) df1.head(3).TWe can see that after discarding all the missing values, we have around 383k rows and 2 columns, this will be our data for training. Now let us check how many unique products are there.
pd.DataFrame(df1.Product.unique()).valuesThere are 18 categories in products. To make the training process easier, we will do some changes in the names of the category.
# Because the computation is time consuming (in terms of CPU), the data was sampled df2 = df1.sample(10000, random_state=1).copy() # Renaming categories df2.replace({'Product': {'Credit reporting, credit repair services, or other personal consumer reports': 'Credit reporting, repair, or other', 'Credit reporting': 'Credit reporting, repair, or other', 'Credit card': 'Credit card or prepaid card', 'Prepaid card': 'Credit card or prepaid card', 'Payday loan': 'Payday loan, title loan, or personal loan', 'Money transfer': 'Money transfer, virtual currency, or money service', 'Virtual currency': 'Money transfer, virtual currency, or money service'}}, inplace= True) pd.DataFrame(df2.Product.unique())The 18 categories are now reduced to 13, we have combined ‘Credit Card’ and ‘Prepaid card’ to a single class and so on.
Now, we will map each of these categories to a number, so that our model can understand it in a better way and we will save this in a new column named ‘category_id’. Where each of the 12 categories is represented in numerical.
# Create a new column 'category_id' with encoded categories df2['category_id'] = df2['Product'].factorize()[0] category_id_df = df2[['Product', 'category_id']].drop_duplicates() # Dictionaries for future use category_to_id = dict(category_id_df.values) id_to_category = dict(category_id_df[['category_id', 'Product']].values) # New dataframe df2.head()Let us visualize the data, and see how many numbers of complaints are there per category. We will use Bar chart here.
fig = plt.figure(figsize=(8,6)) colors = ['grey','grey','grey','grey','grey','grey','grey','grey','grey', 'grey','darkblue','darkblue','darkblue'] df2.groupby('Product').Consumer_complaint.count().sort_values().plot.barh( ylim=0, color=colors, title= 'NUMBER OF COMPLAINTS IN EACH PRODUCT CATEGORYn') plt.xlabel('Number of ocurrences', fontsize = 10);Above graph shows that most of the customers complained regarding:
Credit reporting, repair, or other
Debt collection
Mortgage
Text processingThe text needs to be preprocessed so that we can feed it to the classification algorithm. Here we will transform the texts into vectors using Term Frequency-Inverse Document Frequency (TFIDF) and evaluate how important a particular word is in the collection of words. For this we need to remove punctuations and do lower casing, then the word importance is determined in terms of frequency.
We will be using TfidfVectorizer function with the below parameters:
min_df: remove the words which has occurred in less than ‘min_df’ number of files.
Sublinear_tf: if True, then scale the frequency in logarithmic scale.
Stop_words: it removes stop words which are predefined in ‘english’.
tfidf = TfidfVectorizer(sublinear_tf=True, min_df=5, ngram_range=(1, 2), stop_words='english') # We transform each complaint into a vector features = tfidf.fit_transform(df2.Consumer_complaint).toarray() labels = df2.category_id print("Each of the %d complaints is represented by %d features (TF-IDF score of unigrams and bigrams)" %(features.shape))Now, we will find the most correlated terms with each of the defined product categories. Here we are finding only three most correlated terms.
# Finding the three most correlated terms with each of the product categories N = 3 for Product, category_id in sorted(category_to_id.items()): features_chi2 = chi2(features, labels == category_id) indices = np.argsort(features_chi2[0]) feature_names = np.array(tfidf.get_feature_names())[indices] unigrams = [v for v in feature_names if len(v.split(' ')) == 1] bigrams = [v for v in feature_names if len(v.split(' ')) == 2] print(" * Most Correlated Unigrams are: %s" %(', '.join(unigrams[-N:]))) print(" * Most Correlated Bigrams are: %s" %(', '.join(bigrams[-N:])))* Most Correlated Unigrams are: overdraft, bank, scottrade * Most Correlated Bigrams are: citigold checking, debit card, checking account * Most Correlated Unigrams are: checking, branch, overdraft * Most Correlated Bigrams are: 00 bonus, overdraft fees, checking account * Most Correlated Unigrams are: dealership, vehicle, car * Most Correlated Bigrams are: car loan, vehicle loan, regional acceptance * Most Correlated Unigrams are: express, citi, card * Most Correlated Bigrams are: balance transfer, american express, credit card * Most Correlated Unigrams are: report, experian, equifax * Most Correlated Bigrams are: credit file, equifax xxxx, credit report * Most Correlated Unigrams are: collect, collection, debt * Most Correlated Bigrams are: debt collector, collect debt, collection agency * Most Correlated Unigrams are: ethereum, bitcoin, coinbase * Most Correlated Bigrams are: account coinbase, coinbase xxxx, coinbase account * Most Correlated Unigrams are: paypal, moneygram, gram * Most Correlated Bigrams are: sending money, western union, money gram * Most Correlated Unigrams are: escrow, modification, mortgage * Most Correlated Bigrams are: short sale, mortgage company, loan modification * Most Correlated Unigrams are: meetings, productive, vast * Most Correlated Bigrams are: insurance check, check payable, face face * Most Correlated Unigrams are: astra, ace, payday * Most Correlated Bigrams are: 00 loan, applied payday, payday loan * Most Correlated Unigrams are: student, loans, navient * Most Correlated Bigrams are: income based, student loan, student loans * Most Correlated Unigrams are: honda, car, vehicle * Most Correlated Bigrams are: used vehicle, total loss, honda financial
Exploring Multi-classification ModelsThe classification models which we are using:
Random Forest
Linear Support Vector Machine
Multinomial Naive Bayes
Logistic Regression.
For more information regarding each model, you can refer to their official guide.
Now, we will split the data into train and test sets. We will use 75% of the data for training and the rest for testing. Column ‘consumer_complaint’ will be our X or the input and the product is out Y or the output.
X = df2['Consumer_complaint'] # Collection of documents y = df2['Product'] # Target or the labels we want to predict (i.e., the 13 different complaints of products) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state = 0)We will keep all the using models in a list and loop through the list for each model to get a mean accuracy and standard deviation so that we can calculate and compare the performance for each of these models. Then we can decide with which model we can move further.
models = [ RandomForestClassifier(n_estimators=100, max_depth=5, random_state=0), LinearSVC(), MultinomialNB(), LogisticRegression(random_state=0), ] # 5 Cross-validation CV = 5 cv_df = pd.DataFrame(index=range(CV * len(models))) entries = [] for model in models: model_name = model.__class__.__name__ accuracies = cross_val_score(model, features, labels, scoring='accuracy', cv=CV) for fold_idx, accuracy in enumerate(accuracies): entries.append((model_name, fold_idx, accuracy)) cv_df = pd.DataFrame(entries, columns=['model_name', 'fold_idx', 'accuracy'])The above code will take sometime to complete its execution.
Compare Text Classification Model performanceHere, we will compare the ‘Mean Accuracy’ and ‘Standard Deviation’ for each of the four classification algorithms.
mean_accuracy = cv_df.groupby('model_name').accuracy.mean() std_accuracy = cv_df.groupby('model_name').accuracy.std() acc = pd.concat([mean_accuracy, std_accuracy], axis= 1, ignore_index=True) acc.columns = ['Mean Accuracy', 'Standard deviation'] accFrom the above table, we can clearly say that ‘Linear Support Vector Machine’ outperforms all the other classification algorithms. So, we will use LinearSVC to train model multi-class text classification tasks.
plt.figure(figsize=(8,5)) sns.boxplot(x='model_name', y='accuracy', data=cv_df, color='lightblue', showmeans=True) plt.title("MEAN ACCURACY (cv = 5)n", size=14); Evaluation of Text Classification ModelNow, let us train our model using ‘Linear Support Vector Machine’, so that we can evaluate and check it performance on unseen data.
X_train, X_test, y_train, y_test,indices_train,indices_test = train_test_split(features, labels, df2.index, test_size=0.25, random_state=1) model = LinearSVC() model.fit(X_train, y_train) y_pred = model.predict(X_test)We will generate claasifiaction report, to get more insights on model performance.
# Classification report print('ttttCLASSIFICATIION METRICSn') print(metrics.classification_report(y_test, y_pred, target_names= df2['Product'].unique()))From the above classification report, we can observe that the classes which have a greater number of occurrences tend to have a good f1-score compared to other classes. The categories which yield better classification results are ‘Student loan’, ‘Mortgage’ and ‘Credit reporting, repair, or other’. The classes like ‘Debt collection’ and ‘credit card or prepaid card’ can also give good results. Now let us plot the confusion matrix to check the miss classified predictions.
conf_mat = confusion_matrix(y_test, y_pred) fig, ax = plt.subplots(figsize=(8,8)) sns.heatmap(conf_mat, annot=True, cmap="Blues", fmt='d', xticklabels=category_id_df.Product.values, yticklabels=category_id_df.Product.values) plt.ylabel('Actual') plt.xlabel('Predicted') plt.title("CONFUSION MATRIX - LinearSVCn", size=16);From the above confusion matrix, we can say that the model is doing a pretty decent job. It has classified most of the categories accurately.
PredictionLet us make some prediction on the unseen data and check the model performance.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state = 0) tfidf = TfidfVectorizer(sublinear_tf=True, min_df=5, ngram_range=(1, 2), stop_words='english') fitted_vectorizer = tfidf.fit(X_train) tfidf_vectorizer_vectors = fitted_vectorizer.transform(X_train) model = LinearSVC().fit(tfidf_vectorizer_vectors, y_train)Now run the prediction.
complaint = """I have received over 27 emails from XXXX XXXX who is a representative from Midland Funding LLC. From XX/XX/XXXX I received approximately 6 emails. From XX/XX/XXXX I received approximately 6 emails. From XX/XX/XXXX I received approximately 9 emails. From XX/XX/XXXX I received approximately 6 emails. All emails came from the same individual, XXXX XXXX. It is becoming a nonstop issue of harassment.""" print(model.predict(fitted_vectorizer.transform([complaint]))) complaint = """Respected Sir/ Madam, I am exploring the possibilities for financing my daughter 's XXXX education with private loan from bank. I am in the XXXX on XXXX visa. My daughter is on XXXX dependent visa. As a result, she is considered as international student. I am waiting in the Green Card ( Permanent Residency ) line for last several years. I checked with Discover, XXXX XXXX websites. While they allow international students to apply for loan, they need cosigners who are either US citizens or Permanent Residents. I feel that this is unfair. I had been given mortgage and car loans in the past which I closed successfully. I have good financial history. print(model.predict(fitted_vectorizer.transform([complaint]))) complaint = """They make me look like if I was behind on my Mortgage on the month of XX/XX/2023 & XX/XX/XXXX when I was not and never was, when I was even giving extra money to the Principal. The Money Source Web site and the managers started a problem, when my wife was trying to increase the payment, so more money went to the Principal and two payments came out that month and because I reverse one of them thru my Bank as Fraud they took revenge and committed slander against me by reporting me late at the Credit Bureaus, for 45 and 60 days, when it was not thru. Told them to correct that and the accounting department or the company revert that letter from going to the Credit Bureaus to correct their injustice. The manager by the name XXXX requested this for the second time and nothing yet. I am a Senior of XXXX years old and a Retired XXXX Veteran and is a disgraced that Americans treat us that way and do not want to admit their injustice and lies to the Credit Bureau.""" print(model.predict(fitted_vectorizer.transform([complaint])))The model is not perfect, yet it is performing very good.
The notebook is available here.
ConclusionWe have implemented a basic multi-class text classification model, you can play with other models like Xgboost, or you can try to compare multiple model performance on this dataset using a machine learning framework called AutoML. This is not yet, still there are complex problems associated within the multi-class text classification tasks, you can always explore more and acquire new concepts and ideas about this topic. That’s It!!
Thank you!
All images are created by the author.
My LinkedIn
The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion
Related
A Paranoid Buyer’s Guide To Shopping Online
The Internet can be a very intimidating place, with many people using the anonymity it provides to do nefarious things. Since its inception, millions of people have fallen victim to scams and hackers that have stolen their identities and made purchases in their name.
What Makes Safety So ChallengingHowever, hackers are always trying to stay one step ahead of these methods and sometimes even succeed in stealing customer information from companies, making it difficult to actually make the Internet a safe place for people to make online purchases.
It’s also probably worth mentioning the fact that hundreds of millions of people around the world have their credit and debit card data somewhere on the Web. In America alone, this total reaches 94 million, which is a bit under a third of the entire country’s population.
Looking for HTTPS Isn’t EnoughWhile it’s imperative that you look for the “HTTPS://” before the URL on your address bar to ensure that your data is encrypted while you make a transaction, it’s not enough to tell whether you are being scammed or not. To get the certificate necessary to use HTTPS on your website, you only need to prove that you own the domain but not that you’re a legitimate business (read more on this here).
While it may be safe to make a purchase online from a retailer you know with absolute certainty is legitimate, unknown retailers can still scam you and use an encryption (HTTPS) certificate on their site. The authority that gave them the certification will often try to combat this, but you may still fall victim to scams regardless.
Diversify Your CredentialsThe problem with credit and debit cards on the Internet is that they are just one number. And that number is the sole thing that stands between any entity and your bank account. Once it is revealed, every penny you have at the bank is vulnerable and fair game to anyone.
PayPal is similar in that you have one account tied to all your money. But there’s one crucial difference here: changing your PayPal password is easy, but doing the same to your debit card number is a process that requires interacting with your bank. It could get complicated rather quickly.
Instead of giving out your CC info to every online retailer, it is better to use a “throwaway” number that you can invalidate at your whim. There are startups like Privacy that offer services like these and Visa has also recently rolled out with a token service that does something similar.
Retailers Don’t Need a Lot of Info About YouThere are two things an online store needs before you complete a purchase: a way to send you their product and a way to receive your payment. This includes your address, your name, your phone number (in case they need to contact you about the delivery), and your debit card credentials. Any other information they ask for is superfluous and you should never give it away.
So things like your passport number, your ID number, your SSN, and any other identifying information should never be in the hands of a simple retailer. This is reserved only for government institutions, banks, and other entities that actually require this data to ensure that you’re not an identity thief. Assume the worst if some Amazon wannabe asks you for this information.
Other Things You Should AvoidWhen parting with your money, you should always make sure that the transaction is as private as possible. Avoid making purchases in public, at a public computer, or with any sort of unencrypted WiFi. Yes, that means that even if you make a purchase from your home under an unencrypted WiFi connection, you might as well be doing it at an airport. The idea here is to lock down everything as much as you possibly can.
Miguel Leiva-Gomez
Miguel has been a business growth and technology expert for more than a decade and has written software for even longer. From his little castle in Romania, he presents cold and analytical perspectives to things that affect the tech world.
Subscribe to our newsletter!
Our latest tutorials delivered straight to your inbox
Sign up for all newsletters.
By signing up, you agree to our Privacy Policy and European users agree to the data transfer policy. We will not share your data and you can unsubscribe at any time.
Beginner’s Guide To Become A Web3 Developer
Beginner’s guide to becoming a web3 developer: A primer on backend and the frontend web3 development
Web3 development has several components. As with Web2, the primary components are the front-end and back-end. Furthermore, because smart contracts are an important component of dApps, they merit their own category. Of course, there is the establishment and growth of new programmable blockchains.
This involves understanding which tools to use and which languages to acquire for Web3 development. Furthermore, you should concentrate on providing tangible results that you can demonstrate to your colleagues, prospective clients, or investors. As a result, you must discover how to build an excellent dApp UI, an interface that is clearly critical. On the other hand, you can’t implement good functionalities without a suitable backend and smart contracts.
Blockchain Fundamentals: Blockchain is the first concept you must grasp in order to become a web3 coder. This will allow you to quickly design and optimize smart contracts. Let’s go over what blockchain is and how it functions.
Blockchains are installed on computers called nodes all over the globe. As a result, you’ll need to reach the majority of computers and repeat the procedure. It’s nearly impossible for a machine to do all of this quickly enough for the network to detect and kick the fraudster off the blockchain.
Decentralized Applications: DApps, or Decentralized Applications, are blockchain-based applications.
Frontend: JavaScript frameworks like React, Vue, Angular
Backend: Rust and Solana or Solidity and Ethereum
Frontend Site Programming Fundamentals: The backbone of DApps may be powered by blockchain technology, but the interface is written in chúng tôi — Common HTML tags.
The front end is written in JavaScript
CSS — Basic Properties, Flex, Grid
CSS Frameworks [Optional] — Bootstrap, Semantic UI, Tailwind, etc
JavaScript — Variables, Functions, Classes, ES6, etc.
JavaScript Frameworks
Backend web2, as a backup in case web3 fails. Here is what you should know about the backend:
NodeJS — Event loop, I/O
API Architecture — Express
Databases — MongoDB, SQL, PostgreSQL
Ethereum Fundamentals: Ethereum is a network that is used to execute smart contracts. In 2023, Ethereum will be by far the most common blockchain for making smart contracts. Solidity is the language used to create smart contracts.
smart contracts: Smart contracts are arrangements that are executed by immutable code on the network. Smart contracts are comparable to JS classes. DApps are powered by these. Solidity is a high-level, object-oriented computer language designed especially for creating smart contracts. Because solidity is so novel, there are few tools for learning it. The best method to learn is to create projects and solve issues by consulting documentation. Consider some resources that follow a comparable strategy.
Build space: Build space is a cohort-based learning tool that is an excellent resource for studying Web3. Solana, Polygon, and Ethereum can be used to create DApps, NFT collections, NFT web games, DAOs, and much more.
LearnWeb3DAO: Web3 DAO is yet another fantastic Web3 utility. It has four distinct tracks for developers of varying ability levels: Freshman, Sophomore, Junior, and Senior. You will learn how to create DApps, NFT collections, ICO coins, DAOs, DeFi networks, and many other things.
Crypto Zombies: CrytoZombies is a gamified programming training in which you use smart contracts to construct an undead factory.
Nader Dabit: Nader Dabit about
1. React
2.Web3
Serverless
Blockchain
5.DeFi
Connect Your Smart Contract with Your Frontend: Now that you understand how to create smart contracts, you must put them to use. There are two primary tools for doing this: chúng tôi and chúng tôi Let’s look at why chúng tôi is superior to web3.
Much smaller size
Less buggy
Better documentation
More Popular
Easier for beginners
Extra Features
Alchemy: Alchemy is a set of developer tools for prototyping, debugging, and shipping products more quickly. Alchemy uses a variety of networks, including Ethereum, Polygon, Stark net, Flow, and others. It has an excellent NFT API that enables you to quickly get your NFT collection up and going. It also has a supercharged cryptocurrency API and supports web 3.0 push alerts.
Remix: Remix is a browser-based Editor designed especially for creating Ethereum smart contracts with Solidity. There is no need for preparation; you can begin writing code right away. It generates your code and allows you to quickly test it. Not only that but your smart contract can be simply deployed.
Hardhat: Hardhat makes it simple to publish contracts, perform tests, and debug Solidity code. You can launch your contract on a variety of networks, including Ropsten, Rinkeby, and Mainnet. Yes, and it supports TypeScript as well.
Truffle: Truffle is my preferred smart contract creation tool. It allows you to quickly build and use smart contracts in your front-end code. Ganache also includes Truffle, which mimics a blockchain and adds test accounts, among other things. It’s extremely beneficial to prevent jargon.
How To Use Kodi – A Beginner’s Guide
Kodi is a popular media server application that can not only help you manage your offline media library but also stream online content. What’s surprising to me is that despite being so useful, apart from its dedicated core users, no one seems to know what it does. I mean people who use Kodi sing paeans about it and those who don’t use it know nothing about it. We want to change that perception with this article. Kodi is a free and open-source tool that can organize local media files into a single, cohesive interface. Among many things, Kodi has a powerful media player that can stream a wide range of media formats. And in this article, we are going to tell you everything about it and help you get started with this awesome app.
A Definitive Guide to Kodi (Updated July 2023)Before we move to the article, I want to outline all the topics that I have covered in this explainer. You can easily move back and forth from the links below.
What Exactly is Kodi?About 10 years back when internet speed was abysmal and online streaming platforms were close to naught, there was a culture of owning physical copies of movies, music, etc. Users preferred to download media content over the internet instead of streaming it because the internet speed sucked.
This meant that they had to maintain a large media collection and organizing and maintaining large libraries of media content like movies, music or images in separate directories was a tiresome exercise.
You had to run through hoops to find the content you wanted to play. Sometimes, the native media player wouldn’t even play it because it didn’t support the media format. Kodi came as a solution to all these problems. It streamlined every single thing and brought the entire media-consumption experience under one roof.
You could just open Kodi and all your local media files were right there, ready to be played in a sleek and accessible interface. So, Kodi basically is a media server application which can not only help you organize and manage your media but also access and play it from any device on your home network.
The State of Kodi in 2023Kodi has become a powerful media center app. Kodi has gone through many iterations of development and by now, it has amassed many useful features including the ability to stream content, support for different add-ons and repositories, themes support, and more.
Of course, you can still connect your local media library to Kodi and it will organize everything with proper categorization, album art, metadata, synopsis, etc. Further, you can enable subtitles, track your movie, and show progress with Trakt, record Live TV, and much more.
Kodi is an open-source app with huge community support. It’s available on all major platforms including Windows, macOS, Android, Linux, iOS, and host of other devices. There is so much more about Kodi that I can’t fit it in a single paragraph here.
Anyway, if you want to use Kodi then you will have to start with the installation first and that is where we will begin.
Install Kodi How to Use KodiNow that you have gone through Kodi’s installation process, let us proceed with some basic level stuff on how to use Kodi. In this section, I will start with the interface and then go deep into settings and add-ons to explain Kodi’s useful features.
Understanding the Kodi User Interface
To use Kodi, you will have to understand its user interface first. Thankfully, the default Kodi interface is pretty basic and clean. The latest Kodi version is 18.2 (codenamed Leia) and the below screenshots are grabbed from the same version. You have multiple menus on the left side which are categorized based on the type of content.
There is a search button on the top which lets you search your local content, add-ons, and directly into streaming services like YouTube.
The power button on the left side lets you exit Kodi, reboot and offers other similar functions.
To make your Kodi home screen visually pleasing, you can add weather information or you can use different Kodi skins which basically overhaul Kodi’s UI.
Plunging into the World of Kodi Add-ons
Kodi has something called Kodi Add-ons which are basically apps built for Kodi. They are very similar to Android or iOS apps which can be installed on top of Kodi to bring extra functionality, content, and features. Just like App Store or Play Store, Kodi has an in-built, official repository that hosts thousands of add-ons.
There are also third-party add-ons that are massively popular among the Kodi community. If you want to install third-party Kodi add-ons, you can check our list of latest Kodi add-ons. Further, if you want add-ons specifically to stream movies, you can check our recommended movie add-ons here.
If you are not satisfied with the official repo, you can also install third-party repositories as well. You can check our list of best third-party repository available for Kodi. The linked article also mentions the steps to install a repository so you don’t have to worry about that.
Warning: Some of the websites hosting the Kodi addons contain tracking pixels. If you don’t want to give away your personal information like IP address, you should use VPNs. Remember to check out our article on the best free VPNs for the same.
How to Install Addons on Kodi
3. Upon selecting the repository, choose “All repositories” and then move to “Video add-ons“ if you want to watch movies and TV shows. You can also go through other categories.
4. Here, you will find all the video add-ons available on Kodi’s server.
How to Use Kodi Addons to Watch Movies, Shows, Live TV, and More
You just need to follow the steps that we have mentioned above on how to install video addons. From there, you can find many addons for watching movies, TV shows, streaming live TV and more. That said, my personal recommendation would be the Exodus addon as it’s a complete package and brings all kinds of content, be it movies, TV shows, or live TV channels.
You can follow our guide and learn how to install Exodus on Kodi. Apart from that, if you want more Kodi addons for video content then you can head over to our dedicated list of best Kodi addons. Here, you can find top Kodi addons for anime, cartoons, music, sports, and much more.
What is Kodi Build?
Unlike the default Kodi which is just barebone, Kodi Build is a total package. After you install a Build, you will not have to individually look for addons or other tools. Everything is there for your need. Some of the popular Kodi Builds are Xanax and Titanium, but my favorite remains Xanax for its amazing UI and library of content. You can find more information from our list of the best Kodi Builds.
How to Use a Kodi Build?
1. First off, you need to download the ZIP file of your favorite Kodi Build.
6. Now you will be offered multiple versions of the build. Choose the latest one and proceed ahead.
9. After the installation is complete, force-close Kodi to make the core changes.
How to Add Addons to a Build?
The process to install an addon on a Build is pretty similar to the one on the default Kodi setup. You just have to move to the “Add-on” tab and from there, you can either choose the ZIP file or download the addon from an online source. Despite being on a third-party build, you will still have access to official Kodi addons so that is great.
Keep in mind, every Kodi Build has a different user interface so the location of menus may change from one skin to another. That said, you will always find the option to install addons under the “Add-ons” tab.
Going Deep into Settings
Shortcuts
While I have mentioned the major features of Kodi, there are still a few useful hacks you should know about. Talking about usefulness, keyboard shortcuts make things a hell lot easier on Kodi.
Update the detailed information about A Guide To Monte Carlo Simulation! on the Moimoishop.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!