You are reading the article Is Calling A Software Engineer A ‘Programmer’ An Insult? updated in November 2023 on the website Moimoishop.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested December 2023 Is Calling A Software Engineer A ‘Programmer’ An Insult?
“That’s an insult!”
That was feedback I received from my last article when I referred to someone who writes software code for a living as an “engineer.”It seems that many people who graduated with a computer science or engineering degree take umbrage at those who use the title “engineer” loosely when referring to someone who develops software.
The common theme from these highly educated folks was that not everyone who writes code is educated enough in proper engineering techniques and methodologies to warrant the lofty title of “engineer.”
I seemed to have hit a nerve, so that begged the question, “Does a title really matter to a software professional?”
And I’m not only referring to official job titles. If a manager is presenting a software deliverable to a business unit and refers to “my team of engineers” versus “my team of programmers” is the audience either positively or negatively inclined to pass judgment on the deliverable?
Or if a headhunter refers to someone as an “engineer” or a “programmer” in an initial conversation with a hiring manager, does it impact that manager’s perception of the candidate? Does it possibly resulting in a lower salary offer even after the interview process?
I asked some technical people in my network about this and received varying, but interesting, responses.
Basile went on to stress the importance of creating your own brand outside of the title your company assigns you.
“You should be seeking to earn a name for yourself by creating your own unique branding,” Basile said. “So for example, at the top of your LinkedIn profile, everyone should create a professional “headline” that sums up your professional identity.”
That made sense to me, so I went out on LinkedIn to see how people who write software for a living brand themselves. Here’s a representative sampling of what I found after sifting through a few hundred:
Senior Software Engineer and Architect
Lead Software Engineer
Computer Software Consultant and Professional
Information Systems Engineer
Granted this wasn’t a methodical scientific survey. (After all, I only have an Information Science degree, so what did you expect?”)
That said, did you notice that “developer” and “programmer” were not used – at all?
Does “software developer” have a different capabilities connotation than “programmer”? I personally always thought “developer” sounded better than “programmer,” so on my resume back in the day when I wrote code for a living, I would change my “Programmer II” title to “Senior Engineer.”
Is there really a difference between developing software and programming software?
I don’t believe so, but I didn’t want to risk a less positive first impression when applying for a software development position.
And many others feel that “programmer” is, frankly, a slap in the face.
John Otroba is an HR Director at CadenceQuest, in charge of creating job descriptions. He said that most of his technical staff prefers “Software Engineer” or “Software Developer” as a title.
“Using the title Programmer is like the ‘S’ word for Secretaries who’d rather be referred to as an Executive Assistant or Office Specialist. It is simply no longer politically and socially acceptable.”
When I asked him if it impacted hiring decisions or salaries, he said absolutely not, adding, “It is more of a vanity thing” for the employee.
When I talked to those who had jobs writing software, the ones with computer science or engineering degrees had a common theme summed up best by Justin Pihony who writes code for an IT department in Pittsburgh (home of the Super Bowl champion Steelers! Wahoo! )
Sorry, my hometown roots forced me to digress. Back to Pihony, who has a computer science degree yet has the title Programmer Analyst. He and others I talked to feel that having “Analyst” in their title makes up for the Programmer part, making it more respectable.
Pihony went on to say that whether or not someone designs the software should make a difference in how they’re labeled.
“A software engineer is kind of like an architect in construction who creates the blueprints, realizing that a bad design could result in the whole building collapsing. Whereas, the programmer is like a construction worker who takes the blueprints and uses them to create the building,” said Pihony.
“Designing requires much more knowledge than coding, where you just need to know the programming language and implement the design.”
Wow, so programmers are blue collar and engineers are white collar?
Next Page: The programmer/developer debate, asking the Ultimate Source
You're reading Is Calling A Software Engineer A ‘Programmer’ An Insult?
How Do I Become An Expert Python Programmer?
Learning Beginners Topics
In this section, any new beginner should concentrate on fundamental programming concepts and properly grasp the basic components of programming.
The following is the list of recommendations for a beginner to become an expert python programmer −
Variables − You must understand how variables work, the different types of variables, the scope of variables, and why we even need variables in programming. You can also learn about mutable and immutable data types, which are selfexplanatory.
Operators − Operators are important in programming since they are the tools that are used for computation, variable assignment, condition evaluation, and loops.
Conditions − When it comes to decision-making conditions are dominant. You will need to understand Boolean conditions, conditional chaining, and the statements used to check conditions in this section. This is commonly associated with loops and iteration. You must be familiar with the various loops accessible in the language, such as for and while loops.
Basic Data Structures − Data structures play a key role in every program. There are various other data structures to learn, but the primary focus should be on lists, sets, dictionaries, and tuples.
Functions − Functions are important in any program. The combination of many functions within your programs is what causes the program to behave as expected.
Basic understanding of IO operations − This is not a difficult task. How to read from a text file is the concept to understand. How do you save a text file? Can you read CSV files? These are things you may need to accomplish, especially if you want to create real-world apps or store something in a file. So it will be the fundamental section for you.
Unit testing − You must know how to perform test-driven development in Python or any other programming language.
It is essential that you practice the skills in this area since mastering or having a thorough understanding of the basics will make your Python journey easier.
Learning Intermediate TopicsThe following are the main concepts that an intermediate python programmer must learn −
Object-oriented programming (OOP)Yes, it seems to be a very popular word. It requires a thorough understanding of classes, objects, and many concepts like instantiation, inheritance, abstraction, properties, and others. Learning this will help you a lot. If you only remember one thing from this intermediate level, it’s that you need a solid foundation in object-oriented programming in order to understand anything above this level.
Design patternsWhen it comes to object-oriented programming, design patterns and best practices are essential.
Data structuresAfter you’ve mastered object-oriented, you must learn about data structures. Explore topics like queues, hash maps, and stacks, to name a few. These subjects will be discussed, and knowing the efficiency and temporal complexity in big O notation is crucial. Don’t be frightened if you don’t grasp some of the terminologies. You will make it.
ComprehensionsSo list and dictionary comprehensions are extremely cool, fancy-looking things in Python. They are techniques for writing one-liners (writing a whole independent statement in one line).
Lambda functionsThese are anonymous functions. These functions are commonly found in collection modules, but they are not limited to them. Learn more about the Lambda function and its finest applications.
Inheritance Dunder MethodsIf you’ve ever seen def__init__ or used a function that looks the same, that’s an example of a Python special method. These will be simple to learn once you’ve mastered the functions in the beginner section.
PipThis is one of Python’s best features because pip is a package manager that allows you to include third-party modules in your codes. This is related to learning about Python environments such as Anaconda and how to use them. You will also learn how to design and use your own modules in this section.
Learning Advanced Topic DecoratorsThey are associated with object-oriented programming. In layman’s terms, they decorate a function or method.
GeneratorsGenerators are a method to efficiently use memory in Python. Assume you’re generating a collection rather than the whole thing. If you just need access to one item from that collection at a time, you can generate one item at a time. It does not have to be one item; it may be two or three. A generator can be used for this purpose.
Context managersContext managers are usually indicators of a cleanup operation that occurs after you exit/break the context manager.
Metaclasses Concurrency and parallelismThis needs its own article because it is a really long subject to get into.
Concurrency is the task of running and managing numerous computations at the same time, whereas parallelism is the task of running several computations simultaneously.
CythonThis might definitely be classified as expert or master level, but Cython is essentially how you develop C code that interacts with Python. So, if I have a really performance-heavy piece of code or operation that needs to be done rapidly and I don’t trust Python to do it for me, I can write it in C and then link it up to Python using a module called Cython.
Learning Expert TopicsYou most likely have a vision of what you want to do at this stage. Consider it specialization. You can pursue a career in data science, machine learning, artificial intelligence (AI), or full-time web development. To be honest, each specialist delves deeper into a specific path chosen. There isn’t anything to write about that is relevant at the expert level. Every particular route necessitates more involvement, which you, as the developer, will select.
We cannot offer you a timetable for when you will arrive. It everything usually comes down to your dedication and passion.
ConclusionLamborghini Is Auctioning A Supercar With An Exclusive Nft
Frank Sinatra once said, “you buy a Ferrari when you want to be somebody. You buy a Lamborghini when you are somebody.” Known for their stunning looks, incredible speeds, and powerful combustion engines, the one-of-a-kind luxury cars from Lamborghini are currently going through a big change. The Italian manufacturer previously announced that it will now focus on the development of hybrid and pure electric cars. Therefore, 2023 is going to be the last year for its V-12 combustion engine cars.
To mark the occasion? Lamborghini will auction its last gas-powered Aventador LP 780-4 Ultimae Coupè along with an exclusive NFT.
The company has partnered with DJ Steve Aoki, artist and innovator Krista Kim, and digital content studio INVNT Group for this “Ultimate” drop. There will be only one owner of the 1:1 NFT Lamborghini collectible, and the company also claims that “the drop is the world’s first NFT ever to be auctioned with a physical super sports car.”
Why is Lamborghini getting into NFTs?While making the 1:1 NFT announcement, CEO of Automobil Lamborghini, Stephan Winkelmann, noted the synergy that exists between the car company and NFT community. “Lamborghini and the NFT community fit together very well, as we share many values. We are both young-spirited innovators, looking out for unexpected projects and technological solutions,” he said. In an email exchange with nft now, Winkelmann clarified his sentiments. “Lamborghini is more than just a super sports car manufacturer, it is an attitude, a lifestyle….the NFT community shares many of our values, consisting of innovative, young-spirited people seeking unexpected yet authentic ways to interact,” he said.
Winkelmann added that the project is particularly notable because Lamborghini is one of the first car companies to enter the space in this way. “This project is very special for us as it is a true first, a path nobody has ever taken,” he said.
What’s so special about Lamborghini’s “Ultimate” NFT drop?The auction of Lamborghini’s last V-12 engine-equipped Aventador LP 780-4 Ultimae and the attached NFT will take place at RM Sotheby’s on April 19th at 6:00 PM CET. The lucky collector will get to attend a virtual meet-and-greet with Krista Kim and Steve Aoki. They will enjoy access to Lamborghini’s VIP utilities and receive a private tour of Museo Lamborghini. The NFT holder would also be eligible for a preview of future limited edition Lamborghini cars and other VIP benefits.
Although the 1:1 Lamborghini “Ultimate” NFT is going to be a unique drop, it’s not the first NFT project from the Italian car manufacturer. The company debuted in the NFT space in January with its “Space Time Memory” collection, a pack of five eye-catching Lamborghini Ultimae photos created by Swiss artist Fabian Oefner. This time artist Krista Kim is designing the NFT artwork. Her special signature gradient will appear on both the NFT and the real Aventador.
Meanwhile, Steve Aoki is creating the music for the NFT and he is also working on an exclusive track for the car. Excited about this partnership, Aoki said, “I’m honored to be partnering with Lamborghini and Krista Kim on this historic project! The drop signifies the ultimate intersection – where the physical world, digital art, and music come together as one. Every design element of this car is purposeful. It truly has its own story, and therefore I wanted my music track to reflect its soulful energy – the vibe, the spirit, and the power.”
The auction of its last combustion engine-based Aventador car will be a historical moment for Lamborghini. It will also be a proud moment for the NFT community. “This event will likely be one of the most prolific NFT drops this year and will certainly be one of the most historic automobile auctions ever,” said Scott Cullather, CEO of INVENT Group.
This article has been updated to include additional statements from Stephan Winkelmann.
What Is A Stock? A Beginner’S Guide
Stock
A claim over a company’s assets and its ownership
Written by
Andrew Loo
Published March 3, 2023
Updated May 13, 2023
What is a Stock?When a person owns stock in a company, the individual is called a shareholder and is eligible to claim part of the company’s residual assets and earnings (should the company ever have to dissolve). A shareholder may also be referred to as a stockholder. The terms “stock,” “shares,” and “equity” are used interchangeably in modern financial language. The stock market consists of exchanges where investors can buy and sell individual shares of a company.
Benefits of Owning StocksThere are many potential benefits to owning stocks or shares in a company.
1. Claim on assetsA shareholder has a claim on assets of a company it has stock in. However, the claims on assets are relevant only when the company faces liquidation. In that event, all of the company’s assets and liabilities are counted, and after all creditors are paid, the shareholders can claim what is left. This is the reason that equity (stocks) investments are considered higher risk than debt (credit, loans, and bonds) because creditors are paid before equity holders, and if there are no assets left after the debt is paid, the equity holders may receive nothing.
2. Dividends and capital gainsA stockholder may also receive earnings, which are paid in the form of dividends. The company can decide the amount of dividends to be paid in one period (such as one quarter or one year), or it can decide to retain all of the earnings to expand the business further. Aside from dividends, the stockholder can also enjoy capital gains from stock price appreciation.
3. Power to voteAnother powerful feature of stock ownership is that shareholders are entitled to vote for management changes if the company is mismanaged. The executive board of a company will hold annual meetings to report overall company performance. They disclose plans for future period operations and management decisions. Should investors and stockholders disagree with the company’s current operation or future plans, they have the power to negotiate changes in management or business strategy.
4. Limited liabilityLastly, when a person owns shares of a company, the nature of ownership is limited. Should the company go bankrupt, shareholders are not personally liable for any loss.
Risks of Owning StockAlong with the benefits of stock ownership, there are also risks that investors have to consider.
1. Loss of capitalThere is no guarantee that a stock’s price will move up. An investor may buy shares at $50 during an IPO, but find that the shares move down to $20 as the company begins to perform badly, for example.
2. No liquidation preferenceWhen a company liquidates, creditors are paid before equity holders. In most cases, a company will only liquidate when it has very little assets left to operate. In most cases, that means that there will be no assets left for equity holders once creditors are paid off.
3. Irrelevant power to voteWhile retail investors technically have voting rights in executive board meetings, in practice they usually have very limited influence or power. The majority shareholder typically determines the outcome of all votes at shareholder meetings.
Modern Stock Trading What Affects Share Prices on the Stock Market?There are many factors that affect share prices. These may include the global economy, sector performance, government policies, natural disasters, and other factors. Investor sentiment — how investors feel about the company’s future prospects — often plays a large part in dictating the price. If investors are confident about a company’s ability to rapidly grow and eventually produce large returns on investment, then the company’s stock price may be well above its current intrinsic, or actual, value.
Two of the most examined financial ratios used to evaluate stocks are the following:
Revenue growth
Earnings growth
Revenue growth tells analysts about the sales performance of the company’s products or services and generally indicates whether or not its customers love what it does. Earnings reveal how efficiently the company manages its operations and resources to produce profits. Both are very high-level indicators that can be used as references on whether or not to purchase shares. However, stock analysts also use many other financial ratios and tools to help investors profit from equity trading.
No matter what your job in the financial industry, you will be involved with stocks in one way or another.
Additional ResourcesStock Market Guide
Investing for Beginners
Exchange-Traded Funds (ETFs)
See all equities resources
A Guide To Building An End
This article was published as a part of the Data Science Blogathon.
Knock! Knock!
Who’s there?
It’s Natural Language Processing!
Today we will implement a multi-class text classification model on an open-source dataset and explore more about the steps and procedure. Let’s begin.
Table of Contents
Dataset
Loading the data
Feature Engineering
Text processing
Exploring Multi-classification Models
Compare Model performance
Evaluation
Prediction
Dataset for Text ClassificationThe dataset consists of real-world complaints received from the customers regarding financial products and services. The complaints are labeled to a specific product. Hence, we can conclude that this is a supervised problem statement, where we have the input and the target output for that. We will play with different machine learning algorithms and check which algorithm works better.
Our aim is to classify the complaints of the consumer into predefined categories using a suitable classification algorithm. For now, we will be using the following classification algorithms.
Linear Support Vector Machine (LinearSVM)
Random Forest
Multinomial Naive Bayes
Logistic Regression.
Loading the DataDownload the dataset from the link given in the above section. Since I am using Google Colab, if you want to use the same you can use the Google drive link given here and import the dataset from your google drive. The below code will mount the drive and unzip the data to the current working directory in colab.
from google.colab import drive drive.mount('/content/drive') !unzip /content/drive/MyDrive/rows.csv.zipFirst, we will install the required modules.
Pip install numpy
Pip install pandas
Pip install seaborn
Pip install scikit-learn
Pip install scipy
Ones everything successfully installed, we will import required libraries.
import os import pandas as pd import numpy as np from scipy.stats import randint import seaborn as sns # used for plot interactive graph. import matplotlib.pyplot as plt import seaborn as sns from io import StringIO from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.feature_selection import chi2 from IPython.display import display from sklearn.model_selection import train_test_split from sklearn.feature_extraction.text import TfidfTransformer from sklearn.naive_bayes import MultinomialNB from sklearn.linear_model import LogisticRegression from sklearn.ensemble import RandomForestClassifier from chúng tôi import LinearSVC from sklearn.model_selection import cross_val_score from sklearn.metrics import confusion_matrix from sklearn import metricsNow after this let us load the dataset and see the shape of the loaded dataset.
# loading data df = pd.read_csv('/content/rows.csv') print(df.shape)From the output of the above code, we can say that the dataset is very huge and it has 18 columns. Let us see how the data looks like. Execute the below code.
df.head(3).TNow, for our multi-class text classification task, we will be using only two of these columns out of 18, that is the column with the name ‘Product’ and the column ‘Consumer complaint narrative’. Now let us create a new DataFrame to store only these two columns and since we have enough rows, we will remove all the missing (NaN) values. To make it easier to understand we will rename the second column of the new DataFrame as ‘consumer_complaints’.
# Create a new dataframe with two columns df1 = df[['Product', 'Consumer complaint narrative']].copy() # Remove missing values (NaN) df1 = df1[pd.notnull(df1['Consumer complaint narrative'])] # Renaming second column for a simpler name df1.columns = ['Product', 'Consumer_complaint'] print(df1.shape) df1.head(3).TWe can see that after discarding all the missing values, we have around 383k rows and 2 columns, this will be our data for training. Now let us check how many unique products are there.
pd.DataFrame(df1.Product.unique()).valuesThere are 18 categories in products. To make the training process easier, we will do some changes in the names of the category.
# Because the computation is time consuming (in terms of CPU), the data was sampled df2 = df1.sample(10000, random_state=1).copy() # Renaming categories df2.replace({'Product': {'Credit reporting, credit repair services, or other personal consumer reports': 'Credit reporting, repair, or other', 'Credit reporting': 'Credit reporting, repair, or other', 'Credit card': 'Credit card or prepaid card', 'Prepaid card': 'Credit card or prepaid card', 'Payday loan': 'Payday loan, title loan, or personal loan', 'Money transfer': 'Money transfer, virtual currency, or money service', 'Virtual currency': 'Money transfer, virtual currency, or money service'}}, inplace= True) pd.DataFrame(df2.Product.unique())The 18 categories are now reduced to 13, we have combined ‘Credit Card’ and ‘Prepaid card’ to a single class and so on.
Now, we will map each of these categories to a number, so that our model can understand it in a better way and we will save this in a new column named ‘category_id’. Where each of the 12 categories is represented in numerical.
# Create a new column 'category_id' with encoded categories df2['category_id'] = df2['Product'].factorize()[0] category_id_df = df2[['Product', 'category_id']].drop_duplicates() # Dictionaries for future use category_to_id = dict(category_id_df.values) id_to_category = dict(category_id_df[['category_id', 'Product']].values) # New dataframe df2.head()Let us visualize the data, and see how many numbers of complaints are there per category. We will use Bar chart here.
fig = plt.figure(figsize=(8,6)) colors = ['grey','grey','grey','grey','grey','grey','grey','grey','grey', 'grey','darkblue','darkblue','darkblue'] df2.groupby('Product').Consumer_complaint.count().sort_values().plot.barh( ylim=0, color=colors, title= 'NUMBER OF COMPLAINTS IN EACH PRODUCT CATEGORYn') plt.xlabel('Number of ocurrences', fontsize = 10);Above graph shows that most of the customers complained regarding:
Credit reporting, repair, or other
Debt collection
Mortgage
Text processingThe text needs to be preprocessed so that we can feed it to the classification algorithm. Here we will transform the texts into vectors using Term Frequency-Inverse Document Frequency (TFIDF) and evaluate how important a particular word is in the collection of words. For this we need to remove punctuations and do lower casing, then the word importance is determined in terms of frequency.
We will be using TfidfVectorizer function with the below parameters:
min_df: remove the words which has occurred in less than ‘min_df’ number of files.
Sublinear_tf: if True, then scale the frequency in logarithmic scale.
Stop_words: it removes stop words which are predefined in ‘english’.
tfidf = TfidfVectorizer(sublinear_tf=True, min_df=5, ngram_range=(1, 2), stop_words='english') # We transform each complaint into a vector features = tfidf.fit_transform(df2.Consumer_complaint).toarray() labels = df2.category_id print("Each of the %d complaints is represented by %d features (TF-IDF score of unigrams and bigrams)" %(features.shape))Now, we will find the most correlated terms with each of the defined product categories. Here we are finding only three most correlated terms.
# Finding the three most correlated terms with each of the product categories N = 3 for Product, category_id in sorted(category_to_id.items()): features_chi2 = chi2(features, labels == category_id) indices = np.argsort(features_chi2[0]) feature_names = np.array(tfidf.get_feature_names())[indices] unigrams = [v for v in feature_names if len(v.split(' ')) == 1] bigrams = [v for v in feature_names if len(v.split(' ')) == 2] print(" * Most Correlated Unigrams are: %s" %(', '.join(unigrams[-N:]))) print(" * Most Correlated Bigrams are: %s" %(', '.join(bigrams[-N:])))* Most Correlated Unigrams are: overdraft, bank, scottrade * Most Correlated Bigrams are: citigold checking, debit card, checking account * Most Correlated Unigrams are: checking, branch, overdraft * Most Correlated Bigrams are: 00 bonus, overdraft fees, checking account * Most Correlated Unigrams are: dealership, vehicle, car * Most Correlated Bigrams are: car loan, vehicle loan, regional acceptance * Most Correlated Unigrams are: express, citi, card * Most Correlated Bigrams are: balance transfer, american express, credit card * Most Correlated Unigrams are: report, experian, equifax * Most Correlated Bigrams are: credit file, equifax xxxx, credit report * Most Correlated Unigrams are: collect, collection, debt * Most Correlated Bigrams are: debt collector, collect debt, collection agency * Most Correlated Unigrams are: ethereum, bitcoin, coinbase * Most Correlated Bigrams are: account coinbase, coinbase xxxx, coinbase account * Most Correlated Unigrams are: paypal, moneygram, gram * Most Correlated Bigrams are: sending money, western union, money gram * Most Correlated Unigrams are: escrow, modification, mortgage * Most Correlated Bigrams are: short sale, mortgage company, loan modification * Most Correlated Unigrams are: meetings, productive, vast * Most Correlated Bigrams are: insurance check, check payable, face face * Most Correlated Unigrams are: astra, ace, payday * Most Correlated Bigrams are: 00 loan, applied payday, payday loan * Most Correlated Unigrams are: student, loans, navient * Most Correlated Bigrams are: income based, student loan, student loans * Most Correlated Unigrams are: honda, car, vehicle * Most Correlated Bigrams are: used vehicle, total loss, honda financial
Exploring Multi-classification ModelsThe classification models which we are using:
Random Forest
Linear Support Vector Machine
Multinomial Naive Bayes
Logistic Regression.
For more information regarding each model, you can refer to their official guide.
Now, we will split the data into train and test sets. We will use 75% of the data for training and the rest for testing. Column ‘consumer_complaint’ will be our X or the input and the product is out Y or the output.
X = df2['Consumer_complaint'] # Collection of documents y = df2['Product'] # Target or the labels we want to predict (i.e., the 13 different complaints of products) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state = 0)We will keep all the using models in a list and loop through the list for each model to get a mean accuracy and standard deviation so that we can calculate and compare the performance for each of these models. Then we can decide with which model we can move further.
models = [ RandomForestClassifier(n_estimators=100, max_depth=5, random_state=0), LinearSVC(), MultinomialNB(), LogisticRegression(random_state=0), ] # 5 Cross-validation CV = 5 cv_df = pd.DataFrame(index=range(CV * len(models))) entries = [] for model in models: model_name = model.__class__.__name__ accuracies = cross_val_score(model, features, labels, scoring='accuracy', cv=CV) for fold_idx, accuracy in enumerate(accuracies): entries.append((model_name, fold_idx, accuracy)) cv_df = pd.DataFrame(entries, columns=['model_name', 'fold_idx', 'accuracy'])The above code will take sometime to complete its execution.
Compare Text Classification Model performanceHere, we will compare the ‘Mean Accuracy’ and ‘Standard Deviation’ for each of the four classification algorithms.
mean_accuracy = cv_df.groupby('model_name').accuracy.mean() std_accuracy = cv_df.groupby('model_name').accuracy.std() acc = pd.concat([mean_accuracy, std_accuracy], axis= 1, ignore_index=True) acc.columns = ['Mean Accuracy', 'Standard deviation'] accFrom the above table, we can clearly say that ‘Linear Support Vector Machine’ outperforms all the other classification algorithms. So, we will use LinearSVC to train model multi-class text classification tasks.
plt.figure(figsize=(8,5)) sns.boxplot(x='model_name', y='accuracy', data=cv_df, color='lightblue', showmeans=True) plt.title("MEAN ACCURACY (cv = 5)n", size=14); Evaluation of Text Classification ModelNow, let us train our model using ‘Linear Support Vector Machine’, so that we can evaluate and check it performance on unseen data.
X_train, X_test, y_train, y_test,indices_train,indices_test = train_test_split(features, labels, df2.index, test_size=0.25, random_state=1) model = LinearSVC() model.fit(X_train, y_train) y_pred = model.predict(X_test)We will generate claasifiaction report, to get more insights on model performance.
# Classification report print('ttttCLASSIFICATIION METRICSn') print(metrics.classification_report(y_test, y_pred, target_names= df2['Product'].unique()))From the above classification report, we can observe that the classes which have a greater number of occurrences tend to have a good f1-score compared to other classes. The categories which yield better classification results are ‘Student loan’, ‘Mortgage’ and ‘Credit reporting, repair, or other’. The classes like ‘Debt collection’ and ‘credit card or prepaid card’ can also give good results. Now let us plot the confusion matrix to check the miss classified predictions.
conf_mat = confusion_matrix(y_test, y_pred) fig, ax = plt.subplots(figsize=(8,8)) sns.heatmap(conf_mat, annot=True, cmap="Blues", fmt='d', xticklabels=category_id_df.Product.values, yticklabels=category_id_df.Product.values) plt.ylabel('Actual') plt.xlabel('Predicted') plt.title("CONFUSION MATRIX - LinearSVCn", size=16);From the above confusion matrix, we can say that the model is doing a pretty decent job. It has classified most of the categories accurately.
PredictionLet us make some prediction on the unseen data and check the model performance.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state = 0) tfidf = TfidfVectorizer(sublinear_tf=True, min_df=5, ngram_range=(1, 2), stop_words='english') fitted_vectorizer = tfidf.fit(X_train) tfidf_vectorizer_vectors = fitted_vectorizer.transform(X_train) model = LinearSVC().fit(tfidf_vectorizer_vectors, y_train)Now run the prediction.
complaint = """I have received over 27 emails from XXXX XXXX who is a representative from Midland Funding LLC. From XX/XX/XXXX I received approximately 6 emails. From XX/XX/XXXX I received approximately 6 emails. From XX/XX/XXXX I received approximately 9 emails. From XX/XX/XXXX I received approximately 6 emails. All emails came from the same individual, XXXX XXXX. It is becoming a nonstop issue of harassment.""" print(model.predict(fitted_vectorizer.transform([complaint]))) complaint = """Respected Sir/ Madam, I am exploring the possibilities for financing my daughter 's XXXX education with private loan from bank. I am in the XXXX on XXXX visa. My daughter is on XXXX dependent visa. As a result, she is considered as international student. I am waiting in the Green Card ( Permanent Residency ) line for last several years. I checked with Discover, XXXX XXXX websites. While they allow international students to apply for loan, they need cosigners who are either US citizens or Permanent Residents. I feel that this is unfair. I had been given mortgage and car loans in the past which I closed successfully. I have good financial history. print(model.predict(fitted_vectorizer.transform([complaint]))) complaint = """They make me look like if I was behind on my Mortgage on the month of XX/XX/2023 & XX/XX/XXXX when I was not and never was, when I was even giving extra money to the Principal. The Money Source Web site and the managers started a problem, when my wife was trying to increase the payment, so more money went to the Principal and two payments came out that month and because I reverse one of them thru my Bank as Fraud they took revenge and committed slander against me by reporting me late at the Credit Bureaus, for 45 and 60 days, when it was not thru. Told them to correct that and the accounting department or the company revert that letter from going to the Credit Bureaus to correct their injustice. The manager by the name XXXX requested this for the second time and nothing yet. I am a Senior of XXXX years old and a Retired XXXX Veteran and is a disgraced that Americans treat us that way and do not want to admit their injustice and lies to the Credit Bureau.""" print(model.predict(fitted_vectorizer.transform([complaint])))The model is not perfect, yet it is performing very good.
The notebook is available here.
ConclusionWe have implemented a basic multi-class text classification model, you can play with other models like Xgboost, or you can try to compare multiple model performance on this dataset using a machine learning framework called AutoML. This is not yet, still there are complex problems associated within the multi-class text classification tasks, you can always explore more and acquire new concepts and ideas about this topic. That’s It!!
Thank you!
All images are created by the author.
My LinkedIn
The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion
Related
Pensacola With A Purpose: An Asb Diary
Part Three of a five-part series.
Wednesday, March 8, 2006
Wednesday gets off to a bad, bad start.
The morning is typical — a frantic rush to get breakfast eaten, lunches made, and everybody out the door on time. We arrive at the site and the first two jobs are easy, thanks to our new expertise: we clear a patch of weeds from a telephone pole and we tear down a rotted porch at the back of a trailer.
Then the plywood arrives.
Our job, we learn, is to lay down new plywood walls and floors throughout the wrecked trailer, using four hammers, two crowbars, a circular saw and a gas generator, and moldy eight-by-four slabs that have been donated from an amusement park in Washington, D.C. It’s about 9:30, and we’re supposed to finish at noon. Unlikely, we think, and even more unlikely as we watch Daryl, our coordinator from Rebuilding Northwest Florida, get ready to leave. Katie flags him down and asks if we could maybe have a tape measure and a pencil.
Cursing isn’t allowed on site, because we are, after all, representing Boston University, but much of it is done internally. We don’t have any nails. We don’t know if we are supposed to remove the linoleum from the kitchen before laying the plywood floor. And only two people in our crew have used a circular saw before, and neither of them feels particularly expert.
But the good thing about RNF, we’ve discovered, is that when you have to, you get to make your own rules. And luckily, Vernon Doucette, a photographer for BU Photo Services, who arrived on Tuesday to shoot pictures for BU Today and Bostonia, has some ideas about rules to get us through the day. It so happens that Vernon is a serious kayaker and a former Outward Bound instructor. Also, we are pleased to learn, he’s a pretty good construction manager.
The orders begin: sweep up everything off the floor or the plywood won’t lie even. Bring two pieces inside and then measure how much you need to cut. Wear goggles when you use a circular saw. At first, he goes back and forth between shooting and sawing, but eventually, he hands the camera over to Karen and dives full-force into the project. We set down the living room floor — or Vernon, Dan, and Katie do, fitting the pieces together like a jigsaw puzzle — and two-thirds of the bedroom. We don’t get to the walls or the living room, but Daryl doesn’t seem to mind. We are finished by 12:30.
Kendrick, a group leader, feels responsible for the morning’s confusion, and in the van on the way home, she apologizes. No big deal. At this point, everybody is thinking about lunch and our afternoon at the beach.
But the waves, which are enormous, feel amazing; Amy, Katie, Dan, Matt, and I spend nearly an hour diving through them. Vernon takes a well-deserved lunch break and comes back to snap pictures of us getting knocked around by six-foot swells. The images, I think, would make a good news headline: Boston University students drown — photographer captures it all.
We’re not sure what we’ll be doing tomorrow. It might be demolition, or it might be back to today’s trailer to finish the floors and walls. We’ve got our fingers collectively crossed for the former, but if we go back to today’s site at least we’ll be better off than where we started.
Read Part Four
Read more Alternative Spring Breaks stories.
Explore Related Topics:
Update the detailed information about Is Calling A Software Engineer A ‘Programmer’ An Insult? on the Moimoishop.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!