Trending March 2024 # Using The Power Of Deep Learning For Cyber Security (Part 2) – Must # Suggested April 2024 # Top 12 Popular

You are reading the article Using The Power Of Deep Learning For Cyber Security (Part 2) – Must updated in March 2024 on the website Moimoishop.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested April 2024 Using The Power Of Deep Learning For Cyber Security (Part 2) – Must

Introduction

We are in the midst of a deep learning revolution. Unprecedented success is being achieved in designing deep neural network models for building computer vision and Natural Language Processing (NLP) applications.

State-of-the-art benchmarks are disrupted and updated on a regular basis in tasks like object detection, language translation, and sentiment analysis. It’s a great time to work with deep learning!

But there’s still one field that isn’t quite riding this success wave. The application of deep learning in Information Security (InfoSec) is still very much in its nascent stages. The flashy nature of other applications attracts newcomers and investors – but InfoSec is one of the most crucial fields every data scientist should pay attention to.

Here’s the good news – Malware detection and network intrusion detection are two areas where deep learning has shown significant improvements over the rule-based and classic machine learning-based solutions [3].

This article is the second part of our deep learning for cyber security series. We will demonstrate the power of deep neural networks using Tensorflow and Keras to detect obfuscated PowerShell scripts. As we mentioned – this is a must-read for anyone interested in this field.

Table of Contents

What is PowerShell?

Understanding the Problem

Gathering and Building the PowerShell Scripts Dataset

Data Experiments

What is PowerShell?

PowerShell is a task automation and configuration management framework consisting of a robust command line shell. Microsoft open sourced and made it cross-platform compatible in August 2024.

PowerShell has been a heavily exploited tool in various cyber attacks scenarios. According to a research study by Symantec, nearly 95.4% of all scripts analyzed by Symantec Blue Coat Sandbox were malicious[4].

The Odinaff hacker group leveraged malicious PowerShell scripts as part of its attacks on banks and other financial institutions [5]. We can find many tools like PowerShell Empire [6] and PowerSploit [7] on the internet that can be used for reconnaissance, privilege escalation, lateral movement, persistence, defense evasion, and exfiltration.

Adversaries typically use two techniques to evade detection:

First, by running fileless malware, they load malicious scripts downloaded from the internet directly into memory, thereby evading Antivirus (AV) file scanning

Then, they use obfuscation to make their code challenging to decode. This makes it more difficult for any AV or analyst to figure out the intent of the script

Obfuscation of PowerShell scripts for malicious intent is on the rise. The task of analyzing them is made even more difficult due to the high flexibility of its syntax. In Acalvio high interaction decoys, we can monitor PowerShell logs, commands, and scripts that the attacker tried to execute in the decoy. We collect these logs and analyze them in real time and detect whether the script is obfuscated or not.

Understanding the Problem

Microsoft PowerShell is the ideal attacker’s tool in a Windows operating system. There are two primary reasons behind this:

It is installed by default in Windows

Attackers are better off using existing tools that allow them to blend well and possibly evade Antivirus (AV) software

Microsoft has enhanced PowerShell logging considerably since they launched PowerShell 3.0. If Script Block Logging is enabled, we can capture commands and scripts executed through PowerShell in the event logs. These logs can be analyzed to detect and block malicious scripts.

Obfuscation is typically used to evade detection. Daniel and Holmes address this problem of detecting obfuscated scripts in their Blackhat paper [8]. They used a Logistic Regression classifier with Gradient Descent method to achieve a reasonable classification accuracy to separate the obfuscated script from clean scripts.

However, a deep feed-forward neural network (FNN) might enhance other performance metrics, such as precision and recall. Hence in this blog, we decided to use a deep neural network and compare the performance metrics with different machine learning (ML) classifiers.

Gathering and Building the PowerShell Scripts Dataset

We have used the PowerShellCorpus dataset published and open sourced by Daniel [9] for our data experiments. The dataset consists of around ~300k PowerShell scripts scraped from various sources on the internet like Github, PowerShell Gallery, and Technet.

We also scraped PowerShell scripts from Poshcode [10] and added them to the corpus. Finally, we had nearly 3 GB of script data consisting of 300k clean scripts. We used the Invoke-Obfuscation [11] tool to obfuscate the scripts. Once we obfuscated all scripts using this tool, we labeled the dataset consisting of class labels as clean or obfuscated script.

Data Experiments

This command may be obfuscated as:

(((“{2}{9}{12}{0}{3}{10}{13}{4}{18}{8}{17}{11}{5}{16}{1}{15}{14}{7}{19}{6}"-f'-P','es','G', 'rocess8Dy Whe',' {','S','le','-Ta','-gt','e','r','y ','t','e','t',' 8Dy Forma','ort Handl',' 600} 8D','RYl_.Handles ','b'))

This looks suspicious and noisy. Here is another example of a subtle obfuscation for the same command:

This obfuscation makes it hard to detect the intent of the PowerShell command/script. Most of the malicious PowerShell scripts in the wild have these kinds of subtle variations that help them to evade anti-virus software easily.

It is nearly impossible for a security analyst to review every PowerShell script to determine whether it is obfuscated or not. Therefore, we need to automate the obfuscation detection. We can use a rule-based approach for obfuscation detection; however, it may not detect a lot of obfuscation types, and a large number of rules need to be manually written by a domain expert. Therefore, a machine learning/deep learning-based solution is an ideal answer for this problem.

Typically, the first step of machine learning is data cleaning and preprocessing. For the obfuscation detection dataset, the data preprocessing task is done to remove Unicode characters from a script.

Obfuscated scripts look different from normal scripts. Some combination of characters used in obfuscated scripts are not used in normal scripts. So, we use character-level representation for all PowerShell scripts instead of word-based representation.

Also, in the case of PowerShell scripting, sophisticated obfuscation can sometimes completely blur the boundary between words/tokens/identifiers, rendering it useless for any word-based tokenization. In fact, character-based tokenization is also used by security researchers to detect PowerShell obfuscated scripts.

Lee Holmes from Microsoft has explored character frequency-based representation and cosine similarity to detect obfuscated scripts in his blog [12].

There are multiple ways in which characters can be vectorized. One-hot encoding of characters represents every character by a bit. The bit is set to 0 or 1 depending upon whether that character is present in the script or not. The classifiers trained with a single character one-hot encoding perform well.

However, this can be improved by capturing the sequence of characters. For example, a command like New-Object may be obfuscated as (‘Ne’+’w-‘+’Objec’+’t’). The character plus (+) operator is common for any PowerShell script. However, plus (+) followed by a single (‘) or double quote (“) may not be as common. Therefore, we use tf-idf character bigrams to represent as the features input to the classifiers.

Here are 20 bigrams with top tf-idf score from the training dataset:

Clean script:

['er', 'te', 'in', 'at', 're', 'pa', 'st', 'on', 'me', 'en', 'ti', 'le', 'th', 'am', 'nt', 'es', 'se', 'or', 'ro', 'co']

Obfuscated script:

["'+", "+'", '}{', ",'", "',", 'er', 'te', 'in', 're', 'me', 'st', 'et', 'se', 'ar', 'on', 'at', 'ti', 'am', 'es', '{1']

Each script is represented using the character bigrams. We process all these features using a deep Feed Forward Neural Network (FFN) with N hidden layers using Keras and Tensorflow.

Figure 1: Obfuscation Detection data flow diagram using deep FFN

The data flow diagram above shows the training and prediction flow for obfuscation detection. We have varied the value of hidden layers in the deep FNN and found N=6 to be optimal.

The RELU activation function is used for all the hidden layers. Each hidden layer is dense in nature with a dimension of 1000 and a dropout rate of 0.5. For the last layer, sigmoid is used as the activation function. Figure 2 below shows the deep FFN network representation for obfuscation detection:

Figure 2: FFN Network Representation for Obfuscation Detection

We see a validation accuracy of nearly 92% that indicates the model has generalized well outside the training set.

Next, we test our model on the test set. We get an accuracy of 93% with 0.99 recall for the obfuscated class. Figure 3 below shows the classification accuracy and classification loss plots for training and validation data for each epoch:

Figure 3: Classification Accuracy and Loss plots for Training and Validation Phase

Check out the results of our deep FNN as compared to other ML models in the table below. Precision and recall were used to measure the efficacy of the various models:

Classifier Used Precision Recall

Random Forest

0.92

0.97

Logistic Regression

0.91

0.87

Deep Feed-forward Neural Network (FNN)

0.89

0.99

Our objective is to correctly detect most of the obfuscated scripts as the obfuscated script. In other words, we would like to minimize the false negative rate for the obfuscated class. The Recall metric seems to be the appropriate measure in this case.

Table 1 shows that the deep FNN model achieves more recall as compared to other classifiers. The dataset used in our experiments is of medium scale. The datasets used in the wild are typically quite big, and deep FNN performs even better as compared to the other ML classifiers.

End Notes

In this blog, we demonstrated the power of deep learning to detect obfuscated PowerShell scripts. In our next blog of this series, we will share some more use cases where AI and deception can be leveraged for information security.

References About the Authors

Santosh Kosgi, Data Scientist – Acalvio Technologies

Santosh is a member of Data Science team at Acalvio Technologies. He holds a Masters in Computer Science from IIIT Hyderabad. He previously worked at chúng tôi He is interested in solving real world problems using Machine Learning.

Arunabha Choudhury, Data Scientist – Acalvio Technologies

Arunabha is a member of the Data Science team at Acalvio Technologies. He holds a Masters Degree in CS from University of Kansas with minor in Machine Learning. In his 6+ years of experience as Data Scientists, he has worked with companies like Tresata (now considered a Big Data Unicorn based out of Charlotte, NC) and Samsung R&D. To his credit he has 3 patents and 4 conference publications. He is primarily interested in Machine Learning at Scale.

Waseem Mohd., Data Scientist – Acalvio Technologies

Waseem is a member of the Data Science team at Acalvio. He is a Computer Science graduate from IIT Delhi and has previously worked with companies like Samsung and Microsoft. He is interested in real world applications of Deep Learning.

Dr. Satnam Singh, Chief Data Scientist – Acalvio Technologies

Dr. Satnam Singh is currently leading security data science development at Acalvio Technologies. He has more than a decade of work experience in successfully building data products from concept to production in multiple domains. He was named as one of the top 10 data scientists in India in 2024.  He has 25+ patents and 30+ journal and conference publications to his credit.

Apart from holding a PhD degree in ECE from University of Connecticut, Satnam also holds a Masters in ECE from University of Wyoming. Satnam is a senior IEEE member and a regular speaker in various Big Data and Data Science conferences.

Related

You're reading Using The Power Of Deep Learning For Cyber Security (Part 2) – Must

Fundamentals Of Deep Learning – Starting With Artificial Neural Network

Introduction

Did you know the first neural network was discovered in early 1950s ?

Deep Learning (DL) and Neural Network (NN) is currently driving some of the most ingenious inventions in today’s century. Their incredible ability to learn from data and environment makes them the first choice of machine learning scientists.

Deep Learning and Neural Network lies in the heart of products such as self driving cars, image recognition software, recommender systems etc. Evidently, being a powerful algorithm, it is highly adaptive to various data types as well.

People think neural network is an extremely difficult topic to learn. Therefore, either some of them don’t use it, or the ones who use it, use it as a black box. Is there any point in doing something without knowing how is it done? NO!

Note: This article is best suited for intermediate users in data science & machine learning. Beginners might find it challenging.

What is a Neural Network?

Neural Networks (NN), also called as Artificial Neural Network is named after its artificial representation of working of a human being’s nervous system. Remember this diagram ? Most of us have been taught in High School !

Flashback Recap: Lets start by understanding how our nervous system works. Nervous System comprises of millions of nerve cells or neurons. A neuron has the following structure:

The major components are:

Dendrites- It takes input from other neurons in form of an electrical impulse

Cell Body– It generate inferences from those inputs and decide what action to take

Axon terminals– It transmit outputs in form of electrical impulse

In simple terms, each neuron takes input from numerous other neurons through the dendrites. It then performs the required processing on the input and sends another electrical pulse through the axiom into the terminal nodes from where it is transmitted to numerous other neurons.

ANN works in a very similar fashion. The general structure of a neural network looks like:Source

This figure depicts a typical neural network with working of a single neuron explained separately. Let’s understand this.

The input to each neuron are like the dendrites. Just like in human nervous system, a neuron (artificial though!) collates all the inputs and performs an operation on them. Lastly, it transmits the output to all other neurons (of the next layer) to which it is connected. Neural Network is divided into layer of 3 types:

Input Layer: The training observations are fed through these neurons

Hidden Layers: These are the intermediate layers between input and output which help the Neural Network learn the complicated relationships involved in data.

Output Layer: The final output is extracted from previous two layers. For Example: In case of a classification problem with 5 classes, the output later will have 5 neurons.

Lets start by looking into the functionality of each neuron with examples.

How a Single Neuron works?

In this section, we will explore the working of a single neuron with easy examples. The idea is to give you some intuition on how a neuron compute outputs using the inputs. A typical neuron looks like:

The different components are:

x1, x2,…, xN: Inputs to the neuron. These can either be the actual observations from input layer or an intermediate value from one of the hidden layers.

x0: Bias unit. This is a constant value added to the input of the activation function. It works similar to an intercept term and typically has +1 value.

w0,w1, w2,…,wN: Weights on each input. Note that even bias unit has a weight.

a: Output of the neuron which is calculated as:

Here f is known an activation function. This makes a Neural Network extremely flexible and imparts the capability to estimate complex non-linear relationships in data. It can be a gaussian function, logistic function, hyperbolic function or even a linear function in simple cases.

Lets implement 3 fundamental functions – OR, AND, NOT using Neural Networks. This will help us understand how they work. You can assume these to be like a classification problem where we’ll predict the output (0 or 1) for different combination of inputs.

We will model these like linear classifiers with the following activation function:

Example 1: AND

The AND function can be implemented as:

The output of this neuron is:

a = f( -1.5 + x1 + x2 )

The truth table for this implementation is:

Here we can see that the AND function is successfully implemented. Column ‘a’ complies with ‘X1 AND X2’. Note that here the bias unit weight is -1.5. But it’s not a fixed value. Intuitively, we can understand it as anything which makes the total value positive only when both x1 and x2 are positive. So any value between (-1,-2) would work.

Example 2: OR

The OR function can be implemented as:

The output of this neuron is:

a = f( -0.5 + x1 + x2 )

The truth table for this implementation is:

Column ‘a’ complies with ‘X1 OR X2’. We can see that, just by changing the bias unit weight, we can implement an OR function. This is very similar to the one above. Intuitively, you can understand that here, the bias unit is such that the weighted sum will be positive if any of x1 or x2 becomes positive.

Example 3: NOT

Just like the previous cases, the NOT function can be implemented as:

The output of this neuron is:

a = f( 1 – 2*x1 )

The truth table for this implementation is:

Again, the compliance with desired value proves functionality. I hope with these examples, you’re getting some intuition into how a neuron inside a Neural Network works. Here I have used a very simple activation function.

Note: Generally a logistic function will be used in place of what I used here because it is differentiable and makes determination of a gradient possible. There’s just 1 catch. And, that is, it outputs floating value and not exactly 0 or 1.

Why multi-layer networks are useful?

After understanding the working of a single neuron, lets try to understand how a Neural Network can model complex relations using multiple layers. To understand this further, we will take the example of an XNOR function. Just a recap, the truth table of an XNOR function looks like:

Here we can see that the output is 1 when both inputs are same, otherwise 0. This sort of a relationship cannot be modeled using a single neuron. (Don’t believe me? Give it a try!) Thus we will use a multi-layer network. The idea behind using multiple layers is that complex relations can be broken into simpler functions and combined.

Lets break down the XNOR function.

X1 XNOR X2 = NOT ( X1 XOR X2 )  = NOT [ (A+B).(A'+B') ]       (Note: Here '+' means OR and '.' mean AND) = (A+B)' + (A'+B')' = (A'.B') + (A.B)

Now we can implement it using any of the simplified cases. I will show you how to implement this using 2 cases.

Case 1: X1 XNOR X2 = (A’.B’) + (A.B)

Here the challenge is to design a neuron to model A’.B’ . This can be easily modeled using the following:

The output of this neuron is:

a = f( 0.5 – x1 – x2 )

The truth table for this function is:

Now that we have modeled the individual components and we can combine them using a multi-layer network. First, lets look at the semantic diagram of that network:

Here we can see that in layer 1, we will determine A’.B’ and A.B individually. In layer 2, we will take their output and implement an OR function on top. This would complete the entire Neural Network. The final network would look like this:

If you notice carefully, this is nothing but a combination of the different neurons which we have already drawn. The different outputs represent different units:

a1: implements A’.B’

a2: implements A.B

a3: implements OR which works on a1 and a2, thus effectively (A’.B’ + A.B)

The functionality can be verified using the truth table:

I think now you can get some intuition into how multi-layers work. Lets do another implementation of the same case.

Case 2: X1 XNOR X2 = NOT [ (A+B).(A’+B’) ]

In the above example, we had to separately calculate A’.B’. What if we want to implement the function just using the basic AND, OR, NOT functions. Consider the following semantic:

Here you can see that we had to use 3 hidden layers. The working will be similar to what we did before. The network looks like:

Here the neurons perform following actions:

a1: same as A

a2: implements A’

a3: same as B

a4: implements B’

a5: implements OR, effectively A+B

a6: implements OR, effectively A’+B’

a7: implements AND, effectively (A+B).(A’+B’)

a8: implements NOT, effectively NOT [ (A+B).(A’+B’) ] which is the final XNOR

Note that, typically a neuron feeds into every other neuron of the next layer except the bias unit. In this case, I’ve obviated few connections from layer 1 to layer 2. This is because their weights are 0 and adding them will make it visually cumbersome to grasp.

The truth table is:

General Structure of a Neural Network

Now that we had a look at some basic examples, lets define a generic structure in which every Neural Network falls. We will also see the equations to be followed to determine the output given an input. This is known as Forward Propagation.

A generic Neural Network can be defined as:

It has L layers with 1 input layer, 1 output layer and L-2 hidden layers. Terminology:

L: number of layers

Ni: number of neuron in ith layer excluding the bias unit, where i=1,2,…,L

Since the the output of each layer forms the input of next layer, lets define the equation to determine the output of i+1th layer using output of ith layer as input.

The input to the i+1th layer are:

Ai = [ ai(0), ai(1), ......, ai(Ni) ] Dimension: 1 x Ni+1

The weights matrix from ith to i+1th layer is:

W(i) = [ [ W01(i) W11(i) ....... WNi1(i) ] [ W02(i) W12(i) ....... WNi2(i) ] ... ... ... ... ... ... ... ... [ W0Ni+1(i) W1Ni+1(i) ....... WNiNi+1(i) ] ] Dimension: Ni+1 x Ni+1

The output of the i+1th layer can be calculated as:

Ai+1 = f( Ai.W(i) ) Dimension: 1 x Ni+1

Using these equations for each subsequent layer, we can determine the final output. The number of neurons in the output layer will depend on the type of problem. It can be 1 for regression or binary classification problem or multiple for multi-class classification problems.

But this is just determining the output from 1 run. The ultimate objective is to update the weights of the model in order to minimize the loss function. The weights are updated using a back-propogation algorithm which we’ll study next.

Back-Propagation

Back-propagation (BP) algorithms works by determining the loss (or error) at the output and then propagating it back into the network. The weights are updated to minimize the error resulting from each neuron. I will not go in details of the algorithm but I will try to give you some intuition into how it works.

The first step in minimizing the error is to determine the gradient of each node wrt. the final output. Since, it is a multi-layer network, determining the gradient is not very straightforward.

Let’s understand the gradients for multi-layer networks. Lets take a step back from neural networks and consider a very simple system as following:

Here there are 3 inputs which simple processing as:

d = a – b e = d * c = (a-b)*c

Now we need to determine the gradients of a,b,c,d wrt the output e. The following cases are very straight forward:

However, for determining the gradients for a and b, we need to apply the chain rule.

And, this way the gradient can be computed by simply multiplying the gradient of the input to a node with that of the output of that node. If you’re still confused, just read the equation carefully 5 times and you’ll get it!

But, the actual cases are not that simple. Let’s take another example. Consider a case where a single input is being fed into multiple items in the next layer as this is almost always the case with neural network.

In this case, the gradients of all other will be very similar to the above example except for ‘m’ because m is being fed into 2 nodes. Here, I’ll show how to determine the gradient for m and rest you should calculate on your own.

Here you can see that the gradient is simply the summation of the two different gradients. I hope the cloud cover is slowly vanishing and things are becoming lucid. Just understand these concepts and we’ll come back to this.

Before moving forward, let’s sum up the entire process behind optimization of a neural network. The various steps involved in each iteration are:

Select a network architecture, i.e. number of hidden layers,  number of neurons in each layer and activation function

Initialize weights randomly

Use forward propagation to determine the output node

Find the error of the model using the known labels

Back-propogate the error into the network and determine the error for each node

Update the weights to minimize gradient

Till now we have covered #1 – #3 and we have some intuition into #5. Now lets start from #4 – #6. We’ll use the same generic structure of NN as described in section 4.

#4- Find the error

Here y(i) is the actual outcome from training data

#5- Back-propogating the error into the network

The error for layer L-1 should be determined first using the following:

where i = 0,1,2, ….., NL-1 (number of nodes in L-1th layer)

Intuition from the concepts discussed in former half of this section:

We saw that the gradient of a node is a function of the gradients of all nodes from next layer. Here, the error at a node is based on weighted sum of errors on all the nodes of the next layer which take output of this node as input. Since errors are calculated using gradients of each node, the factor comes into picture.

f'(x)(i) refers to the derivative of the activation function for the inputs coming into that node. Note that x refers to weighted sum of all inputs in present node before application of activation function.

The chain rule is followed here by multiplication of the gradient of current node, i.e. f'(x)(i) with that of subsequent nodes which comes from first half of RHS of the equation.

This process has to be repeated consecutively from L-1th layer to 2nd layer. Note that the first layer is just the inputs.

#6- Update weights to minimize gradient

Use the following update rule for weights: Wik(l) = Wik(l) + a(i).el+1(k)

where,

Wik(l) refers to the weight from the lth layer to l+1th layer from ith node to kth node

With this we have successfully understood how a neural network works. Please feel free to discuss further if needed.

Frequently Asked Questions

This article is focused on the fundamentals of a Neural Network and how it works. I hope now you understand the working of a neural network and wouldn’t use it as a black box ever. It’s really easy once you understand doing it practically as well.

Therefore, in my upcoming article, I’ll explain the applications of using Neural Network in Python. More than theoretical, I’ll focus on practical aspect of Neural Network. Two applications come to my mind immediately:

Image Processing

Natural Language Processing

You want to apply your analytical skills and test your potential? Then participate in our Hackathons and compete with Top Data Scientists from all over the world.

Related

Using Isaftertoday For Power Bi Time Intelligence Functions

I’m going to show you how to use IsAfterToday in extended date tables for Power BI time intelligence scenarios.

When it comes to DAX functions and other tools, I assign them certain jobs or personas in my head so that I can easily remember what they do. For example, I see SWITCH as an air traffic controller. I think of FILTER as a bouncer at a club who decides who gets in and who doesn’t.

As for IsAfterToday, I see it as the Terminator who sweeps through my data, tables and visuals to take out everything I don’t need. You may watch the full video of this tutorial at the bottom of this blog.

To access IsAfterToday, I need to use an extended date table. That’s because IsAfterToday is actually not a DAX function; it’s part of the table itself.

If you go into your table’s data view, the IsAfterToday column looks like this.

The logic behind IsAfterToday is simple. If the date falls after today, it’s TRUE. If it’s before today, it’s FALSE.

To further show you what IsAfterToday can do, I’m going go through two use cases that perfectly showcase its relevance in Power BI time intelligence scenarios.

The first use case is about terminating cumulative totals. It’s a case that’s often asked about in the Enterprise DNA Forum.

This case involves a basic structure.

I have the Quarter & Year, the Total Sales and the Cumulative Sales.

The Total Sales runs from 2023 up to the present, and then continues with some forecast data through the end of 2023.

As for the Cumulative Sales, it’s just the basic Cumulative Sales pattern with ALLSELECTED applied on the Dates.

The visualization shows me that there’s a problem somewhere because the data becomes questionable after the Total Sales drops out.

Looking back at the table, there’s no data for 2023.

So in the chart, the last figure is just repeated over and over until the end of 2023.

So how do we clean up the Total Sales and the Cumulative Sales?

First, I’ll use the DAX approach. I’ll start off by dropping this column for Alt Cumulative Sales into the table.

This is what the the Alt Cumulative Sales measure looks like.

It also shows an IF filter that says if IsAfterToday is TRUE then a BLANK is assigned to it. If not then the Cumulative Sales value is assigned to it.

So, if I go back to the table and check, it does show that it returns the right value row by row.

The problem, however, is in the Total.

The Total is showing 73 million when it should show the last value, which is 59 million.

So, I’ll drop the Alt2 Cumulative Total into the table to show you what the correct data should look like.

Now, I’ll show you the difference between the Alt Cumulative Sales column and the Alt2 Cumulative Sales.

Here’s the same Cumulative Sales measure.

Then here’s the DAXFilter where IsAfterToday is applied.

In the measure I used earlier, nothing came after that. That’s why the values were correct for each row, but there was no way the calculation could tell if it had reached the final row or not.

In this case, there’s a whole new structure that does that.

There’s an IF ISINSCOPE function being applied so that if I reach the total row, I automatically force a total of all the Cumulative Sales up to that point.

This is the approach that a lot of Power BI users apply, mostly because it actually works. But again, it also means having to write over 30 lines of DAX.

This is what that’s going to look like once the filters are showing.

Then under Fields, I’ll search for IsAfterToday.

I’ll drag and drop that into my filters.

Then, under that IsAfterToday filter, I’ll tick False, which means I’m referring to today or earlier days.

Look at how that cleans things up. All the values are in the right places now. The totals are also correct.

The same thing can be said for the visualization and the slicers.

Evidently, this approach is much more efficient than purely using a DAX approach.

I’ll now work on the second use case, which involves taking the Total Sales field and splitting the data into current data and forecast data using IsAfterToday.

I also want to make it dynamic so that over time it puts more data into the actual and less data in the forecast until it reaches the end of the forecast period and everything becomes actual.

I still have the Total Sales from the last example. As mentioned earlier, it has data from the past and the present, plus data towards the end of the year. So it’s a mix of actual data and forecast data.

I also have my basic Cumulative Sales measure on top of that and a Cumulative Sales visual.

And let’s say what we wanted to do is to take and actually decompose that total sales in our visual into actual and forecast.

So, for Actual Sales, this is what the DAX measure is going to look like.

Basically, this states that if the date returns a FALSE for IsAfterToday, the Cumulative Actual value should be used. If not, a BLANK should be returned.

Now I’m going to take the measure for Cumulative Forecast Sales and add it under my Values as well.

Lookin at the DAX however, it’s showing TRUE for IsAfterToday to return the forecast values.

Under visualizations, I’ll remove Cumulative Sales.

Then, I’ll drop the Cumulative Actual Sales and Cumulative Forecast Sales there instead.

Now, the visualization shows the Cumulative Actual Sales and Cumulative Forecast Sales clearly decomposed in the visualization.

This was also a good way to showcase that although DAX is truly powerful, there are also cases where a quicker alternative is needed. It’s all about mastering the pros and cons of using DAX in any given situation so that you can always choose which approach you want to apply.

All the best,

Brian

Become An Evernote Power User: 10 Must

You can already bend Evernote’s notes, notebooks, and stacks to your will. And maybe you’ve directed your team to use Evernote Business. Evernote is friendly when you’re getting started with it, but the more you use it, the more your notes can pile up, threatening your productivity.

Now that you’ve excelled at the basics, it’s time to dig into Evernote’s arsenal and charge ahead like a true note-taking, to-do-list-tackling warrior. Checkboxes

One of the more popular uses for Evernote is to create lists, such as to-do lists, shopping lists, enemies lists, and the like. A simple text list is fine, but you can enhance the utility of your list by adding checkboxes to the items on it.

Save frequently used searches

Save

Instructions for saving searches on mobile devices are similar. Look for the magnifying-glass icon any time you’ve completed a successful search (that is, a search with at least one result) to save it.

Clip Web pages with Evernote Mobile

Because of the vagaries of smartphone Web browsers, Evernote’s mobile app can’t clip Web pages by default, dulling its utility. There’s no easy workaround for the iPhone, but Android users have a couple of options to make Web clipping possible. Dolphin: Evernote is a free add-on that lets you grab Web pages and pull them into Evernote, although it can only grab entire pages, not partial selections. EverWebClipper ($2.88) gives you more flexibility in what you can snag, if you’re willing to pay for the privilege. The offline option on an iPhone

If you’re an Evernote Premium user, you can configure individual notebooks to be accessible offline, whether you have an Internet connection or not. In your device’s Evernote Settings panel under the Offline Notebooks option, just select the notebooks you want to keep stored on your phone or tablet.

One important caveat: Evernote does not save a copy of every version of every note, but rather makes a backup of your notes on a schedule that runs every few hours. If you make multiple changes to a note over a short amount of time, only the most recent version is likely to be saved. Don’t rely on note history to save you if you accidentally erase your entire document 10 minutes after you create it. Web Clipper

Emailing a webpage or its URL to yourself for later retrieval never seems to work right. This task is especially difficult if you’re trying to save a password-protected webpage or a news story that may simply vanish at a later date.

Evernote’s Web Clipper lets you copy webpages in full to Evernote, but power users know that you don’t have to grab the entire screen. When you use Web Clipper, it will automatically attempt to determine where the “meat” of a webpage is, encircling it in a yellow border and graying out the detritus. Use the arrow keys on your keyboard to grab more of the page (Up Arrow) or less of the page (Down Arrow), or to pick a different selection on the page (Left or Right Arrow). When you’re done, press Enter to finalize your clipping and save it.

Master Evernote’s search tool

Evernote has search tags and much more to help you unearth your best notes.

As your Evernote database begins to fill up, you’ll have to rely more and more on searches to find what you’re looking for. You can search for simple keywords, but this tactic will start to turn up a larger number of results, especially if you tend to use Evernote to save lots of information about a narrow set of topics.

To search only within your tags, type tag:tagname or tag:"multiple-word tagname" into Evernote’s search field. To find an exact phrase that comprises multiple words, use quotation marks just as you would in a Web search.

You can use a structure similar to the tag search above to search only for notes within notebooks that contain specified terms in their names. Type notebook:notebookname or notebook:"multiple-word notebookname" into the field.

If you want to find notes that contain your term in the note’s title, try either intitle:term or intitle:"multiple-word term" in your search.

To return notes that contain any of the specified terms inside, type any:term1 term2 term3 in the field. (A standard search for term1 term2 term3 would return only notes containing all three terms.)

If you’d rather get results based on the last time a note was revised, type updated:yyyymmdd into the search field.

Visit Account Info in the desktop app for your Evernote email address.

A quick and easy way to get something into Evernote is simply to email it to your Evernote address. The problem: If you don’t specify where the email should go, it will create a note in your default notebook, with no tags.

When sending an email to Evernote, you can manipulate the subject line to determine where it should end up. Here’s an example of a subject line that covers all the bases:

The Hobbit @Movies #review #4stars #dwarves

This creates a note called “The Hobbit” in your Movies notebook, with tags of “review,” “4stars,” and “dwarves.” Note that you must put the notebook (@) and tags (#) identifiers in the above order. Also, the notebook and tags must already exist before you attempt to use them in an email to Evernote.

Transcribe voice notes

Although Evernote can now convert voice recordings directly to text on Android devices, it can’t do that trick on iOS devices or via recordings made on your PC. You can get around this and make audio notes searchable through a couple of methods. First, you can use a smartphone app like Dragon Dictation to record a voice memo, and then copy the text into Evernote.

Alternatively, you can use a third-party add-in called Voice2Note to do the translation for you directly from Evernote. Just register for Voice2Note online, and record voice notes within the Evernote app normally. They’ll be transcribed and saved behind the scenes. You can also call a special Voice2Note number to create new notes via a simple phone call—something that you can’t do without an add-in on any platform. (Voice2Note is free for five transcriptions per month, or $3 per month if you need more.)

Only a rube uses the mouse to get around desktop apps. The following keyboard shortcuts help you use Evernote even more efficiently on a PC. (The commands are similar on a Mac. And you’ll find even more shortcuts on Evernote’s site.)

Ctrl-Alt-N: Start a new note. (In Windows, this is a global shortcut, meaning that it works from any application as long as Evernote is open.)

Windows-A: Pastes selected text into a new or open note. (Global shortcut.)

F9: Synchronize.

Ctrl-N: New note.

Ctrl-Shift-N: New notebook.

Ctrl-Shift-T: New tag.

Ctrl-Shift-E: Send a note or notes by email.

Ctrl-Shift-C: Insert a checkbox.

Alt-Shift-D: Insert the current time and date.

A/B Testing For Data Science Using Python – A Must

Overview

A/B testing is a popular way to test your products and is gaining steam in the data science field

Here, we’ll understand what A/B testing is and how you can leverage A/B testing in data science using Python

Introduction

Statistical analysis is our best tool for predicting outcomes we don’t know, using the information we know.

Picture this scenario – You have made certain changes to your website recently. Unfortunately, you have no way of knowing with full accuracy how the next 100,000 people who visit your website will behave. That is the information we cannot know today, and if we were to wait until those 100,000 people visited our site, it would be too late to optimize their experience.

This seems to be a classic Catch-22 situation!

This is where a data scientist can take control. A data scientist collects and studies the data available to help optimize the website for a better consumer experience. And for this, it is imperative to know how to use various statistical tools, especially the concept of A/B Testing.

A/B Testing is a widely used concept in most industries nowadays, and data scientists are at the forefront of implementing it. In this article, I will explain A/B testing in-depth and how a data scientist can leverage it to suggest changes in a product.

Table of contents:

What is A/B testing?

How does A/B testing work?

Statistical significance of the Test

Mistakes we must avoid while conducting the A/B test

When to use A/B test

What is A/B testing?

A/B testing is a basic randomized control experiment. It is a way to compare the two versions of a variable to find out which performs better in a controlled environment.

For instance, let’s say you own a company and want to increase the sales of your product. Here, either you can use random experiments, or you can apply scientific and statistical methods. A/B testing is one of the most prominent and widely used statistical tools.

In the above scenario, you may divide the products into two parts – A and B. Here A will remain unchanged while you make significant changes in B’s packaging. Now, on the basis of the response from customer groups who used A and B respectively, you try to decide which is performing better.

Source

It is a hypothetical testing methodology for making decisions that estimate population parameters based on sample statistics. The population refers to all the customers buying your product, while the sample refers to the number of customers that participated in the test.

How does A/B Testing Work?

The big question!

In this section, let’s understand through an example the logic and methodology behind the concept of A/B testing.

Let’s say there is an e-commerce company XYZ. It wants to make some changes in its newsletter format to increase the traffic on its website. It takes the original newsletter and marks it A and makes some changes in the language of A and calls it B. Both newsletters are otherwise the same in color, headlines, and format.

Objective

Our objective here is to check which newsletter brings higher traffic on the website i.e the conversion rate. We will use A/B testing and collect data to analyze which newsletter performs better.

1.  Make a Hypothesis

Before making a hypothesis, let’s first understand what is a hypothesis.

A hypothesis is a tentative insight into the natural world; a concept that is not yet verified but if true would explain certain facts or phenomena.

It is an educated guess about something in the world around you. It should be testable, either by experiment or observation. In our example, the hypothesis can be “By making changes in the language of the newsletter, we can get more traffic on the website”.

In hypothesis testing, we have to make two hypotheses i.e Null hypothesis and the alternative hypothesis. Let’s have a look at both.

Null hypothesis or H0:

The null hypothesis is the one that states that sample observations result purely from chance. From an A/B test perspective, the null hypothesis states that there is no difference between the control and variant groups. It states the default position to be tested or the situation as it is now, i.e. the status quo. Here our H0 is ” there is no difference in the conversion rate in customers receiving newsletter A and B”.

Alternative Hypothesis or H0:

The alternative hypothesis challenges the null hypothesis and is basically a hypothesis that the researcher believes to be true. The alternative hypothesis is what you might hope that your A/B test will prove to be true.

In our example, the Ha is- “the conversion rate of newsletter B is higher than those who receive newsletter A“.

Now, we have to collect enough evidence through our tests to reject the null hypothesis.

2. Create Control Group and Test Group

Once we are ready with our null and alternative hypothesis, the next step is to decide the group of customers that will participate in the test. Here we have two groups – The Control group, and the Test (variant) group.

The Control Group is the one that will receive newsletter A and the Test Group is the one that will receive newsletter B.

For this experiment, we randomly select 1000 customers – 500 each for our Control group and Test group.

Randomly selecting the sample from the population is called random sampling. It is a technique where each sample in a population has an equal chance of being chosen. Random sampling is important in hypothesis testing because it eliminates sampling bias, and it’s important to eliminate bias because you want the results of your A/B test to be representative of the entire population rather than the sample itself.

Another important aspect we must take care of is the Sample size. It is required that we determine the minimum sample size for our A/B test before conducting it so that we can eliminate under coverage bias. It is the bias from sampling too few observations.

3. Conduct the A/B Test and Collect the Data

One way to perform the test is to calculate daily conversion rates for both the treatment and the control groups. Since the conversion rate in a group on a certain day represents a single data point, the sample size is actually the number of days. Thus, we will be testing the difference between the mean of daily conversion rates in each group across the testing period.

When we run our experiment for one month, we noticed that the mean conversion rate for the Control group is 16% whereas that for the test Group is 19%.

Statistical significance of the Test

Now, the main question is – Can we conclude from here that the Test group is working better than the control group?

The answer to this is a simple No! For rejecting our null hypothesis we have to prove the Statistical significance of our test.

There are two types of errors that may occur in our hypothesis testing:

Type I error: We reject the null hypothesis when it is true. That is we accept the variant B when it is not performing better than A

Type II error: We failed to reject the null hypothesis when it is false. It means we conclude variant B is not good when it performs better than A

To avoid these errors we must calculate the statistical significance of our test.

An experiment is considered to be statistically significant when we have enough evidence to prove that the result we see in the sample also exists in the population.

That means the difference between your control version and the test version is not due to some error or random chance. To prove the statistical significance of our experiment we can use a two-sample T-test.

Source

To understand this, we must be familiar with a few terms:

Significance level (alpha):

The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is true. Generally, we use the significance value of 0.05

P-Value: It is the probability that the difference between the two values is just because of random chance. P-value is evidence against the null hypothesis. The smaller the p-value stronger the chances to reject the H0. For the significance level of 0.05, if the p-value is lesser than it hence we can reject the null hypothesis

Confidence interval: The confidence interval is an observed range in which a given percentage of test outcomes fall. We manually select our desired confidence level at the beginning of our test. Generally, we take a 95% confidence interval

Next, we can calculate our t statistics using the below formula:

Let’s Implement the Significance Test in Python

Let’s see a python implementation of the significance test. Here, we have a dummy data having an experiment result of an A/B testing for 30 days. Now we will run a two-sample t-test on the data using Python to ensure the statistical significance of chúng tôi can download the sample data here.

Python Code:



sns.distplot(data.Conversion_B)

At last, we will perform the t-test:

t_stat, p_val= ss.ttest_ind(data.Conversion_B,data.Conversion_A) t_stat , p_val (3.78736793091929, 0.000363796012828762)

For our example, the observed value i.e the mean of the test group is 0.19. The hypothesized value (Mean of the control group) is 0.16. On the calculation of the t-score, we get the t-score as .3787. and the p-value is 0.00036.

SO what does all this mean for our A/B Testing?

Here, our p-value is less than the significance level i.e 0.05. Hence, we can reject the null hypothesis. This means that in our A/B testing, newsletter B is performing better than newsletter A. So our recommendation would be to replace our current newsletter with B to bring more traffic on our website.

What Mistakes Should we Avoid While Conducting A/B Testing?

There are a few key mistakes I’ve seen data science professionals making. Let me clarify them for you here:

Invalid hypothesis: The whole experiment depends on one thing i.e the hypothesis. What should be changed? Why should it be changed, what the expected outcome is, and so on? If you start with the wrong hypothesis, the probability of the test succeeding, decreases

Testing too Many Elements Together: Industry experts caution against running too many tests at the same time. Testing too many elements together makes it difficult to pinpoint which element influenced the success or failure. Thus, prioritization of tests is indispensable for successful A/B testing

Ignoring Statistical Significance: It doesn’t matter what you feel about the test. Irrespective of everything, whether the test succeeds or fails, allow it to run through its entire course so that it reaches its statistical significance

Not considering the external factor: Tests should be run in comparable periods to produce meaningful results. For example, it is unfair to compare website traffic on the days when it gets the highest traffic to the days when it witnesses the lowest traffic because of external factors such as sale or holidays

When Should We Use A/B Testing?

A/B testing works best when testing incremental changes, such as UX changes, new features, ranking, and page load times. Here you may compare pre and post-modification results to decide whether the changes are working as desired or not.

A/B testing doesn’t work well when testing major changes, like new products, new branding, or completely new user experiences. In these cases, there may be effects that drive higher than normal engagement or emotional responses that may cause users to behave in a different manner.

End Notes

To summarize, A/B testing is at least a 100-year-old statistical methodology but in its current form, it comes in the 1990s. Now it has become more eminent with the online environment and availability for big data. It is easier for companies to conduct the test and utilize the results for better user experience and performance.

There are many tools available for conducting A/B testing but being a data scientist you must understand the factors working behind it. Also, you must be aware of the statistics in order to validate the test and prove it’s statistical significance.

To know more about hypothesis testing, I will suggest you read the following article:

Related

Cyber Monday: Last Chance For The Best Amazon Deals

We may earn revenue from the products available on this page and participate in affiliate programs. Learn more ›

Cyber Monday is almost over, but there are still a ton of discounts available on new electronics, high-quality headphones, smart TVs, games for kids, appliances for your kitchen, and more. We’ve already spent the day scouring through all the deals available to find the ones that are actually worth it.

You can expect new deals all week, but Cyber Monday is the day you can get the most-coveted products for the biggest discount. Whether you’re trying to add to your smart home device collection, upgrade your home entertainment system, or add to your kitchen gadgets, you can still get what you need on a discount right now.

We’ll be updating this page regularly with new links and deals, but if you see something you think you want, we suggest you jump on it. With literally millions of people sifting through these discounts, some will inevitably sell out. The most recent batch adds some attractive smart home accessories and household items.

This smart speaker is already over 50 percent off. Then, you get a free smart, variable-color light bulb on top of it. If you’re stocking up on Secret Santa gifts for the office party, this can handle both in one affordable swoop.

This may not seem like the hugest deal, but Sony’s flagship ANC earbuds are some of the best we’ve ever tested and this is the cheapest we have ever seen them from a big-name seller like Amazon. They’re a little chunky, but the sound is exceptional and the noise canceling is very impressive.

Amazon Deals

Amazon typically offers serious price drops on its own devices.

Echo Show 8 $60 (Was $110)

Amazon Echo Buds wireless earbuds $70 (Was $120)

Amazon Kindle Paperwhite e-reader $105 (Was $140)

Blink Mini indoor security camera $20 (Was $35)

Kindle Oasis e-reader $175 (Was $250)

Kindle Paperwhite Kids e-reader $115 (Was $160)

Echo (4th Gen) $60 (Was $100)

Amazfit Band 5 Fitness Tracker with Alexa Built-in $30 (Was $40)

Amazon Fire TV 65″ Omni Series 4K UHD smart TV $600 (Was $830)

Smart Home & Electronics

Razer Basilisk High-Speed Wireless Gaming Mouse $80 (Was $150)

Furbo Dog Camera $118 (Was $169)

Samsung 43-inch QLED 4K TV $497.11 (Was $600)

Samsung 75-inch Q70A 4K QLED TV $1,498 (Was $2,300)

LG OLED C1 55-inch TV $1,297 (Was $1,500)

Amazon Fire TV 55-inch smart TV $380 (Was $520)

iRobot Roomba s9+ (9550) Robot Vacuum & Braava Jet m6 (6112) Robot Mop Bundle $1,299 (Was $1,600)

August Wi-Fi, (4th Generation) Smart Lock $170 (Was $230)

Up to 45 percent off GE Smart Plugs and Grow Light

Big Ass Fans Haiku L Smart Ceiling Fan $559.30 (Was $835)

BLACK+DECKER Works with Alexa Smart Under Cabinet Lighting $71 (Was $100)

myQ Chamberlain Smart Garage Control $17 (Was $30)

Shark AV1010AE IQ Robot Vacuum with XL Self-Empty Base $300 (Was $600)

Samsung Galaxy Tab S7 tablet $530 (Was $730)

Logitech C920x webcam $60 (Was $70)

Beats Studio Buds wireless earbuds $100 (Was $150)

SAMSUNG 49-inch Odyssey G9 Gaming Monitor $980 (Was $1,400)

2024 Apple iMac $799 (Was $1,099)

Kitchen & Home Deals

Vitamix Immersion Blender $120 (Was $150)

Blueair Blue Pure 211+ Air Purifier $200 (Was $300) *more deals here*

Coway Airmega AP-1512HH(W) True HEPA Purifier $164 (Was $230)

SodaStream Fizzi One Touch Sparkling Water Maker Bundle $125 (Was $190)

SodaStream Jet Sparkling Water Maker $55 (Was $80)

Vitamix FoodCycler FC-50 $280 (Was $400)

Vitamix 64-Ounce Professional Series 750 Blender $385 (Was $599)

Save up to 45 percent on coffees and teas

Rubbermaid Brilliance Plastic Food Storage Pantry Set of 14 Containers $67.71 (Was $100) *more options here*

Instant Vortex Plus 10 quart air fryer $90 (Was $140)

Instant Pot Duo pressure cooker $120 (Was $200)

Breville Barista Express Espresso Machine $600 (Was $700)

ThermoPro TP03 instant read thermometer $15 (Was $30)

Outdoor, Tool, and Auto Deals

Intex 18ft X 9ft X 52in Ultra XTR Rectangular Pool Set $800 (Was $2,000)

BLACK+DECKER 20V MAX Cordless Drill $39 (Was $99) *more BLACK+DECKER deals here*

beyond by BLACK+DECKER Home Tool Kit with 20V MAX Drill/Driver, 83-Piece $63 (Was $90)

Up to 40 percent off NOCO Jump Starters and Battery Chargers

Jackery Portable Power Station Explorer 300 $210 (Was $300) *Solar panels and accessories here*

Fitness & Health Deals

NordicTrack T Series Treadmill $454 (Was $649)

Garmin Forerunner 45 Smartwatch $120 (Was $200) *more models here*

Kids and Games Deals

Can get up to 30 percent off Strategy Games like Pandemic, Settlers of Catan, and Ticket to Ride

Up to 40 percent off Hasbro Games

Up to 45 percent off Osmo Educational Kits and Games

Up to 40 percent off educational toys from brands like hand2mind, Educational Insights, and Learning Resources

Osmo Genius Starter Kit for Tablet $84 (Was $132.54)

Clothing & Accessories Deals

Save up to 40 percent of Levi’s clothing

Crocs Ralen Lined Fuzzy Clog $42 (Was $60) *editor tested and approved*

Update the detailed information about Using The Power Of Deep Learning For Cyber Security (Part 2) – Must on the Moimoishop.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!