For the Summer 2019 semester at University of Michigan – Flint, I ended up taking the course SCM 512 – Applied Quantitative Analysis; in other words, Business Analytics. This course was taught by professor Muhammed Usman Ahmed; probably one of the better teachers I’ve ever had.
I’m nearing the end of the jam-packed 7 week course, and I can say it was probably one of my favorite classes. I wish I would’ve been able to take this class in a traditional time setting. That way, I probably would have consumed more information. However, here’s a summary of certain topics that stood out in the class thus far:
Descriptive Analytics
This was a basic introduction to go over data summary techniques such as geometric mean, percentiles, standard deviation, data visualizations and Excel skills. Nothing too important here.
Descriptive Data Mining
We started to utilize the Analytic Solver plugins we were required to purchase. This is when things became interesting. We started to work with data mining techniques such as text mining, utilizing association rules and working with cluster analysis with k-means and hierarchical techniques.
Our main focus in this week was working with the cluster analysis. Here is a little review on it:
Hierarchical Clustering
Hierarchical clustering is the approach of starting each observation as it’s own cluster and then iteratively combining two clusters that are most similar. There are different ways to measure how different clusters are such as Euclidean distance, matching coefficients, or Jaccard’s coefficient. The main focus we utilize were Euclidean distance measurements using single linkage, complete linkage, group average linkage, and centroid linkage.
K-means Clustering
In k-means clustering, the analyst supplies the number of clusters the algorithm should end up with. The algorithm randomly assigns each observation to one of the clusters and after all observations have been assigned, the cluster centroids are calculated. The centroid is the “means” of k-means clustering. This is what we utilized in our group project because it was easier to do data analysis with it.
In general, if you have a small data set and want to easily examine the outputs with increasing numbers of clusters, use hierarchical. If you know how many clusters you want to end up with and have a larger data set, k-means clustering may be the best option.
Probability and Statistical Inference
We focused on probabilities as a base for discrete and continuous probability distributions. For example, we solved conditional probability problems involving independent events, the multiplication law, and Bayes’ Theorem. The topic that stood out the most was the Bayes’ Theorem because how interesting it was to see how relatable it is used in the real world.
Regression Analysis
Here is where we learned to utilize the data we were dealing with clustering and find patterns we can use for future analysis. It was really eye-opening to see how easy it was to use Excel to generate equations via visualizations or using the Regression tool. By using the regression tool, it provided information on coefficients and other aspects of the data you were looking at.
What-If/Monte Carlo Simulations
Here is where we learned how to determine “unknown” factors when making business decisions by running simulations. For instance, if a business did not know the demand for a certain product, running Monte Carlo type algorithms would help provide an average of various inputs. This was very informative to see how it can be used to predict outcomes that would have never been able to be determine prior to the simulations that ran.
I’ll be honest, I became a tad lazy while writing this, so some sections are a little vague as compared to others. However, the point is that I learned an incredible amount from this class utilizing Excel and the Analytic Solver. I’m looking forward to using this information in my daily work in a professional environment.