About this course
Statistics is an essential component in the ever-expanding field of data science
playing an invaluable role in the making of informed business decisions.
Statistical functions are applied on large sets of data to draw conclusions, make
predictions, and minimize loss.
Therefore, if you want to enjoy a successful data science career, you need to have
a solid grasp of statistical core concepts and basics, all covered in the statistics
course notes. We start off with descriptive statistics, diving into all the
associated graphs and tables for numerical and descriptive data. Then we take a
look at inferential statistics, the different types of distributions, confidence
intervals and respective formulas.
We finish off with the process of hypotheses testing, going into the types,
examples formulas and the p-value.
Statistics for Data Science and Analytics is the foundation for extracting meaningful insights from the ever-growing ocean of data. It equips you with the tools and knowledge to summarize, analyze, and interpret data effectively. You'll delve into descriptive statistics, learning how to calculate measures like central tendency (average) and variability (spread) to understand the basic characteristics of your data. Inferential statistics take you further, allowing you to draw conclusions about a larger population based on a sample, helping you assess the significance of patterns and make data-driven decisions. The journey doesn't stop there. Exploring probability distributions helps you understand how data is likely to be distributed, while hypothesis testing provides a framework to test assumptions and identify relationships between variables. Finally, statistical modeling allows you to build predictive models that can forecast future trends or classify new data points. By mastering these statistical techniques, you'll be well-equipped to transform raw data into actionable knowledge, empowering you to solve complex problems and make informed choices across various data science and analytics applications.
Related Courses
Comments (0)
Data is divided into 2. Categorical data represents groups or categories. Numerical data represents numbers.
Graphs and Tables that Represent Categorical Variables
Here, you'll see various techniques
Graphs and tables that represent numerical variables
Graphs and Tables for Relationships Between Variables.
Mean, Median, Mode
Variance and Standard Deviation
Covariance and Correlation
Distributions
The Central Limit Theorem
Estimators and Estimates
Confidence Intervals and the Margin of Error
Student’s T Distribution
Formulas for Confidence Intervals
The ‘scientific method’ is a procedure that has characterized natural science
since the 17th century. It consists in systematic observation, measurement,
experiment, and the formulation, testing and modification of hypotheses.
A hypothesis is
“an idea that can be tested”
Decisions You Can Take
Statistical Errors (Type I Error and Type II Error)
The p-value is the smallest level of significance at which we can still reject the null hypothesis, given the observed sample statistic
Formulae for Hypothesis Testing
In statistical modeling, regression analysis is used to estimate the relationships between two or
more variables:
Linear regression equation
This example shows how to run regression in Excel by using a special tool included with the
Analysis ToolPak add-in.
Interpret regression analysis output
If you need to quickly visualize the relationship between the two variables, draw a linear
regression chart
Microsoft Excel has a few statistical functions that can help you to do linear regression
analysis such as LINEST, SLOPE, INTERCPET, and CORREL