“*There are three kinds of lies: lies, damned lies and statistics*” according to folklore wisdom. My best example is the one with the babies and the stork!

If you wish to, you can make *quantitative statistical* calculations to find the p-value for correlation between the arrival of both babies and storks. It might be small, but I’m sure it is there.

Now *qualitative statistics* look at the baby/stork-challenge slightly different and make the variables interchangeable asking: is the stork bringing the babies or are the babies bringing the stork? I am sure we can calculate both p-values.

Both quantitative and qualitative statistics use effect size calculations, and this is where it is revealed: there **could** be an unregistered variable or more, better predictors for both the arrival of storks and babies because the effect size is not impressive. If you correlate storks arriving in Denmark and season and if you correlate babies arrival with pregnancy the effect size will tell you, you are on to something!

When it gets more complicated than storks, babies, seasons and pregnancies, statistics can show relationships between variables, it can show differences between groups of data, it can be descriptive and sometimes even make predictions. Qualitative statistics are well suited for studying diversity in a population and quantitative statistics are well suited for studying distribution.

Parameters define the observed population ie “storks” and “babies”. In another survey with a larger population including the first population, the same parameter is a variable, ie “migratory birds”, “newborn mammals”.

Variables are the **what** we observe and in qualitative statistics, we use two different kinds of variables: independent and dependent variables. In qualitative statistics, you can choose freely between the different variables as dependent or independent in relation to each other (babies bringing storks or storks bringing babies).

Levels of measurement are describing **how** we measure:

The qualitative level of measurements (that can’t be measured, but observed or counted) are based on **categories of data**: binary categories (ie “pregnant or not”), nominal categories (unordered; ie mammals, birds, reptiles) and ordinal variables being like nominal categories that can be ordered logically (ie babies -> toddlers -> children).

## Descriptive statistics

Qualitative data can be quantified and described in basic features such as a ranked (in %) and central tendency “mode”, the most frequently observed variable, instead of “mean”, the calculated average or “median”, the middle value.

Descriptive statistics are very well suited for graphical presentations.

**Testing a hypothesis (Asking, if you are right in assuming the population looks like “this”?)**

You assume the world can be described in a certain way, you have a **model** of the world, so to speak. Now you want your data to be compared to the model to see if it fits or at least how well it fits:

**Null Hypothesis** is the hypothesis “there is no difference”.

**Significance Testing** is using p-value (“p” is for probability) to determine if the variables’ distribution is statistically significant.

Qualitatively defined samples (groups) have no normal distribution, therefore I would like to test for independency between groups and rank samples.

I use chi square test to test for independence between variables (my Null Hypothesis being “there is no common distribution (no common dependency) between the different groups in my dataset”).

If the likelihood (probability) is greater than 5% (p > 0.05) the assumptions for the Null Hypothesis is said to be violated and we can reject it. The test indicates that the found differences (rejected null hypothesis) did not occur by chance, but does not indicate the size of the effect.

Or explained simply: there is a statistical significance between two or more groups and what they have reported regarding their use of digital tools though **we do not know why**.

**Effect size** is a calculation on how strong a significance is. Until the effect size is 100% (ie: all twins have or have had at least one biological sibling) there is **always** a reason to ask more questions! Any statistical statement with an effect size of less than 100% is “provisional best explanation”

One sample T-test

Test-retest **reliability** lies in the description of the procedure.

**Triangulation** is a very pragmatic mixed method approach to **validate** findings with the use of several qualitative methods for the same subject.

## Data mining for learning analytics:

Data mining is having one or more sets of data, asking the data what answers it can come up with.

In data mining, statistical **variables** are called **features**.

**The true art** is to find relevant features to examine to find new interesting answers.

In the world of Big Data, you can get data from all sorts of sources. You choose! But is the data relevant? Is the data of quality? Is the data available?

Can twitter scrapings be used as a source for learning analytics? Which features are signs of learning?

All this will be continued.