FIEL DIARIES #6

Date: 11 - 02 - 2021

Topic: Analysis of One - dimensional variables

Objective: Elaboration of the box and whisker diagrams.

Shared Resources:

Stadistic in comics

Nummerical Summary

Now we go from graphics to formulas. Our goal is to simple calculations of the basic characteristics of a data set.

Each data set has two main propeties: The central or tyical value and the distribution of that value.

Many dispersions: The mmore dispersed a data distribution is the larger its standard deriation.

Lower dispersion: The smaller the standard deviation, the more homogeneous the data, that is there is less dispersion, the increase in the values of the values of the standard deviation indicates a greater variability of the data.

Average

the mean is represened by the symblo x̄ and is obtained by dividing the sum of all the data bye the number of observations.

The median

To find the midian of a data set, we order from least to greatest. The median is the value that is in the center.

Measures of dispersion

In addition to knowing the central point of a data set, we also want to describe the dispersion, that is how far from the center data is.

Interquartile range

It involves dividing the data into four equal groups and observing the distance that separates the extreme groups.

1. Order the data numerically.

2. Divide the data by then median into two equal groups (If the median matches a data, iinclude it in both groups)

3. Find the median of the lower group. That is the first quartile.

4. The median of the upper group is the third quartile.

Typical deviation

Unlike IQR, which is calculated from the medians, the standard deviatioon measures the spread of data from the mean. An intuitive way of looking at it is as the mean distance between the data and the mean x̄.

A rule of thumb

In nearty symmetric mountain-shaped data sets about 68% of the data is less than one standard deviation from the mean and 95% is less than 2 standard deviations from the mean.

1. Different ways of representing them.

2. Two different concepts of the data center, the median and the mean.

3. Two ways to calculate the dispersion of the data around the center.

4. Histogram of mountain shape and Z. A varible that indicates how many standard deviations from the mean an observation is.

Personal conclusions

We learned about the numerical summary, the mean, the median, the measures of dispersion and the interquartile range. He also explained how each should be derived from the mean and median and their formulas.

Task

In the attached file solve the one on the SOLVE sheet (marked in red).

Portafolio de Estadística por Cristina Huiracocha (D2-2020)

Buscar este blog

FIEL DIARIES #6

Comentarios

Publicar un comentario