Introdución aos Modelos Mixtos
Loading...
Identifiers
Publication date
Authors
Advisors
Tutors
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
No eido da Estatística, os modelos de regresión son a principal ferramenta empregada cando
o que se precisa é estimar a relación entre variables aleatorias. En concreto, veremos como unha ou varias variables (que chamaremos variables explicativas) inflúen sobre outra variable (que chamaremos variable resposta). Moitas bases de datos concernentes ao eido da Educación, a Medicina ou as Ciencias Medioambientais están xerarquicamente organizadas debido á propia natureza destas, de xeito que os individuos se atopan aniñados en grupos; como por exemplo, un conxunto de alumnas/os agrupadas/os por escolas. É obvio pensar que individuos clasificados nun mesmo grupo tenderán a ter un comportamento máis semellante que uns individuos calesquera de grupos diferentes, con
menos información en común. Nestes casos, os modelos de regresión clásicos deixan de ser útiles e xorde a necesidade de ter en conta o efecto que producen estas agrupacións na variable resposta. As primeiras propostas para estudar este tipo de datos, sen ignorar as agrupacións existentes, son os modelos de análise da varianza (coñecidos como modelos ANOVA) ou modelos de análise da covarianza (coñecidos como modelos ANCOVA); mais estes modelos só son interesantes cando o que se quere é aplicar técnicas da Inferencia Estatística sobre certas características dos grupos presentes na base de datos.
Afondando aínda máis na análise de datos xerárquicos, os grupos presentes no conxunto de
datos poden considerarse unha mostra aleatoria dunha poboación máis grande de grupos para facer Inferencia sobre os grupos en xeral. Neste caso, os modelos de regresión ANOVA e ANCOVA deixan de ser válidos, e xorden os denominados modelos mixtos ou modelos multinivel. Ao longo deste traballo introduciranse os modelos mixtos e poñerase de manifesto a súa utilidade para estudar bases de datos cunha estrutura de dous niveis, onde os individuos se atopan no primeiro nivel e están aniñados en grupos no segundo nivel, mediante a incorporación de efectos aleatorios. Para levar a cabo esta ilustración empregarase unha base de datos reais que será analizada empregando a ferramenta estatística R.
In the Statistical field, regression models are the main tool employed when estimating the relation among random variables is needed. In particular, we will see the effect of one or several variables (that will be denoted by explanatory variables) in another variable (that will be denoted by response variable). A lot of databases concerning the fields of Education, Medicine or Environmental Sciences are hierarchically organized due to their own nature, so that individuals are organized in groups; for example, a set of students grouped by schools. It is obvious to think that individuals classified in a same group will tend to have a more similar behaviour than any individual from different groups, with less information in common. In these cases, classical regression models stop being useful and the necessity to take into account the effect produced by these groupings in the response variable arises. The first proposals to study this type of data sets, without ignoring the existing groupings, are the models of analysis of variance (ANOVA) or models of analysis of covariance (ANCOVA); but these models are only interesting when the goal is to apply Statistical Inference techniques on certain features of the groups present in the database. Going even further in the analysis of hierarchical data, the present groups in the dataset can be considered a random sample of a bigger population of groups to make Inference about all groups in general. In this case, the regression models ANOVA and ANCOVA stop being valid, and the named mixed models or multilevel models arises. Throughout this work, mixed models will be introduced and it will be presented their utility in the study of databases with a structure of two levels, where individuals are on the first level and are nested in groups on the second level, by incorporating random effects. To carry out this illustration, it will be used a real database that will be analysed employing the statistical tool R.
In the Statistical field, regression models are the main tool employed when estimating the relation among random variables is needed. In particular, we will see the effect of one or several variables (that will be denoted by explanatory variables) in another variable (that will be denoted by response variable). A lot of databases concerning the fields of Education, Medicine or Environmental Sciences are hierarchically organized due to their own nature, so that individuals are organized in groups; for example, a set of students grouped by schools. It is obvious to think that individuals classified in a same group will tend to have a more similar behaviour than any individual from different groups, with less information in common. In these cases, classical regression models stop being useful and the necessity to take into account the effect produced by these groupings in the response variable arises. The first proposals to study this type of data sets, without ignoring the existing groupings, are the models of analysis of variance (ANOVA) or models of analysis of covariance (ANCOVA); but these models are only interesting when the goal is to apply Statistical Inference techniques on certain features of the groups present in the database. Going even further in the analysis of hierarchical data, the present groups in the dataset can be considered a random sample of a bigger population of groups to make Inference about all groups in general. In this case, the regression models ANOVA and ANCOVA stop being valid, and the named mixed models or multilevel models arises. Throughout this work, mixed models will be introduced and it will be presented their utility in the study of databases with a structure of two levels, where individuals are on the first level and are nested in groups on the second level, by incorporating random effects. To carry out this illustration, it will be used a real database that will be analysed employing the statistical tool R.
Description
Traballo Fin de Grao en Matemáticas. Curso 2021-2022
Keywords
Bibliographic citation
Relation
Has part
Has version
Is based on
Is part of
Is referenced by
Is version of
Requires
Sponsors
Rights
Atribución-NoComercial-CompartirIgual 4.0 Internacional



