La estimación no paramétrica de la densidad
Loading...
Identifiers
Publication date
Authors
Advisors
Editors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
La estimación de la densidad es uno de los campos de mayor interés dentro de la Estadística,
tanto por su aspecto propio como por las múltiples aplicaciones que de él derivan. En este
trabajo nos centramos en el estimador tipo núcleo de la densidad, que es uno de los métodos no
paramétricos más empleados en la literatura. Comenzaremos por introducir aquellos conceptos
esenciales para su construcción, para luego detallar y derivar sus principales características. A
pesar de la exibilidad que con eren los métodos no paramétricos, la cuestión más esencial para
el cálculo del estimador tipo núcleo de la densidad es la elección del denominado parámetro
ventana. La elección de este parámetro ventana no es inmediata, y a lo largo de los años se han
desarrollado diferentes métodos basados en ideas diversas. En este trabajo nos centraremos en
los tres más empleados: el conocido como método de Silverman, el método de Sheather y Jones
y el método de validación cruzada.
Uno de los objetivos es comparar el comportamiento de estos selectores tanto entre sí como
con respecto al óptimo teórico. Para ello realizamos un completo estudio de simulación empleando
el paquete estadístico . En dicho estudio se analiza el comportamiento de los procedimientos
en muestras finitas con varios tamaños muestrales y cuatro modelos teóricos que cubren una
amplia gama de las posibles características presentes en una función de densidad. Por último,
ilustraremos el uso del estimador tipo núcleo y los selectores estudiados con un conjunto de datos
reales del ámbito biomédico en el que se recogen diversas medidas útiles en la detección de la
cardiomegalia (condición humana que se caracteriza por un tamaño del corazón anormalmente
grande).
Density estimation is one of the most interesting fields in Statistics, for its own aspect as well as for the diverse applications derived from it. In this manuscript we focus on the kernel density estimator, which is one of the most widely used non-parametric methods. We will begin by introducing essential concepts for its instroduction, and then we will detail and derive its main characteristics. Despite the flexibility conferred by nonparametric methods, the most essential issue for the calculation of the kernel density estimator is the choice of the so called bandwidth parameter. The choice of this bandwidth parameter is not trivial, and different methods have been developed over the years. In this manuscript we will focus on the three most common ones: the Silverman bandwidth selector, the Sheather and Jones one and the cross-validation method. One of the objectives is to compare the behavior of these selectors among themselves and with respect to the theoretical optimum. For this purpose, we carried out a complete simulation study using the statistical package. This study analyses the behavior of the bandwidth selection methods in finite samples with different sample sizes and four theoretical models, which cover a wide range of possible features available in a density function. Finally, we illustrate the use of the kernel density estimation with a real data set from the biomedical field containing different measures used in the detection of cardiomegaly (a human condition characterised by an abnormally large heart size).
Density estimation is one of the most interesting fields in Statistics, for its own aspect as well as for the diverse applications derived from it. In this manuscript we focus on the kernel density estimator, which is one of the most widely used non-parametric methods. We will begin by introducing essential concepts for its instroduction, and then we will detail and derive its main characteristics. Despite the flexibility conferred by nonparametric methods, the most essential issue for the calculation of the kernel density estimator is the choice of the so called bandwidth parameter. The choice of this bandwidth parameter is not trivial, and different methods have been developed over the years. In this manuscript we will focus on the three most common ones: the Silverman bandwidth selector, the Sheather and Jones one and the cross-validation method. One of the objectives is to compare the behavior of these selectors among themselves and with respect to the theoretical optimum. For this purpose, we carried out a complete simulation study using the statistical package. This study analyses the behavior of the bandwidth selection methods in finite samples with different sample sizes and four theoretical models, which cover a wide range of possible features available in a density function. Finally, we illustrate the use of the kernel density estimation with a real data set from the biomedical field containing different measures used in the detection of cardiomegaly (a human condition characterised by an abnormally large heart size).
Description
Traballo Fin de Grao en Matemáticas. Curso 2021-2022
Keywords
Bibliographic citation
Relation
Has part
Has version
Is based on
Is part of
Is referenced by
Is version of
Requires
Sponsors
Rights
Atribución-NoComercial-CompartirIgual 4.0 Internacional







