Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10259/9825
Título
Dataset of the paper “Variable selection for linear regression in large databases: exact methods” Applied Intelligence, 51(6), 3736-3756
Editorial
Universidad de Burgos
Fecha de publicación
2020
DOI
10.36443/10259/9825
Resumen
The variable selection problem in the context of Linear Regression for large databases is analysed. The problem consists in selecting a small subset of independent variables that can perform the prediction task optimally. This problem has a wide range of applications. One important type of application is the design of composite indicators in various areas (sociology and economics, for example). Other important applications of variable selection in linear regression can be found in fields such as chemometrics, genetics, and climate prediction, among many others. For this problem, we propose a Branch & Bound method. This is an exact method and therefore guarantees optimal solutions. We also provide strategies that enable this method to be applied in very large databases (with hundreds of thousands of cases) in a moderate computation time. A series of computational experiments shows that our method performs well compared with well-known methods in the literature and with commercial software.
Palabras clave
Variable selection
Linear regression
Branch & Bound methods
Heuristics
Materia
Investigación operativa
Operations research
Bases de datos
Databases
Referenciado en
Aparece en las colecciones
Ficheros en este ítem
Tamaño:
84.98Mb
Formato:
zip
Tamaño:
97.10Mb
Formato:
zip
Tamaño:
82.64Mb
Formato:
zip