Usuario: guest
No has iniciado sesión
                                No has iniciado sesión
         Type:  Article
      
    
          CAMDA 2023: Finding patterns in urban microbiomes
         Haydeé Contreras-Peruyero;  Imanol Nuñez;  Mirna Vazquez-Rosas-Landa;  Daniel Santana-Quinteros;  Antón Pashkov;  Mario E. Carranza-Barragán;  Rafael Perez-Estrada;  Shaday Guerrero-Flores;  Eugenio Balanzario;  Víctor Muñiz Sánchez;  Miguel Nakamura;  L. Leticia Ramírez-Ramírez;  Nelly Sélem-Mojica;       
     
    
    
          
         Abstract: 
The Critical Assessment of Massive Data Analysis (CAMDA) addresses the complexities of harnessing Big Data in life sciences by hosting annual competitions that inspire research groups to develop innovative solutions. In 2023, the Forensic Challenge focused on identifying the city of origin for 365 metagenomic samples collected from public transportation systems and identifying associations between bacterial distribution and other covariates. For microbiome classification, we incorporated both taxonomic and functional annotations as features. To identify the most informative Operational Taxonomic Units, we selected features by fitting negative binomial models. We then implemented supervised models conducting 5-fold cross-validation (CV) with a 4:1 training-to-validation ratio. After variable selection, which reduced the dataset to fewer than 300 OTUs, the Support Vector Classifier achieved the highest F1 score (0.96). When using functional features from MIFASER, the Neural Network model outperformed other models. When considering climatic and demographic variables of the cities, Dirichlet regression over Escherichia, Enterobacter, and Klebsiella bacteria abundances suggests that population increase is indeed associated with a rise in the mean of Escherichia while decreasing temperature is linked to higher proportions of Klebsiella. This study validates microbiome classification using taxonomic features and, to a lesser extent, functional features. It shows that demographic and climatic factors influence urban microbial distribution. A Docker container and a Conda environment are available at the repository: GitHub facilitating broader adoption and validation of these methods by the scientific community.
    
   
  The Critical Assessment of Massive Data Analysis (CAMDA) addresses the complexities of harnessing Big Data in life sciences by hosting annual competitions that inspire research groups to develop innovative solutions. In 2023, the Forensic Challenge focused on identifying the city of origin for 365 metagenomic samples collected from public transportation systems and identifying associations between bacterial distribution and other covariates. For microbiome classification, we incorporated both taxonomic and functional annotations as features. To identify the most informative Operational Taxonomic Units, we selected features by fitting negative binomial models. We then implemented supervised models conducting 5-fold cross-validation (CV) with a 4:1 training-to-validation ratio. After variable selection, which reduced the dataset to fewer than 300 OTUs, the Support Vector Classifier achieved the highest F1 score (0.96). When using functional features from MIFASER, the Neural Network model outperformed other models. When considering climatic and demographic variables of the cities, Dirichlet regression over Escherichia, Enterobacter, and Klebsiella bacteria abundances suggests that population increase is indeed associated with a rise in the mean of Escherichia while decreasing temperature is linked to higher proportions of Klebsiella. This study validates microbiome classification using taxonomic features and, to a lesser extent, functional features. It shows that demographic and climatic factors influence urban microbial distribution. A Docker container and a Conda environment are available at the repository: GitHub facilitating broader adoption and validation of these methods by the scientific community.
         Journal: Frontiers in Genetics
      
    
      ISSN:  1664-8021
      
     
         Year:  2024
        
      
        Volume:  15
      
     
         Revision:  1
      
    
    
    
          
    
          
    
    
    
    
 Created:  2025-05-21 14:07:20
            Created:  2025-05-21 14:07:20
       Modified: 2025-06-30 13:16:42
            Modified: 2025-06-30 13:16:42
       Referencia revisada
             Referencia revisada
        
    Autores Institucionales Asociados a la Referencia:
    
      
      
    