Seventeen new datasets from R packages (intkrige, RSDA, MAINT.Data, GPCSIV, HistDAWass, symbolicDA), the Billard & Diday (2007) textbook, and the QualAr Portuguese air quality network:
utsnow.int — 415 Utah weather stations with snow load prediction intervals plus coordinates/elevation (from intkrige).lynne1.int — 10 observations with pulse rate, systolic, and diastolic pressure intervals (from RSDA).loans_by_risk_quantile.int — 35 Lending Club loan groups (A1-G5) with 4 quantile-based financial intervals (from MAINT.Data).judge1.int — 6 regions rated by Judge 1 on 4 interval variables (from GPCSIV).judge2.int — 6 regions rated by Judge 2 on 4 interval variables (from GPCSIV).judge3.int — 6 regions rated by Judge 3 on 4 interval variables (from GPCSIV).video1.int — 10 user groups with 5 video engagement interval metrics (from GPCSIV).video2.int — 10 user groups with 5 video engagement interval metrics (from GPCSIV).video3.int — 10 user groups with 5 video engagement interval metrics (from GPCSIV).lisbon_air_quality.int — 1096 daily observations of 8 pollutant concentration intervals from Lisbon (QualAr).blood.hist — 14 gender-age groups with cholesterol, hemoglobin, and hematocrit histograms (from HistDAWass).china_climate_month.hist — 60 Chinese weather stations with 168 monthly climate histograms (from HistDAWass).china_climate_season.hist — 60 Chinese stations with 56 seasonal climate histograms (from HistDAWass).exchange_rate_returns.hist — 108 monthly exchange rate return histograms (from HistDAWass).polish_cars.mix — 30 Polish car models with 9 interval + 3 multinomial variables (from symbolicDA).hierarchy.hist — 10 observations with hierarchical categories, conditional histograms, and cholesterol interval (Table 6.20).bird_color_taxonomy.hist — 20 birds with density/size histograms, tone, and fuzzy shade taxonomy (Tables 6.9/6.14).Nineteen new datasets from Billard & Diday (2020) Clustering Methodology for Symbolic Data, the HistDAWass R package, and other R packages:
genome_abundances.int — 14 genome classes with 10 dinucleotide abundance intervals (Table 3-16).china_temp_monthly.int — 15 Chinese weather stations with 12 monthly temperature intervals + elevation (Table 7-9).ecoli_routes.int — 9 E. coli transport routes with 5 biochemical interval variables (Table 8-10).loans_by_risk.int — 35 Lending Club loan groups by risk level (A-G) with 4 financial intervals (from MAINT.Data).polish_voivodships.int — 18 Polish voivodships with 9 socio-economic interval variables (from clusterSim).iris_species.hist — 3 iris species with 4 morphological histogram variables (Table 4-10).flights_detail.hist — 16 airlines with 5 flight performance histograms (Table 5-1).cover_types.hist — 7 forest cover types with 4 topographic histograms (Table 7-21).glucose.hist — 4 regions with blood glucose histograms (Table 4-14).state_income.hist — 6 US states with 4 income distribution histograms (Table 7-18).simulated.hist — 5 simulated observations with 2 histogram variables (Table 7-26).age_pyramids.hist — 229 countries with 3 age pyramid histograms (from HistDAWass).ozone.hist — 84 daily observations with 4 weather histograms (from HistDAWass).french_agriculture.hist — 22 French regions with 4 agricultural histograms (from HistDAWass).household_characteristics.distr — 12 counties with 3 categorical distribution variables (Table 6-1).county_income_gender.hist — 12 counties with gendered income histograms + sample sizes (Table 6-16).joggers.mix — 10 jogger groups with pulse rate intervals + running time histograms (Table 2-5).census.mix — 10 census regions with 6 mixed-type variables: histograms, distributions, multi-valued sets, and intervals (Table 7-23).mtcars.mix — 5 car groups with 7 interval + 4 modal variables (from ggESDA).Thirteen new datasets extracted from R packages and the Billard & Diday (2006) textbook:
cardiological.int — 44 patients with 5 interval-valued physiological measurements (from RSDA).prostate.int — 97 prostate cancer patients with 9 clinical interval variables (from RSDA).uscrime.int — 46 US states with 102 interval-valued crime statistics (from RSDA).hardwood.hist — 5 hardwood tree species with 4 histogram-valued climate variables (from RSDA).synthetic_clusters.int — 125 observations in 5 clusters with 6 interval variables (from symbolicDA).environment.mix — 14 EPA state groups with mixed interval/modal environmental data (from ggESDA).weight_age.hist — 7 age groups with histogram-valued weight distributions (Table 3.10).hospital.hist — 15 hospitals with histogram-valued cost distributions (Table 3.12).cholesterol.hist — 14 gender-age groups with cholesterol histograms (Table 4.5).hemoglobin.hist — 14 gender-age groups with hemoglobin histograms (Table 4.6).hematocrit.hist — 14 gender-age groups with hematocrit histograms (Table 4.14).hematocrit_hemoglobin.hist — 10 observations with bivariate 2-bin histograms (Table 6.8).energy_usage.distr — 10 towns with categorical fuel/heating distributions (Table 3.7).Seven new interval-valued benchmark datasets from recent SDA papers (2020-2025):
freshwater_fish.int — 12 freshwater fish species with 13 heavy metal bioaccumulation variables, 4 feeding classes (Andrade et al., 2025).fungi.int — 55 fungi specimens with 5 morphological variables, 3 genera: Amanita, Agaricus, Boletus (Andrade et al., 2025).iris.int — 30 interval observations of Fisher's iris data, 4 sepal/petal variables, 3 species (Andrade et al., 2025).water_flow.int — 316 water flow sensor readings with 47 interval features, 2 classes (Andrade et al., 2025).wine.int — 33 wine samples with 9 chemical/physical property variables, 2 classes (Andrade et al., 2025).car_models.int — 33 Italian car models with 8 specification variables, 4 categories (Andrade et al., 2025).hdi_gender.int — 183 countries with 2 World Bank gender indicator intervals and ordinal HDI classification (Alcacer et al., 2023).MM_to_RSDA() — convert MM format (_min/_max columns) to RSDA format (symbolic_tbl with complex-encoded intervals).iGAP_to_RSDA() — convert iGAP format to RSDA format via iGAP_to_MM → MM_to_RSDA.int_list_conversions() now returns 8 conversions (was 6), including MM_to_RSDA and iGAP_to_RSDA.int_convert_format() now supports to = "RSDA" for MM and iGAP sources with auto-detection.Adopted snake_case naming with type suffixes (.int, .hist, .mix, .distr, .iGAP) for all datasets. Renamed 10 existing datasets:
| Old name | New name |
|---|---|
| Abalone | abalone.int |
| Abalone.iGAP | abalone.iGAP |
| Cars.int | cars.int |
| ChinaTemp.int | china_temp.int |
| Face.iGAP | face.iGAP |
| LoansbyPurpose.int | loans_by_purpose.int |
| bird.int | bird.mix |
| soccer.bivar.int | soccer_bivar.int |
| airline_flights | airline_flights.hist |
| health_insurance | health_insurance.mix |
acid_rain.int, bats.int, credit_card.int, employment.int, oils.int, teams.int, tennis.int, temperature_city.int, trivial_intervals.int, world_cup.int (interval-valued)bird_species.mix, bird_species_extended.mix, town_services.mix (mixed symbolic)lung_cancer.hist (histogram-valued)energy_consumption.distr (distribution-valued)bank_rates, mushroom_fuzzy (other)interval_format_conversions.R):
int_detect_format() — automatically detect interval data format (RSDA, MM, iGAP, SODAS).int_list_conversions() — list available format conversion functions, with optional filtering by source/target format.int_convert_format() — unified interface for all interval format conversions with auto-detection.R CMD check: 0 errors, 0 warnings, 0 notes.interval_dist.R): int_dist(), int_dist_matrix(), int_pairwise_dist(), int_dist_all() — 14 distance measures (GD, IY, L1, L2, CB, HD, EHD, nEHD, snEHD, TD, WD, minkowski, ichino, de_carvalho) with method aliases (euclidean, hausdorff, manhattan, city_block, wasserstein).interval_geometry.R): int_width(), int_radius(), int_center(), int_overlap(), int_containment(), int_midrange().interval_position.R): int_median(), int_quantile(), int_range(), int_iqr(), int_mad(), int_mode() — all support 8 methods (CM, VM, QM, SE, FV, EJD, GQ, SPT).interval_robust.R): int_trimmed_mean(), int_winsorized_mean(), int_trimmed_var(), int_winsorized_var().interval_shape.R): int_skewness(), int_kurtosis(), int_symmetry(), int_tailedness().interval_similarity.R): int_jaccard(), int_dice(), int_cosine(), int_overlap_coefficient(), int_tanimoto(), int_similarity_matrix().interval_uncertainty.R): int_entropy(), int_cv(), int_dispersion(), int_imprecision(), int_granularity(), int_uniformity(), int_information_content().R/interval_format_conversions.R, organized by target format.R/interval_utils.R: consolidated 7 internal interval helpers, format-preparation functions (RSDA_format, set_variable_format, clean_colnames), and their internal helpers.R/histogram_utils.R: 8 internal helpers for histogram statistics.utils_validation.R to validation.R.R CMD check: 0 errors, 0 warnings, 0 notes. All 399 tests pass.R/utils_validation.R: 11 internal validation helpers centralizing all checks.RSDA_format fix: replaced 4 return("Error") with proper stop() calls.R CMD check: 0 errors, 0 warnings.NEWS.md with changelog for all versions.int_mean, int_var, int_cov, int_cor (8 methods).hist_mean, hist_var, hist_cov, hist_cor.