Submission to the 7th LifeCLEF plant identification challenge. Best public Kaggle F1 0.41826, private F1 0.40283, good for 7th place on the private leaderboard.






Pipeline
Quadrat in, species set out. The same i002 BioCLIP 2.5 checkpoint runs at 224 and 336 pixels over a 4 by 4 tile grid, the per tile softmax distributions are averaged, class prior logit adjustment is applied, and an adaptive probability threshold emits the final species set.
Overview
We train on isolated single plant photographs and predict every species visible in a 1m² in situ vegetation quadrat. Training and test data live in different visual worlds.
Our pipeline is a single fine tuned BioCLIP 2.5 ViT H/14 with only the last 4 of 32 transformer blocks unfrozen, per task MLP heads at species, genus and family, and a 4 by 4 tile inference at two resolutions. More clean training data without a per species cap (the 2.65 million image i001 manifest combining PlantCLEF 2024 with research grade iNaturalist) is what gave the best result; an explicit cap and tail augmentation in i003 lowered public F1.
Across more than 20 experiments and 600+ Kaggle submissions, the bottleneck kept being the single plant to multi species distribution shift, not model capacity. Held out single plant validation accuracy correlates with quadrat F1 at only r ≈ 0.4. Best public F1 of 0.41826, private F1 of 0.40283, from a 224 plus 336 px ensemble of our i002 checkpoint with class prior logit adjustment (τ = 0.25) and an adaptive probability threshold (T = 0.03, k ∈ [2, 10]).
The task
PlantCLEF 2026 is a single multi label track over 7,806 species. Training images are single plant close ups; test images are multi species LUCAS quadrats.
Given a top down photograph of a LUCAS vegetation plot, predict every vascular plant species visible. Plots contain 1 to 10 species at varying scale.
Models train on isolated single plant photos and then compose those representations into a multi label prediction on plots where species co occur and occlude each other. The 7,806 species axis is fixed.
Datasets
All splits share the canonical 7,806 species axis. Taxonomy joins on gbif_species_id.
Pl@ntNet close up photographs across 7,806 species. The original supervised training set.
Research grade iNaturalist images, de duplicated and image verified, added to the manifest during the i001 data rebuild.
What the deployed i002 model trains on: stratified 90/10 by species (seed 42), species with under 5 images held entirely in train.
In situ 1m² quadrats with 1 to 10 species each, expert annotated. The distribution we transfer to.
7,806 species into 1,446 genera and 181 families. Drives the auxiliary taxonomy heads at training time.
Results
Scores pulled from the Kaggle public API for plantclef-2026 and merged with our local submission logs (240 submissions across 18 experiments).
Figure 1
Each dot is the best public Kaggle F1 from one experiment. The dashed line tracks best so far. 010 is the partial unfreeze breakthrough; i002 is long tail capping with adaptive selection.
Figure 2
Holding every other knob fixed, sweeping the number of unfrozen transformer blocks. Val top 5 rises monotonically; Kaggle F1 peaks sharply at n=4. The bottleneck is train versus test distribution, not capacity.
Figure 3
Each dot is one training run. Validation top 1 (x) and Kaggle public F1 (y) are only weakly correlated, Pearson r = 0.41. Within the 014b sweep the relationship inverts: n=5 has the best val and the worst Kaggle.
Figure 4
Every taxonomic level rises smoothly across the 5 epoch PC24, iNat run. Yet Kaggle public F1 went from 0.37506 at ep1 to 0.37956 at ep5, still below the 0.38333 010 anchor without iNat. Adding capacity to the wrong distribution.
Figure 5
PC24 is heavily right skewed. 2,263 species have 501 to 1,000 training images while 145 have a single image. The 500 cap from i003 collapses the right tail but leaves 1,644 species with under 100 images unchanged.
Submissions
Ranked by public F1. Click any column to re sort. Highest public was 0.41826; highest private was 0.40600.
| # | Exp | Recipe | Public F1↓ | Private F1 |
|---|---|---|---|---|
| 01 | i002Cap, extra | ensemble 224 + 336 px, LA tau 0.25, probT 0.03best public | 0.41826 | 0.40283 |
| 02 | i002Cap, extra | 224 + 336 + GBIF DOY phenology Boltzmann prior | 0.41346 | 0.38766 |
| 03 | i002Cap, extra | 224 + 336, LA tau 0.25, genus rerank, probT 0.035 | 0.41246 | 0.39946 |
| 04 | i002Cap, extra | last_blocks 4, softmax mean, probT 0.05 | 0.41165 | 0.38132 |
| 05 | i002Cap, extra | last_blocks 336 px (pos_embed bicubic resize) | 0.41117 | 0.38376 |
Resources
Code repository, method paper, and per experiment notes.
Training, inference, and analysis code for the ARM Wision submission, covering the BioCLIP 2.5 partial fine tune, the tiled inference pipeline, and the full experiment ladder.
Partial unfreeze sweet spot, long tail capping, validation versus leaderboard decoupling, and ablations across the experiment ladder. Will be linked here when published.
Per experiment summaries from 001 through i003 with recipe, validation metrics, Kaggle public and private scores, and the lessons that carried forward.
The deployed i002 model trains on the 2.65 million image i001 manifest combining PlantCLEF 2024 with a research grade iNaturalist pull, stratified 90/10 by species with seed 42.
Team
ANU Deep Learning, Semester 1 2026.
Cite this work
If you build on this work, please also cite the PlantCLEF 2026 overview paper and the Pl@ntNet observation pipeline.