S23DR 2026 β€” Submission v2 (Tuned Post-Processing)

Learned-model submission for the S23DR 2026 wireframe-estimation challenge.

What changed vs v1

Same checkpoint as the original submission (checkpoint.pt, 170k steps, 4096-pt), but with post-processing hyperparameters tuned via grid search on the local 1024-scene validation set:

Parameter v1 (default) v2 (tuned) Effect
MERGE_THRESH (merge_end) 0.55 0.40 Less aggressive vertex merge β†’ better recall
SNAP_RADIUS 0.50 0.70 Larger point-cloud snap radius β†’ better precision

Local val results (1024 scenes, 4096-pt):

Version HSS@0.5 vF1 eIoU
v1 β€” original submission (default params) 0.376 0.428 0.323
v2 β€” tuned post-processing 0.390 0.457 0.353
Ξ” +0.014 +0.029 +0.030

Pipeline

raw multi-view sample
  -> point fusion          per-view depth unprojection + COLMAP, labeled
                           by ADE/Gestalt              (point_fusion.py)
  -> priority sampling     4096 pts: 3072 COLMAP + 1024 depth
                                                       (make_sampled_cache.py)
  -> Perceiver             hidden=256, 256 latents x 7 layers, 64 segment
                           queries; (midpoint, direction, length) + confidence
                                                       (model.py)
  -> postprocess           conf > 0.5 -> segments -> iterative vertex merge
                           (end=0.40) -> snap to point cloud (r=0.70)
                           -> horizontal snap
  -> {order_id, wf_vertices, wf_edges}                 -> submission.json

Run inference

python script.py

The competition harness provides params.json and runs script.py. The script downloads the dataset to /tmp/data, loads checkpoint.pt, iterates over validation + test splits, and writes submission.json.

Requirements: CUDA GPU, torch, huggingface_hub, datasets, numpy, opencv-python, scipy, tqdm.

Model details

Property Value
Architecture Perceiver IO β€” TokenTransformerSegments
Parameters ~8.85 M
Input [B, 4096, 95] β€” xyz + normals + local PCA features
Output [B, 64] segment predictions
Checkpoint checkpoint.pt (170k steps, same as v1)

Post-processing sweep

Grid search results (200 scenes per config):

merge_end snap_radius HSS@0.5
0.4 0.7 0.379 ← chosen
0.4 0.5 0.376
0.5 0.7 0.372
0.5 0.5 0.371
0.6 0.7 0.368
0.5 0.3 0.365
0.6 0.5 0.364
0.7 0.5 0.359
0.6 0.3 0.359
0.7 0.3 0.354
Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Datasets used to train WatsonTang98/s23dr-2026-somnathab3-test