Model card update
#2
by
pcuenq
HF Staff
- opened
README.md
CHANGED
|
@@ -16,6 +16,10 @@ widget:
|
|
| 16 |
|
| 17 |
# DepthPro: Monocular Depth Estimation
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
## Table of Contents
|
| 20 |
|
| 21 |
- [DepthPro: Monocular Depth Estimation](#depthpro-monocular-depth-estimation)
|
|
@@ -34,8 +38,6 @@ widget:
|
|
| 34 |
|
| 35 |
## Model Details
|
| 36 |
|
| 37 |
-

|
| 38 |
-
|
| 39 |
DepthPro is a foundation model for zero-shot metric monocular depth estimation, designed to generate high-resolution depth maps with remarkable sharpness and fine-grained details. It employs a multi-scale Vision Transformer (ViT)-based architecture, where images are downsampled, divided into patches, and processed using a shared Dinov2 encoder. The extracted patch-level features are merged, upsampled, and refined using a DPT-like fusion stage, enabling precise depth estimation.
|
| 40 |
|
| 41 |
The abstract from the paper is the following:
|
|
|
|
| 16 |
|
| 17 |
# DepthPro: Monocular Depth Estimation
|
| 18 |
|
| 19 |
+

|
| 20 |
+
|
| 21 |
+
This is the transformers version of DepthPro, a foundation model for zero-shot metric monocular depth estimation, designed to generate high-resolution depth maps with remarkable sharpness and fine-grained details. For the checkpoint compatible with the original codebase, please check [this repo](https://huggingface.co/apple/DepthPro).
|
| 22 |
+
|
| 23 |
## Table of Contents
|
| 24 |
|
| 25 |
- [DepthPro: Monocular Depth Estimation](#depthpro-monocular-depth-estimation)
|
|
|
|
| 38 |
|
| 39 |
## Model Details
|
| 40 |
|
|
|
|
|
|
|
| 41 |
DepthPro is a foundation model for zero-shot metric monocular depth estimation, designed to generate high-resolution depth maps with remarkable sharpness and fine-grained details. It employs a multi-scale Vision Transformer (ViT)-based architecture, where images are downsampled, divided into patches, and processed using a shared Dinov2 encoder. The extracted patch-level features are merged, upsampled, and refined using a DPT-like fusion stage, enabling precise depth estimation.
|
| 42 |
|
| 43 |
The abstract from the paper is the following:
|