ML & DL

Grounded Segment Anything(Grounded SAM)을 사용한 auto-labeling (autodistill)

ssun-g 2023. 6. 23. 20:43
Grounded Segment Anything을 이용해 auto-labeling을 해보자.
이후 해당 데이터로 YOLOv8 학습까지 진행해본다.

공식 github

https://github.com/autodistill/autodistill

 

GitHub - autodistill/autodistill: Images to inference with no labeling (use foundation models to train supervised models)

Images to inference with no labeling (use foundation models to train supervised models) - GitHub - autodistill/autodistill: Images to inference with no labeling (use foundation models to train supe...

github.com


설명

Grounding DINO + Segment Anything(SAM) = Grounded SAM

코드

import cv2
import numpy as np
from autodistill.detection import CaptionOntology
from autodistill_grounded_sam import GroundedSAM

# Load test image
img = cv2.imread("./input/cat.jpg")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Find object
ontology=CaptionOntology({
    "cat with ribbon": "cat"  # "object name" : "label_name"
})

base_model = GroundedSAM(ontology=ontology)

dataset = base_model.label(
    input_folder="./input/", 
    extension=".jpg", 
    output_folder="./output/")

# Load label
with open("./output/valid/labels/cat.txt","r") as f:
    data = f.read().rstrip()

# Masking & Visualize
label = data.split('\n')
mask = np.zeros(img.shape, dtype=np.uint8)
for l in label:
    new_label = l.split(' ')
    class_id = int(new_label[0])
    points = list(map(float, new_label[1:]))
    
    h = img.shape[0]
    w = img.shape[1]
    points = np.array(points).reshape(-1, 2)
    points[:,0] *=  w
    points[:,1] *=  h
    points = points.astype(np.int32)

    mask = cv2.fillPoly(mask, [points], (255, 255, 255))
    masked_image = cv2.bitwise_and(img, mask)

# Save result
masked_image = cv2.cvtColor(masked_image, cv2.COLOR_BGR2RGB)
cv2.imwrite('result.jpg', masked_image)

수행 결과

원본 이미지

 

결과 이미지


총평

  • 다양한 이미지로 테스트 해봤는데 결과가 꽤 좋음.
  • 라벨링 소요 시간을 상당히 줄일 수 있을 것 같음.