vosstalane
/

object-detection

Object Detection

conditional_detr

Model card Files Files and versions

object-detection / README.md

vosstalane's picture

Upload 5 files

888be09 verified 8 months ago

|

history blame contribute delete

2.14 kB

	---
	library_name: transformers
	license: mit
	language:
	- en
	pipeline_tag: object-detection
	base_model:
	- microsoft/conditional-detr-resnet-50
	tags:
	- object-detection
	- fashion
	- search
	---
	This model is fine-tuned version of microsoft/conditional-detr-resnet-50.

	You can find details of model in this github repo -> [fashion-visual-search](https://github.com/yainage90/fashion-visual-search)

	And you can find fashion image feature extractor model -> [yainage90/fashion-image-feature-extractor](https://huggingface.co/yainage90/fashion-image-feature-extractor)

	This model was trained using a combination of two datasets: [modanet](https://github.com/eBay/modanet) and [fashionpedia](https://fashionpedia.github.io/home/)

	The labels are ['bag', 'bottom', 'dress', 'hat', 'shoes', 'outer', 'top']

	In the 96th epoch out of total of 100 epochs, the best score was achieved with mAP 0.7542. Therefore, it is believed that there is a little room for performance improvement.

	``` python
	from PIL import Image
	import torch
	from transformers import AutoImageProcessor, AutoModelForObjectDetection

	device = 'cpu'
	if torch.cuda.is_available():
	device = torch.device('cuda')
	elif torch.backends.mps.is_available():
	device = torch.device('mps')

	ckpt = 'yainage90/fashion-object-detection'
	image_processor = AutoImageProcessor.from_pretrained(ckpt)
	model = AutoModelForObjectDetection.from_pretrained(ckpt).to(device)

	image = Image.open('<path/to/image>').convert('RGB')

	with torch.no_grad():
	inputs = image_processor(images=[image], return_tensors="pt")
	outputs = model(**inputs.to(device))
	target_sizes = torch.tensor([[image.size[1], image.size[0]]])
	results = image_processor.post_process_object_detection(outputs, threshold=0.4, target_sizes=target_sizes)[0]

	items = []
	for score, label, box in zip(results["scores"], results["labels"], results["boxes"]):
	score = score.item()
	label = label.item()
	box = [i.item() for i in box]
	print(f"{model.config.id2label[label]}: {round(score, 3)} at {box}")
	items.append((score, label, box))
	```

	![sample_image](sample_image.png)