Skip to content

Trying to implement YoloV4 in ML.NET but running into some trouble (I'm quite new to all of this) #5466

Closed
@devedse

Description

@devedse

System information

  • OS version/distro: Windows 10 20H2
  • .NET Version (eg., dotnet --info): 3.1.9

Introduction

I'm currently trying to use the YoloV4 model inside ML.NET but whenever I try to put an image through the model I get very strange results from the model. (It mainly finds backpacks, handbags, some apples and some other unexpected results). The image I'm using:
dog

I created a repository to reproduce my issue as well:
https://github.com/devedse/DeveMLNetTest

Disclaimer: I'm quite new to all of this 😄

Steps I've taken

I first wanted to obtain a pre-trained model which I was sure worked so I did the following steps:

  1. Cloned: https://github.com/Tianxiaomo/pytorch-YOLOv4
  2. Installed requirements + pytorch + cuda etc.
  3. Downloaded the Yolov4_epoch1.pth model and put it in checkpoints\Yolov4_epoch1.pth
  4. Ran demo_pytorch2onnx.py with debug enabled.

When you do this you can step through the code where the output from the model is parsed (this shows the labels for all boxes with a confidence higher then 0.6):
image

When we look up these labels, we can see that (all +1 since it's a 0-index array):

1: bicycle
8: truck
16: dog

So the results in Python seem to be correct. This python script first converts the model to an ONNX model and then uses that model to do the inference.

When I now start using the model inside C# though and try to parse through the results in exactly the same way I find completely different results:
image

Could it be that the model is somehow creating garbage data?

Anyway, here's the code that I'm

MLContext:

var pipeline = mlContext.Transforms.LoadImages(outputColumnName: "image", imageFolder: "", inputColumnName: nameof(ImageNetData.ImagePath))
                .Append(mlContext.Transforms.ResizeImages(outputColumnName: "image", imageWidth: ImageNetSettings.imageWidth, imageHeight: ImageNetSettings.imageHeight, inputColumnName: "image"))
                .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "input", inputColumnName: "image"))
                .Append(mlContext.Transforms.ApplyOnnxModel(
                    modelFile: modelLocation,
                    outputColumnNames: new[] { "boxes", "confs" },
                    inputColumnNames: new[] { "input" }
                    ));

Code to parse model output:

public IList<YoloBoundingBox> ParseOutputs(float[] boxes, float[] confs, float threshold = .6F)
{
	var boxesUnflattened = new List<BoundingBoxDimensions>();
	for (int i = 0; i < boxes.Length; i += 4)
	{
		boxesUnflattened.Add(new BoundingBoxDimensions()
		{
			X = boxes[i],
			Y = boxes[i + 1],
			Width = boxes[i + 2] - boxes[i],
			Height = boxes[i + 3] - boxes[i + 1],
			OriginalStuff = boxes.Skip(i).Take(4).ToArray()
		});
	}

	var confsUnflattened = new List<float[]>();
	for (int i = 0; i < confs.Length; i += 80)
	{
		confsUnflattened.Add(confs.Skip(i).Take(80).ToArray());
	}

	var maxConfidencePerBox = confsUnflattened.Select(t => t.Select((n, i) => (Number: n, Index: i)).Max()).ToList();
	var boxesNumbered = boxesUnflattened.Select((b, i) => (Box: b, Index: i)).ToList();

	var boxesIndexWhichHaveHighConfidence = maxConfidencePerBox.Where(t => t.Number > threshold).ToList();
	var allBoxesThemselvesWithHighConfidence = boxesIndexWhichHaveHighConfidence.Join(boxesNumbered, t => t.Index, t => t.Index, (l, r) => (Box: r, Conf: l)).ToList();


	Console.WriteLine("I would expect a bike, dog and car here");
	Console.WriteLine("Instead we got:");

	foreach (var b in allBoxesThemselvesWithHighConfidence)
	{
		var startString = $"{b.Conf.Number}: {labels[b.Conf.Index]}";
		Console.WriteLine($"{startString.PadRight(30, ' ')}({string.Join(",", b.Box.Box.OriginalStuff)})");
	}

	throw new InvalidOperationException("Everything below this doesn't work anyway");

Can someone give me some hints / ideas on why I could be running into this issue?

Also feel free to clone the repo and press F5. It should simply build / run.
https://github.com/devedse/DeveMLNetTest

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions