Forum

Issues encountered with ML.NET

hello all,

a few days ago, we started looking into bringing machine learning to vl. discussions and tests are happening in this element channel.

one of the options we have at hand is to use Microsoft’s ML.NET lib. Hayden and I did some tests and unfortunately, we could not succeed. this topic aims to highlight the issues we encountered and try to figure out what could be done in vl to be able to wrap such a lib.

i tried to patch this example in which they teach you how to predict if a user review is aggressive or not (more info in the previous link). here are the issues I faced, please note i lack C# terms to clearly explicit the things they do in code I could not manage to patch :

finding the right overloads and default values

  • in their example, they’re using the LoadFromTextFile<T> function like so :
IDataView dataView = mlContext.Data.LoadFromTextFile<SentimentIssue>(DataPath, hasHeader: true);

i actually had a hard time finding which overload of this function to use (the lib has many), and most importantly could not find the “generic” one. I would have a expected a generic LoadFromTextFile node waiting for its output to be annotated to LoadFromTextFile<MyClass>, but it could not find it in the node browser. is this something VL does not handle?

image

also, using the “Options” version makes the node red, with VL complaining the reference is ambiguous, not sure why.

  • they create a trainer this way
var trainer = mlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: "Label", featureColumnName: "Features");

when using the SdcaLogisticRegression node that has the input they’re using in code, the node throws an error saying name cannot be null or empty. this is confusing since they’re only setting two params here, which are by the way the defaults the node already has.

defaults

… but using the node that just takes an Options input, no error is thrown :

image

attributes
also, they heavily rely on attributes (you have to annotate your class with those to tell to which column of your dataset a specific property of your class corresponds to). without proper attributes support in VL, it seems the lib not unusable right now.

loading a trained model

  • tried a second approach trying to load a model trained in visual studio in C#, which they do like that :
var predEngine = mlContext.Model.CreatePredictionEngine<SentimentIssue, SentimentPrediction>(trainedModel);

in this case, I can find a generic CreatePredictionEngine<T1, T2>, but VL complains TDst must have a default constructor, but SentimentPrediction has not. @tonfilm pointed out it was related to type constraints, which not something I’m familiar with.

image


sorry for being so verbose, I just wanted to report things i stumbled upon to see how those issues could be addressed, or if I missed something obvious :)

thanks in advance for your comments and tips
seb

Hello @sebescudie , you have done a lot of progress there and you are getting more analytical than I would be. I tried also to implement ML from time to time, but I never had the appropriate background to even set the proper questions and ask for the the proper answers.
Although I started many times with almost the same housing example by watching video tutorial and I was always hitting the same wall.
Last time I tried something was recently, after Hayden’s presentation and your workshop regarding Runaway.ML
I believed that with a strong motivation and a clear mind I would be more productive and it happened to be partially like this.

  1. LoadFromTextFile

image

  1. The housing data class

image

My problem at the moment is few lines ahead to the model Evaluation where I didnt manage to find a proper way to add / exploit the additional Labels “Features” and “Score”

image

Since I am not completely aware of what I am doing, I am providing to you the whole patch so you can observe it and maybe come after to useful conclusions.

cheers and happy new year!

test VL.ML.7z (332.9 KB)

hey, thanks (again) for sharing! could it be that those are attributes that ML.NET looks for? this is what i understood from they taxi fare prediction tutorial :

The TaxiTripFarePrediction class represents predicted results. It has a single float field, FareAmount , with a Score ColumnNameAttribute attribute applied. In case of the regression task, the Score column contains predicted label values.

I gave it a spin again and got the same results … this time though, I managed to clean and comment my attempts, explaining what was not clear, and what did not work, if anyone wants to look into it :) this is easier to explain in a patch than in a forum post…

ml_net_taxi_price_upload.7z (210.2 KB)

seb