Being the buzz of 2018, ML can be beautifully integrated with visual programming concepts. Libraries exist that can be used, ideas are needed regarding what to use them for.
a log of what happened here:
Hayden was introducing us to the matter, to his CNTK library and moderating the session.
Here is a rough breakdown of the architecture of VL.CNTK:
Model
the black box that comes with neurons and weights. composed out of several layers.
I/O
numbers, vectors, strings, images, …
Layers
e.g.: Dense = fully connected layer of neurons that can be tweaked
Trainers
e.g.: Stochastic Gradient Descent:
error = bad score
minimize error
backpropagation
But for an easier access we’d like to have ready-made nodes that cover high-level tasks:
Utils
e.g. classification problem -> here are my images sorted into categories -> train
“instruments” premade modules that proved to be valuable for certain tasks
again image classification as an example. there are pretrained models we can use:
squeezenet: 1000 image categories google 5mb
imagenet: 200 mb
Question: what do you want to achieve? Ideas?
anton: pattern recognition
motzi: gestures, time-based data
wekinator
marco: hallucination, video synthesis, style transfer
timpernagel: mario klingemann: fluid simulation without proper math
art, randomness
importing external models
(If i didn’t cover your thoughts properly please fell free to edit here)
VL.CNTK Library
representation of data:
videos: 4d
string is special: indexing words. language adds sentences, grammar/syntax, semantics.
GAN Hayden is thinking about putting up a prerelease and use the time at LINK to work on it
Demo
Transfer Learning:
General idea: highly pre0trained model like squeezenet, strip away the last layer, do your classification as the last layer
Patch Layout: Inputs, Model, Classification (yielding Loss Function and Evaluation Function), Training
we had a look inside a Dense Layer: Flattens your image, Parameters that can get trained (weights, biases)
Trainer:
allows the user to adjust the learning rate
formatted input
live input
Data
Structured data is a text file with image file paths and labels (integers)
Strategies. After classification throw in some false positives
GAN (generative adversarial network) = conflicting networks:
the discriminator (1st network) has examples of real images and learns what is real about these images.
the generator network (2nd network) generates images from a random input.
the discriminator judges whether the generator’s images look “real” or not.
the generator refines its output based on the discriminator’s judgment, creating images images that look more “real”
discriminator at some point can’t say which is a “fake” generated image and what is “real”.
Fixed a few more bugs, pruned a few nodes, but there is still a little work to be done on the development side. More clear from the workshop is that there is a lot more work to be done for documentation for basic guidelines on setting up the network and training without errors.
This will be enough for a pre-release, but for a full release every node will need to be documented, perhaps with a help file.
There also needs to be more easy to use modules and example networks to show how different takes can be interpretted.
That said if you are familiar with machine learning terminalogy you may find that the library is quite easy to use.
One very good usecase example we came across was signal processing. So for the pre-lease I’ll develop some examples of how that is achievable.