As I am continuing with dataset acquisition, I am always thinking what would be the best machine learning framework to use to build by object detector. I have always like Tensorflow.js approach, and was seriously considering building a web application that enables it features. On this video, IBM Lead Developer, talks about how he builds a predication model using node.js, but I was more considering of not using node.js. Still searching for solutions which work for me.
hey everybody welcome to the second
video of our introductory AI in nodejs
tutorial series i am paul van eck in
line with the cognitive l btec group at
IBM in the last video we talked about
getting started by using pre trained
tensorflow j s models that already
exists out there in this video we’re
going to learn how to build and train a
simple deep learning model from scratch
in JavaScript here we expect users to
have basic knowledge of nodejs and some
familiarity with AI and machine learning
concepts to outline what will be covered
first we’ll go over some programming
concepts with tensorflow j s and how to
convert existing Python models forth
then we’re going to build and train a
neural network to help classify fashion
and clothing items using the fashion M
mist data set lastly top it all off
we’re going to show how you can apply
transfer learning to use a train model
to classify for new classes links to the
full tutorial and source code can be
found in the description below now let’s
get started so conceptually a neural
network consists of many layers of
weights and computations which are
represented as nodes and edges in a
graph to help implement these neural
networks tensorflow J s offers two api’s
the low-level core API and the
high-level layers API at the low level
we typically have tensors combined with
operations operations include linear
algebra and machine learning
computations on some input tensors
producing new tensors as results these
tensors and operations are strung
together to produce neural networks now
looking at the high levels layers API
you have something that pretty much
imitates the Chara’s programming style
in Python just in JavaScript syntax the
main concept with this is that you
create a model object which represents
20 clearning model and add any number of
layers to the model to implement your
model architecture this is by far the
most popular way of constructing their
own networks due to its ease of use
we’ll see this in action in just a
moment but first let’s go over
conversion real quick as we saw in the
last tutorial there are many open-source
pre-trained models for tensorflow and
Jas however more models are trained and
available in tensorflow and Kerr’s
Python formats for those models
conversion is necessary before they can
be used in tensorflow Jas all you really
need is a Python environment and simply
pip install tensorflow JS to install the
tenzin flow Jas converter then you can
run the Python command line tool to do
versions for example this would be a
command for converting a Karass hdf5
model to the tensor flow Jas graph model
format take note however if the model
includes operations which are not
supported by tensor flow Jas conversion
will fail learn more about the tensor
flow Jas converter and supported ops in
a tea of Jas converter repository so to
build our own model for this we’re going
to use the high level layers API and
we’ll go through the following steps in
the process loading the data defining a
model training the model and then
finally testing or evaluating the model
as mentioned before we will train a
model to help classify some of the
classes in the fashion amnesty data set
so first things first let’s create a new
project directory initialize it with NPM
in it and then install T fjs node next
let’s download the data set which can be
found on the IBM data set exchange the
data set exchange provides a curated
list of free and open data sets for you
to use after the download extract the
tar file into the project directory once
that is done you should have a fashion
atmos directory containing two CSV files
fashion and this train and fashion and
mist test these represent the training
and testing sets the first column of
each CSV file represents the label for
an item or rather the index that
corresponds to a label the remaining
columns represent the pixel values 0
through 255 for the image which is 28 by
28 pixels or when multiplied 780 for
total pixels per item now we need to
create a low there for all of the CSV
data let’s treat a new file called build
model Dante s and then populated with
some starter variables take note of the
train data URL and test at a URL
variables to ensure that they are set to
the proper path of the extracted data
files here we have the labels array
containing all of the labels in the
fashion and mr. dataset and this is what
we will use to denote we only want to
use the first five classes in this array
for this we only want to use half the
data set so that in the next section we
can take them although we train here see
how transfer learning works with the
second half of the data set the rest of
the variables pertain to image
properties batch size and epics value
our training hyper parameters these can
be configured to your liking but the
current values should work well
next let’s add the bit of code to load
the data with a function called load
data for which it will accept the data
URL and batch sizes input parameters we
will be using the tensorflow json data
api which helps with loading and
preparing the data for usage in our
models the primary loading is done right
here with several chained methods first
we use as he SP method to load and parse
the CSV file specifying that the label
column is of course the label then we
apply a normalized function to each data
entry using map this normalizes the
pixel values which are 0 the 255 to be
between 0 and 1 then we will use the
filter method to get the first 5 classes
as denoted by the numb of classes
variable next we run each data entry
into a transform function to convert
each images pixel values into a 3d
tensor and also convert the label
representation until you one-hot the
vector which is commonly used for
categorical classification problems
lastly we call the batch method to make
sure all the data is grouped into
batches of sized batch size which is 100
next at the bottom let’s add a run
function to call the load data function
then print out the first batch in the
data set let’s save and run the app from
the command-line when the code is run
the training data is loaded normalized
and turned into tensors here we see the
label values in one hot encoding and the
normalized pixel values for the first
batch of images so now that we have the
data loading mechanism it’s time to
build our model for image classification
convolutional neural Nets have been
shown to be effective in extracting
useful features from the images so that
the model can learn so let’s use that so
back in our build model danteís file
let’s create a building model function
like this the model architecture here
simply consists of two layers of 2d
convolution along with computing the max
go after each layer feel free to make
changes to the model like adding more
layers or changing activation functions
but this is what we’ll go with for this
tutorial as you can see constructing the
model is pretty straightforward using
the TF GS layers API you first create an
empty model using the sequential
function we then build the model by
arranging layers in a linear order with
one layer feeding them next the tensions
between the layers are allocated
automatically so you only need to manage
the potential that feeds the first
player that is why we only specify the
input shape in the first convolutional
layer at the end we flatten the input
for the fully connected output layer
since we are doing a classification we
are using the commonly used soft max
activation function which will give us a
probability distribution representing
the likelihoods of an element belonging
to each of the classes after the model
was built before we can use it we must
configure it with the compile function
here we specify the optimizer for which
we choose the Adam optimizer and then
the loss function for which we use
categorical cross entropy we also
specify accuracy as a metric to be
evaluated during training and testing we
then return this model now let’s update
the run code to call this here we call a
summary method of the model so that when
we run the script we can view the
architectural details so let’s save and
go ahead and run the updated script now
we see a summary of the architecture
with the shape and cram count at each
layer and also account the trainable
params at the end these trainable
currents are the weights that will be
adjusted to minimize loss during
training now let’s add a function to
handle training in our script let’s
create a trained model function like
this one this function will taken the
model to be trained the training data
set and an optional epochs argument to
denote how many times we should look for
the data set for training we specify two
callback functions to help us get
information during training like the
current epoch and the training set
accuracy after each epoch completes the
fed data set function is a great
function that will handle the training
for you given the training configuration
in data set very convenient
now before we kick off the training
let’s create the function to evaluate
the model when the training completes
using the testing data set it is
important that we check how the model
performs on data it hasn’t seen yet here
we call the evaluate data set function
that will collect the accuracy for us on
the specified data set so now let’s
update the run code they call the train
and evaluate functions we just made but
before we run the script let’s add one
more bit of code to save the model so
that we can use to save model for
infants later and even for transfer
learning to do that we need to call
model dot save specifying a directory
path here we are saying that we want the
model files to be saved to the fashion
amnesty FPS directory finally let’s save
and run the script the model goes
through training using a training data
set and iterates for the number of
epochs define
each iteration will display the loss in
accuracy values you should see the
accuracy improving after each epoch this
may take several minutes just give it a
bit when the training completes the
testing set is evaluated on the trained
model we are hoping for around 90% test
that kursi and it looks like we got it
after the evaluation is complete the
model is saved the director will be
specified in the code which is fashion
amnesty fjs here we can see the model
JSON and weights final now let’s use the
model we just trained for inference on
arbitrary fashion images since the data
set be trained on had only 28 by 28
Korea scaled images in the image we want
to run through the model will need to be
converted to this format for that we
will use chimp in image manipulation
program simply install it with NPM
install next create a new test model JS
file and add this initialization code to
it like we had in the build model script
this time we also required jimp then
using jimp we make a two pixel data
function that will convert an input
image into the needed format this
function will return an array of these
pixel values that are normalized between
0 & 1 like before we can then add the
code that runs the prediction this
function here will call it a 2 pixel
data function to get the pixel array
then we convert it to a tensor with this
shape then expand dims to get the tensor
into the expected shape then finally
feeding it into the models predicts
function the rest is just finding out
the index of the highest score in the
output array and comparing it with the
indices in the labels array finally we
create a run function to put all the
pieces together we take in an image path
as a command-line argument we then load
the model using TF cloud layers model
then run the prediction function logging
the result let’s try it out on one of
the sample images the output will be an
object containing the prediction and
score for each available label as you
can see here it correctly predicted the
label for this given image now moving on
to the next part of the tutorial I think
it’d be pretty useful if you go over how
we can perform
transfer learning training models can
take large amounts of time especially
for extensive models with millions of
trainable parameters transfer learning
shortcuts a lot of this training work by
taking a model trying on one task and
repurposing it for a second related task
we do this by replacing the final layer
or layers of the pre trained model with
new layers and then train them with new
a major advantage of this technique is
that much less training data is needed
to train an effective model for new
classes in our case the two tasks are
very similar we just want to use the
model we trained on the first five
classes of the fashion and this dataset
and train a classifier for the remaining
five classes so to get started let’s
make a copy of the build model J’s file
we created earlier and call it the
transfer learn not Jas there are a few
things that need to be altered from this
file first we need to make data loading
adjustments we need to alter the filter
function so that we now get the
remaining classes of the data set we
didn’t use previously then we alter the
transform function to accurately map the
final five labels to 100 days for
example sandal has a label number of
five so we subtract a number of classes
from it to get the hot index in the one
hakurei note that we only have to do
these data loading changes because we
are splitting data from the same csv
file
you wouldn’t typically have to do this
next since we are no longer building a
model from scratch and are going to rely
on the model built from before let’s
change the build model function we can
remove all this in this time build model
will require an argument for the base
model so first let’s remove the last
layer of the base model if you recall
this is the softmax classification layer
used for classifying the first five
classes of fashion amidst this leaves us
with the flatten layer as a new final
layer next we loop through the base
models remaining layers and marking them
is not trainable we want to freeze the
weights in the base model layer so you
don’t change when we train the new model
the idea here is that the features
extract them in these lower layers can
be applied to the new classes and don’t
need adjustment following that we can
now create the new sequential model
specifying the base model layers as the
starting layers then we make sure we
keep this to add a soft max dense layer
to the model this layer will have the
trainable parameters for classifying new
classes next we update the run code to
load the pre tween model from before and
use it as the base for the new model
we’re going to train on only a subset of
the data set this time in this case
we’re only training on 10% of the
available training images from the new
set of classes we make sure to save this
model in the new directory as well now
let’s run the script to perform the
transfer learning as you should see
training is completing much quicker this
time with remarkably similar test
accuracy to what we got training a model
from scratch
we train on less data but only the
weights in the last layer are being
adjusted pretty cool I think to try out
this model we can make some small
adjustments to test model is changing
the model URL to the new model and
commenting out the first five classes of
the labels array now we can run it using
an image from one of the new five
classes yep looks like it correctly
predicted that one again feel free to
make any adjustments you want as an
exercise you can try it training longer
on the base model adding dropout layers
or maybe try changing the transfer
learning setup to not just classify the
last five classes but all ten classes
instead up to you so that about covers
it for this video here you took a deeper
dive in the programming and eight
learning model and node using tensor
flow and Jas you also learned about
converting models building models
training and saving models and even
reusing models all common practices in
AI stay tuned for more videos in this
series to learn more about AI in nodejs
thanks for watching
Disclaimer:
My Machine Learning Blog may not own some of the content presented.
Copyright Disclaimer under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing.
All posts on this video blog are my personal opinion and they don’t in any way reflect the opinions of my employer. All materials, posts, and advice from this site are informational, researching, and for testing purposes only. You can use them at your own responsibility. I’m not in any way responsible for any damage done by following posts, advice, tutorials, and articles from this video blog.
References
↑1 | Originally Published on YouTube on Apr 17, 2020 |
---|