On this article, I came across this great video from freeCodeCamp.org session explaining how to use Tensorflow.js.
what’s up guys in this video we’re going
to introduce the concept of client-side
artificial neural networks which will
lead us to deploying and running models
along with our full deep learning
applications in the browser to implement
this cool capability will be using
tensorflow j/s Penzer flows javascript
library which will allow us to well
build and access models in JavaScript
I’m super excited to cover this so let’s
get to it
whether you’re a complete beginner to
neural networks in deep learning or you
just want to up your skills in this area
you’ll definitely want to check out all
the resources available on the deep
lasorda youtube channel as well as deep
lizard comm there you’ll find several
complete video series and blogs where
you’ll learn everything from absolute
machine learning basics all the way to
developing and deploying your own deep
learning projects from scratch here’s a
quick breakdown of the sections that
make up this video go ahead and take a
look here to get an idea of the content
will be covering together so client-side
neural networks running models in the
browser to be able to appreciate the
coolness factor of this we’re going to
need some context so that we can
contrast what we’ve historically been
able to do from a deployment perspective
to what we can now do with client-side
neural networks all right so what are we
used to being able to do as we’ve seen
in our previous series on deploying
neural networks in order to deploy a
deep learning application we need to
bring two things together the model and
the data
to make this happen will normally see
something like this we have a front-end
application say a web app running in the
browser that a user can interact with to
supply data then we have a back-end
application a web service where our
model is loaded and running when the
user supplies the data to the front-end
application that web app will make a
call to the back-end application and
post the data to the model the model
will then do its thing like make a
prediction and then it will return its
prediction back to the web application
which will then be supplied to the user
so that’s the usual story of how we
bring the model and the data together
and we’ve already gone over all the
details for how to actually implement
this type of deployment so be sure to
check out that series I mentioned
earlier if you haven’t already links in
the description now with client-side
neural network deployment we have no
back-end the model isn’t sitting on a
server somewhere waiting to be called by
front-end apps but rather the model is
embedded inside of the front-end
application what does this mean well for
a web app that means the model is
running right inside the browser so in
our example scenario that we went over a
couple moments ago rather than the data
being sent across the network or across
the internet from the front-end
application to the model in the backend
the data in the model are together from
the start within the browser okay so
this is cool and everything but what’s
the point
for one users get to keep their data
local to their machines or devices since
the models running on the local device
as well if user data isn’t having to
travel across the internet we can say
that’s a plus and there’s more on this
concern in the previous series I
mentioned additionally when we develop
something that runs in a front-end
application like in the browser that
means that anyone can use it
anyone can interact with it as long as
they can browse to the link our
application is running on there’s no
prerequisites or installs needed and in
general a browser application is highly
interactive and easy to use
so from a user perspective it’s really
simple to get started and engaged with
the app so this is all great stuff but
just the stars’ considerations to
address with the traditional deployment
implementation there are a couple
caveats for the client-side deployment
implementation as well but don’t let
that run you off there’s just a few
things to consider as we’re talking
about these I’m going to play some demos
of open source projects that have been
created with tensorflow jeaious
just so you can get an idea of some of
the cool things we can do with
client-side neural networks links to all
of them are in the description now on to
the caveats for one we have to consider
the size of our models since our models
will be loaded and running in the
browser you can imagine that loading a
massive model into our web app might
cause some issues tensorflow j/s
suggests to use models that are 30
megabytes in size or less to give some
perspective the vgg 16 model is over 500
megabytes so what would happen if we
tried to run that sucker in the browser
we’re actually going to demo that in a
future section so stay tuned alright
then what types of models are good to
run in the browser
well smaller ones once that have been
created with the intent to run on
smaller or lower powered devices like
phones and what types of models have we
seen that are incredibly powerful for
this mobile nets so we’ll also be seeing
how a mobile net model holds up to being
deployed in the browser as well now once
we have a model up and running in the
browser can we do anything we’d
ordinarily otherwise do with the model
using tensorflow j/s pretty much but
would we want to do everything with a
model running in the browser
that’s another question and the answer
is probably not having our models run in
the browser is best suited for tasks
like fine-tuning pre-trained models or
most popularly inference so using a
model to get predictions on new data and
this task is exactly what we saw with
the project we worked on in the Charis
model deployment series using flask
while building new models and training
models from scratch can also be done in
the browser these tasks are usually
better addressed using other api’s like
Kerris or standard tensorflow in Python
so and now that we have an idea of what
it means to deploy our model to a
client-side application why we’d want to
do this and what types of specific
things we’d likely use this for let’s
get set to start coding and implementing
applications for this in the next
sections of this video let me know in
the comments if you plan to follow along
and I’ll see you in the next section in
this section we’ll continue getting
acquainted with the idea of client-side
neural networks and we’ll kick things
off by seeing how we can use tensor
flows model converter tool to convert
Kerris models into tensorflow j/s models
this will allow us to take models that
have already been built and trained with
Kerris and make use of them in the
browser with tensorflow js
so let’s get to it tensorflow jeaious
has what they called the layers api
which is a high-level neural network api
inspired by Carris and we’ll see that
what we can do with this API and how we
use it is super similar to what we’ve
historically been able to do with Kerris
so given this it makes sense that we
should be able to take a model that we
built in Kerris or that we’ve trained in
Kerris and port it over to tensorflow
jeaious and use it in the browser with
the layers api right otherwise the
alternative would be to build a model
from scratch and train it from scratch
in the browser and as we discussed in
the last section that’s not always going
to be ideal so having the ability in the
convenience to convert pre-built or
pre-trained Kerris models to run in the
browser is definitely going to come in
handy all right now let’s see how we can
convert a Karass model to attends a
float jeaious model first we need to
install the tensorflow js model
converter tool
from a Python environment probably one
where Karis is already installed we run
pip install tensorflow J’s from the
terminal and once we have this we can
convert a Karass model into a tensorflow
j/s model there are two ways to do the
conversion and will demo both the first
way is making use of the convertor
through the terminal or command line
we’d want to use this method for Karis
models that we’ve already saved a disk
as an h5 file if you’ve watched the B+
or Kara series you know we have multiple
ways we can save a model or save parts
of a model like just the weight or just
the architecture to convert a Kerr’s
model into a tensorflow j/s model though
we need to have saved the entire model
with the weights the architecture
everything in an h5 file currently
that’s done using Karis model dot save
function so given this I already have a
sample model I’ve created with Karis and
save to disk if you don’t already have
access to these Karis model files don’t
worry
I’ve included links to Karis github
where you can just download these files
once you have them you can follow this
conversion process using those h5 files
when they’re needed later in this video
I’m in the terminal now where we’ll run
the tensorflow J’s converter program so
we run at tensorflow J’s converter and
specify what kind of input the converter
should expect so we supply – – input
format caris then we supply the path to
the saved h5 file and the path to the
output directory where we want our
converted model to be placed and the
output directory needs to be a directory
that’s solely for holding the converted
model there will be multiple files so
don’t just specify your desktop or
something like that so when we run this
we get this warning regarding a
deprecation but it’s not hurting us for
anything we’re doing here and that’s it
for the first method we’ll see in a few
moments the format of the converted
model but before we do that let’s demo
the second way to convert a Karass model
this is going to be done directly using
Python and this method is for when we’re
working with a Karass model and we want
to go ahead and convert it on the spot
to a tensorflow J’s model without
necessarily needing to save it to an h5
file first so we’re in a Jupiter
notebook where we’re importing Kerris in
the tensorflow Jas library and I’m going
to demo this with the vgg 16 model
because we’ll
be making use of this one in a future
section anyway but this conversion will
work for any model you build with Kerris
so we have this vgg 16 model that’s
created by calling Karis applications
bgg 16 vgg 16 and then we call
tensorflow j s converters dot save Karis
model and to this function we supply the
model that we’re converting as well as
the path to the output directory where
we want the converted tensorflow j s
model to be placed and that’s it for the
second method so let’s go check out what
the output from these conversions look
like we’re going to look at the smaller
model that we converted from the
terminal so we’re inside of this
directory called simple model which is
the output directory I specified
whenever we converted the first model
and we have a few files here we have
this one file called model JSON which
contains the model architecture and
metadata for the weight files and those
corresponding weight files are these
sharded files that contain all the
weights from the model in our stored in
binary format the larger and more
complex the model is the more weight
files there will be this model was small
with only a couple dense layers and
about 640 learning Bowl parameters but
the vgg 16 model we converted on the
other hand with over 140 million learn
about parameters has 144 corresponding
weight files all right so that’s how we
can confirm our existing Kerris models
into tensorflow j/s models we’ll see how
these models and their corresponding
weights are loaded in the browser in a
future section when we start building
our browser application to run these
models I’ll see you there in this
section we’ll go through the process of
getting a web server set up to host deep
learning web applications and serve deep
learning models with Express for nodejs
so let’s get to it
[Music]
to build deep learning applications that
run in the browser we need a way to host
these applications and a way to host the
models so then really we just need a way
to serve static files if you follow the
deep Azure youtube series on deploying
Kerris models then you know that we
already have a relatively easy way of
hosting static files and that’s with
flask blasts though is written in Python
and while it would work perfectly fine
to host the tensorflow j/s applications
will be developing it makes sense that
we might want to use a JavaScript based
technology to host our apps since we’re
kind of breaking away from Python and
embracing JavaScript in this series so
enter Express for nodejs Express is a
minimalist web framework very similar to
flask but is for node.js not Python and
if you’re not already familiar with
node.js then you’re probably wondering
what it is as well no js’ which will
refer to most of the time as just note
is an open source runtime environment
that executes JavaScript on the server
side C historically JavaScript has been
used mainly for client-side applications
like browser applications for example
but node allows us to write server-side
code using JavaScript will specifically
be making use of express to host our web
applications and serve our models so
let’s see how we can do that now first
things first we need to install node.js
I’m here on the downloads page of nodes
website so you just need to navigate to
this page choose the installation for
your operating system and get it
installed I’ve installed node on a
Windows machine but you’ll still be able
to follow the demos we’ll see in a few
moments even if you’re running another
operating system all right after we’ve
got node installed we need to create a
directory that will hold all of our
project files so we have this directory
here I’ve called tensorflow j/s within
this directory we’ll create a
subdirectory called local server which
is where the Express code that we’ll run
our web server will reside and we’ll
also create a static directory which is
where our web pages and eventually our
models will reside within this local
server we create a package JSON file
which is going to allow us to specify
the packages that our project depends on
let’s go ahead and open this file I’ve
opened this with Visual Studio code
which is a free open source code editor
developed by Microsoft that can run on
Windows Linux and Mac OS this is what
we’ll be using to write our code so you
can download it and use it yourself as
well or you can use any other editor
that you’d like all right back to the
package.json file within package.json
we’re going to specify a name for our
project which we’re calling in
tensorflow j s all lowercase per the
requirements of this file will also
specify the version of our project there
are some specs that the format of this
version has to meet but most
simplistically it has to be in an X X X
format so we’re just going to go with
the default of 1.0.0 all right name and
version are the only two requirements
for this file but there are several
other optional items we can add like a
description the author and a few others
we’re not going to worry about this
stuff but we are going to add one more
thing the dependencies this specifies
the dependencies that our project needs
to run we’re specifying Express here
since that’s what we’ll be using to host
our web apps and we’re also specifying
the version now we’re going to open
PowerShell and we have the ability to
open it from right within this editor by
navigating to view and then integrated
terminal and you should have the ability
to open the terminal of your choice
that’s appropriate for your operating
system if you’re running on Linux for
example and don’t have PowerShell
otherwise you can just open the terminal
outside of the editor if you’d like all
right so from within PowerShell we make
sure we’re inside of the local server
directory where the package.json file is
and we’re going to run NPM install NPM
stands for node package manager and by
running NPM install NPM will download
and install the dependencies listed in
our package of JSON file so let’s run
NPM install and we’ll see it installs
Express and when this is finished you
can see that we now have an added node
modules directory that contains the
downloaded packages and we additionally
have this package lock JSON file that we
didn’t have before it contains
information about the downloaded
dependencies
don’t delete these things alright so at
this point we have node we have Express
now we need to write a node program that
will start the Express server and will
host the files that we specify let’s see
that makes sense to do this we’ll create
this file called server Jas
inside a server j/s we first import
Express using require Express using
require like this will import the
Express module and give our program
access to it you can think of a module
and node as being analogous to a library
in JavaScript or Python just a group of
functions that we want to have access to
from within our program and then we
create an Express application using the
Express module which we assign to app an
Express app is essentially a series of
calls to functions that we call
middleware functions middleware
functions have access to the HTTP
requests and response objects as well as
the next function in the applications
request response cycle which just passes
control to the next handler so within
this app when a request comes in we’re
doing two things we’re first logging
information about the request to the
terminal where the Express server is
running and we’ve then pass control to
the next handler which will respond by
serving any static files that we’ve
placed in this directory called static
that’s right within the root directory
of our tensorflow j/s project so in our
case the middleware functions I
mentioned are here and here
note that the calls to app dot use are
only called once and that’s when the
server is started the app dot used calls
specify the middleware functions and
calls to those middleware functions will
be executed each time a request comes
into the server
lastly we call app dot listen to specify
what Port Express should listen on I’ve
specified port 81 here but you can
specify whichever unused port you’d like
when the server starts up and starts
listening on this port this function
will be called which will log this
message letting us know that the server
is up and running
alright we’re all set up let’s drop a
sample HTML file into our static
directory then start up the Express
server and see if we can browse to the
page we’re going to actually just place
the web application called predicta HTML
that we created in the Charis deployment
series into this directory as a
proof-of-concept so we place that here
you can use any HTML file you’d like
though to test this now to start express
we use PowerShell let’s make sure we’re
inside of the local server directory and
we run node server j/s we get our output
message letting us know that express is
serving files from our static directory
on port 81 so now let’s browse to
localhost or whatever the IP address is
that you’re running Express on port 81
slash predict HTML which is the name of
the file we put into the static
directory and here we go
this is indeed the webpage we wanted to
be served
we can also check out the output from
this request in PowerShell to view the
logging that we specified so good we now
have node and Express setup to be able
to serve our models and host our
tensorflow j/s apps that will be
developing coming up give me a signal in
the comments if you are able to get
everything up and running and I’ll see
it in the next section in this section
we’re going to start building the UI for
our very first client-side neural
network application using tensorflow JS
so let’s get to it now that we have
Express setup to host a web app for us
let’s start building one the first app
we’ll build is going to be similar in
nature to the predict app we built in
the flask series with Kerris recall this
was the app we built in that previous
series we had a fine-tuned vgg 16 model
running in the backend as a web service
and as a user we would select an image
of a cat or dog submit the image to the
model and receive a prediction now the
idea of the app will develop with
tensorflow jeaious will be similar but
let’s discuss the differences can you
see the source code is generating the
response
yeah we can and we will but first know
that our model will be running entirely
in the browser our app will therefore
consist only of a front-end application
developed with HTML and JavaScript so
here’s what the new app will do the
general layout will be similar to the
one we just went over where a user will
select an image submit it to the model
and get a prediction we won’t be
restricted to choosing only cat and dog
images though because we won’t be using
fine-tuned models this time instead
we’ll be using original pre-trained
models that were trained on image net so
we’ll have a much wider variety of
images we can choose from once we submit
our selected image to the model the app
will give us back the top 5 predictions
for that image from the image net
classes so which model will we be using
well remember how we discussed earlier
that models best suited for running in
the browser are smaller models and how
tensorflow recommends using models that
are 30 megabytes or less in size well
we’re first going to go against this
recommendation and use vgg 16 as our
model which is over 500 megabytes in
size nice memories we’ll see how that
works out for us but you can imagine
that it may be problematic no worries so
we’ll have mobile net to the rescue
coming in at only about 16 megabytes so
we’ll get to see how these two models
compare to each other performance wise
in the browser it’ll be interesting all
right let’s get set up from within the
static directory we created last time
that we need to create a few new
resources we need to create a file
called predict with tfj HTML which will
be our web app then we also need to
create a file called predict j/s which
will hold all the JavaScript logic for
our app then we need a directory to hold
our tensorflow JS models so we have this
one which we’re calling TF j s models
navigating inside we have two sub
directories one for mobile net and one
for vgg 16 since these are the two
models we’ll be using each of these
directories will contain the model JSON
in the corresponding weight files for
each model navigating inside the vgg 16
we can see that
to get these files here I simply went
through the conversion process and
Python of loading bgg 16 and mobile met
with Karis and then converting the
models with the tensorflow j/s converter
we previously discussed so follow that
earlier section to get this same output
to place in your model directories
alright navigating back to the static
directory the last resource is this
image net class J s file this is simply
a file that contains all the image net
classes which will be making use of
later you can also find all of these
ordered image net classes on the
tensorflow J s blogs at Deep laser comm
let’s open it up and take a look at the
structure so we just have this
javascript object called image net
classes that contains the key value
pairs of the image net classes with
associated IDs all right now let’s open
the predict with T fjs HTML file and
jump into the code we’re starting off in
the head by specifying the title of our
web page and importing the styling from
this CSS file for all the styling on the
page will be using bootstrap which is an
open source library for developing HTML
CSS and JavaScript that uses design
templates to format elements on the page
bootstrap is really powerful but will
simply be using it just to make our app
look a little nicer now bootstrap uses a
grid layout where you can think of the
web page having containers that could be
interpreted as grids and then UI
elements on the page are organized into
the rows and the columns that make up
those grids by setting the elements
class attributes
that’s how bootstrap knows what type of
styling to apply to them
so given this that’s how we’re
organizing our UI elements embedded
within the body we’re putting all the UI
elements within this main tag you can
see that our first div is what’s
considered to be a container on the page
and then within the container we have
three rows and each row has columns the
columns are where our actual UI elements
reside our UI elements are the image
selector the predict button the
prediction list and the selected image
we’ll explore this grid layout
interactively in just a moment but first
let’s finish checking out the remainder
of HTML although we
left to do is import the required
libraries and resources that our app
needs first we import jQuery then we
import tensorflow j/s with this line so
this single line is all it takes to get
tensorflow JS into our app then we
import the image net class J s file we
checked out earlier and lastly we import
our predict J s file which as mentioned
earlier contains all the logic for what
our app does when a user supplies and
image to it alright so that’s it for the
HTML let’s check out the page and
explore the grid layout first we start
up our Express server which we learned
how to do in the last section then in
our browser we’ll navigate to the IP
address where I express servers running
port 81 predict with T fjs HTML and
here’s our page it’s pretty empty right
now because we haven’t selected an image
but once we write the JavaScript logic
to handle what to do when we select an
image then the name of the selected
image file will be displayed here the
image will be displayed in the image
section and upon clicking the predict
button the predictions for the image
from the model will be displayed in this
prediction section if we open the
developer tools by right-clicking on the
page and then click again inspect then
from the elements tab but we can explore
the grid layout let’s expand the body
then main then this first div that acts
as the container and hovering over this
div you can see that the blue on the
page is what’s considered to be the
container or the grid so now that we’ve
expanded this div we have access to all
the rows so hovering over the first row
we can see what that map’s to in the UI
from this blue section and we can do the
same for the second and third rows as
well then if we expand the rows we have
access to the columns that house the
individual UI elements so hovering over
this first column in the first row we
can see that the image selector is here
and the predict button is within the
second column in the first row and the
same idea applies for the remaining
elements on the page as well so
hopefully that sheds a bit of light on
the grid layout of that bootstrap is
making use of alright in the
next section will explore all of the
JavaScript that handles the predictions
and actually makes use of tensorflow Jas
I’ll see you there in this section we’ll
continue the development of the
client-side deep learning application we
started last time so let’s get to it in
the last section we built the UI for our
image classification web app now we’ll
focus on the JavaScript that handles all
the logic for this app we’ll also start
getting acquainted with the tensorflow
jsapi without further ado let’s get
right into the code recall in the last
section we created this predict Jas file
within the static directory but left it
empty this file now contains the Java
Script logic that handles what will
happen when a user submits an image to
the application so let’s look at the
specifics for what’s going on with this
code we first specify what should happen
when an image file is selected with the
image selector when a new image is
selected the change event will be
triggered on the image selector and when
this happens we first create this file
reader object called reader to allow the
web app to read the contents of the
selected file we then set the onload
handler for reader which will be
triggered when reader successfully reads
the contents of a file when this happens
we first initialize the data URL
variable as reader dot result which will
contain the image data as a URL that
represents the files data as a basic c
for encoded string we then set the
source attribute of the selected image
to the value of data URL lastly within
the onload handler we need to get rid of
any previous predictions that were being
displayed for previous images and we do
this by calling empty on the prediction
list element next we get the selected
file from the image selector and load
the image by calling read as data URL on
reader and passing in the selected image
file we then instantiate this model
variable and we’re going to define it
directly below now this below section
may look a little freaky if you’re not
already a job
tripped whiz so let’s see what the deal
is here we have what’s called an eye IFE
or immediately invoked function
expression an eye IFE is a function that
runs as soon as it’s defined we can see
this is structured by placing the
function within parentheses and then
specifying the call to the function with
these parentheses that immediately
follow within this function we load the
model by calling the tensorflow J s
function T f dot load model which
accepts a string containing the URL to
the model JSON file recall from the last
section we showed how the model dot JSON
file and corresponding weight files
should be organized within our static
directory that’s being served by Express
we’re first going to be working with vgg
16 as our model so I’ve specified the
URL to where the model JSON file for vgg
16 resides now TF dot load model returns
a promise meaning that this function
promises to return the model at some
point in the future this a weight
keyword pauses the execution of this
wrapping function until the promise is
resolved and the model is loaded this is
why we use the async keyword when
defining this function because if we
want to use the await keyword then it
has to be contained within an async
function now I’ve added a progress bar
to the UI to indicate to the user when
the model is loading as soon as the
promise is resolved we’re then hiding
the progress bar from the UI which
indicates the model is loaded before
moving on let’s quickly jump over to the
HTML we developed last time so I can
show you where I inserted this progress
bar so here we are in predict with T fjs
HTML and you can see that right within
this first div the container I’ve
inserted this row where the progress bar
is embedded we’ll see it in action
within the UI at the end of this video
all right jumping back over to the
JavaScript we now need to write the
logic for what happens when the predict
button is clicked when a user clicks the
predict button we first get the image
from the selected image element then we
need to transform the image into a rank
for 10
our object of floats with height and
width dimensions of 224 by 224 since
that’s what the model expects to do this
we create a tensor object from the image
by calling the tensor flow J s function
TF dot from pixels and passing our image
to it
we then resize the image to 224 by 224
cast the tensors type to float 32 and
expand the tensors dimensions to be of
rank four we’re doing all of this
because the model expects the image data
to be organized in this way and note
that all of these transformations are
occurring with call step functions from
the tensor flow jsapi all right we have
the tensor object of image data that the
model expects now vgg 16 actually wants
the image data to be further pre
processed in a specific way beyond the
basics we just completed there are
transformations to the underlying pixel
data that need to happen for this
pre-processing that bgg 16 wants in
other libraries like Karis
pre-processing functions for specific
models are included in the API currently
though tensorflow j s does not have
these pre-processing functions included
so we need to build them ourselves we’re
going to build a pre-processing function
in the next section to handle this so
for right now what we’ll do is pass in
the image data contained in our tensor
object as is to the model the model will
still accept the data as input it just
won’t do a great job with its
predictions since the data hasn’t been
processed in the same way as the images
that bgg 16 was originally trained on so
we’ll go ahead and get this app
functional now and then we’ll circle
back around to handle the pre-processing
in the next section and insert it
appropriately then alright so a user
clicks the predict button we transform
the image data into a tensor and now we
can pass the image to the model to get a
prediction we do that by calling predict
on the model and passing our tensor to
it predict returns a tensor of the
output predictions for the given input
we then call data on the prediction
sensor which asynchronously loads the
values from the tensor and returns a
promise of a typed array after the
computation complete
notice the await and async keywords here
that we discussed earlier
so this predictions array is going to be
made up of 1,000 elements each of which
corresponds to the prediction
probability for an individual imagenet
class each index in the array maps to a
specific imagenet class now we want to
get the top five highest predictions out
of all of these since that’s what we’ll
be displaying in the U I will store
these top five predictions in this top
five variable top five top five top five
before we sort and slice the array to
get the top five we need to map the
prediction values to their corresponding
imagenet classes for each prediction in
the array we return a JavaScript object
that contains the probability and the
imagenet class name notice how we use
the index of each prediction to obtain
the class name from the image net
classes array that we imported from the
image net classes javascript file we
then sort the list of javascript objects
by prediction probability in descending
order and obtain the first five from the
sorted list using the slice function
lastly we iterate over the top five
predictions and store the class names
and corresponding prediction
probabilities in the prediction list of
our UI and that’s it let’s now start up
our Express server and browse to our App
all right we’re here and we’ve got
indication that our model is loading so
I pause the video while this model was
continuing to load and it ended up
taking about 40 seconds to complete not
great it may even take longer for you
depending on your specific computing
resources remember though I said we’d
run into some less than ideal situations
with running such a large model like vgg
16 in the browser I want you so the time
it takes to load the model is the first
issue we’ve got over 500 megabytes of
files to load into the browser for this
model hence the long loading time
all right well our model is loaded so
let’s choose an image and predict on it
hmm about a five-second wait time to get
a prediction on a single image again not
great oh and yeah the displayed
prediction isn’t accurate but that
doesn’t have anything to do with the
model size or anything like that it’s
just because we didn’t include the
pre-processing for bgg 16 remember we’re
going to handle that in the next section
there will get further exposure to the
tensor flow jsapi by exploring the
tensor operations we’ll need to work
with to do the pre-processing all right
so we’ve got that coming up and then
afterwards we’ll solve all these latency
issues attributed to using a large model
by substituting mobile net in for vgg 16
let me know in the comments if you are
able to get your app up and running and
I’ll see you in the next section in this
section we’re going to explore several
tensor operations by pre-processing
image data to be passed to a neural
network running in our web app so let’s
get to it recall that last time we
developed our web app to accept an image
pass it to our tensor flow j/s model and
obtain a prediction for the time being
we’re working with vgg 16 as our model
and in the previous section we
temporarily skipped over the image
pre-processing that needed to be done
for vgg 16 we’re going to pick up with
that now so we’re going to get exposure
to what specific pre-processing needs to
be done for bgg 16 yes but perhaps more
importantly we’ll get exposure to
working with and operating on tensors
will be further exploring the tensor
flow j/s library in order to do these
tensor operations alright let’s get into
the code we’re back inside of our
predict j s file and we’re going to
insert the vgg 16 pre-processing code
right within the handler for the click
event on the predict button we’re
getting the image in the same way we
covered last time converting it into a
tensor object using TF dot from pixels
resizing it to the appropriate 224 by
224 dimensions and casting the type of
the tensor to float 32 no change here so
far all right now let’s discuss the
pre-processing that needs to be done for
vgg 16 this paper authored by the
creators of vgg 16 discusses the details
the architecture and the findings of
this model we’re interested in finding
out what pre-processing they did on the
image data jumping to the architecture
section of the paper the other state
quote the only pre-processing we do is
subtracting the mean RGB value computed
on the training set from each pixel
let’s break this down a bit
we know that image net was the training
set for vgg 16 so image net is the data
set for which the mean RGB values are
calculated to do this calculation for a
single color channel say red we compute
the average red value of all the pixels
across every image net image the same
goes for the other two color channels
and green and blue then to pre-process
each image we subtract the mean red
value from the original red value in
each pixel we do the same for the green
and blue values as well this technique
is called zero centering because it
forces the mean of the given data set to
be zero so we’re zero centering each
color channel with respect to the image
net data set now aside from zero
centering the data we also have one more
pre-processing step and not mentioned
here the authors trained vgg 16 using
the cafe library which uses a BGR color
scheme for reading images rather than
RGB so as a second pre-processing step
we need to reverse the order of each
pixel from RGB to BGR alright now that
we know what we need to do let’s jump
back in the code and implement it we
first define a JavaScript object mean
image net RGB which contains the mean
red green and blue values from image net
we’ve then defined this list we’re
calling indices the name will make sense
in a minute
this list is made up of one-dimensional
tensors of integers created with TF
tensor 1d the first sensor in the list
contains the single value 0 the second
tensor contains the single value 1 and
the third tensor contains the single
value 2 we’ll be making use of these
sensors in the next step here we have
this javascript object we’re calling
center 2r
which contains the centered red green
and blue values for each pixel in our
selected image let’s explore how we’re
doing this centering recall that we have
our image data organized now into a 2 24
by 2 24 by 3 tensor object so to get the
centered red values for each pixel in
our tensor we first used the tensorflow
j s function TF gather to gather all the
red values from the tensor specifically
TF gather is gathering each value from
the 0th index along the tensors second
axis each element along the second axis
of our 224 by 224 by 3 tensor represents
a pixel containing a red green and blue
value in that order so the 0th index in
each of these pixels is the red value of
the pixel after gathering all the red
values we need to Center them by
subtracting the mean imagenet red value
from each red value in our tensor to do
this we use the tensor flow J S sub
function which will subtract the value
passed to it from each red value in the
tensor it will then return a new tensor
with those results the value we’re
passing to sub is the mean red value
from our mean image net RGB object but
we’re first transforming this raw value
into a scalar object by using the TF
scalar function alright so now we’ve
centered all the red values but at this
point the tensor we’ve created that
contains all of these red values is of
shape 224 by 224 by one we want to
reshape this tensor to just be a
one-dimensional tensor containing all
50000 176 red values so we do that by
specifying this shape to the reshape
function great now we have a
one-dimensional tensor containing all
the centered red values from every pixel
in our original tensor we need to go
through this same process now again to
get the centered green and blue values
at a brief glance you can see the code
is almost exactly the same as what we
went through for the red values the only
exceptions are the indices we’re passing
to TF gather and the mean imagenet
values were passing to TF dot scalar at
this point we now have this centered RGB
object that contains a 1 inch
joltin sir of centered read values a
one-dimensional tensor of centered green
values and a one-dimensional tensor of
centered blue values we now need to
create another tensor object that brings
all of these individual red green and
blue tensors together into a 224 by 224
by 3 tensor this will be the
pre-processed image so we create this
processed tensor by stacking the
centered red centered green and centered
blue tensors along axis 1 the shape of
this new tensor is going to be of 50000
176 by 3 this centre represents 50
thousand 176 pixels each containing a
red green and blue value we need to
reshape this tensor to be in the form
that the model expects which is 224 by
224 by 3 now remember at the start we
said that we’d need to reverse the order
of the color channels of our image from
RGB to BGR so we do that using the
tensor flow Jas function reverse to
reverse our tensor along the specified
axis
lastly we expand the dimensions to
transform the tensor from rank 3 to rank
4 since that’s what the model expects
ok now we have our pre processed image
data in the form of this pre-processed
tensor object so we can pass this pre
processed image to our model to get a
prediction before we do that though note
that we handle these tensor operations
in a specific way in a specific order to
pre-process the image it’s important to
know though that this isn’t the only way
we could have achieved this in fact
there’s a much simpler way through a
process called broadcasting that could
achieve this same process tensor at the
end don’t worry we’re going to be
covering broadcasting in a future
section but I thought that for now doing
these kind of exhaustive tensor
operations would be a good opportunity
for us to explore the tensor flow jsapi
further and get more comfortable with
tensors in general checking out our app
using the same image as last time we can
now see that the model gives us an
accurate prediction on the image since
the image has now been processed
appropriately now I don’t know about you
but tensor operations like the ones we
worked with here are always a lot easier
for me to grasp when I can visualize
what the tensor
looks like before and after the
transformation so in the next section
we’re going to step through this code
using the debugger to visualize each
tensor transformation that occurs during
pre-processing I’ll see you there in
this section we’re going to continue our
exploration of tensors here we’ll be
stepping through the code we developed
last time with the debugger to see the
exact transformations that are happening
to our tensors in real time so let’s get
to it last time we went to the process
of writing the code to pre-process
images for vgg 16 through that process
we gained exposure to working with
tensors transforming and manipulating
them we’re now going to step through
these tensor operations with the
debugger so that we can see these
transformations occur in real time as we
interact with our app if you’re not
already familiar with using a debugger
don’t worry you’ll still be able to
follow we’ll first go through this
process using the debugger and visual
studio code then it will demo the same
process using the debugger built into
the Chrome browser we’re here within our
predict J’s file within the click event
for the predict button where all the
pre-processing code is written we’re
placing a breakpoint in our code where
our first tensor is defined remember
this is where we’re getting the selected
image and transforming it into a tensor
using TF dot from pixels the expectation
around this break point is when we
browse to our app the model will load
we’ll select an image and click the
predict button once we click predict
this click event will be triggered and
we’ll hit this breakpoint when this
happens the code execution will be
paused until we tell it to continue to
the next step
this means that while we’re pulse we can
inspect the tensors were working with
and see how they look before and after
any given operation let’s see we’ll
start our debugger in the top left of
the window which will launch our app in
Chrome
all right we can see our model is
loading okay the models loaded let’s
select an image
you
now let’s click the predict button and
when we do this we’ll see our breakpoint
will get hit and that Apple pause
and here we go our code execution is now
paused we’ll minimize the browser and
expand our code window now since this is
where we’ll be debugging we’re currently
pulse at this line where we define our
tensor object we’re going to click this
step over icon which will execute this
code where we’re defining tensor and
we’ll pause at the next step let’s see
all right we’re now paused at the next
step now that tensor has been defined at
let’s inspect it a bit first we have
this variables panel over in the left
where we can check out information about
the variables in our app and we can see
our tensor variable is here in this list
clicking tensor we can see we have all
types of information about this object
for example we can see the d-type is
float32 the tensors of rank 3 the shape
is 2 24 by 2 24 by 3 and the size is a
hundred and fifty thousand and 528 so we
get a lot of information describing this
guy additionally in the debug console we
can play with distance err further for
example let’s print it using the
tensorflow j/s print function and we’ll
scroll up a bit and we can see that this
kind of lets us get a summary of the
data contained in this tensor remember
we made this tensor have shaped to
twenty four by two twenty four by three
so looking at this output we can
visualize this tensor as an object with
two hundred twenty four rows each of
which is 224 pixels across and each of
those pixels contains a red green and
blue value so what’s selected here
represents one of those 224 rows and
each one of these are one of the 224
pixels in this row and each of these
pixels contains first a red then a green
then a blue value so make sure you have
a good grip on this idea so you can
follow all the transformations this
tensor is about to go through all right
our debugger is paused on defining the
mean imagenet RGB object let’s go ahead
and step over this so that it gets
defined again we can now inspect this
object over in the local variables panel
we’re not doing any tensor operations
here so let’s go ahead and move on we’re
now paused on our list of rank one
tensors called indices which will make
use of later so let’s execute this we
can see indices now shows up in our
local variable panel let’s inspect this
one a bit from the debug console if we
just print out this list using
console.log indices we get back that
this is an array with three things in it
we know that each element in this array
is a tensor so let’s access one of them
let’s get the first tensor
and it might help if we spell indices
correctly so let’s try that again
we get back that this object is a tensor
and we can see what it looks like just a
one-dimensional tensor with a single
value zero and we can easily do the same
thing for the second and third elements
in the list too all right we’re going to
minimize this panel on the left now and
scroll up some in our code
we’re now paused where we’re defining
the centered RGB object and from last
time we know that’s where the bulk of
our tensor operations are occurring so
if we execute this block then we’ll skip
over being able to inspect each of these
transformations so what we’ll do is
we’ll stay paused here but in the
debugger console will mimic each of
these individual transformations one by
one so we can see the before and after
version of the tensor so for example
we’re first going to mimic what’s
happening here with the creation of the
tensor that contains all the centered
red values within our centered RGB
object in the console we’ll create this
variable called red and set it equal to
just the first call to TF gather and see
what it looks like so we’ll go ahead and
copy this call and we’ll create a
variable red and set it equal to that
before we do any other operations let’s
see what this looks like let’s first
check the shape of red okay to 24 by 2
24 by 1 so similar to what we saw from
the original tensor of 2 24 by 2 24 by 3
but rather than the last dimension
containing all 3 pixel values red green
and blue our new red tensor only
contains the red pixel values let’s
print red and let’s scroll up so that we
can see the start of the tensor and just
to hit the point home let’s compare this
to the original tensor so the first 3
values in red are 56 58 and 59 now let’s
scroll up and check out the original
tensor to see if this lines up so 56 58
59 scrolling up to our original tensor
and yet our original tensor has the red
values of 56 58 and 59 in the first
three 0th indices along the second axis
so red is just made up of each of these
values
alright let’s scroll back down in our
debug console and let’s see what the
next operation on red is this is where
we’re centering each red value by
subtracting the mean red value from
imagenet using this sub function let’s
make a new variable called centered red
and mimic this operation so we’ll define
that centered red
equal to read and then call the sub
function now let’s print centered read
and scroll up to the top okay so about
minus 67-65 and minus 64 for the first
three values along the second axis let’s
compare this to the original read tensor
now by scrolling up to look at that and
these are 56 58 and 59 as the first
three values along the second axis so if
we do the quick math of subtracting the
mean read value of one twenty three
point six eight and remember we can see
that by looking here one twenty three
point six eight as our mean read value
in the mean imagenet RGB object
subtracting this number from the first
three values of our original red tensor
we do indeed end up with the centered
red values in the new centered red
tensor we just looked at now centered
red still has the same shape as red
which recall is 224 by 224 by one the
next step is to reshape this tensor to
be a rank one tensor of size 50,000 176
so we just want to bring all the
centered red values together which are
currently each residing in their own
individual tensors so to mimic this
reshape call we’ll make a new variable
called reshaped red so we’ll scroll back
down in our debugger console and we’ll
copy this reshape call and we’ll define
reshaped red equal to centered red and
then call reshape on that alright let’s
check the shape on this new object to
get confirmation and we see it is indeed
the shape that we specified let’s now
look at the printout of reshaped red
okay and we see all the red values are
now listed out here in this
one-dimensional tensor alright so that’s
it for getting all the centered red
values as mentioned last time we go
through the same process to gather all
the blues and greens as well so we’re
not going to go through that in the
debugger will now execute this block of
code to create this centered RGB object
and move on to the next step this is
where we’re bringing our
centered red green and blue values all
together into a new process tensor so
from the console let’s run this first
stack operation by creating a variable
called stacked tensor so I’ll create
stack tensor set that equal to this
stack call remember we just saw that
reshaped red ended up being a Rank 1
tensor of shape fifty thousand one
hundred seventy six the green and blue
tensors have the same shape in size so
when we stack them along axis one we
should now have a fifty thousand one
hundred seventy six by three tensor you
may think the result of the stack
operation would look like this where we
have the centered red tensor with its
fifty thousand one hundred seventy six
values stacked on top of the green
tensor stacked on top of the blue tensor
and that’s how it would look if we were
stacking along axis zero because we’re
stacking along axis one though we’ll get
something that looks like this where we
have a fifty thousand one hundred
seventy six rows each of which is made
up of a single pixel with a red green
and blue value let’s check the shape now
and the console to be sure we get the
fifty thousand one seventy six by three
we expect yep we do
let’s also print it to get a visual okay
so we have fifty thousand one hundred
seventy six rows each containing a red
green and blue value now we need to
reshape this guy to be of shape to 24 by
224 by three before we can pass it to
the model so let’s do that now with a
new variable will call reshaped tensor
so we’ll copy the reshape call from over
here and define reshaped tensor equal to
our stacked tensor dot reshape okay
let’s print this reshape tensor and
scroll up to the top again this shape
means we have two hundred and twenty
four rows each containing 224 pixels
which each contain a red green and blue
value now we need to reverse the values
in this tensor along the second axis
from RGB to BGR
for the reasons we mentioned last time
so we’ll copy this reverse call here and
we’ll make a new object called a reverse
tensor and set that equal to our reshape
tensor dot reverse and we need to scroll
down in our debug console and let’s
print this one out and scroll up to the
top of it okay so we see the first BG
our values now let’s scroll up to our
last tensor to make sure this is the
reverse of the RGB values we had there
so – 99 – 87-67 scrolling up we have –
99 – 87 – 67
so indeed our new tensor has the
reversed RGB values let’s scroll back
down in our debugger and our last
operation is expanding the dimensions of
our tensor to make it go from rank 3 to
a rank 4 tensor which is what our model
requires so we’ll create a new tensor
called expanded tensor and set that
equal to reverse tensor and we’ll copy
the expand dem call from over here and
call that on our reverse sensor all
right now let’s check the shape of this
guy to make sure it’s what we expect so
we have this inserted dimension at the
start now making our tensor rank 4 with
shape 1 by 2 24 by 2 24 by 3 rather than
just 2 24 by 2 24 by 3 that we had last
time and if we print this out and scroll
up to the start we can see this extra
dimension added around our previous
tensor and that sums up all the tensor
operations quickly though in case you’re
not using visual studio code I did want
to also show this same set up directly
within the Chrome browser so that you
can do your debugging there instead if
you’d prefer in chrome we can right
click on our page click inspect and then
go to this sources tab here we have
access to the source code for our app
predict J s is currently being shown in
the open window so now we have act
to the exact code we were displaying in
Visual Studio code and we can insert
breakpoints here in the same way as well
let’s go ahead and put a breakpoint in
the same place as we did earlier
now let’s select an image and click the
predict button we see that our app is
pulsed at our breakpoint and then we can
step over the code just as we saw
earlier and we have our variables all
showing in this panel here and we also
have our console down here so I can do
indices 0 dot print for example to get
that same output that we got in visual
studio debugger and from this console I
can run all the same code that we ran in
Visual Studio code as well all right
hopefully now you have a decent grasp on
tensors and tensor operations let me
know what you thought of going through
this practice and the debugger to see
how the tensors changed over time with
each operation and I’ll see you in the
next section in this section we’ll learn
about broadcasting and illustrate its
importance and major convenience when it
comes to tensor operations so let’s get
to it over the last couple of sections
we’ve immerse ourselves in tensors and
hopefully now we have a good
understanding of how to work with
transform and operate on them if you
recall a couple sections back I
mentioned the term broadcasting and said
that we would layer make use of it to
vastly simplify our vgg 16
pre-processing code before we get into
the details about what broadcasting is
though let’s get a sneak peek of what
our transformed code will look like once
we’ve introduced broadcasting because
I’m using git for source management I
can see the disk between our original
predict jas file and the modified
version of this file that uses
broadcasting on the left we have our
original predict J’s file within the
click event recall this is where we
transformed our image into a tensor then
the rest of this code was all created to
do the appropriate pre-processing for
vgg 16 where we centered and reversed
the RGB values now on the right this is
our new and improved predict J’s file
that makes use of broadcasting in
place of all the explicit 1×1 tenser
operations on the left so look all of
this code in red has now been replaced
with what’s shown in green that’s a
pretty massive reduction of code before
we show how this happened we need to
understand what broadcasting is
broadcasting describes how tensors with
different shapes are treated during
arithmetic operations for example it
might be relatively easy to look at
these two rank two tensors and figure
out what the sum of them would be they
have the same shape so we just take the
element-wise sum of the two tensors
where we calculate the sum element by
element and here we go we have our
resulting tensor now since these two
tensors have the same shape one by three
no broadcasting is happening here
remember broadcasting comes into play
when we have tensors with different
shapes all right so what would happen if
our two rank two tensors instead looked
like this and we want it to sum them we
have one with shape 1 by 3 and the other
with shape 3 by 1 well here’s where
broadcasting will come into play before
we cover how this is done go ahead and
pause the video and just see intuitively
what comes to mind as the resulting
tensor from adding these two together
give it a go write it down and keep what
you write handy because we’ll circle
back around to what you wrote later in
the video
all right we’re first going to look at
the result and then we’ll go over how we
arrived there
a result from summing these two tensors
is a 3×3 tensor so here’s how
Broadcasting works we have two tensors
with different shapes the goal of
broadcasting is to make the tensors have
the same shape so we can perform
elementwise operations on them first we
have to see if the operation we’re
trying to do is even possible between
the given tensors based on the tensors
original shapes there may not be a way
to reshape them to force them to be
compatible and if we can’t do that then
we can’t use broadcasting the rule to
see if broadcasting can be used is this
we compare the shapes of the two tensors
starting at their last dimensions and
working backwards our goal is to
determine whether or not each dimension
between the two tensor shapes is
compatible in our example we have shapes
3 by 1 and 1 by 3 so we first compare
the last dimensions the dimensions are
compatible when either a they’re equal
to each other or B one of them is 1
comparing the last dimensions of the two
shapes we have a 1 and a 3 are these
compatible well let’s check the rule are
they equal no one doesn’t equal 3 is one
of them 1 yes
great the last dimensions are compatible
working our way to the front for the
next dimension we have a 3 and a 1
similar story just switched order right
so are these compatible yes ok that’s
the first step we’ve confirmed each
dimension between the two shapes is
compatible if however while comparing
the dimensions we confirm that at least
one dimension wasn’t compatible then we
would cease our efforts there because
the arithmetic would not be possible
between the two now since we’ve
confirmed that our two tensors are
compatible we can sum them and use
broadcasting to do it when we sum two
tensors the result of this sum will be a
new tensor our next step is to find out
the shape of this resulting tensor we do
that by again comparing the shapes of
the original tensors let’s see exactly
how this is done comparing the shape of
1 by 3 to 3 by 1 we first calculate the
max of the last dimension the max of 3
and one is three three will be the last
dimension of the shape of the resulting
tensor moving on to the next dimension
again the max of 1 & 3 is 3 so 3 will be
the next dimension of the shape of the
resulting tensor we’ve now stepped
through each dimension of the shapes of
the original tensors and we can conclude
that the resulting tensor
we’ll have shape 3 by 3 the original
tensors of shape 1 by 3 and 3 by 1 will
now be expanded to shape 3 by 3 also in
order to do the element wise operation
broadcasting can be thought of as
copying the existing values within the
original tensor and expanding that
tensor with these copies until it
reaches the required shape
the values in our 1 by 3 tensor will now
be broadcast to this 3×3 tensor in the
values in our 3 by 1 tensor will now be
broadcast to this 3×3 tensor we can now
easily take the element wise sum of
these 2 to get this resulting 3×3 tensor
let’s do another example what if we
wanted to multiply this rank two tensor
of shape 1 by 3 with this rank 0 tensor
better known as a scaler we can do this
since there’s nothing in the
broadcasting rules preventing us from
operating on two tensors of different
ranks let’s see we first compare the
last dimensions of the two shapes when
we’re in a situation where the ranks of
the two tensors aren’t the same like
what we have here then we simply
substitute a 1 in for the missing
dimensions of the lower ranked tensor in
our example we substitute a 1 here then
we ask are these two dimensions
compatible and the answer will always be
a yes in this type of situation since
one of them will always be a 1 all right
all the dimensions are compatible so
what will the resulting tensor look like
from multiplying these two together
again go ahead and pause here and try
yourself before getting the answer
well the max of 3 & 1 is 3 and the max
of 1 & 1 is 1 so our resulting tensor
will be of shape 1 by 3 our first sensor
is already this shape so it gets left
alone our second tensor is now expanded
to this shape by broadcasting its value
like this now we can do our element wise
multiplication to get this resulting 1
by 3 tensor let’s do one more example
what if we wanted to sum this rank 3
tensor of shape 1 by 2 by 3 and this
rank two tensor of shape 3 by 3 before
covering any of the incremental steps go
ahead and give it a shot yourself
and see what you find out alright
assuming you’ve now paused and resumed
the video the deal with these two
tensors is that we can’t operate on them
why well comparing the second-to-last
dimensions of the shapes they’re not
equal to each other and neither one of
them is 1 so we stop there all right and
now we should have a good grip on
broadcasting let’s go see how we’re able
to make use
of it in our vgg 16 pre-processing code
first we can see we’re changing our mean
image net RGB object into a rank one
tensor which makes sense right because
we’re going to be making use of
broadcasting which is going to require
us to work with tensors not arbitrary
JavaScript objects all right now get a
load of this remaining code all of this
code was written to handle the centering
of the RGB values this has now all been
replaced with this single line which is
simply the result of subtracting the
mean image net RGB tensor from the
original tensor okay so why does this
work and where is the broadcasting let’s
see our original tensor is a rank 3
tensor of shape 224 by 224 by 3 our mean
image net RGB tensor is a rank 1 tensor
of shape 3 our objective is to subtract
each mean RGB value from each RGB value
along the second axis of the original
tensor from what we’ve learned about
broadcasting we can do this really
easily we compare the dimensions of the
shapes from each tensor and confirm
they’re compatible the last dimensions
are compatible because they’re equal to
each other the next two dimensions are
compatible because we substitute a 1 in
for the missing dimensions in our rank 1
tensor taking the max across each
dimension our resulting tensor will be
of shape 224 by 224 by 3 our original
tensor already has that shape so we
leave it alone our rank 1 tensor will be
expanded to this shape of 224 by 224 by
3 by copying its three values along the
second axis so now we can easily do the
element wise subtraction between these
two tensors exiting out of this diff and
looking at the modified predict j s file
alone we have this so the reversing and
the expanding of the dims at the end is
still occurring in the same way after
the centering now actually if we wanted
to make this code even more concise
rather than creating two tensor objects
our original one and our pre processed
one we can chain all these calls
together to condense these two separate
tensors into one we would first need to
bring our mean image net RGB definition
above our tensor definition then we need
to move our sub reverse and
fandom calls up and chained them to the
original tensor lastly we replace this
reference to process tensor with just
tensor and that’s it so if you took the
time to truly understand the tensor
operations we went through step by step
in the last couple of sections then you
should now be pretty blown away by how
much easier broadcasting can make our
lives and our code given this do you see
the value in broadcasting let me know in
the comments oh also remember all those
times I asked you to pause the video and
record your answers to the examples we
were going through let me know what you
got and don’t be embarrassed if you were
wrong I was wrong when I tried to figure
out examples like these when I first
started learning broadcasting so no
shame let me know and I’ll see you in
the next section in this section we’ll
be adding new functionality to our deep
learning web application to increase its
speed and performance specifically we’ll
see how we can do this by switching
models so let’s get to it we currently
have a web app that allows users to
select and submit an image and
subsequently receive a prediction for
the given image up to this point we’ve
been using vgg 16 as our model vgg 16
gets the job done when it comes to
giving accurate predictions on the
submitted images however as we’ve
previously discussed a model of its size
over 500 megabytes is not ideal for
running in the browser because of this
we’ve seen a decent time delay in both
loading the model as well as obtaining
predictions from the model well we’re in
luck because we’ll now make use of a
much smaller model mobile net which is
pretty ideal size-wise for running in
the browser coming in at around 16
megabytes with mobile net we’ll see a
vast decrease in time for both loading
the model and obtaining predictions
let’s go ahead and get into the code to
see what modifications we need to make
all right we’re here and our predict
with TF j s HTML file and we’re going to
make a model selector where the user has
the ability to choose which model to use
for now we’ll have vgg 16 and mobile net
as available options currently the call
to load the model occurs immediately
when the web page is requested but now
we’ll change that functionality so that
the model will be
once a user selects which model they’d
like to use our model selector will take
on the form of an HTML select element so
the first thing we need to do is add
this element to our HTML within the same
row as the image selector and the
predict button we’re adding this new
select element within a column to the
left of both of the previously mentioned
elements when a user shows up to the
page the model selector will be set to
the option that states select model and
they’ll have the option to choose either
mobile net or vgg 16 now also recall how
we mentioned that until now the model
was being loaded immediately when a user
arrived at the page and during that time
the progress bar would show to indicate
the loading since we’ll be changing the
functionality so that the model isn’t
loaded until a user chooses which model
they want to use we won’t need the
progress bar to show until that model is
selected so navigating to the progress
bar element we’re going to set the
display style attribute to none which
will hide the progress bar until we
explicitly instruct it to be shown in
the JavaScript code all right that’s it
for the changes to our HTML jumping to
predict j/s will now specify what should
happen once a user selects a model when
a model is selected this will trigger a
change event on the model selector we’re
handling this event by calling a new
function which we’ll discuss in a moment
called load model load model essentially
does what it sounds like it does we pass
this function the value from model
selector which is either going to be
mobile net or vgg 16 do you remember how
previously we were loading the model
using an immediately invoked function
expression or IFE well now that we don’t
want to load the model until we
explicitly call load model like we just
specified we no longer want this loading
to happen within and ify the code for a
load model is actually super similar to
the IFV we had before just with some
minor adjustments load model accepts the
name of the model to be loaded once
called the progress bar will be shown to
indicate the model is loading we
initially set the model to undefined so
that in case we’re in a situation where
we’re switching from one model to
another the previous model can be
cleared from memory afterwards we set
model to the result of calling the
tensorflow j/s function at TF load model
remember this function accepts the URL
to the given models model dot JSON file
the models reside in folders that were
given the names of the actual models
themselves for example the vgg 16 files
reside within a directory called vgg 16
and the mobile net files reside within a
directory called mobile net so when we
give the URL to the model JSON we use
the name of the selected model to point
to the correct location for where the
corresponding JSON file resides once the
model is loaded we then hide the
progress bar all right now let’s
navigate to the click event for the
predict button previously within this
handler function we would get the
selected image and then we would do all
of the pre-processing for vgg 16 and get
a prediction well now since we have two
different models that pre process images
differently we’re putting the
pre-processing code into its own
function called pre process image so now
once a user clicks the predict button we
get the selected image we get the model
name from the value of the model
selector and then we create a tensor
which is set to the result of our new
pre process image function we pass the
function both the image and the model
name let’s go check out this function
alright as just touched on pre process
image accepts an image and the model
name it then creates a tensor using TF
dots from pixels passing the given image
to it resizes this tensor to have height
and width dimensions of 224 by 224 and
cast the tensors type to float all this
should look really familiar because we
had this exact same code within the
predict buttons click event before this
code won’t change regardless of whether
we’re using vgg 16 or mobile net now in
case later we want to add another model
and say we only want the base generic
pre-processing that we just covered then
in that case we won’t pass a model name
and we’ll catch that case with this if
statement that just returns the tensor
with expanded dimensions if vgg 16 is
the selected model then we need to do
the remaining pre-processing that we
went over together in earlier sections
so we have our mean image net RGB tensor
that we defined last time here and we
subtract the mean image net RGB tensor
from the original tensor reverse the RGB
values and expand the dimensions of the
tensor we then returned this final
tensor as the result of this function if
mobile net is selected on the other hand
then our pre-processing will be a bit
different
unlike vgg 16 the images that mobile net
was originally trained on were a
pre-processed so that the RGB values
were scaled down from a scale of 0 to
255 to a scale of -1 to 1 we do this by
first creating this scalar value of 120
7.5 which is exactly one-half of 255 we
then subtract the scalar from the
original tensor and divide that result
by the scalar this will put all the
values on a scale of minus 1 to 1 and
notice the use of broadcasting that’s
going on with these operations behind
the scenes
lastly we again expand the dimensions
and then return this resulting tensor
also in this last case if a model name
is passed to this function that isn’t
one of the available ones already here
then we’ll throw this exception alright
we’ve made all the necessary code
changes let’s now browse to our app and
see the results we’ve arrived at our
application and we now have the new
model selector we added in clicking on
the selector we can select either mobile
net or vgg 16 let’s go ahead and select
mobile net
and you can see that loaded pretty fast
remember when we load a vgg 16 in
previous sections I had to pause and
resume the video since it took so long
to load but mobile net was speedy
alright cool
now we’ll select an image click predict
and again mobile net was super fast
relative to vgg 16 and returning a
prediction to us so hopefully this
exercise has illustrated the
practicality of using mobile nets and
situations like these and that’s it
congratulations for making it all the
way to the end and give yourself a pat
on the back if you’re looking to learn
more be sure to check out deep lizard on
YouTube and the available resources on
deep laser comm thanks for watching and
I hope to see you later
[Music]
Disclaimer:
My Machine Learning Blog may not own some of the content presented.
Copyright Disclaimer under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing.
All posts on this video blog are my personal opinion and they don’t in any way reflect the opinions of my employer. All materials, posts, and advice from this site are informational, researching, and for testing purposes only. You can use them at your own responsibility. I’m not in any way responsible for any damage done by following posts, advice, tutorials, and articles from this video blog.
References
↑1 | Originally Published on YouTube on Nov 13, 2018 |
---|