Skip to content
For most of my development, I use Jupyter notebooks which are fantastic for iterative development, but running in managed environments such as Google’s ML Engine require python scripts. You can obviously run these locally with python in the terminal or your IDE, but the loop from debug, terminate, change and re-run is rather slow (from what I can tell due to import speed of tensorflow and other imported packages). I wanted to be able to keep these imports in memory (like Jupyter) and just re-run a single function during development.
In 2017, there seems no doubt that if you aren’t running your ML training on a GPU you just aren’t doing things right ?. At home, my only computer is my MacBook Pro which is great to develop on, but would take an extremely long time to train something such as an image classification task. Saying this, I’d love to have a GPU machine at home, but I also love the opportunity to use this hardware in the cloud without having to power it, upgrade it and generally take care of something you actually own.
I’ve had a few attempts at getting TensorFlow estimators into a serving host and a client I can use to query them. Finally getting it working, I thought I’d write up the steps for reproduction.
Assumptions and Prerequisites The first assumption is that you have already trained your estimator (say the tf.estimator.DNNRegressor) and this is now in the variable estimator. You also have a list of feature columns as is standard in a variable feature_columns.
Embeddings can be used in machine learning to represent data and take advantage of reducing the dimensionality of the dataset and learning some latent factors between data points. Commonly this is used with words to say, reduce a 400,000 word vector to a 50 dimensional vector, but could equally be used to map post codes or other token encoded data. Another use case might be in recommender systems GloVe (Global Vectors for Word Representation) was developed at Stanford and more information can be found here.
An MNIST classifier is the go-to introduction for machine learning. Tensorflow is no different, and evolves to the Deep MNIST for Experts to include convolution, max pooling, dense layers and dropout: a good overview of ML layers for image problems. The downside of this is it doesn’t make use of Tensorflow’s new tf.estimator high level APIs. These provide all sorts of benefits for free than the usual sess.run TensorFlow tutorials you see online.
A common format for storing images and labels is a tree directory structure with the data directory containing a set of directories named by their label and each containing samples for said label. Often transfer learning that is used for image classification may provide data in this structure.
Update May 2018: If you would like an approach that doesn’t prepare into TFRecords, utilising tf.data and reading directly from disk, I have done this in when making the input function for my Dogs vs Cats transfer learning classifier.
TFRecords are TensorFlow’s native binary data format and is the recommended way to store your data for streaming data. Using the TFRecordReader is also a very convenient way to subsequently get these records into your model.
The data We will use the well known MNIST dataset for handwritten digit recognition as a sample. This is easily retrieved from tensorflow via:
from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets( "/tmp/tensorflow/mnist/input_data", reshape=False ) We then have mnist.
Azure functions can be very cheap and very easy to manage. Hosting a single page app (SPA) that does very few requests to the host and is basically just a few files to be delivered to the client seems perfect for using the consumption pricing of azure functions. Basically only pay for a request vs a monthly fee.
Azure Resources We’ll need a few resources in Azure, they are:
It was great to see you can now use Ghost as an NPM module. The upgrade process used to be a pain prior to the 1.0 release. Saying this, I ran into a few bumps trying to host this on Azure.
Follow the docs on Ghost as an NPM module to get started Set server port at runtime Node runs behind IIS and is reverse proxied so the port we are assigned is not static.
Simple script to bump the build number of my Xcode project on each Archive. The version number I am leaving as a manual change for the moment, but each time I release a build to testers, I want the build to change and a commit message for the change. In Xcode from the menu go to Product -> Scheme -> Edit Scheme On the side, expand the Arhive build and select Pre-actions.