Tuning Recurrent Neural Networks with Reinforcement Learning

By | machinelearning, ML, TensorFlow | No Comments

We are excited to announce our new RL Tuner algorithm, a method for enchancing the performance of an LSTM trained on data using Reinforcement Learning (RL). We create an RL reward function that teaches the model to follow certain rules, while still allowing it to retain information learned from data. We use RL Tuner to teach concepts of music theory to an LSTM trained to generate melodies. The two videos below show samples from the original LSTM model, and the same model enchanced using RL Tuner.

Read More


Multistyle Pastiche Generator

By | machinelearning, ML, TensorFlow | No Comments

Vincent Dumoulin,
Jonathon Shlens,
Manjunath Kudlur
have extended image style transfer by creating a single network which performs
more than one stylization of an image. The
paper[1] has also been summarized
in a
Research Blog
post. The source code and trained models behind the paper
are being released here.

The model creates a succinct description of a style. These descriptions can be
combined to create new mixtures of styles. Below is a picture of Picabo[5] stylized with a mixture of 3 different styles. Adjust the sliders below the
image to create more styles.




var pasticheDemo = function(img_id, url_prefix) {

function getValue(index) {
return parseFloat(document.getElementById(img_id + ‘_style_’ + index).value);

function normalizeValues(values) {
var sum = values[0] + values[1] + values[2];
if (sum <= 0) {
return [0, 0, 0];
var normValue = function(v) {
return Math.round(v * 20 / sum) * 50;
var norm = [
normValue(values[0]), normValue(values[1]), normValue(values[2])];
sum = norm[0] + norm[1] + norm[2];
var diff = 1000 – sum;
var max = Math.max(norm[0], norm[1], norm[2]);
if (norm[0] == max) {
norm[0] += diff;
} else if (norm[1] == max) {
norm[1] += diff;
} else {
norm[2] += diff;
return norm;

function imageHash(values) {
var toString = function(v) {
var str = String(v);
while (str.length < 3) {
str = '0' + str;
// AdBlock Plus looks for patterns that match common ad image sizes.
// Breaking up the number with a character is enough to bypass this.
str = str[0] + 'a' + str.substr(1);
return str;
return toString(values[0]) + '_' + toString(values[1]) + '_' +

function getImageUrl(prefix, values) {
href = '/assets/style_blends/' + prefix + '_' + imageHash(values) + '.jpg';
return href;

var preloadedImages = null;

function createImage(values) {
var img = new Image(); = img_id;
var contents = {
'isloaded': false,
'image': img
img.onload = function() {
contents.isloaded = true;
img.src = getImageUrl(url_prefix, values);
return contents;

function getImage(values) {
var hash = imageHash(values);
var contents = preloadedImages[hash];
if (contents.isloaded) {
return contents.image;
} else {
preloadedImages[hash] = createImage(values);
return contents.image;

function loadAllImages() {
var images = {};
for (var x = 0; x <= 1000; x += 50) {
for (var y = 0; y <= 1000 – x; y += 50) {
for (var z = 0; z <= 1000 – x – y; z += 50) {
if (x + y + z == 1000 || x + y + z == 0) {
images[imageHash([x, y, z])] = createImage([x, y, z]);
return images;

function displayImage(image) {
var current = document.getElementById(img_id);
// Load the new image with the height of the current image so the slider
// stays in the same place.
image.width = current.width;
image.height = current.height;
var parent = current.parentElement;

function setWeightLabels(values) {
for (var index = 0; index < 3; ++index) {
var weight = document.getElementById(img_id + '_weight_' + index);
weight.innerHTML = (values[index] / 10) + '%';

function sliderChange() {
if (preloadedImages == null) {
preloadedImages = loadAllImages();

var img = document.getElementById(img_id);
var values = [getValue(0), getValue(1), getValue(2)];
var normalized = normalizeValues(values);

document.getElementById(img_id + '_style_0').oninput = sliderChange;
document.getElementById(img_id + '_style_1').oninput = sliderChange;
document.getElementById(img_id + '_style_2').oninput = sliderChange;

pasticheDemo('picabo_deck', 'picabo');

Read More


Magenta MIDI Interface

By | machinelearning, ML, TensorFlow | No Comments

The magenta team is happy to announce our first step toward providing an easy-to-use
interface between musicians and TensorFlow. This release makes it
possible to connect a TensorFlow model to a MIDI controller and synthesizer in
real time.

Don’t have your own MIDI keyboard? There are many free software
components you can download and use with our interface. Find out more details on
setting up your own TensorFlow-powered MIDI rig in the

Read More

Generating Long-Term Structure in Songs and Stories

By | machinelearning, ML, TensorFlow | No Comments

One of the difficult problems in using machine learning to generate sequences, such as melodies, is creating long-term structure. Long-term structure comes very naturally to people, but it’s very hard for machines. Basic machine learning systems can generate a short melody that stays in key, but they have trouble generating a longer melody that follows a chord progression, or follows a multi-bar song structure of verses and choruses. Likewise, they can produce a screenplay with grammatically correct sentences, but not one with a compelling plot line. Without long-term structure, the content produced by recurrent neural networks (RNNs) often seems wandering and random.

But what if these RNN models could recognize and reproduce longer-term structure? Read More

Music, Art and Machine Intelligence (MAMI) Conference

By | machinelearning, ML, TensorFlow | No Comments

This past June, Magenta, in parternship with the
Artists and Machine Intelligence group, hosted
the Music, Art and Machine Intelligence (MAMI) Conference in San Francisco.
MAMI brought together artists and researchers to share their work and explore
new ideas in the burgeoning space intersecting art and machine learning.

AMI has posted a wonderful summary
of the event on their blog, which we encourage you to read.

Many of the lectures have also been made available on YouTube,
including talks by Google ML researchers Samy Bengio
and Blaise Aguera y Arcas,
Wekinator creator Rebecca Fiebrink,
and artist Mario Klingmann.

We hope you will find the content of the conference as stimulating as we did
and take part in the ongoing conversation in our discussion group.

– Magenta

Source link

Welcome to Magenta!

By | machinelearning, ML, TensorFlow | No Comments

We’re happy to announce Magenta, a project from the Google Brain
that asks: Can we use
machine learning to create compelling art and music? If so, how? If
not, why not? We’ll use TensorFlow, and
we’ll release our models and tools in open source on our GitHub. We’ll
also post demos, tutorial blog postings and technical papers. Soon
we’ll begin accepting code contributions from the community at
large. If you’d like to keep up on Magenta as it grows, you can follow
us on our GitHub and join our

What is Magenta?

Magenta has two goals. First, it’s a research project to advance the
state of the art in machine intelligence for music and art
generation. Machine learning has already been used extensively to
understand content, as in speech recognition or translation. With
Magenta, we want to explore the other side—developing algorithms that
can learn how to generate art and music, potentially creating
compelling and artistic content on their own.

Second, Magenta is an attempt to build a community of artists, coders
and machine learning researchers. The core Magenta team will build
open-source infrastructure around TensorFlow for making art and music.
We’ll start with audio and video support, tools for working with
formats like MIDI, and platforms that help artists connect to machine
learning models. For example, we want to make it super simple to play
music along with a Magenta performance model.

We don’t know what artists and musicians will do with these new tools,
but we’re excited to find out. Look at the history of creative
tools. Daguerre and later Eastman didn’t imagine what Annie
or Richard
would accomplish
in photography. Surely Rickenbacker and Gibson didn’t have Jimi
St. Vincent in
mind. We believe that the models that have worked so well in speech
recognition, translation and image annotation will seed an exciting
new crop of tools for art and music creation.

To start, Magenta is being developed by a small team of researchers
from the Google Brain team. If you’re a researcher or a coder, you
can check out our alpha-version
code. Once we have a
stable set of tools and models, we’ll invite external contributors to
check in code to our GitHub. If you’re a musician or an artist (or
aspire to be one—it’s easier than you might think!), we hope you’ll
try using these tools to make some noise or images or videos… or
whatever you like.

Our goal is to build a community where the right people are there to
help out. If the Magenta tools don’t work for you, let us know. We
encourage you to join our discussion list and shape how Magenta
evolves. We’d love to know what you think of our work—as an artist,
musician, researcher, coder, or just an aficionado. You can follow our
progress and check out some of the music and art Magenta helps create
right here on this blog. As we begin accepting code from community
contributors, the blog will also be open to posts from these
contributors, not just Google Brain team members.

Research Themes

We’ll talk about our research goals in more depth later, via a series
of tutorial blog postings. But here’s a short outline to give an idea
of where we’re heading.


Our main goal is to design algorithms that learn how to generate art
and music. There’s been a lot of great work in image generation from
neural networks, such as
from A. Mordvintsev et al. at Google and Neural Style
from L. Gatys et al. at
U. Tübingen. We believe this area is in its infancy, and expect to see
fast progress here. For those following machine learning closely, it
should be clear that this progress is already well underway. But
there remain a number of interesting questions: How can we make models
like these truly
generative? How can
we better take advantage of user feedback?

Attention and Surprise

It’s not enough just to sample images or sequences from some learned
distribution. Art is dynamic! Artists and musicians draw our
attention to one thing at the expense of another. They change their
story over time—is any Beatles album exactly like another?—and there’s
always some element of surprise at play. How do we capture effects
like attention and surprise in a machine learning model? While we
don’t have a complete answer for this question, we can point to some
interesting models such as the Show, Attend and Tell
by Xu et al. from the MILA
in Montreal that learns to control
an attentional lens, using it to generate descriptive sentences of


This leads to perhaps our biggest challenge: combining generation,
attention and surprise to tell a compelling story. So much
machine-generated music and art is good in small chunks, but lacks any
sort of long-term narrative arc. (To be fair, my own 2002 music
into this category). Alternately, some machine generated content does
have long-term structure, but that structure is provided TO rather
than learned BY the algorithm. This is the case, for example, in David
Cope’s very interesting Experiments in Musical Intelligence
, in
which an AI model deconstructs compositions by human composers, finds
common signatures in them, and recombines them into new works. The
design of models that learn to construct long narrative arcs is
important not only for music and art generation, but also areas like
language modeling, where it remains a challenge to carry meaning even
across a long paragraph, much less whole stories. Attention models
like the Show, Attend and Tell point to one promising direction, but
this remains a very challenging task.


Evaluating the output of generative models is deceivingly
difficult. The time will come when Magenta has 20 different music
generation models available in open source. How do we decide which
ones are good? One option is to compare model output to training data
by measuring
likelihood. For
music and art, this doesn’t work very well. As argued very nicely in
A note on generative models (Theis
et al.), it’s easy to generate outputs that are close in terms of
likelihood, but far in terms of appeal (and vice versa). This
motivates work in artificial adversaries such as Generative

by Goodfellow et al. from MILA in Montreal. In the end, to answer the
evaluation question we need to get Magenta tools in the hands of
artists and musicians, and Magenta media in front of viewers and
listeners. As Magenta evolves, we’ll be working on good ways to
achieve this.

Other Google efforts

Finally, we want to mention other Google efforts and resources related
to Magenta. The Artists and Machine Intelligence
project is connecting with artists
to ask: What do art and technology have to do with each other? What is
machine intelligence, and what does ‘machine intelligence art’ look,
sound and feel like? Check out their
blog for more
about AMI.

The Google Cultural
is fostering
the discovery of exhibits and collections from museums and archives
all around the world. Via their Lab at the Cultural Institute, they’re also
connecting directly with artists. As we make TensorFlow/Magenta the
best machine learning platform in the world for art and music
generation, we’ll work closely with both AMI and the Google Cultural
Institute to connect artists with technology. To learn more about our
various efforts, be sure to check out the Google Research

Source link


A Recurrent Neural Network Music Generation Tutorial

By | machinelearning, ML, TensorFlow | No Comments

We are excited to release our first
tutorial model,
a recurrent neural network that generates music. It serves as an end-to-end primer on how to build
a recurrent network in TensorFlow. It also
demonstrates a sampling of what’s to come in Magenta. In addition, we are
releasing code that converts MIDI files to a format that TensorFlow can
understand, making it easy to create training datasets from any collection of
MIDI files.

This tutorial will allow you to to generate music with a recurrent neural
network. It’s purposefully a simple model, so don’t expect stellar music
results. We’ll post more complex models soon.

Background on Recurrent Neural Networks

A recurrent neural network (RNN) has looped, or recurrent, connections which
allow the network to hold information across inputs. These connections can be
thought of as similar to memory. RNNs are particularly useful for learning
sequential data like music.

In TensorFlow, the recurrent connections in a graph are unrolled into an
equivalent feed-forward network. That network is then trained using a gradient
descent technique called backpropagation through time

An RNN’s recurrent connection unrolled through time. Image courtesy
of Chris Olah.

There are endless ways that an RNN can connect back to itself with recurrent
connections. People typically stick to a few common patterns, the most common
being Long Short-Term Memory (LSTM) cells and Gated Recurrent Units (GRU). These
both have multiplicative gates that protect their internal memory from being
overwritten too easily, allowing them to handle longer sequences. We use LSTMs
in this model. To learn more about RNNs and specifically LSTMs, check out
Chris Olah’s fantastic post. Experts
in the field might also like to look at Goodfellow, Bengio and Courville’s
RNN chapter from their book
“Deep Learning.”

This Release

This RNN is the first in a series of models we will be releasing which predict
the next note given a sequence of previous notes. They do this by learning a
probability distribution over the next notes given all the previous notes. By
sampling from that distribution and feeding the chosen note back into the model
at the next step, the RNN can dream up an entire melody. Generative models are
typically unsupervised, meaning that there are samples but no labels. However we
turn the problem of melody generation into a supervised one by trying to predict
the next note in a sequence, that way labels can be derived from any dataset of
just music and nothing else. This allows us to use RNNs which are supervised

It takes a bit of work to put together a training set of melodies, so we are
providing code that reads an archive of MIDI files and outputs monophonic melody
lines extracted from them in a format TensorFlow can understand. After you have
that ready, instructions to build and run the model are


As always, we are excited to hear from you. Let us know what you liked, didn’t
like, and want to see in the future from Magenta. You can add some code to our
GitHub or join our
discussion group.

Source link

Reading List

By | machinelearning, ML, TensorFlow | No Comments

Magenta’s primary goal is to push the envelope forward in research on music and art generation. Another goal of ours is to teach others about that research. This includes disseminating important works in the field in one place, a resource that if curated, will be valuable to the community for years to come.

Towards that end, we are publishing today a batch of reviews of research papers that we think everyone in the field should read and understand. It will be hosted on our GitHub in the reviews section. This list certainly isn’t definitive and is purposefully meant to be organic. It includes:

  1. DRAW: A Recurrent Neural Network For Image Generation by Gregor et al. (paper)
  2. Generating Sequences with Recurrent Neural Networks by Graves. (paper, video)
  3. A Neural Algorithm of Artistic Style by Gatys et al. (paper)

We will have two more for you soon and others on a rolling basis.

There are certainly many other papers and resources that belong here. We want this to be a community endeavor and encourage high-quality summaries, both in terms of reviews and selection. So if you have a favorite, please file an issue saying which paper you want to write about. After we approve the topic, submit a pull request and we’ll be delighted to showcase your work.

– Magenta

Source link