[Editorial Note: We’re excited to feature a guest blog post by another member of our extended community, Hanoi Hantrakul, whose team recently won the Outside Lands Hackathon by building an interactive application based on NSynth.]
I review (with animations!) backprop and truncated backprop through time (TBPTT), and introduce a multi-scale adaptation of TBPTT to hierarchical recurrent neural networks that has logarithmic space complexity. I wished to use this to study long-term dependencies, but the implementation got too complicated and kind of collapsed under its own weight. Finally, I lay out some reasons why long-term dependencies are difficult to deal with, going above and beyond the well-studied sort of gradient vanishing that is due to system dynamics.
Last summer at Magenta, I took on a somewhat ambitious project. Whereas most of Magenta was working on the symbolic level (scores, MIDI, pianorolls), I felt that this left out several important aspects of music, such as timbre and phrasing. Instead, I decided to work on a generative model of real audio.
[Editorial Note: One of the best parts of working on the Magenta project is getting to interact with the awesome community of artists and coders. Today, we’re very happy to have a guest blog post by one of those community members, Parag Mital, who has implemented a fast sampler for NSynth to make it easier for everyone to generate their own sounds with the model.]
For mobile users on a cellular data connection: The size of this first demo is around 5 MB of data. Everytime you change the model in the demo, you will use another 5 MB of data.
We present Performance RNN, an LSTM-based recurrent neural network designed to model polyphonic music with expressive timing and dynamics. Here’s an example generated by the model:
Sketch-RNN, a generative model for vector drawings, is now available in Magenta. For an overview of the model, see the Google Research blog from
April 2017, Teaching Machines to Draw
(David Ha). For the technical machine learning details, see the arXiv paper
A Neural Representation of Sketch Drawings (David Ha and Douglas Eck).
Vector drawings of flamingos from our Jupyter notebook.
In a previous post, we described the details of NSynth (Neural Audio Synthesis), a new approach to audio synthesis using neural networks. We hinted at further releases to enable you to make your own music with these technologies. Today, we’re excited to follow through on that promise by releasing a playable set of neural synthesizer instruments:
- An interactive AI Experiment made in collaboration with Google Creative Lab that lets you interpolate between pairs of instruments to create new sounds.
One of the difficult problems in using machine learning to generate sequences, such as melodies, is creating long-term structure. Long-term structure comes very naturally to people, but it’s very hard for machines. Basic machine learning systems can generate a short melody that stays in key, but they have trouble generating a longer melody that follows a chord progression, or follows a multi-bar song structure of verses and choruses. Likewise, they can produce a screenplay with grammatically correct sentences, but not one with a compelling plot line. Without long-term structure, the content produced by recurrent neural networks (RNNs) often seems wandering and random.
But what if these RNN models could recognize and reproduce longer-term structure? Read More
The magenta team is happy to announce our first step toward providing an easy-to-use
interface between musicians and TensorFlow. This release makes it
possible to connect a TensorFlow model to a MIDI controller and synthesizer in
Don’t have your own MIDI keyboard? There are many free software
components you can download and use with our interface. Find out more details on
setting up your own TensorFlow-powered MIDI rig in the