Category

TensorFlow

Distributed TensorFlow and the hidden layers of engineering work

By | machinelearning, TensorFlow

With all the buzz around Machine Learning as of late, it’s no surprise that companies are starting to experiment with their own ML models, and a lot of them are choosing TensorFlow. Because TensorFlow is open source, you can run it locally to quickly create prototypes and deploy fail-fast experiments that help you get your proof-of-concept working at a small scale. Then, when you’re ready, you can take TensorFlow, your data, and the same code and push it up into Google Cloud to take advantage of multiple CPUs, GPUs or soon even some TPUs.

When you get to the point where you’re ready to take your ML work to the next level, you will have to make some choices about how to set up your infrastructure. In general, many of these choices will impact how much time you spend on operational engineering work versus ML engineering work. To help, we’ve published a pair of solution tutorials to show you how you can create and run a distributed TensorFlow cluster on Google Compute Engine and run the same code to train the same model on Google Cloud Machine Learning Engine. The solutions use MNIST as the model, which isn’t necessarily the most exciting example to work with, but does allow us to emphasize the engineering aspects of the solutions.

We’ve already talked about the open-source nature of TensorFlow, allowing you to run it on your laptop, on a server in your private data center, or even a Raspberry PI. TensorFlow can also run in a distributed cluster, allowing you divide your training workloads across multiple machines, which can save you a significant amount of time waiting for results. The first solution shows you how to set up a group of Compute Engine instances running TensorFlow, as in Figure 1, by creating a reusable custom image, and executing an initiation script with Cloud Shell. There are quite a few steps involved in creating the environment and getting it to function properly. Even though they aren’t complex steps, they are operational engineering steps, and will take time away from your actual ML development.

Figure 1. A distributed TensorFlow cluster on Google Compute Engine.

The second solution uses the same code with Cloud ML Engine, and with one command you’ll automatically provision the compute resources needed to train your model. This solution also delves into some of the general details of neural networks and distributed training. It also gives you a chance to try out TensorBoard to visualize your training and resulting model as seen in Figure 2. The time you save provisioning compute resources can be spent analyzing your ML work more deeply.

Figure 2. Visualizing the training result with TensorBoard.

Regardless of how you train your model, the whole point is you want to use it to make predictions. Traditionally, this is where the most engineering work has to be done. In the case where you want to build a web-service to run your predictions, at a minimum, you’ll have to provision, configure and secure some web servers, load balancers, monitoring agents, and create some kind of versioning process. In both of these solutions, you’ll use the Cloud ML Engine prediction service to effectively offload all of those operational tasks to host your model in a reliable, scalable, and secure environment. Once you set up your model for predictions, you’ll quickly spin up a Cloud Datalab instance and download a simple notebook to execute and test the predictions. In this notebook you’ll draw a number with your mouse or trackpad, as in Figure 3, which will get converted to the appropriate image matrix format that matches the MNIST data format. The notebook will send your image to your new prediction API and tell you which number it detected as in Figure 4.

Figure 3.
Figure 4.

This brings up one last and critical point about the engineering efforts required to host your model for predictions, which is not deeply expanded upon in these solutions, but is something that Cloud ML Engine and Cloud Dataflow can easily address for you. When working with pre-built machine learning models that work on standard datasets, it can be easy to lose track of the fact that machine learning model training, deployment, and prediction are often at the end of a series of data pipelines. In the real world, it’s unlikely that your datasets will be pristine and collected specifically for the purpose of learning from the data.

Rather, you’ll usually have to preprocess the data before you can feed it into your TensorFlow model. Common preprocessing steps include de-duplication, scaling/transforming data values, creating vocabularies, and handling unusual situations. The TensorFlow model is then trained on the clean, processed data.

At prediction time, it is the same raw data that will be received from the client. Yet, your TensorFlow model has been trained with de-duplicated, transformed, and cleaned-up data with specific vocabulary mappings. Because your prediction infrastructure might not be written in Python, there is a significant amount of engineering work necessary to build libraries to carry out these tasks with exacting consistency in whatever language or system you use. Many times there is too much inconsistency in how the preprocessing is done before training versus how it’s done before prediction. Even the smallest amount of inconsistency can cause your predictions to behave poorly or unexpectedly. By using Cloud Dataflow to do the preprocessing and Cloud ML Engine to carry out the predictions, it’s possible to minimize or completely avoid this additional engineering work. This is because Cloud Dataflow can apply the same preprocessing transformation code to both historical data during training and real-time data during prediction.

Summary 

Developing new machine learning models is getting easier as TensorFlow adds new APIs and abstraction layers and allows you to run it wherever you want. Cloud Machine Learning Engine is powered by TensorFlow so you aren’t locked into a proprietary managed service, and we’ll even show you how to build your own TensorFlow cluster on Compute Engine if you want. But we think that you might want to spend less time on the engineering work needed to set up your training and prediction environments, and more time tuning, analyzing and perfecting your model. With Cloud Machine Learning Engine, Cloud Datalab, and Cloud Dataflow you can optimize your time. Offload the operational engineering work to us, quickly and easily analyze and visualize your data, and build preprocessing pipelines that are reusable for training and prediction.



Source link

Data Science Weekly – Issue 195

By | machinelearning, TensorFlow

Data Science Weekly – Issue 195

#outlook a{
padding:0;
}
.ReadMsgBody{
width:100%;
}
.ExternalClass{
width:100%;
}
body{
margin:0;
padding:0;
}
img{
border:0;
height:auto;
line-height:100%;
outline:none;
text-decoration:none;
}
table,td{
border-collapse:collapse !important;
mso-table-lspace:0pt;
mso-table-rspace:0pt;
}
#bodyTable,#bodyCell{
height:100% !important;
margin:0;
padding:0;
width:100% !important;
}
#bodyCell{
padding:20px;
}
#templateContainer{
width:600px;
}
body,#bodyTable{
background-color:#ecf0f1;
}
h1{
color:#34495e !important;
display:block;
font-family:Georgia;
font-size:26px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h2{
color:#34495e !important;
display:block;
font-family:Tahoma;
font-size:20px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h3{
color:#000000 !important;
display:block;
font-family:Helvetica;
font-size:18px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h4{
color:#000000 !important;
display:block;
font-family:Helvetica;
font-size:16px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:left;
}
#templatePreheader{
border-top:0;
border-bottom:0;
}
.preheaderContent{
color:#34495e;
font-family:Tahoma;
font-size:9px;
line-height:125%;
padding-top:10px;
padding-bottom:10px;
text-align:left;
}
.preheaderContent a:link,.preheaderContent a:visited,.preheaderContent a .yshortcuts {
color:#34495e;
font-weight:bold;
text-decoration:none;
}
#templateHeader{
border-top:10px solid #000000;
border-bottom:5px solid #000000;
}
.headerContent{
color:#000000;
font-family:Helvetica;
font-size:20px;
font-weight:bold;
line-height:100%;
padding-top:20px;
padding-bottom:20px;
text-align:center;
}
.headerContent a:link,.headerContent a:visited,.headerContent a .yshortcuts {
color:#000000;
font-weight:normal;
text-decoration:underline;
}
#headerImage{
height:auto;
max-width:600px !important;
}
#templateBody{
border-top:0;
border-bottom:0;
}
.bodyContent{
color:#000000;
font-family:Helvetica;
font-size:16px;
line-height:150%;
padding-top:40px;
padding-bottom:40px;
text-align:left;
}
.bodyContent a:link,.bodyContent a:visited,.bodyContent a .yshortcuts {
color:#FF0000;
font-weight:normal;
text-decoration:none;
}
.bodyContent img{
display:inline;
height:auto;
max-width:600px !important;
}
#templateFooter{
border-top:2px solid #000000;
border-bottom:20px solid #000000;
}
.footerContent{
color:#000000;
font-family:Helvetica;
font-size:10px;
line-height:150%;
padding-top:20px;
padding-bottom:20px;
text-align:center;
}
.footerContent a:link,.footerContent a:visited,.footerContent a .yshortcuts,.footerContent a span {
color:#000000;
font-weight:bold;
text-decoration:none;
}
.footerContent img{
display:inline;
height:auto;
max-width:600 !important;
}
@media only screen and (max-width: 500px){
body,table,td,p,a,li,blockquote{
-webkit-text-size-adjust:none !important;
}

} @media only screen and (max-width: 500px){
body{
width:auto !important;
}

} @media only screen and (max-width: 500px){
td[id=bodyCell]{
padding:10px;
}

} @media only screen and (max-width: 500px){
table[id=templateContainer]{
max-width:600px !important;
width:75% !important;
}

} @media only screen and (max-width: 500px){
h1{
font-size:40px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h2{
font-size:20px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h3{
font-size:18px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h4{
font-size:16px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
table[id=templatePreheader]{
display:none !important;
}

} @media only screen and (max-width: 500px){
td[class=headerContent]{
font-size:20px !important;
line-height:150% !important;
}

} @media only screen and (max-width: 500px){
td[class=bodyContent]{
font-size:18px !important;
line-height:125% !important;
}

} @media only screen and (max-width: 500px){
td[class=footerContent]{
font-size:14px !important;
line-height:150% !important;
}

} @media only screen and (max-width: 500px){
td[class=footerContent] a{
display:block !important;
}

}


Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments

Issue #195

Aug 17 2017

Editor Picks

 

  • Hype or Not? Some Perspective on OpenAI’s DotA 2 Bot
    The OpenAI news came as such a shock. How can this be true? Have there been recent breakthroughs that I wasn’t aware of? As I started looking more into what exactly the DotA 2 bot was doing, how it was trained, and what game environment it was in, I came to the conclusion that it’s an impressive achievement, but not the AI breakthrough the press would like you to believe it is. That’s what this post is about. I would like to offer a sober explanation of what’s actually new…
  • Amazing graphics from the 1950s New York Times archive
    The “morgue” is a smelly storage room in a dark basement just down the street from The New York Times headquarters. About seven million photographs and tens of millions of clippings are stored there. A journalist’s dream, a minimalist’s nightmare…

 


 

A Message from this week's Sponsor:

 

 

 


 

Data Science Articles & Videos

 

  • Meet the Bregman Divergences
    What I hope to do in this post is gently introduce you to the Bregman divergences, point out some of their interesting properties, and highlight one result that I found surprising and I believe is underappreciated…
  • Captioning Novel Objects in Images
    The task of visual description aims to develop visual systems that generate contextual descriptions about objects in images. Visual description is challenging because it requires recognizing not only objects (bear), but other visual elements, such as actions (standing) and attributes (brown), and constructing a fluent sentence describing how objects, actions, and attributes are related in an image (such as the brown bear is standing on a rock in the forest)…
  • Simple Square Packing Algorithm
    In a recent project the design asked for a component which shows a small number of values in squares. It was important to represent the relation between the values, so they should be mapped to the area and not the size of the squares…
  • Autoregressive Convolutional Neural Networks for Asynchronous Time Series
    We propose 'Significance-Offset Convolutional Neural Network', a deep convolutional network architecture for multivariate time series regression. The model is inspired by standard autoregressive (AR) models and gating mechanisms used in recurrent neural networks. It involves an AR-like weighting system, where the final predictor is obtained as a weighted sum of sub-predictors while the weights are data-dependent functions learnt through a convolutional network…

 


 

Jobs

 

  • Data Scientist – Qubit – London, UK

    We’re looking for a Data Scientist to join our Research team, to help us develop intelligent products around this data, and conduct cutting-edge research into consumer behaviour on the web.

    This is a great opportunity to conduct real R&D around human behaviour. Our data collection tools store more than 1 billion data points every day. Overall, Qubit technology tracks consumer journeys leading to billions of pounds of online spending worldwide every year, for some of the largest names in online retail.

    We’re looking for someone smart and motivated, with experience solving real data analysis problems with statistical and machine learning techniques. As part of our research team you’ll help to understand our ever growing dataset, working closely with other parts of the business to ensure our products are ahead of the competition…

 


 

Training & Resources

 

  • Pandas tips and tricks
    This post includes some useful tips for how to use Pandas for efficiently preprocessing and feature engineering from large datasets…

  • Python Data Science Handbook
    This website contains the full text of the Python Data Science Handbook by Jake VanderPlas; the content is available on GitHub in the form of Jupyter notebooks…

 


 

Books

 

  • The Book of R: A First Course in Programming and Statistics

    "The Book of R is a comprehensive, beginner-friendly guide to R, the world’s most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, you’ll find everything you need to begin using R effectively for statistical analysis"

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

 


 
P.S. Want to break into Data Science? We've put together a comprehensive guide to get you started. Check it out here! 🙂 – All the best, Hannah & Sebastian

Follow on Twitter
Copyright © 2013-2017 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Source link

Data Science Weekly – Issue 194

By | machinelearning, TensorFlow

Data Science Weekly – Issue 194

#outlook a{
padding:0;
}
.ReadMsgBody{
width:100%;
}
.ExternalClass{
width:100%;
}
body{
margin:0;
padding:0;
}
img{
border:0;
height:auto;
line-height:100%;
outline:none;
text-decoration:none;
}
table,td{
border-collapse:collapse !important;
mso-table-lspace:0pt;
mso-table-rspace:0pt;
}
#bodyTable,#bodyCell{
height:100% !important;
margin:0;
padding:0;
width:100% !important;
}
#bodyCell{
padding:20px;
}
#templateContainer{
width:600px;
}
body,#bodyTable{
background-color:#ecf0f1;
}
h1{
color:#34495e !important;
display:block;
font-family:Georgia;
font-size:26px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h2{
color:#34495e !important;
display:block;
font-family:Tahoma;
font-size:20px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h3{
color:#000000 !important;
display:block;
font-family:Helvetica;
font-size:18px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h4{
color:#000000 !important;
display:block;
font-family:Helvetica;
font-size:16px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:left;
}
#templatePreheader{
border-top:0;
border-bottom:0;
}
.preheaderContent{
color:#34495e;
font-family:Tahoma;
font-size:9px;
line-height:125%;
padding-top:10px;
padding-bottom:10px;
text-align:left;
}
.preheaderContent a:link,.preheaderContent a:visited,.preheaderContent a .yshortcuts {
color:#34495e;
font-weight:bold;
text-decoration:none;
}
#templateHeader{
border-top:10px solid #000000;
border-bottom:5px solid #000000;
}
.headerContent{
color:#000000;
font-family:Helvetica;
font-size:20px;
font-weight:bold;
line-height:100%;
padding-top:20px;
padding-bottom:20px;
text-align:center;
}
.headerContent a:link,.headerContent a:visited,.headerContent a .yshortcuts {
color:#000000;
font-weight:normal;
text-decoration:underline;
}
#headerImage{
height:auto;
max-width:600px !important;
}
#templateBody{
border-top:0;
border-bottom:0;
}
.bodyContent{
color:#000000;
font-family:Helvetica;
font-size:16px;
line-height:150%;
padding-top:40px;
padding-bottom:40px;
text-align:left;
}
.bodyContent a:link,.bodyContent a:visited,.bodyContent a .yshortcuts {
color:#FF0000;
font-weight:normal;
text-decoration:none;
}
.bodyContent img{
display:inline;
height:auto;
max-width:600px !important;
}
#templateFooter{
border-top:2px solid #000000;
border-bottom:20px solid #000000;
}
.footerContent{
color:#000000;
font-family:Helvetica;
font-size:10px;
line-height:150%;
padding-top:20px;
padding-bottom:20px;
text-align:center;
}
.footerContent a:link,.footerContent a:visited,.footerContent a .yshortcuts,.footerContent a span {
color:#000000;
font-weight:bold;
text-decoration:none;
}
.footerContent img{
display:inline;
height:auto;
max-width:600 !important;
}
@media only screen and (max-width: 500px){
body,table,td,p,a,li,blockquote{
-webkit-text-size-adjust:none !important;
}

} @media only screen and (max-width: 500px){
body{
width:auto !important;
}

} @media only screen and (max-width: 500px){
td[id=bodyCell]{
padding:10px;
}

} @media only screen and (max-width: 500px){
table[id=templateContainer]{
max-width:600px !important;
width:75% !important;
}

} @media only screen and (max-width: 500px){
h1{
font-size:40px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h2{
font-size:20px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h3{
font-size:18px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h4{
font-size:16px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
table[id=templatePreheader]{
display:none !important;
}

} @media only screen and (max-width: 500px){
td[class=headerContent]{
font-size:20px !important;
line-height:150% !important;
}

} @media only screen and (max-width: 500px){
td[class=bodyContent]{
font-size:18px !important;
line-height:125% !important;
}

} @media only screen and (max-width: 500px){
td[class=footerContent]{
font-size:14px !important;
line-height:150% !important;
}

} @media only screen and (max-width: 500px){
td[class=footerContent] a{
display:block !important;
}

}


Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments

Issue #194

Aug 10 2017

Editor Picks

 

  • A computer was asked to predict which start-ups would be successful.
    The results were astonishing

    In 2009, Ira Sager of Businessweek magazine set a challenge for Quid AI's CEO Bob Goodson: programme a computer to pick 50 unheard of companies that are set to rock the world. Nearly eight years later, the magazine revisited the list to see how “Goodson plus the machine” had performed. The results surprised even Goodson: Evernote, Spotify, Etsy, Zynga, Palantir, Cloudera, OPOWER – the list goes on…
  • Dots vs. polygons: How I choose the right visualization
    When I start designing a map I consider: How do I want the viewer to read the information on my map? Do I want them to see how a measurement varies across a geographic area at a glance? Do I want to show the level of variability within a specific region? Or do I want to indicate busy pockets of activity or the relative volume/density within an area?…
  • PyTorch vs TensorFlow — spotting the difference
    In this post I want to explore some of the key similarities and differences between two popular deep learning frameworks: PyTorch and TensorFlow. Why those two and not the others? There are many deep learning frameworks and many of them are viable tools, I chose those two just because I was interested in comparing them specifically…

 


 

A Message from this week's Sponsor:

 

 

 


 

Data Science Articles & Videos

 

  • Transitioning entirely to neural machine translation
    Creating seamless, highly accurate translation experiences for the 2 billion people who use Facebook is difficult. We need to account for context, slang, typos, abbreviations, and intent simultaneously. To continue improving the quality of our translations, we recently switched from using phrase-based machine translation models to neural networks to power all of our backend translation systems, which account for more than 2,000 translation directions and 4.5 billion translations each day…
  • An Algorithm Trained on Emoji Knows When You’re Being Sarcastic on Twitter
    Detecting the sentiment of social-media posts is already useful for tracking attitudes toward brands and products, and for identifying signals that might indicate trends in the financial markets. But more accurately discerning the meaning of tweets and comments could help computers automatically spot and quash abuse and hate speech online. A deeper understanding of Twitter should also help academics understand how information and influence flows through the network. What’s more, as machines become smarter, the ability to sense emotion could become an important feature of human-to-machine communication…
  • Whiz Kid Invents an AI System to Diagnose Her Grandfather's Eye Disease
    Kopparapu and her team—including her 15-year-old brother, Neeyanth, and her high school classmate Justin Zhang—trained an artificial intelligence system to recognize signs of diabetic retinopathy in photos of eyes and offer a preliminary diagnosis. She presented the system last month…
  • Exploring the census income dataset using bubble plot
    When exploring a data set, we look at the connection between different features in the data and between the features and the target. This can give us a lot of insights about how we should formulate the problem, the required preprocessing (missing values, normalization), which algorithm should we use to build are model, should we segment our data and build different models for different subsets of our dataset, etc…

 


 

Jobs

 

  • Data Scientist – BuzzFeed – New York City, USA

    BuzzFeed’s data science team is diverse, coming from varying backgrounds, experiences, and skill sets. The team uses data-driven methods to power decisions, inform strategy, build robust data products, and identify opportunities for innovation across the company. We are true hybrids – software engineers, statisticians, mathematicians, domain experts and analysts – who specialize in translating questions into methodical approaches, experiments, and products. We think deeply about the limitations of data, and communicate our output coherently…

 


 

Training & Resources

 

  • hipsteR: re-educating people who learned R before it was cool
    I was an early adopter of R, having first learned S (yay!) and then S-plus (yuck!). But at times my knowledge of R seems stuck in 2001. I keep finding out about “new” R functions (like replicate, which was new in 2003). This is a tutorial for people like me, or people who were taught by people like me…

  • Tidyverse
    Welcome to the new and improved tidyverse website. We are working hard to make tidyverse.org the place to go to learn the tidyverse and to keep up to date with it as it evolves…

  • Diamond Part 1
    We are excited to announce Diamond, an open-source Python solver for certain kinds of generalized linear models. This post covers the mathematics used by Diamond. The sister post covers the specifics of diamond. If you just want to use the package, check out the Github page…

 


 

Books

 

  • The Book of R: A First Course in Programming and Statistics

    "The Book of R is a comprehensive, beginner-friendly guide to R, the world’s most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, you’ll find everything you need to begin using R effectively for statistical analysis"

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

 


 
P.S. Want to be a Data Scientist? We've put together a comprehensive guide to help get you started. Check it out here! 🙂 – All the best, Hannah & Sebastian

Follow on Twitter
Copyright © 2013-2017 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Source link

Breve introducción práctica al Deep Learning con Tensorflow

By | machinelearning, TensorFlow

ACTUALIDAD: Nuevo libro sobre TensorFlow con el título  “HELLO WORLD EN TENSORFLOW para iniciarse en la programación del Deep Learning”, estará disponible para consultar en esta web a partir de su presentación el próximo lunes 1 de febrero. ¡Están todos invitados a la presentación! Versión en papel ya disponible en el portal lulu.com (y próximamente en el de amazon.com) y algunos ejemplares se podrán adquirir en la presentación. NOTA: La información contenida en esta  recopilación de les 8 post sobre TensorFlow  se ha actualizado y ampliado completamente en este nuevo libro. 

The post Breve introducción práctica al Deep Learning con Tensorflow appeared first on Jordi Torres – Professor and Researcher at UPC & BSC: Supercomputing for Artificial Intelligence and Deep Learning.

Source link

TensorFlow Serving 1.0

By | machinelearning, TensorFlow

Posted by Kiril Gorovoy, Software Engineer

We’ve come a long way since our initial open source release in February 2016 of TensorFlow Serving, a high performance serving system for machine learned models, designed for production environments. Today, we are happy to announce the release of TensorFlow Serving 1.0. Version 1.0 is built from TensorFlow head, and our future versions will be minor-version aligned with TensorFlow releases.

For a good overview of the system, watch Noah Fiedel’s talk given at Google I/O 2017.

When we first announced the project, it was a set of libraries providing the core functionality to manage a model’s lifecycle and serve inference requests. We later introduced a gRPC Model Server binary with a Predict API and an example of how to deploy it on Kubernetes. Since then, we’ve worked hard to expand its functionality to fit different use cases and to stabilize the API to meet the needs of users. Today there are over 800 projects within Google using TensorFlow Serving in production. We’ve battle tested the server and the API and have converged on a stable, robust, high-performance implementation.

We’ve listened to the open source community and are excited to offer a prebuilt binary available through apt-get install. Now, to get started using TensorFlow Serving, you can simply install and run without needing to spend time compiling. As always, a Docker container can still be used to install the server binary on non-Linux systems.

With this release, TensorFlow Serving is also officially deprecating and stopping support for the legacy SessionBundle model format. SavedModel, TensorFlow’s model format introduced as part of TensorFlow 1.0 is now the officially supported format.

To get started, please check out the documentation for the project and our tutorial. Enjoy TensorFlow Serving 1.0!



Source link

Independent research firm names Google Cloud the Insight PaaS Leader

By | machinelearning, TensorFlow

Forrester Research, a leading analyst firm, just named Google Cloud Platform (GCP) the leader in The Forrester Wave™: Insight Platforms-As-A-Service, Q3 2017, its analysis of cloud providers offering Platform as a Service. According to the report, an insight PaaS makes it easier to:

  • Manage and access large, complex data sets
  • Update and evolve applications that deliver insight at the moment of action
  • Update and upgrade technology
  • Integrate and coordinate team member activities

For this Wave, Forrester evaluated eight separate vendors. It looked at 36 evaluation criteria spanning three broad buckets  current offering, strategy and market presence.

Of the eight vendors, Google Cloud’s insight PaaS scored highest for both current offering and strategy.

“Google was the only vendor in our evaluation to offer insight execution features like full machine learning automation with hyperparameter tuning, container management and API management. Google will appeal to firms that want flexibility and extreme scalability for highly competent data scientists and cloud application development teams used to building solutions on PaaS.”  The Forrester Wave: Insight Platforms-As-A-Service, Q3 2017

Our presence in the Insight Platform as a Service market goes way back. We started with a vision for serverless computing back in 2008 with Google App Engine and added serverless data processing in 2010 with Google BigQuery. In 2016 we added machine learning (Cloud Machine Learning Engine) to GCP to help bring the power of TensorFlow (Google’s open source machine learning framework) to everyone. We continue to be amazed by what companies like Snap and The Telegraph are doing with these technologies and look forward to building on these insight services to help you build the amazing applications of tomorrow.

Sign up here to get a complimentary copy of the report.



Source link

Data Science Weekly – Issue 193

By | machinelearning, TensorFlow

Data Science Weekly – Issue 193

#outlook a{
padding:0;
}
.ReadMsgBody{
width:100%;
}
.ExternalClass{
width:100%;
}
body{
margin:0;
padding:0;
}
img{
border:0;
height:auto;
line-height:100%;
outline:none;
text-decoration:none;
}
table,td{
border-collapse:collapse !important;
mso-table-lspace:0pt;
mso-table-rspace:0pt;
}
#bodyTable,#bodyCell{
height:100% !important;
margin:0;
padding:0;
width:100% !important;
}
#bodyCell{
padding:20px;
}
#templateContainer{
width:600px;
}
body,#bodyTable{
background-color:#ecf0f1;
}
h1{
color:#34495e !important;
display:block;
font-family:Georgia;
font-size:26px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h2{
color:#34495e !important;
display:block;
font-family:Tahoma;
font-size:20px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h3{
color:#000000 !important;
display:block;
font-family:Helvetica;
font-size:18px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h4{
color:#000000 !important;
display:block;
font-family:Helvetica;
font-size:16px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:left;
}
#templatePreheader{
border-top:0;
border-bottom:0;
}
.preheaderContent{
color:#34495e;
font-family:Tahoma;
font-size:9px;
line-height:125%;
padding-top:10px;
padding-bottom:10px;
text-align:left;
}
.preheaderContent a:link,.preheaderContent a:visited,.preheaderContent a .yshortcuts {
color:#34495e;
font-weight:bold;
text-decoration:none;
}
#templateHeader{
border-top:10px solid #000000;
border-bottom:5px solid #000000;
}
.headerContent{
color:#000000;
font-family:Helvetica;
font-size:20px;
font-weight:bold;
line-height:100%;
padding-top:20px;
padding-bottom:20px;
text-align:center;
}
.headerContent a:link,.headerContent a:visited,.headerContent a .yshortcuts {
color:#000000;
font-weight:normal;
text-decoration:underline;
}
#headerImage{
height:auto;
max-width:600px !important;
}
#templateBody{
border-top:0;
border-bottom:0;
}
.bodyContent{
color:#000000;
font-family:Helvetica;
font-size:16px;
line-height:150%;
padding-top:40px;
padding-bottom:40px;
text-align:left;
}
.bodyContent a:link,.bodyContent a:visited,.bodyContent a .yshortcuts {
color:#FF0000;
font-weight:normal;
text-decoration:none;
}
.bodyContent img{
display:inline;
height:auto;
max-width:600px !important;
}
#templateFooter{
border-top:2px solid #000000;
border-bottom:20px solid #000000;
}
.footerContent{
color:#000000;
font-family:Helvetica;
font-size:10px;
line-height:150%;
padding-top:20px;
padding-bottom:20px;
text-align:center;
}
.footerContent a:link,.footerContent a:visited,.footerContent a .yshortcuts,.footerContent a span {
color:#000000;
font-weight:bold;
text-decoration:none;
}
.footerContent img{
display:inline;
height:auto;
max-width:600 !important;
}
@media only screen and (max-width: 500px){
body,table,td,p,a,li,blockquote{
-webkit-text-size-adjust:none !important;
}

} @media only screen and (max-width: 500px){
body{
width:auto !important;
}

} @media only screen and (max-width: 500px){
td[id=bodyCell]{
padding:10px;
}

} @media only screen and (max-width: 500px){
table[id=templateContainer]{
max-width:600px !important;
width:75% !important;
}

} @media only screen and (max-width: 500px){
h1{
font-size:40px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h2{
font-size:20px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h3{
font-size:18px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h4{
font-size:16px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
table[id=templatePreheader]{
display:none !important;
}

} @media only screen and (max-width: 500px){
td[class=headerContent]{
font-size:20px !important;
line-height:150% !important;
}

} @media only screen and (max-width: 500px){
td[class=bodyContent]{
font-size:18px !important;
line-height:125% !important;
}

} @media only screen and (max-width: 500px){
td[class=footerContent]{
font-size:14px !important;
line-height:150% !important;
}

} @media only screen and (max-width: 500px){
td[class=footerContent] a{
display:block !important;
}

}


Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments

Issue #193

Aug 3 2017

Editor Picks

 

  • Inside Salesforce’s Quest to Bring Artificial Intelligence to Everyone
    Starting two years ago, a band of artificial-intelligence acolytes within Salesforce escaped the towering headquarters with the goal of crazily multiplying the impact of the machine learning models that increasingly shape our digital world—by automating the creation of those models. As shoppers checked out sofas above their heads, they built a system to do just that…
  • Machines Are Developing Language Skills Inside Virtual Worlds
    Now teams at DeepMind, an AI-focused subsidiary of Alphabet, and Carnegie Mellon University have developed a way for machines to figure out simple principles of language for themselves inside 3-D environments based on first-person shooter computer games…

 


 

A Message from this week's Sponsor:

 

 

 


 

Data Science Articles & Videos

 

  • Machine Learning Infrastructure at Stripe
    Machine learning at Stripe has a foundation built on Python and the PyData stack, with scikit-learn and pandas continuing to be core components of an ML pipeline that feeds a production system written in Scala. This talk will cover the ML Infra team’s work to bridge the serialization and scoring gap between Python and the JVM, as well as how ML Engineers ship models to production…

  • Fashioning with Networks: Neural Style Transfer to Design Clothes

    In this paper, the neural style transfer algorithm is applied to fashion so as to synthesize new custom clothes. We construct an approach to personalize and generate new custom clothes based on a users preference and by learning the users fashion choices from a limited set of clothes from their closet…
  • The AI Hierarchy of Needs
    Think of AI as the top of a pyramid of needs. Yes, self-actualization (AI) is great, but you first need food, water and shelter (data literacy, collection and infrastructure)…
  • Natural Language Processing with Small Feed-Forward Networks
    We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models…
  • Earth from Space
    Analyzing DigitalGlobe’s high resolution satellite data to obtain a detailed representation of land use on Earth….
  • Predicting Personality from Book Preferences with User-Generated Content Labels
    Psychological studies have shown that personality traits are associated with book preferences. However, past findings are based on questionnaires focusing on conventional book genres and are unrepresentative of niche content. For a more comprehensive measure of book content, this study harnesses a massive archive of content labels, also known as 'tags', created by users of an online book catalogue, Goodreads.com…

 


 

Jobs

 

  • Junior Data Scientist/Data Scientist – Penguin Random House US – New York City, USA

    The Data Science & Analytics group at Penguin Random House is seeking a Junior Data Scientist/Data Scientist…In this role, you will have an opportunity to work on a variety of high-profile projects under the mentorship of Senior Data Scientists and in collaboration with key decision makers across the organization…We are an agile team of data scientists and software engineers. The team has a wide mandate encompassing pricing systems, recommendation / personalization systems, title segmentation, supply chain, as well as ad-hoc analysis and data exploration….

 


 

Training & Resources

 

  • Deep recommender models using PyTorch.
    Spotlight uses PyTorch to build both deep and shallow recommender models. By providing both a slew of building blocks for loss functions (various pointwise and pairwise ranking losses), representations (shallow factorization representations, deep sequence models), and utilities for fetching (or generating) recommendation datasets, it aims to be a tool for rapid exploration and prototyping of new recommender models…

 


 

Books

 

  • Text Mining with R: A Tidy Approach

    Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you’ll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You’ll learn how tidytext and other tidy tools in R can make text analysis easier and more effective….

    For a detailed list of books covering Data Science, Machine Learning, AI and associated programming languages check out our resources page.

 


 
P.S. Looking to hire a Data Scientist? Find an awesome one among our readers! Email us for details on how to post your job 🙂 – All the best, Hannah & Sebastian

Follow on Twitter
Copyright © 2013-2016 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Source link

How AI can help make safer baby food (and other products)

By | machinelearning, TensorFlow

Editor’s note: Whether you’re growing cucumbers or building your own robot arm, machine learning can help. In this guest editorial, Takeshi Ogino of Kewpie tells us how they used machine learning to ensure the quality and safety of the ingredients that go into their food products.

Quality control is a challenge for most industries, but in the world of food production, it’s one of the biggest. With food, products are as good as the ingredients that go into them. Raw materials can vary dramatically, from produce box to produce box, or even from apple to apple. This means inspecting and sorting the good ingredients from the bad is one of the most important tasks any food company does. But all that work inspecting by hand can be time-consuming and arduous both in terms of overhead and manpower. So what’s a food company to do?

At Kewpie Corporation, we turned to a surprising place to explore better ways to ensure food quality: artificial intelligence built on TensorFlow.

Although Kewpie Corporation is most famous for our namesake mayonnaise, we’ve been around for 100 years with dozens of products, from dressings to condiments to baby foods. We’ve always believed that good products begin with good ingredients.

Ingredients that are safe and also give you peace of mind

Last October, we began investigating whether AI and machine learning could ensure the safety and purity of our ingredients faster and more reliably than ever.

The project began with a simple question: “What does it mean to be a ‘good’ ingredient?” The ingredients we purchase must be safe, of course, and from trustworthy producers. But we didn’t think that went far enough. Ingredients also need to offer peace of mind. For example, the color of potatoes can vary in ways that have nothing to do with safety or freshness.

Kewpie depends on manual visual detection and inspection of our raw ingredients. We inspect the entire volume of ingredients used each day, which, at four to five tons, is a considerable workload. The inspection process requires a certain level of mastery, so scaling this process is not easy. At times we’ve been bottlenecked by inspections, and we’ve struggled to boost production when needed.

We’d investigated the potential for mechanizing the process a number of times in the past. However, the standard technology available to us, machine vision, was not practical in terms of precision or cost. Using machine vision meant setting sorting definitions for every ingredient. At the Tosu Plant alone we handle more than 400 types of ingredients, and across the company we handle thousands.

That’s when I began to wonder whether using machine learning might solve our problem.

Using unsupervised machine learning to detect defective ingredients

We researched AI and machine learning technology across dozens of companies, including some dedicated research organizations. In the end, we decided to go with TensorFlow. We were impressed with its capabilities as well as the strength of its ecosystem, which is global and open. Algorithms that are announced in papers get implemented quickly, and there’s a low threshold for trying out new approaches.

One great thing about TensorFlow is that it has such a broad developer community. Through Google, we connected with our development partner, BrainPad Inc, who impressed us with their ability to deliver production level solutions with the latest deep learning. But even BrainPad, who had developed a number of systems to detect defective products in manufacturing processes, had never encountered a company with stricter inspection standards than ours. Furthermore, because our inspections are carried out on conveyor belts, they had to be extremely accurate at high speeds. Achieving that balance between precision and speed was a challenge BrainPad looked forward to tackling.

kewpie-2

Sorting diced potato pieces at the Tosu Plant.

To kick off the project, we started with one of our most difficult inspection targets: diced potatoes. Because they’re an ingredient in baby food, diced potatoes are subject to the strictest scrutiny both in terms of safety and peace of mind. That meant feeding more than 18,000 line photographs into TensorFlow so that the AI could thoroughly learn the threshold between acceptable and defective ingredients.

Our big breakthrough came when we decided to use the AI not as a ”sorter” but an ”anomaly detector.” Designing the AI as a sorter meant supervised learning, a machine learning model that requires labels for each instance in order to accurately train the model. In this case that meant feeding into TensorFlow an enormous volume of data on both acceptable and defective ingredients. But it was hugely challenging for us to collect enough defective sample data. But by training the system to be an anomaly detector we could employ unsupervised learning. That meant we only needed to feed it data on good ingredients. The system was then able to learn how to identify acceptable ingredients, and reject as defective any ingredients that failed to match. With this approach, we achieved both the precision and speed we wanted, with fewer defective samples overall.

By early April, we were able to test a prototype at the Tosu Plant. There, we ran ingredients through the conveyor belt and had the AI identify which ones were defective. We had great results. The AI picked out defective ingredients with near-perfect accuracy, which was hugely exciting to our staff.

kewpie-3

The inspection team at the Tosu Plant.

It’s important to note that our goal has always been to use AI to help our plant staff, not replace them. The AI-enabled inspection system performs a rough removal of defective ingredients, then our trained staff inspects that work to ensure nothing slips through. That way we get “good” ingredients faster than ever and are able to process more food and boost production.

Today we may only be working with diced potatoes, but we can’t wait to expand to more ingredients like eggs, grains and so many others. If all goes well, we hope to offer our inspection system to other manufacturers who might benefit. Existing inspection systems such as machine vision have not been universally adopted in our industry because they’re expensive and require considerable space. So there’s no question that the need for AI-enabled inspection systems is critical. We hope, through machine learning, we’re bringing even more safe and reassuring products to more people around the world.



Source link

Guest post: How Seenit uses Google Cloud Platform and Couchbase to power our video collaboration platform

By | machinelearning, TensorFlow

Editor’s Note: In this guest post, Seenit CTO Dave Starling walks us through how they use Google Cloud Platform (GCP) and Couchbase to build their innovative crowdsourced video platform.

Since we started Seenit in 2014, our goal has been to give businesses the tools to tell interesting stories through crowdsourced video. But getting there wasn’t simple. What we envisioned for Seenit didn’t exist at the time we started, challenging us to define our product architecture from ground zero. We learned a lot, which is why today I thought I’d share a little on how we’re using Couchbase and GCP to bring Seenit to life.

When we first began looking at what we wanted to build as a platform, we came up with a list of requirements for our database and cloud provider. We chose to run Couchbase on GCP because it offered us distributed architecture that’s highly scalable and available globally. Our clients are typically large enterprises, sometimes in dozens of countries all over the world. We wanted to make sure that everyone, no matter where they are, could get a consistently good user experience.

By applying Couchbase’s N1QL and Full Text Search (FTS) with Google Cloud Machine Learning APIs, our customers can easily filter submissions by objects, words or phrases. And because everything is on GCP, we can duplicate our entire platform within minutes on 12 VMs.

Here’s how it works:

  1. We use Google Compute Engine to autoscale between two and 20 servers.
  2. Google Cloud Storage allows for unified object storage and retrieval. Near-infinite scalability means the service is capable of handling everything from small applications to builds of exabyte-scale systems.
  3. Couchbase’s Full Text Search (FTS) enables us to examine all the words in every document and match them with designated criteria.
  4. Cloud Machine Learning APIs sort clips by objects, gender of speakers and sentiment. The APIs all speak the same language so communication is seamless.

Last year, when we began looking for a machine learning platform, we wanted something that would talk JSON, store JSON and search JSON. We knew a machine learning platform that did all of that would integrate nicely into our Couchbase system. TensorFlow fit our criteria. We love that it isn’t restricted. We can build our own domain-specific models and use Google tools to train them.

Although TensorFlow is an open source machine learning platform, we use it through Cloud Machine Learning Engine. It’s a fully managed service, which is great for us because that way we don’t need to build and manage our own hardware. This allows us to do a lot of manipulation and extract a lot of really interesting data. It’s fully integrated in Couchbase, especially in full text search but also into N1QL, so we can search and extract intelligence and provide value to our customers. It’s a serverless architecture with the advantage of the custom hardware that Google started doing.

It’s also been great that we feel engaged with the community and product and engineering teams. As a startup, it’s important to feel like you can stand on the shoulders of giants, so to speak. The support we get from organizations like Google and Couchbase allow us to do lots of things that we otherwise wouldn’t be able to do with the resources we had.

There’s plenty more to share, but I’ll stop here. If you want to learn more, you might want to check out the joint talk GCP Product Manager Anil Dhawan and I recently gave at Couchbase Connect.

I also recommend checking out Couchbase and other tools on Cloud Launcher. You can use free trial credits to play around and even deploy something of your own. Good luck!



Source link

Data Science Weekly – Issue 190

By | machinelearning, TensorFlow

Data Science Weekly – Issue 190

#outlook a{
padding:0;
}
.ReadMsgBody{
width:100%;
}
.ExternalClass{
width:100%;
}
body{
margin:0;
padding:0;
}
img{
border:0;
height:auto;
line-height:100%;
outline:none;
text-decoration:none;
}
table,td{
border-collapse:collapse !important;
mso-table-lspace:0pt;
mso-table-rspace:0pt;
}
#bodyTable,#bodyCell{
height:100% !important;
margin:0;
padding:0;
width:100% !important;
}
#bodyCell{
padding:20px;
}
#templateContainer{
width:600px;
}
body,#bodyTable{
background-color:#ecf0f1;
}
h1{
color:#34495e !important;
display:block;
font-family:Georgia;
font-size:26px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h2{
color:#34495e !important;
display:block;
font-family:Tahoma;
font-size:20px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h3{
color:#000000 !important;
display:block;
font-family:Helvetica;
font-size:18px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:center;
}
h4{
color:#000000 !important;
display:block;
font-family:Helvetica;
font-size:16px;
font-style:normal;
font-weight:bold;
line-height:100%;
letter-spacing:normal;
margin-top:0;
margin-right:0;
margin-bottom:10px;
margin-left:0;
text-align:left;
}
#templatePreheader{
border-top:0;
border-bottom:0;
}
.preheaderContent{
color:#34495e;
font-family:Tahoma;
font-size:9px;
line-height:125%;
padding-top:10px;
padding-bottom:10px;
text-align:left;
}
.preheaderContent a:link,.preheaderContent a:visited,.preheaderContent a .yshortcuts {
color:#34495e;
font-weight:bold;
text-decoration:none;
}
#templateHeader{
border-top:10px solid #000000;
border-bottom:5px solid #000000;
}
.headerContent{
color:#000000;
font-family:Helvetica;
font-size:20px;
font-weight:bold;
line-height:100%;
padding-top:20px;
padding-bottom:20px;
text-align:center;
}
.headerContent a:link,.headerContent a:visited,.headerContent a .yshortcuts {
color:#000000;
font-weight:normal;
text-decoration:underline;
}
#headerImage{
height:auto;
max-width:600px !important;
}
#templateBody{
border-top:0;
border-bottom:0;
}
.bodyContent{
color:#000000;
font-family:Helvetica;
font-size:16px;
line-height:150%;
padding-top:40px;
padding-bottom:40px;
text-align:left;
}
.bodyContent a:link,.bodyContent a:visited,.bodyContent a .yshortcuts {
color:#FF0000;
font-weight:normal;
text-decoration:none;
}
.bodyContent img{
display:inline;
height:auto;
max-width:600px !important;
}
#templateFooter{
border-top:2px solid #000000;
border-bottom:20px solid #000000;
}
.footerContent{
color:#000000;
font-family:Helvetica;
font-size:10px;
line-height:150%;
padding-top:20px;
padding-bottom:20px;
text-align:center;
}
.footerContent a:link,.footerContent a:visited,.footerContent a .yshortcuts,.footerContent a span {
color:#000000;
font-weight:bold;
text-decoration:none;
}
.footerContent img{
display:inline;
height:auto;
max-width:600 !important;
}
@media only screen and (max-width: 500px){
body,table,td,p,a,li,blockquote{
-webkit-text-size-adjust:none !important;
}

} @media only screen and (max-width: 500px){
body{
width:auto !important;
}

} @media only screen and (max-width: 500px){
td[id=bodyCell]{
padding:10px;
}

} @media only screen and (max-width: 500px){
table[id=templateContainer]{
max-width:600px !important;
width:75% !important;
}

} @media only screen and (max-width: 500px){
h1{
font-size:40px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h2{
font-size:20px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h3{
font-size:18px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
h4{
font-size:16px !important;
line-height:100% !important;
}

} @media only screen and (max-width: 500px){
table[id=templatePreheader]{
display:none !important;
}

} @media only screen and (max-width: 500px){
td[class=headerContent]{
font-size:20px !important;
line-height:150% !important;
}

} @media only screen and (max-width: 500px){
td[class=bodyContent]{
font-size:18px !important;
line-height:125% !important;
}

} @media only screen and (max-width: 500px){
td[class=footerContent]{
font-size:14px !important;
line-height:150% !important;
}

} @media only screen and (max-width: 500px){
td[class=footerContent] a{
display:block !important;
}

}


Curated news, articles and jobs related to Data Science. 
Keep up with all the latest developments

Issue #190

July 13 2017

Editor Picks

 

  • Technical Debt in Machine Learning
    Experienced teams know when to back up seeing a piling debt, but technical debt in machine learning piles extremely fast. You can create months worth of debt in a matter of one working day and even the most experienced teams can miss a moment when the debt is so huge that it sets them back for half a year, which is often enough to kill a fast-pacing project…

 


 

A Message from this week's Sponsor:

 

 
STPF is the premier opportunity for outstanding scientists and engineers to learn first-hand about policymaking while contributing their knowledge and analytical skills to address some of today’s most pressing societal challenges. Enhance your career while engaging with policy administrators and thought leaders.

For over 43 years, doctoral level scientists, social scientists, engineers, and health/medical professionals have applied their knowledge and technical expertise to policymaking at the national and international levels. Fellows serve yearlong assignments in all three branches of the federal government and represent a broad range of backgrounds, disciplines and career stages.

For more information, visit: go.stpf-aaas.org/DSW
 


 

Data Science Articles & Videos

 

  • Are Search Engines Fair? Auditing Search Engines for Differential Satisfaction
    Many online services, such as search engines, social media platforms, and digital marketplaces, are advertised as being available to any user, regardless of their age, gender, or other demographic factors. However, there are growing concerns that these services may systematically underserve some groups of users…
  • Where Machine Learning meets rule-based verification
    This post addresses some high-level questions like: Longer term, how much of the verification of Intelligent Autonomous Systems can be done with just Machine Learning (ML)? Should most requirements remain rule-based, and if so – how does that connect to the ML part? And how will the uneasy interface between ML and rules influence general ML-based systems?…
  • Privacy-preserving generative deep neural networks support clinical data sharing
    Though it is widely recognized that data sharing enables faster scientific progress, the sensible need to protect participant privacy hampers this practice in medicine. We train deep neural networks that generate synthetic subjects closely resembling study participants. Using the SPRINT trial as an example, we show that machine-learning models built from simulated participants generalize to the original dataset…
  • The Confluence of Geometry and Learning
    The learning signal for our 3D perception capability likely comes from making consistent connections among different perspectives of the world that only capture partial evidence of the 3D reality. We present methods for building 3D prediction systems that can learn in a similar manner…
  • Lessons learned from building a Hello World Neural Network
    I remember myself impressed by a model that generates natural language descriptions of images and their regions, developed at the Stanford University in 2015, thinking that I would like to be able to do similar things at some point. So I started searching…
  • Recommendation System Algorithms
    Today, many companies use big data to make super relevant recommendations and growth revenue. Among a variety of recommendation algorithms, data scientists need to choose the best one according a business’s limitations and requirements. To simplify this task, the Statsbot team has prepared an overview of the main existing recommendation system algorithms….

  • Controlling Linguistic Style Aspects in Neural Language Generation

    Most work on neural natural language generation (NNLG) focus on controlling the content of the generated text. We experiment with controlling several stylistic aspects of the generated text, in addition to its content. The method is based on conditioned RNN language model, where the desired content as well as the stylistic parameters serve as conditioning contexts…

 


 

Jobs

 

  • Data Scientist – Hello Fresh – Berlin, Germany

    We are looking for a smart, result-oriented individual who can translate data insights into recommendations driving high-end business value across areas of demand management, marketing, customer lifecycle, and product development. Our ideal candidate has solid background in data science, including predictive modelling, forecasting and validation techniques. So if you are passionate about finding answers in scientific investigation and leading new solutions, feel invited to apply!…

 


 

Training & Resources

 

  • Neural Networks
    Nice collection of slides & pointers on poorly understood / unintuitive properties of Neural Networks…

 


 

Books

 

 


 
P.S. Looking to hire a Data Scientist? Find an awesome one among our readers! Email us for details on how to post your job 🙂 – All the best, Hannah & Sebastian

Follow on Twitter
Copyright © 2013-2016 DataScienceWeekly.org, All rights reserved.
unsubscribe from this list    update subscription preferences 

Source link