How to build deep learning open source communities online

IoT Now asked Adam Gibson and Chris Nicholson, the founders of Deeplearning4j — the first commercial-grade, open source, distributed neural net library written for Java and Scala, with one of the most active communities on Gitter — to share their thoughts and lessons learned on open source community building.

IoT Now: Tell us a little bit about yourself and the Deeplearning4j community. How did it all begin?

Chris: We started building Deeplearning4j in late 2013. Adam had been involved with machine learning for about four years, at that time, and deep artificial neural networks were looking more and more promising. The first network in Deeplearning4j was a restricted Boltzmann machine, since that was the net that Geoff Hinton had come up with back in 2006, which was the turning point in the field. I was working for another startup doing PR and recruiting, and had previously worked as a journalist, so I took care of the documentation (and still do), since we believed that proper communication was key to making open source code valuable.

IoT Now: What are the main issues discussed in the deeplearning4j channel?

The main issues used to be installation. Engineers in the community taught us a lot about how to write clearer instructions, and how to make the code and experience better. If we hadn’t had that feedback loop, Deeplearning4j would be worse. Open source communities are an amazing for quality control! The sooner you fix an issue, the less demands you get from the community about that issue. It’s a great incentive to move quickly.

Now the main issue is loading data and neural net tuning. We are working on communicating better about that, and about making the framework better, so that ETL and tuning get easier. Finally, there are a lot of basic questions about machine- and deep-learning. Many software engineers have figured out that deep learning and machine learning are really powerful tools, so they’re trying to grasp new ideas. We’ve written a lot ofintroductory material, and we point them to various web pages where those ideas are explained.

IoT Now: What common goals do you have as a community?

The community is centered around Deeplearning4j and our scientific computing library, ND4J, which powers the neural nets. So we answer questions about how to use the libs, and in the process, we help people understand more about deep learning in general. It’s not a deep learning hotline, unfortunately, so there are some questions we don’t tend to answer. But we do help engineers in the DL4J community build apps and understand how neural nets work. The common goal is to learn about deep learning, and to build cool shit.

We’ve only seen the tip of the iceberg in terms of what deep learning can do. So far, we’ve seen huge advances in image recognition, machine translation, machine transcription and time series predictions. By many metrics, machine perception now equals or surpasses human perception, and that will change society in ways that are hard to imagine. Those changes just haven’t been implemented yet. So the secondary goal of the community is to bring this narrow form of AI into the world, so that it can make a difference.

IoT Now: What are the most important factors that you have taken into account while creating and maintaining the community? What factors contribute to its success?

Creating and maintaining a community is a huge commitment of time and effort. You have to be available, and you have to try to understand where other people are coming from. They don’t always know the jargon to ask precise questions, so you have to have the patience to figure out together with them what they’re trying to ask, or where they’re stuck.

Adam_Gibson_Wired_headshot
Adam Gibson, founder, Deeplearning4j

We’re not always as patient as we should be. Being available, making that effort, and offering support for powerful tools like this are a good way to build a community. When the makers of a big project are available to answer esoteric questions about how it works, that creates a lot of trust, because people know that you speak with authority and that if something is really broken, it’s going to get fixed. There’s a tight feedback loop between the community and the project creators.

IoT Now: What are the key challenges that you encounter while managing the community?

One of the challenges is: What questions do we care about, and what questions do people need to answer for themselves? If someone has really basic questions about Java, an IDE like IntelliJ, or a build tool like Maven, most of the time they need to figure that out for themselves. Our Gitter channel isn’t the right place to hash through that, although we do help in special cases, because sometimes you need to expand your heap space for neural nets to work.

You also have to find a balance between building the community and building the product. Ideally, you’d have a big team with full-time support engineers and the rest of the team working on the code base. But most open-source projects have very small teams. There are just a handful of people capable of support, and they’re the ones who also should be fixing bugs and adding features.

IoT Now: How do you encourage participants’ commitment and contribution to the community?

You create a smart, friendly environment in the community. You remind them you appreciate contributions, and you show them, as best you can, what needs to be worked on. We created top-level files recognising our contributors, showing people how to contribute, and laying down the rules of the community. We also wrote a devguide, and we now label all issues as bug, enhancement or documentation, so that people can scan the queue quickly and explore where they can add something.

IoT Now: Tell us about the time commitment required to set up and establish the community. How much community maintenance is required on an ongoing basis?

Skymind is a distributed team, with engineers in Australia, Europe and the US, and Deeplearning4j community members in almost every time zone. There’s a Skymind engineer watching the Gitter queue probably 12–16 hours out of any weekday. This is a pretty serious commitment, because there are less than 10 of us. It’s not their full-time job, but maybe they’ll be running unit tests and answering questions on Gitter in their downtime.

IoT Now: Based on your experience, do you feel that open source communities have evolved over the past few years? If so, how?

nicholson_headshot
Chris Nicholson, founder, Deeplearning4j

Open source is winning the enterprise stack, so it’s a lot more important than it used to be. The biggest organisations in the world are running on open source software. Linux won the operating system, Hadoop won big data storage. And open source won because when you do it right, you get better code. More eyeballs mean more uptime. So the size of the OSS community, and the quality of attention that software engineers bring to open source projects, have both increased over the years.

IoT Now: What advice would you give to someone who wants to start an online open source community from scratch?

First, build something neat, something you care about. Focus on building one thing that works. Then, share it with people. They will help you improve it, and they may help you think about what to build next. Don’t do too much big upfront development. Try to scope it so that you can ship in a reasonable amount of time. A few weeks, say. Open-source is valuable because it’s a conversation, and the conversation leads you places, so that you and project evolve in ways you can’t anticipate. Also, by open-source early, you’re increasing your exposure and therefore your chances of getting help. We’ve had amazing developers join the community and the Skymind team.

IoT Now: What digital tools do you use to help manage and grow your community?

The code lives on Github, the conversation lives on Gitter. There are about 1360 devs on the Gitter channel now, so it’s probably one of the more lively neural net conversations on the planet. Our website is hosted on Github, so the content lives there, too. We generate a lot of automatic documentation with Javadoc (always a WIP…). We ask people to use Maven as their automated build tool. One of the biggest problems with any software is the install, and Maven helps make that a little easier. You need to constantly try to clear away obstacles, so that people can just use your code and not worry about other stuff.

IoT Now: Can you share a success story of a community member that happened thanks to their participation in your channel?

For most of the stories, you just had to be there. But in general, a lot of data scientists and Java engineers come, and they just build something for their companies that works. They’ll come back later and say: “We saw a 200% increase in ad coverage when we made DL4J part of the recommender system.” Another guy built an app with DL4J and then an investor saw it and he raised funds. So that’s all pretty cool. With open source, you’re throwing a rock out into the ocean, and you don’t always hear it hit the water. You can’t even see the ripples. So it’s encouraging when people come back and say “thanks” and tell us how it helped them. That makes it more meaningful.

IoT Now: Thanks.

Gitter is well supported among the developer community with over 300,000 regularly active users. Popular software communities using Gitter include .Net, Node.js and Meteor. Visit the deeplearning4j community on Gitter.

Comment on this article below or via Twitter: @IoTNow_ OR @jcIoTnow

FEATURED IoT STORIES

9 IoT applications that will change everything

Posted on: September 1, 2021

Whether you are a future-minded CEO, tech-driven CEO or IT leader, you’ve come across the term IoT before. It’s often used alongside superlatives regarding how it will revolutionize the way you work, play, and live. But is it just another buzzword, or is it the as-promised technological holy grail? The truth is that Internet of

Read more

Which IoT Platform 2021? IoT Now Enterprise Buyers’ Guide

Posted on: August 30, 2021

There are several different parts in a complete IoT solution, all of which must work together to get the result needed, write IoT Now Enterprise Buyers’ Guide – Which IoT Platform 2021? authors Robin Duke-Woolley, the CEO and Bill Ingle, a senior analyst, at Beecham Research. Figure 1 shows these parts and, although not all

Read more

CAT-M1 vs NB-IoT – examining the real differences

Posted on: June 21, 2021

As industry players look to provide the next generation of IoT connectivity, two different standards have emerged under release 13 of 3GPP – CAT-M1 and NB-IoT.

Read more

IoT and home automation: What does the future hold?

Posted on: June 10, 2020

Once a dream, iot home automation is slowly but steadily becoming a part of daily lives around the world. In fact, it is believed that the global market for smart home automation will reach $40 billion by 2020.

Read more
RECENT ARTICLES

Snow Software study uncovers the realities vs. the promises of cloud

Posted on: October 26, 2021

26 October, 2021 –Snow Software, the global provider of technology intelligence, unveiled findings from its most recent survey, based on the input from more than 500 IT leaders from organisations with over 500 employees in the United States and United Kingdom to determine the current state of cloud infrastructure.

Read more

CloudM announces Archive feature which save businesses time and money while meeting compliance demands

Posted on: October 26, 2021

CloudM, a SaaS data management platform, has announced the launch of Archive, a new feature which allows users to easily, automatically, and safely store and recover user data, helping businesses to remain compliant without facing the mounting user license fees associated with traditional archiving and ediscovery solutions.

Read more