15 Top Open Source Artificial Intelligence Tools
Artificial intelligence, AI is one of the most popular directions in scientific research. Companies like IBM, Google, Microsoft, Facebook and Amazon are investing heavily in R&D, or acquiring start-ups that have made progress in machine learning, neural networks, natural language and image processing. Given the degree of interest, we will not be surprised by the conclusion drawn by Stanford experts in the AI report: "The growing use of AI may have a far-reaching positive impact on our society and economy, which will occur between now and 2030."
In a recent article, we outlined 45 very interesting or promising AI projects. In this article, we will focus on open source AI tools, and have a detailed understanding of the 15 most famous open source AI projects.
The following open source AI applications are at the forefront of AI research.
1. Caffe
It was created by Jia Yangqing during his blog reading at the University of California, Berkeley. Caffe is a deep learning framework based on expression architecture and extensible code. What makes it famous is its speed, which makes it popular with researchers and business users. According to its website, it can process more than 60 million images in a day with only one NVIDIA K40 GPU. It is managed by the Berkeley Vision and Learning Center (BVLC) and supported by NVIDIA and Amazon.
2. CNTK
It is the abbreviation of Computational Network Toolkit, a Microsoft Open Source Artificial Intelligence Toolkit. Whether it is on a single CPU, a single GPU, multiple GPUs or multiple machines with multiple GPUs, it has excellent performance. Microsoft mainly uses it for speech recognition research, but it has good applications in machine translation, image recognition, image caption, text processing, language understanding and language modeling.
3. Deeplearning4j
Deeplearning4j is an open source deep learning library for Java virtual machine (JVM). It runs in a distributed environment and is integrated in Hadoop and Apache Spark. This allows it to configure deep neural networks and is compatible with Java, Scala and other JVM languages.
The project is managed by a business company called Skymind, which provides support, training and a corporate distribution for the project.
4. DMTK
DMTK is the abbreviation of Distributed Machine Learning Toolkit. Like CNTK, DMTK is Microsoft's open source artificial intelligence tool. As an application designed for large data, its goal is to train AI systems faster. It includes three main components: DMTK framework, LightLDA topic model algorithm and distributed (polysemous) word embedding algorithm. To prove its speed, Microsoft claims that on an eight-cluster machine, it can "train a topic model with a vocabulary of 1 million topics and 10 million words (a total of 10 trillion parameters) and collect 100 billion symbols in a document." This achievement is incomparable with other tools.
5. H20
Compared with scientific research, hydrogen peroxide pays more attention to serving AI for enterprise users, so hydrogen peroxide has a large number of corporate customers, such as First Capital Finance, Cisco, Nielsen Catalina, PayPal and Pan American. It claims that anyone can use the power of machine learning and predictive analysis to solve business problems. It can be used for forecasting modeling, risk and fraud analysis, insurance analysis, advertising technology, health care and customer intelligence.
It has two open source versions: the standard version of H2O and the Spaking Water version, which are integrated into Apache Spark. There is also paid enterprise user support.
6. Mahout
It is an Apache Foundation project and Mahout is an open source machine learning framework. According to its website, it has three main features: a programming environment for building scalable algorithms, prefabricated algorithm tools like Spark and H2O, and a vector mathematics experimental environment called Samsara. Companies that use Mahout include Adobe, Accenture Consulting, Foursquare, Intel, Link, Twitter, Yahoo and many others. Its website lists third-party professional support.
7. MLlib
Because of its speed, Apache Spark has become one of the most popular large data processing tools. MLlib is Spark's extensible machine learning library. It integrates Hadoop and interacts with NumPy and R. It includes many machine learning algorithms such as classification, regression, decision tree, recommendation, clustering, topic modeling, function transformation, model evaluation, ML pipeline architecture, ML persistence, survival analysis, frequent itemsets and sequential pattern mining, distributed linear algebra and statistics.
8. NuPIC
NuPIC, managed by Numenta, is an open source AI project based on Hierarchical Temporal Memory, HTM theory. Essentially, HTM is trying to create a computer system that mimics the human cortex. Their goal is to create a machine that approaches or surpasses human cognitive abilities in many cognitive tasks.
In addition to open source licensing, Numenta also provides NuPic's commercial licensing agreement, and it also provides licences for technology patents.
9. OpenNN
OpenNN is a c++ programming library that implements neural network algorithms, designed for developers and researchers with high-level understanding of artificial intelligence. Its key features include depth
Please read the Chinese version for details.