Enabling reliable, secure collaboration on data science and machine learning projects
A conversation with Paul Taylor, chief architect in Watson Data and AI, and IBM fellow.
A conversation with Paul Taylor, chief architect in Watson Data and AI, and IBM fellow.
The toughest part of machine learning with Spark isn't what you think it is.
Learn the somewhat quirky process for integrating Logstash with the Amazon Elasticsearch Service.
Explore techniques that allow specific IP address/proxy server access to Kibana, protect your ES cluster, and block entry by unauthorized users.
Learn to configure the access policies crucial to working successfully with the Amazon Elasticsearch service.
Learn how to manage Apache Spark configuration overrides for an AWS Elastic MapReduce cluster to save time and money.
Learn how to create, structure, and compile your Scala script to a JAR file, and use SBT to run on a distributed Spark cluster.
Learn how to use steps in the EMR console to schedule and run Spark scripts stored in Amazon S3, on both new and existing clusters.
Learn how to use SSH to connect to the master node of your Elastic MapReduce (EMR) cluster.
Learn how to set up an SSH tunnel and web proxy to use tools like Hue, Zeppelin, and ResourceManager.
Learn three different ways of running Hive queries on your EMR cluster: by script via terminal, the Hue web interface, or steps in the EMR console.